The Awesome Factor: 2007

Thursday, December 13, 2007

Sorting Filenames Containing Numbers

Jeff Atwood articulated a problem I usually run into when looking at files in Explorer. If you have a bunch of files with numbers, file "a100.txt" will come before file "a2.txt" when sorted by name. Of course, two is less than 100, so you'd expect "a2.txt" to come first.

Here's a solution in Factor:


: human-sort ( seq -- newseq )
    [ [ digit? ] cut3 >r string>number r> 3array ] map natural-sort
    [ first3 >r number>string r> 3append ] map ;

{ "a100.txt" "a10.txt" } human-sort .
    { "a10.txt" "a100.txt" }

The algorithm is simple: split a filename at the first number, convert that from a string to a number, and let Factor's natural-sort do the rest.

It maps over the sequence, { "a100.txt" "a10.txt" }, and cuts it at the first number inside each element so, after cut3, you get the result on the stack like "a" "100" ".txt", and "a" "10" ".txt". You dip under the top element and convert the to a number, >r string>number r>, and 3array to make a sequence of sequences { { "a" 100 ".txt" } { "a" 10 ".txt" } }. You then call natural-sort on this. Natural-sort knows how to compare element by element, so seeing the "a" "a" is equal it moves onto the next comparison, 100 10. Now that it's in the right order, you still need to turn your sequence back into strings. natural-sort returned { { "a" 10 ".txt" } { "a" 100 ".txt" } }, so map (do something to each element in the sequence) over this and do first3 >r number>string r> 3append, which explodes the array, converts back to a string, and appends three strings together to get the filenames. The dot at the end prints it.

I had to write cut3, but it's a generally useful word for partitioning a sequence that I put into sequences.lib.


: cut3 ( seq pred -- head match tail )
    2dup find drop [
        rot swap cut rot [ not ] compose
        dupd find drop [ cut ] [ f ] if*
    ] [
        drop nip f like f f
    ] if* ;

UPDATE:

I generalized this to work on { "a100b200.txt" "a100b2.txt" } with some suggestions from Slava. If you understood the previous code, this should be pretty understandable too. It keeps calling cut3 until it returns no matches and accumulates the result in a sequence, all the while applying a quotation to whatever matched the predicate. It then compares with natural-sort like before, and transforms these sequences back into filenames. cut3 is cleaned up in this version. I ended up with two general words, cut3 and cut-all that can be put into a library, sequences.lib, and used elsewhere.


: cut-find ( seq pred -- before after )
    dupd find drop dup [ cut ] when ;

: cut3 ( seq pred -- first mid last )
    [ cut-find ] keep [ not ] compose cut-find ;
    
: (cut-all) ( seq pred quot -- )
    [ >r cut3 r> dip >r >r , r> [ , ] when* r> ] 2keep
    pick [ (cut-all) ] [ 3drop ] if ;

: cut-all ( seq pred quot -- seq )
    [ (cut-all) ] { } make ;


: human-sort ( seq -- newseq )
    [ [ digit? ] [ string>number ] cut-all ] map natural-sort
    [ [ dup string? [ number>string ] unless ] map concat ] map ;

{ "a100b2.txt" "a100b200.txt" } human-sort .
    { "a100b2.txt" "a100b200.txt" }

UPDATE TWO:

Here is a much better algorithm that doesn't reconstruct the original filenames and thus can handle leading zeros. It constructs keys that can be sorted by sort-values or sort-keys (sort-values is just as efficient here and it lets you use dup in the map instead of a [ ] keep, which makes the code slightly more legible).


: human-sort ( seq -- newseq )
    [ dup [ digit? ] [ string>number ] cut-all ] { } map>assoc
    sort-values keys ;

{ "000a.txt" "00a.txt" } human-sort .
    { "00a.txt" "000a.txt" }

Monday, November 26, 2007

Cross-platform Factor Install Script

Recently, I wrote an installer/updater bash script that downloads the git repository, compiles, and bootstraps Factor. On Linux, it compiles Factor for console-only if any of the libraries required for the graphical interface are missing (freetype, GL, GLU, X11).

To install Factor to a directory: ./factor.sh install
To update your Factor repository: ./misc/factor.sh update

The script source code is available for browsing or for download.

Tested on Windows XP/Vista in Cygwin, Debian and Ubuntu Linux, and Mac OS X.

Please send bug reports and improvements. Thanks!

Sunday, November 18, 2007

Where's my Raptor!??

My Macbook's dock displayed the default icon instead of a raptor. The fix was to move the move the .app directory to another name, then to move it back. Now I have a raptor!

Does 10.4 do some caching? It probably had it cached wrong since we got the new icon.

Wednesday, November 07, 2007

Doubly-linked Lists, Queues, and Heaps

Doubly-linked Lists and Queues

Factor has had doubly-linked lists for years now, but they were not well-documented or polished. Now, they're documented and have replaced the queues library.

An example of a dlist usage:

<dlist> "red" over push-front "blue" over push-front dup pop-back .
    "red"

You can add/remove nodes of a dlist with push-front, push-back, pop-front, pop-back, delete-node, and search with dlist-find, dlist-contains?.

Finding the length of a dlist is O(1) since it stores the length as dlist-length, a tuple slot.

Heaps

Heaps have been updated to allow for <min-heap> and <max-heap> data structures. Adding elements to a heap is achieved with heap-push ( value key heap -- ), while popping elements is heap-pop ( heap -- value key ).

Factor's green threads implementation had been using a hack for the sleep-queue: each time a new entry was added it would modify a non-growable array, which would then be sorted by the smallest timeout. Adding a sequence of sleep continuations would take O(n^2 log n) time! Running 10000 [ [ 100 sleep ] in-thread ] times should spawn 10000 threads and sleep for 100 ms in each one, and with the old sleep-queue implementation it takes over a minute on my Macbook. Now it's just O(n log n), which takes a second or two.

Monday, September 24, 2007

Windows SEH Revisited

It turns out there are Win32 API calls to set up a structured exception handler (SEH) on Windows NT, which is easier and more reliable to use than inline assembly. To install an exception handler, call AddVectoredExceptionHandler. An exception handler gets passed a PEXCEPTION_POINTERS struct, which is a PEXCEPTION_RECORD and a CONTEXT*, both of which are defined in the header files. The context structure is different on every platform because it lists the contents of the CPU registers at the time of the exception, while the ExceptionRecord struct is the same, and contains the ExceptionCode and the address where it occurred (in ExceptionInformation[1]).

The job of the exception handler is to determine where program execution should proceed after it returns. For Factor, the errors we handle are memory access errors, which happen when a stack overflows or underflows and hits a guard page, division by zero, and 'any other error'. We can't just jump to these Factor exception handlers inside the Win32 exception handler, but we can set the instruction pointer, the EIP register, to our handler and return EXCEPTION_CONTINUE_EXECUTION. In this way, we can let the operating system catch errors and report them to the user as "Data stack underflow", "Division by zero", and continue running Factor without expensive checks for each memory access or division.

The stack pointer at the time of the exception is also important. If we were executing compiled Factor code, as determined by checking the fault address against the C predicate in_code_heap_p, then we set a global with this address to continue execution after the exception is handled. However, if we are in C code, then the global is left as NULL.

Here is the code--much cleaner, and more correct, than before.

void c_to_factor_toplevel(CELL quot)
{
    AddVectoredExceptionHandler(0, (void*)exception_handler);
    c_to_factor(quot);
    RemoveVectoredExceptionHandler((void*)exception_handler);
}

long exception_handler(PEXCEPTION_POINTERS pe)
{
    PEXCEPTION_RECORD e = (PEXCEPTION_RECORD)pe->ExceptionRecord;
    CONTEXT *c = (CONTEXT*)pe->ContextRecord;

    if(in_code_heap_p(c->Eip))
        signal_callstack_top = (void*)c->Esp;
    else
        signal_callstack_top = NULL;

    if(e->ExceptionCode == EXCEPTION_ACCESS_VIOLATION)
    {
        signal_fault_addr = e->ExceptionInformation[1];
        c->Eip = (CELL)memory_signal_handler_impl;
    }
    else if(e->ExceptionCode == EXCEPTION_FLT_DIVIDE_BY_ZERO
            || e->ExceptionCode == EXCEPTION_INT_DIVIDE_BY_ZERO)
    {
        signal_number = ERROR_DIVIDE_BY_ZERO;
        c->Eip = (CELL)divide_by_zero_signal_handler_impl;
    }
    else
    {
        signal_number = 11;
        c->Eip = (CELL)misc_signal_handler_impl;
    }

    return EXCEPTION_CONTINUE_EXECUTION;
}

void memory_signal_handler_impl(void)
{
    memory_protection_error(signal_fault_addr,signal_callstack_top);
}

void divide_by_zero_signal_handler_impl(void)
{
    general_error(ERROR_DIVIDE_BY_ZERO,F,F,signal_callstack_top);
}

void misc_signal_handler_impl(void)
{
    signal_error(signal_number,signal_callstack_top);
}

Saturday, September 08, 2007

Destructors in Factor

After spending way too much time trying to perfect Factor's win32 api code, I wrote a word I should have written long ago: with-destructors. What this allows you to do is allocate a system resource, add a destructor, and automate the resource cleanup, even when an exception is thrown. Take this buggy code as an example of resource leaks.


TUPLE: mallocs one two three ;

: three-mallocs-buggy ( -- obj )
    100 malloc
    200 malloc
    300 malloc
    \ mallocs construct-boa ;

Any one of these calls to malloc could fail. If the first one fails, an error is thrown and no resources are lost. However, if the second or third fail, nothing will ever clean up after the successful allocations, and resources are leaked!

One alternative is to put each malloc into a tuple slot as they succeed. This solution is quite verbose and needs an extra cleanup word (boilerplate).


TUPLE: mallocs one two three ;

: cleanup-mallocs ( mallocs -- )
    dup mallocs-one [ free ] when*
    dup mallocs-two [ free ] when*
    dup mallocs-three [ free ] when* ;

: three-mallocs-verbose ( -- obj )
    \ mallocs construct-empty
    f
    [
        drop
        100 malloc over set-mallocs-one
        200 malloc over set-mallocs-two
        300 malloc over set-mallocs-three
        t
    ] [
        [ cleanup-mallocs ] unless
    ] cleanup ;

We need a boolean because we only want to cleanup up resources if something fails. See how we save each malloc as it's created? Otherwise it could get lost. This tedious method is how much of the win32 native io (io completion ports) is implemented right now. Notice that the cleanup word doesn't even set all the slots in the tuple to f, so if you called cleanup-mallocs twice somehow, your program would hopefully crash (sooner rather than later!). More boilerplate would fix it.

Instead, let's wrap each returned resource in a destructor.


TUPLE: mallocs one two three ;

: three-mallocs ( -- obj )
    [
        100 malloc dup [ free ] f add-destructor
        200 malloc dup [ free ] f add-destructor
        300 malloc dup [ free ] f add-destructor
        \ mallocs construct-boa
    ] with-destructors ;

Ah! This is marginally more work than the first example, but is 100% correct. The word add-destructor ( obj quot always? -- ) takes an arbitrary object, a destructor quotation (some code), and a boolean to tell it under which circumstances to cleanup the resource. Calling add-destructor with t will always clean up the resource; calling it with f will only clean up if the quotation passed to with-destructors fails. Thus, a cleanup routine is required elsewhere, but we can worry about that later. The duplicated code could be factored out if you find yourself using it often, but I have chosen not to here because of the tricky boolean flag for add-destructor. In practice, I need to save about half of the resources and to destroy the other half very soon after creation. However, it still might be best to factor out the duplicate code:

: destruct-malloc-on-fail ( obj -- ) [ free ] f add-destructor ;

This example is trivial compared to using win32 for memory mapped io, which requires: escalating two privileges, opening a file, creating a file mapping, calling map view of file, and lowering both privileges, any of which could fail! This series of calls allocates two file handles and requires unmapping the file during cleanup. The four calls to the privileges routines call malloc, and this could also leak resources!

This complexity is the norm when writing code for performance and reliability in win32.

The destructor implementation is simple:


USING: continuations kernel namespaces sequences vectors ;
IN: destructors

SYMBOL: destructors
SYMBOL: errored?
TUPLE: destructor obj quot always? ;

<PRIVATE

: filter-destructors ( -- )
    errored? get [
        destructors [ [ destructor-always? ] subset ] change
    ] unless ;

: call-destructors ( -- )
    destructors get [
        dup destructor-obj swap destructor-quot call
    ] each ;

PRIVATE>

: add-destructor ( obj quot always? -- )
    \ destructor construct-boa destructors [ ?push ] change ;

: with-destructors ( quot -- )
    [
        [ call ] [ errored? on ] recover
        filter-destructors call-destructors
        errored? get [ rethrow ] when
    ] with-scope ; inline

with-destructors and add-destructor make up the main interface. If the quotation passed to with-destructors succeeds, the always-destructs are filtered out of the destructor sequence, and call-destructors destroys the objects that are left.

Hopefully this library will make dealing with system resources in Factor all but trivial.

Friday, August 31, 2007

Managed malloc and free

The Factor Windows backend has to use manual memory management for Windows runtime callbacks. This opens up a small amount of code to double free errors. And crashes. But Factor is better than that; why crash when you can easily prevent it?

The new malloc adds an entry to a global hashtable, mallocs, whenever new memory is allocated. Upon calling free, it checks that the buffer is still allocated before making the actual memory cleanup call. This managed version of malloc is now the default. To get plain old libc memory allocation, call (malloc) and (free), though the overhead is negligible compared to having your program crash.

Does it work? Yes! Opening a new UI window adds two new mallocs, and closing it takes them away. After running test-all there were two mallocs that went unchecked; both were bugs and corrected in five minutes by inspection. Finally, there is no way to call free twice on a pointer using the managed malloc/free since, if that pointer is not in the mallocs hashtable, an error is thrown. Now, to check that you balance malloc/free calls, you can run your code and make sure the malloc hash doesn't have extra entries. Here's a snippet to count the entries:

mallocs get-global assoc-size .

The code works for calloc and realloc as well, and can be perused here.

Wednesday, June 20, 2007

Roman Numeral Conversion in Factor

I whipped up some words to convert integers to Roman numerals. The word <PRIVATE changes the IN: vocabulary to roman.private to hide the implementation from the library user. Of course you can access these private words, like all words in Factor, with a USE:.

The algorithm is simple. Going from high to low, iterate the Roman numeral values and /mod (integer division with remainder) each with the input, outputting the divisor and replacing the input with the remainder. This algorithm treats 4s and 9s as digits, just as it treats single letters as digits (i, v, x, etc). Without the 4s and 9s, you end up getting longer answers that, while logical, are wrong, e.g. 9 is "ix", not "viv". (I found this bug in the first iteration while writing unit tests.)

The words we care about, >roman and >ROMAN, are placed IN: roman because of the PRIVATE> word, which drops back to the public vocabulary roman. The > in a word's name is a convention for words that do conversions; the parentheses around the word (>roman) mean it's an implementation word; you should never have a (>roman) without also having a >roman. Picking these names is done purely by convention--the only forbidden word names are numbers and words that start with a ", which parse as strings. Everything until whitespace is a word name.

The conversion from Roman numerals back to integers and roman+, roman*, etc are in roman.factor.

USING: arrays assocs kernel math math.vectors namespaces
quotations sequences sequences.private strings ;
IN: roman

<PRIVATE

: roman-digits ( -- seq )
    { "m" "cm" "d" "cd" "c" "xc" "l" "xl" "x" "ix" "v" "iv" "i" } ;

: roman-values ( -- seq )
    { 1000 900 500 400 100 90 50 40 10 9 5 4 1 } ;

TUPLE: roman-range-error n ;

: roman-range-check ( n -- )
    dup 1 3999 between? [
        drop
    ] [
        <roman-range-error> throw
    ] if ;

: (>roman) ( n -- )
    roman-values roman-digits [
        >r /mod swap r> <repetition> concat %
    ] 2each drop ;

PRIVATE>

: >roman ( n -- str )
    dup roman-range-check [
        (>roman)
    ] "" make ;

: >ROMAN ( n -- str ) >roman >upper ;

Friday, April 27, 2007

Windows CE SEH for ARM with GCC

Structure exception handling (SEH) is a low-level way to handle software and hardware exceptions in Microsoft Windows. Other exception handling mechanisms are built on top of SEH. Factor uses SEH in order to report stack underflow/overflow errors on Windows. Microsoft Visual Studio (VS) would usually take care of the tricky implementation details by supplying __try, __except, and __finally keywords, but Factor's compiler relies on globally assigning the datastack and retainstack to specific registers, a feature which VS does not support. Conversely, GCC does not support __try, so we are left to implement the exception handler in assembly.

The internal implementation is both OS and architecture dependent, meaning that we have to support SEH on Windows NT on x86, and Windows CE for x86 and ARM. Digging through MSDN leads to a page called SEH in RISC Environments, where RISC standing for "Reduced Instruction Set Computer". (You are just supposed to know that ARM is RISC and x86 is CISC). I recommend reading about SEH on MSDN, but it basically says that SEH on RISC uses Virtual Unwinding and that you will need to set the PDATA structure for your function in order to set up the exception handler. Virtual unwinding traverses the framestack until it finds a frame with an exception handler, at which time it will actually unwind the framestack to that point and call the exception handler.

All we need to do is define void run_toplevel(void){...} to install our exception_handler and call Factor's "main" subroutine, run.

vm/os-windows-ce.c


long exception_handler(PEXCEPTION_RECORD rec, void *frame, void *ctx, void *disp
atch)
{
    memory_protection_error(
        rec->ExceptionInformation[1] & 0x1ffffff, // mask off process slot
        native_stack_pointer());
    return -1; /* unreachable */
}

vm/os-windows-ce-arm.S


 .text

 .globl run_toplevel

 .word exception_handler
 .word 0

run_toplevel:
 ldr pc, _Prun

_Prun:  .word run

 .section .pdata
 .word run_toplevel
 .word 0xc0000002 | (0xFFFFF << 8)

The C code passes the memory fault address, stored in ExceptionInformation[1], and the native stack pointer (another assembly function that simply moves ESP -> EAX) to a function that converts the fault address into a Factor stack error message, e.g. "Datastack underflow". The fault address is actually a slotized address, meaning that it is relative to some process slot which must be masked off. Annoyingly, the exceptions happen at different address ranges on the emulator and on a real mobile device. Slotized addresses in this context are not well documented on MSDN, but the Windows CE Blog Team quickly answered my email. (Thanks!)

The assembly code is mostly boilerplate. The .text directive starts us in the executable code section. We're going to be exporting void run_toplevel(void){...} in order to call it from C, so we declare it as a .globl and start the definition. We simply call our "main" function, void run(void){...}, which is the platform-dependent run function to start the Factor interpreter.

Hopefully, if someone else has to do this it will take much less time.

Thursday, April 12, 2007

Building Factor in Cygwin now supported

Factor has only compiled on MinGW since we made the Windows Factor port a couple of years ago. The problem with Cygwin has been its dependence on cygwin1.dll, which slows your program. Also, the Cygwin installer doesn't add it to your path, so that is one more step (and not standard). Additionally, structured exception handling (SEH) works the first time but hangs on the second time it triggers with the dll, e.g. on the second stack underflow.

However, Eric Mertens, a new Factor user, found a compiler flag to disable cygwin.dll:
-mno-cygwin

Three cheers for new users!

This flag fixed both the dll dependency and the SEH bug, allowing all unit tests to pass. The only other change I had to make for the port was to #include <wchar.h> for the compiler to find wcslen().

So now you can compile Factor with either MinGW or Cygwin, and both versions are officially supported.

The Awesome Factor