Monday, September 24, 2007

Windows SEH Revisited

It turns out there are Win32 API calls to set up a structured exception handler (SEH) on Windows NT, which is easier and more reliable to use than inline assembly. To install an exception handler, call AddVectoredExceptionHandler. An exception handler gets passed a PEXCEPTION_POINTERS struct, which is a PEXCEPTION_RECORD and a CONTEXT*, both of which are defined in the header files. The context structure is different on every platform because it lists the contents of the CPU registers at the time of the exception, while the ExceptionRecord struct is the same, and contains the ExceptionCode and the address where it occurred (in ExceptionInformation[1]).

The job of the exception handler is to determine where program execution should proceed after it returns. For Factor, the errors we handle are memory access errors, which happen when a stack overflows or underflows and hits a guard page, division by zero, and 'any other error'. We can't just jump to these Factor exception handlers inside the Win32 exception handler, but we can set the instruction pointer, the EIP register, to our handler and return EXCEPTION_CONTINUE_EXECUTION. In this way, we can let the operating system catch errors and report them to the user as "Data stack underflow", "Division by zero", and continue running Factor without expensive checks for each memory access or division.

The stack pointer at the time of the exception is also important. If we were executing compiled Factor code, as determined by checking the fault address against the C predicate in_code_heap_p, then we set a global with this address to continue execution after the exception is handled. However, if we are in C code, then the global is left as NULL.

Here is the code--much cleaner, and more correct, than before.

void c_to_factor_toplevel(CELL quot)
{
AddVectoredExceptionHandler(0, (void*)exception_handler);
c_to_factor(quot);
RemoveVectoredExceptionHandler((void*)exception_handler);
}

long exception_handler(PEXCEPTION_POINTERS pe)
{
PEXCEPTION_RECORD e = (PEXCEPTION_RECORD)pe->ExceptionRecord;
CONTEXT *c = (CONTEXT*)pe->ContextRecord;

if(in_code_heap_p(c->Eip))
signal_callstack_top = (void*)c->Esp;
else
signal_callstack_top = NULL;

if(e->ExceptionCode == EXCEPTION_ACCESS_VIOLATION)
{
signal_fault_addr = e->ExceptionInformation[1];
c->Eip = (CELL)memory_signal_handler_impl;
}
else if(e->ExceptionCode == EXCEPTION_FLT_DIVIDE_BY_ZERO
|| e->ExceptionCode == EXCEPTION_INT_DIVIDE_BY_ZERO)
{
signal_number = ERROR_DIVIDE_BY_ZERO;
c->Eip = (CELL)divide_by_zero_signal_handler_impl;
}
else
{
signal_number = 11;
c->Eip = (CELL)misc_signal_handler_impl;
}

return EXCEPTION_CONTINUE_EXECUTION;
}

void memory_signal_handler_impl(void)
{
memory_protection_error(signal_fault_addr,signal_callstack_top);
}

void divide_by_zero_signal_handler_impl(void)
{
general_error(ERROR_DIVIDE_BY_ZERO,F,F,signal_callstack_top);
}

void misc_signal_handler_impl(void)
{
signal_error(signal_number,signal_callstack_top);
}

Saturday, September 08, 2007

Destructors in Factor

After spending way too much time trying to perfect Factor's win32 api code, I wrote a word I should have written long ago: with-destructors. What this allows you to do is allocate a system resource, add a destructor, and automate the resource cleanup, even when an exception is thrown. Take this buggy code as an example of resource leaks.

TUPLE: mallocs one two three ;

: three-mallocs-buggy ( -- obj )
100 malloc
200 malloc
300 malloc
\ mallocs construct-boa ;

Any one of these calls to malloc could fail. If the first one fails, an error is thrown and no resources are lost. However, if the second or third fail, nothing will ever clean up after the successful allocations, and resources are leaked!

One alternative is to put each malloc into a tuple slot as they succeed. This solution is quite verbose and needs an extra cleanup word (boilerplate).

TUPLE: mallocs one two three ;

: cleanup-mallocs ( mallocs -- )
dup mallocs-one [ free ] when*
dup mallocs-two [ free ] when*
dup mallocs-three [ free ] when* ;

: three-mallocs-verbose ( -- obj )
\ mallocs construct-empty
f
[
drop
100 malloc over set-mallocs-one
200 malloc over set-mallocs-two
300 malloc over set-mallocs-three
t
] [
[ cleanup-mallocs ] unless
] cleanup ;

We need a boolean because we only want to cleanup up resources if something fails. See how we save each malloc as it's created? Otherwise it could get lost. This tedious method is how much of the win32 native io (io completion ports) is implemented right now. Notice that the cleanup word doesn't even set all the slots in the tuple to f, so if you called cleanup-mallocs twice somehow, your program would hopefully crash (sooner rather than later!). More boilerplate would fix it.

Instead, let's wrap each returned resource in a destructor.

TUPLE: mallocs one two three ;

: three-mallocs ( -- obj )
[
100 malloc dup [ free ] f add-destructor
200 malloc dup [ free ] f add-destructor
300 malloc dup [ free ] f add-destructor
\ mallocs construct-boa
] with-destructors ;

Ah! This is marginally more work than the first example, but is 100% correct. The word add-destructor ( obj quot always? -- ) takes an arbitrary object, a destructor quotation (some code), and a boolean to tell it under which circumstances to cleanup the resource. Calling add-destructor with t will always clean up the resource; calling it with f will only clean up if the quotation passed to with-destructors fails. Thus, a cleanup routine is required elsewhere, but we can worry about that later. The duplicated code could be factored out if you find yourself using it often, but I have chosen not to here because of the tricky boolean flag for add-destructor. In practice, I need to save about half of the resources and to destroy the other half very soon after creation. However, it still might be best to factor out the duplicate code:
: destruct-malloc-on-fail ( obj -- ) [ free ] f add-destructor ;

This example is trivial compared to using win32 for memory mapped io, which requires: escalating two privileges, opening a file, creating a file mapping, calling map view of file, and lowering both privileges, any of which could fail! This series of calls allocates two file handles and requires unmapping the file during cleanup. The four calls to the privileges routines call malloc, and this could also leak resources!

This complexity is the norm when writing code for performance and reliability in win32.

The destructor implementation is simple:

USING: continuations kernel namespaces sequences vectors ;
IN: destructors

SYMBOL: destructors
SYMBOL: errored?
TUPLE: destructor obj quot always? ;

<PRIVATE

: filter-destructors ( -- )
errored? get [
destructors [ [ destructor-always? ] subset ] change
] unless ;

: call-destructors ( -- )
destructors get [
dup destructor-obj swap destructor-quot call
] each ;

PRIVATE>

: add-destructor ( obj quot always? -- )
\ destructor construct-boa destructors [ ?push ] change ;

: with-destructors ( quot -- )
[
[ call ] [ errored? on ] recover
filter-destructors call-destructors
errored? get [ rethrow ] when
] with-scope ; inline

with-destructors and add-destructor make up the main interface. If the quotation passed to with-destructors succeeds, the always-destructs are filtered out of the destructor sequence, and call-destructors destroys the objects that are left.

Hopefully this library will make dealing with system resources in Factor all but trivial.