Thursday, February 14, 2008

Disassembler Vocabulary "ported" to Windows

Slava wrote a vocabulary to send gdb a process id and a range of addresses to disassemble in order to streamline the process of writing compiler optimizations. The original code only worked on Unix, requiring a call unix:getpid, which is a Unix system call. The Windows equivalent is GetCurrentProcessId. Since we need to call the Unix version on MacOSX and Linux, and the Windows API call on Windows XP, we use a Factor design pattern -- the HOOK:.

The HOOK:



The word in question is called make-disassemble-cmd. Behold its majesty:

QUALIFIED: unix

M: pair make-disassemble-cmd
in-file [
"attach " write
unix:getpid number>string print
"disassemble " write
[ number>string write bl ] each
] with-file-out ;


You can see where it's calling unix:getpid. Knowing the Windows API call from above, it's easy to write a version that works on Windows:

M: pair make-disassemble-cmd
in-file [
"attach " write
GetCurrentProcessId number>string print
"disassemble " write
[ number>string write bl ] each
] with-file-out ;


Obviously, this is unacceptable, because now it doesn't work on Unix! If we rename the make-disassemble-cmd word for the new platform, then there are still two copies of the exact same word, and you'll be loading them both on platforms where they shouldn't be loaded. We really just want to rename the one word that changed, so...

Let's make a HOOK:

! in io.launcher
HOOK: current-process-handle io-backend ( -- handle )

! in io.unix.launcher
M: unix-io current-process-handle ( -- handle ) getpid ;

! in io.windows.launcher
M: windows-io current-process-handle ( -- handle ) GetCurrentProcessId ;

! in tools.disassembler
M: pair make-disassemble-cmd
in-file [
"attach " write
current-process-handle number>string print
"disassemble " write
[ number>string write bl ] each
] with-file-out ;


Now, there is just one word that will do the work on both platforms. The relevant code is only loaded into the image on the correct platform, and the problem is solved without renaming lots of words, without #ifdefs, and without copy/pasting.

No comments: