Sunday, October 19, 2008

Moving primitives out of the Factor VM: Factor vs C

In a previous lifetime, I added environment variable primitives to the VM. Well, it turns out that a better place for them is in the Factor basis vocabulary root, so this post is about moving them again.

To move a primitive out of the VM, implement its functionality in Factor code and replace usages with your word if necessary, remove it from vm/primitives.c and core/bootstrap/primitives.factor, remove the primitive code from the VM, make a new image, recompile Factor, and bootstrap. Basically, do the inverse of the previous post.

What's more interesting, unless you have to actually remove a primitive someday and need this reference, is comparing the code for each version side-by-side. The C code has to call functions to prevent the garbage collector from moving data around when it shouldn't, but it's written without the usual ifdefs you will find in most cross-platform C code, so overall it's fairly clean. The Factor version has a high-level protocol that is implemented by both backends across separate files, with one-liners for most of the Unix definitions and high-level combinators for the Windows ones. I find the Factor version much easier to understand and I believe it's more maintainable. Factor is a better C than C.

High-level environment variable interface

Factor's high-level environment variable words let you get a single variable or all of them, set a single variable or all of them, and unset a variable. On Windows you cannot set all of the variables at once, and on Windows CE the whole concept of environment variables does not exist.

Here is the code for the main vocabulary. Notice that there are hooks on the os word, which will be a value like macosx or winnt or linux. The boilerplate at the bottom is for loading the platform-specific code.

USING: assocs combinators kernel sequences splitting system
vocabs.loader ;
IN: environment

HOOK: os-env os ( key -- value )

HOOK: set-os-env os ( value key -- )

HOOK: unset-os-env os ( key -- )

HOOK: (os-envs) os ( -- seq )

HOOK: (set-os-envs) os ( seq -- )

: os-envs ( -- assoc )
(os-envs) [ "=" split1 ] H{ } map>assoc ;

: set-os-envs ( assoc -- )
[ "=" swap 3append ] { } assoc>map (set-os-envs) ;

{
{ [ os unix? ] [ "environment.unix" require ] }
{ [ os winnt? ] [ "environment.winnt" require ] }
{ [ os wince? ] [ ] }
} cond

Unix environment variables, before and after

DEFINE_PRIMITIVE(os_env)
{
char *name = unbox_char_string();
char *value = getenv(name);
if(value == NULL)
dpush(F);
else
box_char_string(value);
}

DEFINE_PRIMITIVE(os_envs)
{
GROWABLE_ARRAY(result);
REGISTER_ROOT(result);
char **env = environ;

while(*env)
{
CELL string = tag_object(from_char_string(*env));
GROWABLE_ARRAY_ADD(result,string);
env++;
}

UNREGISTER_ROOT(result);
GROWABLE_ARRAY_TRIM(result);
dpush(result);
}

DEFINE_PRIMITIVE(set_os_env)
{
char *key = unbox_char_string();
REGISTER_C_STRING(key);
char *value = unbox_char_string();
UNREGISTER_C_STRING(key);
setenv(key, value, 1);
}

DEFINE_PRIMITIVE(unset_os_env)
{
char *key = unbox_char_string();
unsetenv(key);
}

DEFINE_PRIMITIVE(set_os_envs)
{
F_ARRAY *array = untag_array(dpop());
CELL size = array_capacity(array);

/* Memory leak */
char **env = calloc(size + 1,sizeof(CELL));

CELL i;
for(i = 0; i < size; i++)
{
F_STRING *string = untag_string(array_nth(array,i));
CELL length = to_fixnum(string->length);

char *chars = malloc(length + 1);
char_string_to_memory(string,chars);
chars[length] = '\0';
env[i] = chars;
}

environ = env;
}
Factor
USING: alien alien.c-types alien.strings alien.syntax kernel
layouts sequences system unix environment io.encodings.utf8
unix.utilities vocabs.loader combinators alien.accessors ;
IN: environment.unix

HOOK: environ os ( -- void* )

M: unix environ ( -- void* ) "environ" f dlsym ;

M: unix os-env ( key -- value ) getenv ;

M: unix set-os-env ( value key -- ) swap 1 setenv io-error ;

M: unix unset-os-env ( key -- ) unsetenv io-error ;

M: unix (os-envs) ( -- seq )
environ *void* utf8 alien>strings ;

: set-void* ( value alien -- ) 0 set-alien-cell ;

M: unix (set-os-envs) ( seq -- )
utf8 strings>alien malloc-byte-array environ set-void* ;

os {
{ macosx [ "environment.unix.macosx" require ] }
[ drop ]
} case

MacOSX environment variables, before and after

On OSX, we have to use a function to access the environment variable.

#ifndef environ
extern char ***_NSGetEnviron(void);
#define environ (*_NSGetEnviron())
#endif
Factor
USING: alien.syntax system environment.unix ;
IN: environment.unix.macosx

FUNCTION: void* _NSGetEnviron ( ) ;

M: macosx environ _NSGetEnviron ;

Windows NT environment variables, before and after

Draw your own conclusions.

DEFINE_PRIMITIVE(os_env) 
{
F_CHAR *key = unbox_u16_string();
F_CHAR *value = safe_malloc(MAX_UNICODE_PATH * 2);
int ret;
ret = GetEnvironmentVariable(key, value, MAX_UNICODE_PATH * 2);
if(ret == 0)
dpush(F);
else
dpush(tag_object(from_u16_string(value)));
free(value);
}

DEFINE_PRIMITIVE(os_envs)
{
GROWABLE_ARRAY(result);
REGISTER_ROOT(result);

TCHAR *env = GetEnvironmentStrings();
TCHAR *finger = env;

for(;;)
{
TCHAR *scan = finger;
while(*scan != '\0')
scan++;
if(scan == finger)
break;

CELL string = tag_object(from_u16_string(finger));
GROWABLE_ARRAY_ADD(result,string);

finger = scan + 1;
}

FreeEnvironmentStrings(env);

UNREGISTER_ROOT(result);
GROWABLE_ARRAY_TRIM(result);
dpush(result);
}

DEFINE_PRIMITIVE(set_os_env)
{
F_CHAR *key = unbox_u16_string();
REGISTER_C_STRING(key);
F_CHAR *value = unbox_u16_string();
UNREGISTER_C_STRING(key);
if(!SetEnvironmentVariable(key, value))
general_error(ERROR_IO, tag_object(get_error_message()), F, NULL);
}

DEFINE_PRIMITIVE(unset_os_env)
{
if(!SetEnvironmentVariable(unbox_u16_string(), NULL)
&& GetLastError() != ERROR_ENVVAR_NOT_FOUND)
general_error(ERROR_IO, tag_object(get_error_message()), F, NULL);
}

DEFINE_PRIMITIVE(set_os_envs)
{
not_implemented_error();
}
Factor
USING: alien.strings fry io.encodings.utf16 kernel
splitting windows windows.kernel32 ;
IN: environment.winnt

M: winnt os-env ( key -- value )
MAX_UNICODE_PATH "TCHAR" <c-array>
[ dup length GetEnvironmentVariable ] keep over 0 = [
2drop f
] [
nip utf16n alien>string
] if ;

M: winnt set-os-env ( value key -- )
swap SetEnvironmentVariable win32-error=0/f ;

M: winnt unset-os-env ( key -- )
f SetEnvironmentVariable 0 = [
GetLastError ERROR_ENVVAR_NOT_FOUND =
[ win32-error ] unless
] when ;

M: winnt (os-envs) ( -- seq )
GetEnvironmentStrings [
<memory-stream> [
utf16n decode-input
[ "\0" read-until drop dup empty? not ]
[ ] [ drop ] produce
] with-input-stream*
] [ FreeEnvironmentStrings win32-error=0/f ] bi ;

1 comment:

William said...

"Factor is a better C than C."
I think that "C" can be replaced with a lot of languages here.