A Scheme interpreter for VxWorks.
[ Download ]
Vx-scheme is a compact, fairly efficient implementation of the
Scheme programming language. It has some special features designed
to allow it to integrate with the VxWorks real-time operating
system shell. It is very nearly compliant to the R4RS
language standard: in particular, it supports continuations with
infinite lifetimes and all the procedures mentioned in the
specification, including the optional ones (such as
UsesVX-scheme was briefly used in a humanoid robotics project. See Chris Gaskett's page at James Cook University in Cairns, Australia.
Scheme is sort of a hobby of mine, as I'm a fan of Abelson, Sussman and Sussman's seminal text Structure and Interpretation of Computer Programs [SICP]. This implementation was born as I tried to follow the Scheme implementation experiments in chapters 4 and 5 of that book using C as the implementation language. Once that was working well enough, I then undertook to implement the complete R4 standard of the language, thinking it looked pretty easy! The R4 standard is a marvel of concision, but still conceals within it about 200 procedures and special forms. I also considered how the Scheme language might be integrated with VxWorks, an RTOS made by my former employer, Wind River (today I work at Google). That RTOS has a simple control shell based on C syntax, that's integrated with the runtime symbol table. Could Scheme do that as well? I wanted to find out.
First of all I wanted to learn the concepts in Scheme language implementation first-hand, so that I could grow as a programmer. To that end, I restrained myself from peeking at other implementations of Scheme on the net "to see how it should be done." That temptation was very hard to resist with Aubrey Jaffer's SCM, which is considerably faster than my implementation and almost as small. I did use SCM as a "reference implementation" to find out the "right behavior" in a number of tricky circumstances, and I made extensive use of his R4 test suite during the project.
(My first version model had the special forms implemented as C
functions. But this made complete support of
VxWorks programmers prize compact implementations, and this was perhaps the one constraint I adhered to most strongly. The entire executable code is less than 64Kb of code/data on VxWorks and FreeBSD. The system allocates an initial budget of only 10000 cons cells, and garbage collects when that budget is exhausted (interestingly, most of my benchmarks can actually survive on that small ration!) Whenever the code bloated over 64Kb, I would stop work and wring out the fat. (While the interpreter is written in C++ for notational convenience's sake, my tight size budget forbade the use of STL classes, or indeed any sort of template. The program uses a rather strict subset of C++'s features). Not having STL at hand meant I had to implement an efficient symbol store for the program (I like Knuth's implementation of AVL for jobs like this. A hash table might have been better, but AVL never runs out of gas. Knuth's implementation, which eschews recursion, is just about as fast as a balanced tree can get, IMO.)
I also didn't want any time-consuming implementation shortcuts.
Each and every procedure mentioned in the R4
spec has its own natural, efficient C implementation. (For example,
we don't ever do syntactic transformation, e.g., transforming
That said, it's still an interpreter, and doesn't attempt any compilation (or even the kinds of syntactic transformation that could save time rather than lose it). The "register machine" at the heart of the evaluator is a fairly faithful implementation of the design given in SICP.
VxWorks Shell Integration
In the VxWorks "C" shell, you can work with integer variables and call functions with a mix of integer and string parameters. VxScheme can do these things too. In the event that a symbol has no value in any of the current execution context's enclosing environments, vx-scheme will "fall through" to the VxWorks symbol table. If it finds a data symbol, it returns the current value of the variable as the value of the expression. If it finds a text symbol, vx-scheme constructs a procedure that will marshal its arguments into integer form and invoke the underlying function. It's meant to seem simple on the user side:
...of course, in Scheme syntax, the expression " i " would have the value of a procedure object, right? To actally call the procedure, we surround it with parens as usual (yes, this takes getting used to).
VxWorks variables shine right through:
Vx-scheme converts string arguments by passing the address to
the underlying C function (just as the standard target shell would
do). This has the interesting side effect of letting us use
Of course one thing we get out of this whole exercise is looping and conditionals for the VxWorks shell, things it doesn't have under its current C-expression syntax. Scheme also does a pretty good job of working with files. A good example of these things is the code for the test suite driver. Here's another where we spawn a handful of tasks with different delays.
That quote-mark in front of taskDelay is to prevent the evaluator from substituting a lambda value in place of the variable. We don't want that this time; instead we want the address of taskDelay to be passed to the 'sp' function! The quote allows these two cases to be treated correctly (all in the same S-expression).
This little bit of code creates three tasks and pends them all on the same semaphore.
This would be more interesting if we had a means of spawning a task to compute a Scheme subexpression, which could then deliver its value to a waiting continuation in the spawning environment. But that's a story for another day...
These are the platforms that I habitually run the code on. There's a shell script, "runtest", in the root of the distribution that will run a modest test suite on FreeBSD and Cygwin. This includes Jaffer's R4 compliance test as well as some things I picked up off the net and from SICP.
Memory and Garbage Collection
This implementation uses a fairly standard cons cell structure. The fundamental cell is a struct of two machine pointers, car and cdr. They are required to contain 8-byte aligned addresses, leaving the lower three bits free for tagging. The LSB is the "atom" tag, and the other two are for GC marking/freelist management.
Integers that will fit in 24 bits are stored directly "in the pointer" with no heap allocation. No special syntax is required to access this feature and spillover to 32-bit arithmetic is automatic (though such numbers come from cell storage).
Garbage collection is straight mark & sweep. We start with a budget of 10000 conses. When an allocation fails, GC is attempted at that moment. If we succeed in leaving at least 20% of the slab free, we're happy. Otherwise, we note that our GC attempt wasn't very successful, and arrange to allocate another slab at the next allocation failure.
Deficiencies, and Deviations from the R4 standard
The R4 standard insists that hardware integer overflow must be handled "correctly": either by spilling over into a multiple-precision format, or raising an error. Vx-scheme does the one thing that is forbidden: silently returning the wrong answer! Integer arithmetic in vx-scheme is essentially a "front-end" for the integer arithmetic provided by the C compiler and the hardware, and no apologies are made for it.
Similarly, Scheme insists that when an object is written out, this
should be done so that when it is read back in, the exact same object
is produced. This is tricky for real numbers. In vx-scheme, again, the
floating point I/O is a "front end" for that provided by
While the R4 standard doesn't require any better, perhaps now is the time to point out that we only support 32-bit integers and 64-bit reals as numeric data types, declining to implement support for rationals, arbitrary-precision integers, and complex numbers. If Scheme itself seems an odd thing to run atop an RTOS, these features would make it even more so!
Error handling in vx-scheme is weak. There's no backtrace facility or debugger. When error conditions arise (typically finding the wrong type of atom in an argument list) we print a message indicating the type of atom expected and longjmp back to the top of the read-eval-print loop.
Vx-Scheme is lazy about checking argument lists. If a call to a standard procedure supplies extra arguments, these are typically ignored. Other examples of slothfulness (like letting cond's have multiple else's, etc.) abound. Vx-Scheme is of course interested in giving the correct result for correctly-formed expressions, but is amazingly indulgent of ill-formed ones.
Vx-Scheme is not quite reentrant. While there are very few global variables (and most of those have invariant values that could be shared among multiple threads), some aspects of vx-scheme (such as memory management) have not yet been made MT-safe. Part of the reason for this: how to implement a multi-threaded scheme on an RTOS is actually an interesting question! One way to do it would be to have separate Scheme name/evaluation spaces running in different VxWorks tasks. But another way to do it would share the namespace and allow the "spawning" of threads to evaluate Scheme subexpressions and return the resulting values to the waiting tasks. I haven't decided which (if any) of these models to attempt.
If I were willing to learn about and implement R5's
syntax-extension facility, this implementation would be within
striking distance of R5 compliance. R4 requires no support for
defining new special forms whatsoever. But I decided to add support
for defmacro, a very handy tool. It's strong enough to implement
the stream processing features detailed in SICP's
§3.5. Here's a more pedestrian example, the
(I include this example because I think Scheme's
Benchmarking is facilitated by the
Chris Gaskett generously contributed an init file for Aubrey Jaffer's SLIB. This is integrated into the test suite (on Linux and Cygwin).
This scheme interpreter has a facility wherein the evaluator will call out to an OS layer just before it is about report that a symbol has no binding. This gives the OS a chance to supply a binding using its own resources (in the case of VxWorks, the C symbol table).
How this works is the OS returns a magic cell, which is a
kind of atom that has two function pointers in it,
VxWorks functions are a little more complicated: they actually return a lambda expression which delegates to a special function called "vx-invoke" whose job is to dispatch a call to the VxWorks function after converting the arguments.
But I think magic cells are a neat hack: kind of like java bean or COM “properties.” I'm sure it's been done before, of course, but it's interesting to speculate on other uses for this trick.
There's an interface to supply a “flag word” to the Scheme process. On UNIX/Cygwin, the environment variable ‘T’ can be set to an integer which sets any of the bits in the following table. On VxWorks, the global variable vxSchemeDebug can be set to the desired value using the C shell. The bit values are:
Check out the VxWorks Cookbook at John Gordon's site, bluedonkey.org. It's an excellent collection of VxWorks (and related) embedded technique assembled by one of the masters.
Permit me to thank the following people who have helped with this research:
Copyright © 2002-2003 Colin Smith.