Header2Scheme:
An Automatic C++ to Scheme Interface Generator
Header2Scheme version 1.4 is now available, and contains a few bug fixes and enhancements. The source code is available
for download.
The Problem
There are many situations in which more text-based interactivity is
desired in a C or C++ program than simple printf's and
scanf's. Unfortunately, to create a full-fledged interpreter is a
challenging and time-consuming task. Tcl is one solution that many
people choose, but has several disadvantages, among them that it is
slow and stores all data types as character strings. Scheme is a very
elegant interpreted language for which there are many implementations
of interpreters. Unfortunately, up until now, there have been
relatively few useful libraries for which Scheme bindings have been
created, so acceptance of Scheme as a general interpreted language
solution has been minimal.
One solution
Header2Scheme is a program which reads in a directory tree full of C++
header files and compiles them into C++ code. This new code,
when compiled and linked with SCM (Aubrey Jaffer's Scheme
interpreter) implements the back end for a Scheme interface to the
classes defined by these header files.
Why SCM?
I looked at a few Scheme interpreters before deciding on SCM; two of
these included Scheme48 and libScheme. Each had its advantages, but SCM had
the distinction of being the fastest of the pack. Since my work focuses
mainly on real-time graphics applications, SCM had the obvious
advantage. (However, libScheme had a very elegant, object-oriented
programming interface which was very appealing.)
Basic syntax defined by Header2Scheme
Given a C++ class:
class Foo {
public:
int memberFunction();
Yabba *class_variable;
};
The Scheme interface becomes
(define my-foo (new-foo))
(define my-int (-> my-foo 'memberfunction))
(define my-yabba (-> my-foo 'class_variable))
An alternative calling interface is
(foo::memberfunction my-foo)
(foo::class_variable my-foo)
More on the backend
When interfacing C++ and Scheme, there are several issues to take into
consideration:
I have tried to address all of these problems in Header2Scheme.
Basic C types often have analagous Scheme data structures:
- arrays/vectors
- strings
- ints, floats
Header2Scheme generates code to convert fixed-length C arrays into Scheme
vectors and back. (See the section below on pointers and references.)
Strings have essentially the same representation in Scheme and C, modulo a
type tag in Scheme. Floats and ints are likewise similarly represented and
have all the necessary conversion code generated by Header2Scheme.
More sophisticated C data structures (i.e. the ones which Header2Scheme is
designed to provide access to) are represented in the Scheme backend as
pointers with a type tag.
All "complicated" C structures are represented in the Scheme backend
essentially as C pointers with a type tag. On the Scheme side of things,
there are no notions of pointers, references or dereferencing; everything
is an "object".
When making a function call (see the section below on type checking)
pointers are automatically dereferenced.
One of the most elegant aspects of Scheme is that there is no type
checking; everything is considered to be "data". Because Header2Scheme
fundamentally is dealing with C, however, it necessarily enforces some type
checking.
When you call a member function of a class via Scheme, the code that
Header2Scheme produced checks the Scheme arguments' types and attempts to
find the appropriate (possibly overloaded) C++ function. If one can't be
found, the interpreter produces a wrong type error (as opposed to a
segmentation fault). Otherwise, the arguments are converted to their C
equivalents, the function is called, and the return value, if any, is
converted to Scheme format and returned to the interpreter.
In Scheme, all arguments to functions are passed by
value. Unfortunately, C++ allows passing pointers and references to
variables during function calls, and many C functions would become
unusable if they were unable to side-effect their parameters. Because
of this, the backend that Header2Scheme generates allows side-effects
to propagate, where possible, to all arguments that are passed by
reference (either via a pointer or a reference). (Scheme integers and
characters can not be mutated because they are always passed by
value.) It should be noted that this can cause unpleasant and
unexpected effects in code:
class SbVec3f {
(...)
void getValue(float &x, float &y, float &z);
(...)
};
> (define x 0.0)
> (define y 0.0)
> (define z 0.0)
> (define my-list (list x y z))
> my-list
(0.0 0.0 0.0)
> (define my-vec (new-SbVec3f 3 4 5))
> (-> my-vec 'getValue x y z)
> x
3.0
> y
4.0
> z
5.0
> my-list
#(3.0 4.0 5.0)
If, on the other hand, the command which had mutated x had been the
standard Scheme
> (set! x 2.0)
> my-list
#(0.0 0.0 0.0)
my-list is (correctly) not mutated in this example.
Consider the following two classes:
class a {
void foo();
};
class b : public a {
void bar();
};
The member function "foo" can be called from an object of type "a" or
"b". However, the member function "bar" can only be called from an object
of type "b". Standard Scheme has no built-in notion of inheritance, so the
question remains of how to decide at runtime whether a given member
function can correctly be called for a given Scheme object.
Header2Scheme solves this problem by reconstructing its own idea of the
desired class hierarchy in a pre-processing step while the backend to the
Scheme interface is being generated. This hierarchy is then reconstructed
when the Scheme interpreter is run, and is referenced to resolve run-time
questions of type checking.
What has been done with Header2Scheme?
I have used Header2Scheme to create a Scheme binding for Open Inventor, a
3D graphics toolkit developed by Silicon Graphics. This package is called
Ivy, and is available from the Ivy home page.
- Fixed bugs in generated code for pointers returned from functions.
- Added long/short qualifiers in typedefs file for ints and doubles, and
output them in the argument extraction code so references to these types
work.
- Rewrote the C++ wrappers for the Scheme interpreter so they work
better; more consistent user interface (prompt is always there when
interpreter is listening) and more robust (Ctrl-C no longer crashes the
interpreter).
h2s-1.4.tar.gz (290K) contains the source
code and documentation for Header2Scheme. Header2Scheme is Copyright 1995
Kenneth B. Russell and is covered under the GNU General Public License.
Other links
GUILE is a collaborative
effort to make embedded Scheme more ubiquitous.
SWIG is a more up-to-date glue code
generator than Header2Scheme which supports multiple language bindings.
Feedback
Please send me email if you
have any comments, questions or suggestions, or have problems with the
distribution.
Kenneth B. Russell - kbrussel@media.mit.edu
$Id: index.html,v 1.15 2001/06/02 05:20:33 kbrussel Exp $