Scheme convergence [Was: Re: 3.99.3] from Lars Thomas Hansen on 1998-10-06 (STk_mailing

From: Lars Thomas Hansen <lth_at_ccs.neu.edu>
Date: Tue, 06 Oct 1998 11:05:21 -0400

This is long, but as someone who spends a considerable fraction of his
time working on a Scheme implementation that is very, very different
from STk, I'd like to add a few comments to the thread.

(1) "There can only be one" (Highlander principle)

STk is a very useful system for certain kinds of applications and
prototyping but it is not a suitable basis for many kinds of research
due to its architecture. It's not alone in this: SCM, Guile, SIOD, Elk,
and probably most other quasi-portable interpreters have the same
problems. Some of these problems are:

- Performance: my rule of thumb is that these interpreters are two
   orders of magnitude slower than a good native-compiled system like
   Chez Scheme.

   Reasons: interpreters are inherently somewhat slow; they are usually
   implemented in a straightforward (non-clever) way, making them slower;
   all objects, including small integers and other data with high
   allocation rates, are heap-allocated; allocation is not fast; and
   garbage collectors are not very good.

- Massive amounts of C code: it's bad enough to have to debug and
   maintain a C run-time system; in these systems the entire Scheme
   run-time library, for the most part perfectly expressible in Scheme,
   is written in highly stylized and usually brittle C.

   Reason: largely, to get reasonable performance from the slow interpreter.

- Dependence on approximate-roots collection: as far as I know, all the
   interpreters mentioned above rely on conservative stack scanning to
   be able to interoperate with C.

   Reason: the interpreters need to interoperate with C to allow
   the run-time library to be written in C.

- Non-moving non-generational garbage collectors: approximate-roots
   collection requires that objects identified by approximate roots
   (ie referenced from the stack) not be moved by the GC; for simplicity's
   sake, the GC moves no objects.

   Reason: The interpreter implementors are more interested in getting
   something out the door than spend their life on GC research and are
   happy with simple collectors; I can't say I blame them.

- Hard-to-use FFIs: the FFIs typically require stub functions to be
   written in the same highly-stylized C that the run-time library
   is written in. This makes interfacing to foreign functions hard
   on the programmer (but easy on the Scheme implementer).

   Reason: The stub-requiring FFIs can be accomodated by a portable
   interpreter; a non-stub-requiring FFI requires a little assembly code
   for a C "apply" function.

- Non-clever implementations of continuations: continuations are usually
   implemented using the "stack" strategy (the entire stack is copied-out
   on capture and copied-back on throw). Implementing simple multitasking
   on top of continuations is not very attractive due to the costs.
   Implementing exceptions on top of continuations is not attractive.

   Reason: simplicity, quasi-portability.

- Quasi-portability: several of these systems are pretty Unix-specific.

   Reason: Scheme interpreter writers reside in universities, where Unix
   is the dominating platform.

As it happens, our research implementation of Scheme shares none of
these problems (indeed was designed not to have any of these problems),
and is therefore suitable for the kind of research we do: optimizing
compilers, sophisticated garbage collection, continuation stuff,
eventually native threads on multiprocessors, with very limited time for
maintaining a lot of C code that may need to change when the run-time
system conventions change (we're a small group).

Using any of the systems mentioned above for our research would probably
require such massive changes to them that it would amount to a
near-complete reimplementation; using any of them is therefore not an
attractive proposition, and that's why we invented yet another Scheme
implementation.

(2) "We must stand together or we shall surely hang together"

Not everyone working on Scheme is interested in using Scheme for serious
application development, and not everyone is particularly interested in
Scheme taking over Perl's place. Scheme is interesting in that it has
very few restrictions and hence makes programming pleasant and efficient
implementation difficult, yet it is also small. The combination makes
it a tractable and attractive research platform for issues related to
language implementation, and that's how some (including myself) use it.

A plethora of implementations isn't bad in itself; witness the richness
of approaches this has brought us. Scheme->C pioneered compilation to
C, a technique that is now used by Bigloo, Gambit, RScheme, and by an
experimental compiler in MzScheme. It also pioneered mostly-copying
garbage collection, a clever technique used by commercial and research
systems alike. Gambit-C and MzScheme are notable for their portability.
RScheme has a real-time (generational?) garbage collector. Scheme48 is
the basis for scsh and the Vlisp project on provably correct
implementation. STk and DrScheme have experimented with very different
GUI toolkits, both portable. SCM begat SLIB.

(3) Continued growth of STk

The problem with too many Scheme systems, if there is one, is not that
everyone writes his own system, but that almost everyone writes a system
to do research with, on one level or another. Application programmers
get short shrift, and STk users are application programmers.

Hardware is fast and getting faster, and for that reason STk can survive
as its usage area grows to include ever-larger programs. However, I
suggest to you that this is not desirable.

I think a reasonable growth path for the STk community would be to
*abandon* the STk core and move the interesting features of the system
-- object system, Tk interface -- to an independently-supported
well-performing portable Scheme system like Gambit-C, MzScheme, RScheme,
Scheme48, or Bigloo. (Not all of which are portable in interesting
ways, unfortunately.) Sure, there are disadvantages to this, notably
that one has less control over the run-time system, hence a smaller bag
of tricks to choose from, but in the long run it will almost certainly
be a big win.

Observe how the scsh people avoided reimplementing a Scheme system and
instead used Scheme48, which was stable and available.

(4) Scheme must grow

On a more positive (!) note, a Scheme workshop was held recently at
ICFP. The attendees, which included a number of implementers, Scheme
Report authors, and users, approved: a structured mechanism for
submitting proposals for language features or functionality, a
repository of which will be hosted on www.schemers.org; a feature for
checking the Scheme implementation at macro-expand time; an exception
handling system; a mechanism for source code inclusion (INCLUDE
directive); and improvements to the I/O system. The group also examined
proposals for Unicode, records, and changes to the language to
incorporate IEEE arithmetic. While the workshop did not have official
status wrt the Revised Report or the IEEE standard, the spirit of
cooperation and willingness to make progress was notable.

--lars
Received on Tue Oct 06 1998 - 17:06:18 CEST

This archive was generated by hypermail 2.3.0 : Mon Jul 21 2014 - 19:38:59 CEST