Re: interesting test of hash-table vs a-list

From: Lars Thomas Hansen <lth_at_ccs.neu.edu>
Date: Mon, 03 May 1999 10:52:19 -0400

Harvey Stein writes:

>Lars Thomas Hansen <lth_at_ccs.neu.edu> writes:
>
> > The best compromise between performance and the distribution given by
> > hashing on the characters of the print name is probably obtained by
> > computing the hash value when the symbol is created and storing it in
> > the symbol for future use. Both MacScheme and Larceny use this trick.
> > The code for symbol-hash is then simple: a type test and a lookup.
> >
> > In Larceny, the break-even point of hash tables vs. alists is at about
> > 25 elements, according to Will Clinger.
>
>What about scrambling the bits of the address, or using the address
>modulo hash table size with the hash table size being a prime number?

In STk this might make quite a bit of sense because the collector is
nonmoving (although that will not give you the distribution you would
get from hashing on the print name, so you rely on getting some benefit
from that prime number).

In a system with a moving collector, hashing on the address will be
about as expensive as a hash on EQ?, which requires occasional rehashes
and so on when things move. No doubt this can be made affordable (which
does _not_ imply competitive with simpler methods), but the complexity
seems uncalled for when memoizing the hash code is such a
straightforward thing to do and gives such a reasonable tradeoff.

--lars
Received on Mon May 03 1999 - 16:53:00 CEST

This archive was generated by hypermail 2.3.0 : Mon Jul 21 2014 - 19:38:59 CEST