Garbage collection, part two...

From: Paul Anderson <paul_at_grammatech.com>
Date: Wed, 05 Apr 2000 13:06:28 -0400

Erick:

Sorry for that last truncated post. I accidentally hit the
send button before I was finished with it.

As I was saying....

The fix to the first garbage collection problem is attached
to that email. I think it is self contained, but if you
have any questions please ask.

The second problem is much more subtle, and harder to fix.

It only shows up when the file containing STk_execute_Tcl_lib_cmd
is compiled at a high level of optimization. We see it when
we compile with gcc version 2.8.1 at -O2, and only on sparc/solaris.
However, there is no reason why another aggressive optimizer
on another platform would not trigger it.

BTW, this was found after several days of excellent detective
work by my colleague Chi-Hua Chen.

In that function there is an array of string-pointers (char **argv)
which is used as the argument to the Tcl function invocation.
This is created by iterating through the arguments and calling
STk_convert_for_Tcl. A side effect of doing this is to create
conv_res, which is a STk vector which contains SCM string values
that point to the same strings as those in argv.
The comment in that function indicates that conv_res is used to
avoid GC problems. However, it isn't quite right.
What happens is that conv_res gets collected and because it
contains pointers to the strings in argv, these strings are
getting freed, and argv is then invalid.

When this function is optimized, conv_res gets placed in a register.
Normally this will protect it from being collected. However, with
aggressive optimization, because the value of conv_res is not used
afterwards, that register is then re-used for something else.
Then, when the GC is invoked sometime in the call
 (*W->fct)(W->ptr, STk_main_interp, argc, argv);
the value that was in conv_res is then collected and
all hell breaks loose.

I see two ways of fixing this. However each has its disadvantages,
and I am not sure if there are other places in the code where the
same problem might show up.

Fix 1 is to trick the optimizer into keeping the value conv_res
in a register. This can be done by passing it to a dummy
procedure. For example:

   void dummy(SCM v) { }; /* Should be in a separate compilation */

   tkres = (*W->fct)(W->ptr, STk_main_interp, argc, argv);
   dummy(conv_res); /* This references forces the optimizer to NOT
                        discard the value until afterwards. */

Fix 2 is to explicitly protect conv_res from being garbage collected
using STk_gc_protect.

The trouble with both fixes is that we don't know where else they might
need to be applied. I am not sure I know how to characterise
exactly where such a situation might arise. I know that it is
at least the following:

1. A SCM value is created such that it can be put in a register, and
2. it references a dynamically created structure (such as a string), and
3. that reference is copied elsewhere, and
4. the SCM value becomes a candidate for collection before the value
     created in 3 is dereferenced.

Please help us understand which of the above is the better solution
(or if there is any other way), and also how we should go about finding
other such places.

Best regards,

Paul

______
Paul Anderson. GrammaTech, Inc. Tel: +1 607 273-7340
mailto:paul_at_grammatech.com http://www.grammatech.com
Received on Wed Apr 05 2000 - 19:07:47 CEST

This archive was generated by hypermail 2.3.0 : Mon Jul 21 2014 - 19:38:59 CEST