More info on the Linux dump saga - partial success.

From: Harvey J. Stein <hjstein_at_math.huji.ac.il>
Date: Wed, 14 Dec 1994 13:05:54 +0200

Summary:

It looks like Stk's dump & restore code seems to work under Linux as
long as you bind stk-bin statically! If stk-bin is dynamically
linked, then restoring images seems to seg fault.

To recap:

I followed Amancio Hasty's advice to see if stk's dump/restore code
would work under Linux.

After recompilation, dump files would be created, but would cause
segmentation faults when I tried to use them.

Here's the new news:

I recompiled stk with debugging on so that I could see a stack trace
of what caused the seg fault, and I stopped getting seg faults!!!!!
The -image switch started working fine!!!!

Looking over things abit, I figured that this was probably because I
was now creating a statically linked binary. So, I re-linked stk-bin
as a dynamic library & I started getting the set fault again.

So, it seems that if one links stk-bin statically under linux (instead
of dynamically) one can then use the dumping/restoring features.

As for why this matters, I don't know. Getting a stack trace didn't
help much:

hjstein_at_udun:~/STk-2.1.4/Src$ gdb stk-bin
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.12 (i486-unknown-linux), Copyright 1994 Free Software Foundation, Inc...
(gdb) run -no-tk -name foo -image test.dump
Starting program: /home/hjstein/STk-2.1.4/Src/stk-bin -no-tk -name foo -image test.dump

Program received signal SIGSEGV, Segmentation fault.
0x60034b10 in _end ()
(gdb) bt
#0 0x60034b10 in _end ()
#1 0x0 in _entry ()
(gdb) The program is running. Quit anyway (and kill it)? (y or n) y

So, I tried running & stepping through in the debugger. I found that
I'm getting a seg fault on the line:

  apply(gcont, LIST1(ntruth));

in the function internal_restore in dump.c.

Stepping through until the error occurs indicates it occurs on the
last line of lthrow (in cont.c), namely:

  longjmp(C_ENV(tmp), JMP_THROW);

I don't know exactly what's going on, but it seems like STk dumps by
just writting out all of memory (as in a core dump). To restore the
stack properly, it seems that it starts a continuation before dumping
so that it can when it can restore by just finishing the continuation.
I'm not sure because I know nothing about continuations in scheme.

In any case, it seems that for some reason this works if the Linux
binary is statically linked & doesn't work if it is dynamically
linked. As to why, I can only guess...

Anyway, that's the current status. I don't know where I'll go from
here, aside from seeing how big stk-bin is when it's compiled -O2 &
statically linked. The -g statically linked guy is 5meg, but that's
mostly from -g, since the -g dynamically linked binary is also about
5meg.

Maybe someone who knows more about dynamic libraries under Linux can
pick up the ball...

Dr. Harvey J. Stein
Berger Financial Research
hjstein_at_math.huji.ac.il
Received on Wed Dec 14 1994 - 12:07:15 CET

This archive was generated by hypermail 2.3.0 : Mon Jul 21 2014 - 19:38:59 CEST