Re: Read-line bug in STk 4.0.1 - missing control-Ms.

From: Harvey J. Stein <hjstein_at_bfr.co.il>
Date: 27 Dec 1999 10:48:04 -0500

Ian Wild <ian.wild_at_eurocontrol.be> writes:

> I think read-line is the WRONG place to do this - it should
> be an option affecting the whole file once, not each
> individual access to it. Have you considered an approach
> like Tcl's [fconfigure]?

So you're suggesting a (port-options) function or set of functions?
Sounds reasonable. Alternatively, one could have the port creation
fcns take arguments for this, but there're so many different port
creation fcns that that'd add up to a lot of changes.

Actually, it'd be much better to have a function to do this than to
hang the options off the port creation fcns because the latter doesn't
allow one to change the options. Consider using a package which
returns a port. If the package opens the port with newline conversion
turned on, but you *really* need to get the data verbatim, you'd be
stuck.

In either case, it'd have to affect not only read-line, but all other
reading functions (read, read-char, peek-char, char-ready?, port->*,
and any other ones I missed going through the docs). It looks like
it'd require a fairly large rewrite of the I/O portion of STk. I also
find it a little scary to think that low level fcns like read-char
might read multiple characters to get one character. Especially
worisome is using char-ready? with read-char to get non-blocking
reading. Consider char-ready? returning #t because a \r can be read,
but read-char hanging because the \n isn't ready yet.

WRT general read-line behavior, the current mechanism isn't really
converting end of line tokens. It's not really converting \r\n to \n,
it's just deleting \r. If it were really converting, namely by
reading until \r\n and then returning everything except for the \r\n,
then we'd have problems in general on newline = \n platforms. If it
were to read until \n & delete trailing \r characters, it wouldn't be
quite as dangerous as the current behavior, it would more or less work
in general, but it could still bite people. In any case, getting this
sort of stuff to work properly with EOF, peek-char & char-ready? makes
me nervous...

It's not in RnRS, so I guess you could do whatever you want with it,
but I always like a) to see read-line read a line on the
platform on which it's reading, and b) to have read-line be
invertible. By being invertible I mean that it should be possible to
reconstruct the input from the output. For read-line, that'd mean
that the original input should be the output strings concatenated
together with newlines in between (if readline strips them), except
possibly for a missing trailing newline. Some implementations do
manage to get that case right, too. For example, some read-lines
(such as C's fgets) return all characters up to and including the
trailing newline, so that the original input is exactly the
concatenation of the output strings. Guile doesn't do this with
read-line, but does provide %read-line, which returns a dotted pair
consisting of the line and the terminator, either #\newline or
#<eof>.

In particular, being invertible means that read-line can't be
ambiguous about end of line marking - it has to be either \r\n or \n.
It can't be either.

So, I guess I'd be most in favor of an option to read-line to say what
EOL should be, with the default being the same as the hosting
platform, and with strict conversion being done. Setting port options
is a close second, although it looks like a lot of work...

--
Harvey Stein
Bloomberg LP
hjstein_at_bfr.co.il
Received on Mon Dec 27 1999 - 16:48:40 CET

This archive was generated by hypermail 2.3.0 : Mon Jul 21 2014 - 19:38:59 CEST