[kaffe] Bug report (java.io.StreamTokenizer)
Hermanni Hyytiälä
hemppah@cc.jyu.fi
Mon Jun 30 00:30:02 2003
Hi,
Please see my answer inlined.
On Sat, 2003-06-28 at 05:56, Ito Kazumitsu wrote:
> Hi Hermanni,
>
> Kaffe's java.io.StreamTokenizer checks each character in the
> following order:
>
> isWhitespace
> isNumeric
> isAlphabetic
> chr=='/' && CPlusPlusComments && parseCPlusPlusCommentChars()
> chr=='/' && CComments && parseCCommentChars()
> isComment
> isStringQuote
>
> So '#' is treated as a word character (isAlphabetic) before
> it is checked against isComment.
>
> I do not think Sun's API document clearly defines in what order
> character types should be checked. So it can be said that treating
> '#' as a word character is not a bug but so specified.
>
> But in order to make the behavior of kaffe's java.io.StreamTokenizer
> similar to Sun's, I suggest that the cheking order be changed
> as follows (the more specific, the earlier):
>
> isWhitespace
> chr=='/' && CPlusPlusComments && parseCPlusPlusCommentChars()
> chr=='/' && CComments && parseCCommentChars()
> isComment
> isStringQuote
> isNumeric
> isAlphabetic
According to the JLS (first edition), the nextToken-method of
java.io.StreamTokenizer class has the following lexical order:
whitespace
numeric character
alphabetic character
comment character
string quote character
comment //
comment /*
Yet, however, I haven't tested if this parsing order changes the
behaviour of Kaffe's java.io.StringTokenizer's behaviour (could someone
test it ?-)).
The documentation of the nextToken-method can be found from
http://java.sun.com/docs/books/jls/first_edition/html/javaio.doc14.html#29287
Thanks,
Hermanni