[kaffe] Bug report (java.io.StreamTokenizer)

Hermanni Hyytiälä hemppah@cc.jyu.fi
Mon Jun 30 00:30:02 2003


Hi,

Please see my answer inlined.

On Sat, 2003-06-28 at 05:56, Ito Kazumitsu wrote:
> Hi Hermanni,
> 
> Kaffe's java.io.StreamTokenizer checks each character in the
> following order:
> 
>   isWhitespace
>   isNumeric
>   isAlphabetic
>   chr=='/' && CPlusPlusComments && parseCPlusPlusCommentChars()
>   chr=='/' && CComments && parseCCommentChars()
>   isComment
>   isStringQuote
>
> So '#' is treated as a word character (isAlphabetic) before
> it is checked against isComment.
> 
> I do not think Sun's API document clearly defines in what order
> character types should be checked.  So it can be said that treating
> '#' as a word character is not a bug but so specified.
> 
> But in order to make the behavior of kaffe's java.io.StreamTokenizer
> similar to Sun's,  I suggest that the cheking order be changed
> as follows (the more specific, the earlier):
> 
>   isWhitespace
>   chr=='/' && CPlusPlusComments && parseCPlusPlusCommentChars()
>   chr=='/' && CComments && parseCCommentChars()
>   isComment
>   isStringQuote
>   isNumeric
>   isAlphabetic


According to the JLS (first edition), the nextToken-method of
java.io.StreamTokenizer class has the following lexical order:

whitespace
numeric character
alphabetic character
comment character
string quote character
comment //
comment /*

Yet, however, I haven't tested if this parsing order changes the
behaviour of Kaffe's java.io.StringTokenizer's behaviour (could someone
test it ?-)).

The documentation of the nextToken-method can be found from
http://java.sun.com/docs/books/jls/first_edition/html/javaio.doc14.html#29287



Thanks,
Hermanni