[kaffe] Bug report (java.io.StreamTokenizer)
Ito Kazumitsu
ito.kazumitsu@hitachi-cable.co.jp
Fri Jun 27 19:54:01 2003
Hi Hermanni,
In message "[kaffe] Bug report"
on 03/06/27, Hermanni Hyyti=E4l=E4 <hemppah@cc.jyu.fi> writes:
> Token: # (type: -3)
> The tokenizer is initialized in
> sandStorm.main.SandstormConfig$configSection (starts from line 610) like
> this:
> tok =3D new StreamTokenizer(in);
> tok.resetSyntax();
> tok.wordChars((char)0, (char)255);
> tok.whitespaceChars('\u0000', '\u0020');
> tok.commentChar('#');
> tok.eolIsSignificant(true);
This way of initialization makes all characters between 0 and 255
word characters.
So '#' is both a word character and a comment character.
(Sun's API document says, "Each character can have zero or more of these
attributes.")
Kaffe's java.io.StreamTokenizer checks each character in the
following order:
isWhitespace
isNumeric
isAlphabetic
chr=3D=3D'/' && CPlusPlusComments && parseCPlusPlusCommentChars()
chr=3D=3D'/' && CComments && parseCCommentChars()
isComment
isStringQuote
So '#' is treated as a word character (isAlphabetic) before
it is checked against isComment.
I do not think Sun's API document clearly defines in what order
character types should be checked. So it can be said that treating
'#' as a word character is not a bug but so specified.
But in order to make the behavior of kaffe's java.io.StreamTokenizer
similar to Sun's, I suggest that the cheking order be changed
as follows (the more specific, the earlier):
isWhitespace
chr=3D=3D'/' && CPlusPlusComments && parseCPlusPlusCommentChars()
chr=3D=3D'/' && CComments && parseCCommentChars()
isComment
isStringQuote
isNumeric
isAlphabetic
Please try this patch.
--- java/io/StreamTokenizer.java.orig Tue Feb 19 09:47:49 2002
+++ java/io/StreamTokenizer.java Sat Jun 28 11:48:50 2003
@@ -116,14 +116,6 @@
/* Skip whitespace and return nextTokenType */
parseWhitespaceChars(chr);
}
- else if (e.isNumeric) {
- /* Parse the number and return */
- parseNumericChars(chr);
- }
- else if (e.isAlphabetic) {
- /* Parse the word and return */
- parseAlphabeticChars(chr);
- }
/* Contrary to the description in JLS 1.ed,
C & C++ comments seem to be checked
before other comments. That actually
@@ -145,6 +137,14 @@
else if (e.isStringQuote) {
/* Parse string and return word */
parseStringQuoteChars(chr);
+ }
+ else if (e.isNumeric) {
+ /* Parse the number and return */
+ parseNumericChars(chr);
+ }
+ else if (e.isAlphabetic) {
+ /* Parse the word and return */
+ parseAlphabeticChars(chr);
}
else {
/* Just return it as a token */