[kaffe] Re: [Classpathx-xml] Unwanted SAXParseException
David Brownell
david-b@pacbell.net
Sun Oct 19 08:20:04 2003
Ito Kazumitsu wrote:
> :> This is the bug right here ... you're creating an InputSource
> :> without a System ID, which is why you get a later complaint
> :> about an InputSource that's missing such an ID!
>
> Yes, I knew about it and deliberately made such an incomplete
> program.
>
> The problem is that Sun's JAXP or Xerces does not throw
> an exception about this.
Which is a problem because (a) that's always been a violation
of the SAX spec, and (b) it hides bugs software in the layers
above SAX so (c) they're wrongly called parser bugs.
> :> It's unhealthy to have infrastructure guessing about such
> :> things, since it doesn't really have the facts to guess right.
> :> For example, the resolver is allowed to change the system ID
> :> based on the public ID, in which case the (local) system ID
> :> would clearly be wrong.
>
> I think so, too.
>
> As for the patch to gnu/xml/aelfred2/SAXDriver.java, I think
> it should be removed.
Yes.
> Then occurs another question. Is it correct that
> gnu/xml/aelfred2/XmlParser.java does this in pushURL()?
>
> systemId = source.getSystemId ();
> if (systemId == null) {
> handler.warn ("missing system ID, using " + ids [1]);
> systemId = ids [1];
> }
You didn't see that warning though, right? Odd. And you didn't
see it _after_ applying your patch either either.
Guessing ids[1] seems wrong; rather than recording systemIdGuessed
as in your patch, I'd just remove the assignment. And the test,
and the warning, since the SAXDriver should handle that (see my
response to Nic).
> Giving the warning of "missing system ID" should be postponed
> until the system ID is really needed.
There are several contexts in which it may be needed. There's
the one that follows immediately -- and then there's the case
of using it to resolve relative URIs much later on, which is
the one that's painful to debug. (And which it seemed the
Apache and Sun folk never tracked down to its root cause...)
Having just the one warning to cover both paths save code/data
space, and helps minimize the number of _undetected_ ways that
things can misbehave.
- Dave