Rewriting byte codes

Mon, 1 Apr 2002 09:35:13 +0200

On Sun, Mar 31, 2002 at 03:40:48PM -0800, Jim Pick wrote:
> 
> On Sun, Mar 31, 2002 at 09:02:46PM +0200, Erik Corry wrote:
> > 
> > Hi
> > 
> > I'd like to make some changes to Kaffe to make it simpler to
> > do more precise GC.
> 
> I'm all for getting some better GC capabilities into Kaffe.
> 
> > What would be needed at a minimum would
> > be:
> > 
> > * Split local variables that can be either reference or non-reference
> >   into two.  (The alternative is to keep track of when they are what).
> 
> What do the other VMs with non-conservative GC do?

Hotspot splits them, but any VM needs to keep track of the stack.
It would be nice if the compiler didn't reuse stack positions
for reference and non-reference values.  I don't know if any
compilers do that, but it would save the rewriting and cost
very little.

> > * Keep track of which stack positions are references when we do a
> >   call or new.  Luckily the stack is zapped on an exception so we
> >   don't have trouble there.
> 
> Where does the information about which stack positions are references
> get stored?  On the stack, I suppose?

That's one solution.  The other is to have a structure that
correlates byte code position with stack types.

If you cache the top few stack positions in registers then you
can reserve some registers for references and others for non-references.
Then you just need to tag the part of the stack (if any) that you spill
out to memory.  This isn't really going to fly in its simplest form on
machines with fewer than 32 registers, and for the interpreter it means
gcc only, since no other compilers let you control register usage so
closely (trouble with exceptions).

> I need to look at some other VM implementations to see how they keep
> track of references on the stack (not to mention objects on the heap).
> 
> > * Make copies of jsr code so that each basic block can only be called
> >   from one jsr call point chain (default the null chain of course)
> 
> I think I understand the idea behind splitting the local variables
> (since the local variable used to store the return PC from a jsr call
> could also be used to store a reference).

This is a different optimisation, the main point of which is
to make verification about 10 times easier.  See
http://citeseer.nj.nec.com/freund98costs.html

> > There's an obvious problem that when we split the variable, one
> > of the new variables could end up above the 256 byte level.  Similarly
> > we could end up with more than 65536 bytes in a method.
> > In order to fix this a little byte code rewriting is inevitable.  I
> > would suggest we rewrite the byte codes such that:
> > 
> > * All insns are wide (16 bit index to local variables), and we drop
> >   the wide prefix
> > * All inline constants are native-endian
> > * All pc indexes are 32 bits and of course native endian (mostly a
> >   problem with the exception table I think, also replace GOTO with
> >   GOTO_W throughout)
> 
> The ideas sound nice, although I we should think seriously about what
> the tradeoffs are before making this the default.  The rewriting phase
> is some additional overhead, and there is some code expansion (but
> maybe the resulting code will be faster).

Yes, perhaps we can avoid the rewriting in the common case, ie
no jsrs and not more than 256 local variables even after splitting.

> I think it would be nice to have a generic spot where we can rewrite
> bytecode before JITing it.  I imagine that could be used for some
> other optimizations too.

The native byte ordering is a win under any circumstances I think.

> Theoretically, somebody could hand code a subroutine that did
> something different with the jsr return PC value that is pushed onto
> the stack (instead of using an astore, as is typically found in a
> finally clause).  In real life, that probably never happens, but isn't
> it still legal bytecode?

I don't think it is, actually.

The only slightly tricky thing you can do with them is to return
several levels in one by returning with the previous return value
instead of the most recent one.  So we would need to keep track of
that.  But you can't recurse, you can't have more than one return
point and you of course can't construct a synthetic return value.
We need to check those things anyway in the byte code verifier.

-- 
Erik Corry erik@arbat.com