I made some more discoveries on the m68k/NetBSD port last night.  Since I am
about to head off to work, and then fly across the continent, I am posting
my findings now, in the hope that someone (Kiyo?) can profit from it in my
absence.

First, my previously posted FLUSH_DCACHE is broken.  I currently have it as:

#define FLUSH_DCACHE(beg, end)                                  \
        __asm__ __volatile__(                                   \
                "movem%.l %/d0-%/d7/%/a0-%/a5,%/sp@-\n\t"       \
                "move%.l %0,%/a1\n\t"                           \
                "move%.l %1,%/d1\n\t"                           \
                "sub%.l %/a1,%/d1\n\t"                          \
                "movel  %#0x80000004,d0\n\t"                    \
                "trap   %#12\n\t"                               \
                "movem%.l %/sp@+,%/d0-%/d7/%/a0-%/a5"           \
                :                                               \
                : "g" (beg), "g" (end) )

#endif

Previously, I had used 0x80000006 as the control word; this is wrong, as it
will have no effect.

Kiyo: warning: the kernel will not flush your 3/260's L2 cache.  I don't think
this is a problem, since the writes write through and the instruction fetch
is on the same side of the cache, but be aware that something stinky might
be happening.  I have checked the kernel sources for all m68k/NetBSD ports.

For anyone: question: is it wise to target specific registers like this?
I'm not too familiar with in-line assembler, but if we are asking GCC to pick
two registers for %0 and %1, then could GCC ever pick a1, d0 or d1?  For
example, if it ever picked %a1 for %1, then we'd have:

    move.l (something),%a1
    move.l %a1,%d1

Since %a1 is clobbered, this breaks.  What stops GCC from doing this?  This only
affects code where we need to put values in specific registers before a call.
To be safe, I'd rather write FLUSH_CACHE as a real function, as follows:

    link %fp,#0
    movem.l %d2-%d7/%a2-%a5,%sp@-
    move.l %fp@(8),%a1
    move.l %fp(12),%d1
    sub.l %a1,%d1
    movel #0x80000004,%d0
    trap #12
    movem.l %sp@+,%d2-%d7/%a2-%a5
    unlk %fp
    rts

I wrote this off the top of my head, so check carefully.


Now for the new stuff: sysdepCallMethod for NetBSD is broken.  It currently
reads:

#define sysdepCallMethod(CALL) do {                             \
        int extraargs[(CALL)->nrargs];                          \
        register int d0 asm ("d0");                             \
        register int d1 asm ("d1");                             \
        int *res;                                               \
        int *args = extraargs;                                  \
        int argidx;                                             \
        for(argidx = 0; argidx < (CALL)->nrargs; ++argidx) {    \
                if ((CALL)->callsize[argidx])                   \
                        *args++ = (CALL)->args[argidx].i;       \
                else                                            \
                        *args++ = (CALL)->args[argidx-1].j;     \
        }                                                       \
        asm volatile ("jsr      %2@\n"                          \
         : "=r" (d0), "=r" (d1)                                 \
         : "a" ((CALL)->function),                              \
           "r" ((CALL)->nrargs * sizeof(int))                   \
         : "cc", "memory");                                     \
        if ((CALL)->retsize != 0) {                             \
                res = (int *)(CALL)->ret;                       \
                res[1] = d1;                                    \
                res[0] = d0;                                    \
        }                                                       \
} while (0)


There are two problems with this code:

1. Registers d2-d7/a2-a5 are clobbered by JIT-generated code, but are not saved
   here.  GCC puts automatic variables in registers, and expects them to
   survive function calls.  In particular, the `success' variable tested at
   line 464 of classMethod.c is clobbered.  This is why the static class
   constructor of java.lang.Runtime() "fails".

2. At no point is the location of the arguments indicated to the JIT code.
   We build an argument vector in the automatic variable `extraargs', but
   we don't pass its address to the called code, so I'm not sure how the code
   can ever find it.

Given the extreme disparity between this code and the Linux version, my next
step would be to figure out the calling conventions for JIT code (is this on
the kaffe.org site somewhere?) and rewrite this function from scratch.

Enough for now.  Hopefully Kiyo can do something with this.  At 2 hours per
compile, it must be painful on that 3/260.