[kaffe] Allocating objects in the stack

Heikki Linnakangas hlinnaka@iki.fi
Wed Jun 18 08:13:01 2003


Hello,

I've been pondering for some time if it was possible to allocate some
objects in the stack insted of the heap. Stack allocation is faster,
stack doesn't need to be garbage collected, and it keeps the memory
footprint smaller.

Last weekend I decided to look into the problem. I figured that if a
method constructs a new object, and never assigns it to member variable,
never gives it as an argument to other methods etc. it should be safe to
allocate it in the stack. Little googleing revealed that indeed I wasn't
the first one to think about this, and I found a nice paper [1] that
outlines the above restrictions in a more formal way. Gay and Steensgard
also discuss some other code transformations that could leverage stack
allocation even more.

I grapped the kaffe source code and implemented a very crude version of
the method described in the paper. This initial implementation is very
limited. Stack allocation is done only for primitive arrays, and only if
the method doesn't contain any loops. Escape analysis is very limited.
And no JIT implementation yet.

I thought I'd post the initial implementation right away before I
tackle the more complicated stuff because I'd like to receive comments as
early as possible.

The implementation introduces a new synthetic instruction, NEWARRAY_STACK,
which is identical to NEWARRAY, except that those allocations have been
determined safe to be made in the stack. NEWARRAY instructions are changed
to NEWARRAY_STACK instructions in the code analyzation phase if the
reference doesn't escape the method.

Obviously there is a lot of room for improvement:

1. Allocations that are made inside a loop are not suitable for stack
allocation. The current implementation just checks if the method has any
loops at all, and doesn't do stack allocation if there is. A proper
loop detection should determine if the allocation is made inside the loop.

I believe it would be fairly easy by looking at the relations between the
basic blocks. I haven't looked at that yet.

2. Currently, only primitive arrays are supported. Reference arrays would
be trivial to add, but objects are more complicated because they have
constructors that could let the reference to escape. Objects would need
item 3 solved.

3. Currently all references that are given as arguments to method
calls are considered to escape, including the "this" variable of the
callee and most importantly all constructors. However, I'd say that most
methods don't really let the argument escape. Think about all the
setter/getter-methods, toString etc.

Escape analysis should be made for method arguments also. That way most
method calls would not inhibit stack allocation. Flags should be attached
to all methods that indicates which arguments the implementation let's
escape.

4. JIT implementation. Could someone help me to figure out how to allocate
something in the stack?

I've included the initial implementation as a diff against the current CVS
head. You'll have to compile with "--with-engine=INTRP". AFAIK, it
works, but I haven't tested it very much. The regression tests passed.

It's hard to find code that really benefits from this limited
implementation, but you can try the following simple example:

public class StackAllocationTest {
  public static void main(String[] args) {
    int i = 0;
    while(i < 10) {
       i = inc(i);
    }
    System.out.println(i);
  }
  public static int inc(int i) {
    int[] ii = new int[1];
    ii[0] = i;
    ii[0]++;
    return ii[0];
  }
}

Run the above code with "-vmdebug INTRPA" to see the NEWARRAY_STACK
instruction in action. CODEANALYSE will show some details of the escape
analysis.

[1] Fast Escape Analysis and Stack Allocation for Object-Based Programs
(2000), David Gay, Bjarne Steensgaard
http://citeseer.nj.nec.com/gay00fast.html

PS. The KaffeInternal.java regression test is broken. It expects a
ClassNotFoundException, but a SecurityException is thrown.

- Heikki

Index: kaffevm/bytecode.h
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/bytecode.h,v
retrieving revision 1.2
diff -c -r1.2 bytecode.h
*** kaffevm/bytecode.h	29 Jul 2002 16:02:32 -0000	1.2
--- kaffevm/bytecode.h	16 Jun 2003 20:19:54 -0000
***************
*** 217,220 ****
--- 217,223 ----
  #define JSR_W 201
  #define BREAKPOINT 202

+ /* Synthetic bytecodes */
+ #define NEWARRAY_STACK 255
+
  #endif
Index: kaffevm/code-analyse.c
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/code-analyse.c,v
retrieving revision 1.37
diff -c -r1.37 code-analyse.c
*** kaffevm/code-analyse.c	6 Feb 2003 21:35:07 -0000	1.37
--- kaffevm/code-analyse.c	16 Jun 2003 20:19:57 -0000
***************
*** 49,55 ****
  	3, 3, 1, 1, 1, 4, 3, 3, 5, 5, 1, 1, 1, 1, 1, 1,
  	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
! 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  };

  static void mergeFrame(codeinfo*, int, int, frameElement*, Method*);
--- 49,55 ----
  	3, 3, 1, 1, 1, 4, 3, 3, 5, 5, 1, 1, 1, 1, 1, 1,
  	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
! 	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,
  };

  static void mergeFrame(codeinfo*, int, int, frameElement*, Method*);
***************
*** 74,79 ****
--- 74,80 ----
  	bool wide;
  	codeinfo *codeInfo;
  	localUse* localuse;
+ 	bool containsLoops = false;

  DBG(CODEANALYSE,
  	dprintf(__FUNCTION__ " %p: %s.%s\n", THREAD_NATIVE(),
***************
*** 150,155 ****
--- 151,158 ----
  		case IF_ACMPEQ: case IF_ACMPNE:
  		case IFNULL:	case IFNONNULL:
  			tabpc = pc + WORD(pc+1);
+ 			if(tabpc <= pc)
+ 			  containsLoops = true;
  			SET_STARTOFBASICBLOCK(tabpc);
  			SET_JUMPFLOW(pc, tabpc);
  			pc = pc + INSNLEN(pc);
***************
*** 158,163 ****
--- 161,168 ----
  			break;
  		case GOTO:
  			tabpc = pc + WORD(pc+1);
+ 			if(tabpc <= pc)
+ 			  containsLoops = true;
  			SET_STARTOFBASICBLOCK(tabpc);
  			SET_JUMPFLOW(pc, tabpc);
  			pc = pc + INSNLEN(pc);
***************
*** 167,172 ****
--- 172,179 ----
  			break;
  		case GOTO_W:
  			tabpc = pc + DWORD(pc+1);
+ 			if(tabpc <= pc)
+ 			  containsLoops = true;
  			SET_STARTOFBASICBLOCK(tabpc);
  			SET_JUMPFLOW(pc, tabpc);
  			pc = pc + INSNLEN(pc);
***************
*** 175,180 ****
--- 182,188 ----
  			}
  			break;
  		case JSR:
+ 			containsLoops = true;
  			tabpc = pc + WORD(pc+1);
  			SET_STARTOFBASICBLOCK(tabpc);
  			SET_JUMPFLOW(pc, tabpc);
***************
*** 183,188 ****
--- 191,197 ----
  			SET_NORMALFLOW(pc);
  			break;
  		case JSR_W:
+ 			containsLoops = true;
  			tabpc = pc + DWORD(pc+1);
  			SET_STARTOFBASICBLOCK(tabpc);
  			SET_JUMPFLOW(pc, tabpc);
***************
*** 363,368 ****
--- 372,387 ----
  		}
  	}
  #endif
+
+ 	/* Scan the code for stack allocation candidates */
+ 	if(!containsLoops) {
+ 	  for (pc = 0; pc < codeInfo->codelen; pc++) {
+ 	    if(IS_STARTOFINSTRUCTION(pc) && INSN(pc) == NEWARRAY && !IS_ESCAPES(pc)) {
+ 	      INSN(pc) = NEWARRAY_STACK;
+ 	    }
+ 	  }
+ 	}
+
  	return (true);
  }

***************
*** 877,882 ****
--- 896,902 ----
  			STACKIN(1, TINT);
  			STACKIN(0, TOBJ);
  			STKPOP(3);
+ 			STACKESCAPE(0);
  			INCPC(1);
  			break;

***************
*** 1385,1390 ****
--- 1405,1411 ----

  		case ARETURN:
  			STACKIN(0, TOBJ);
+ 			STACKESCAPE(0);
  			STKPOP(1);
  			INCPC(1);
  			break;
***************
*** 1468,1473 ****
--- 1489,1495 ----
  			case '[':
  			case 'L':
  				STACKIN(0, TOBJ);
+ 				STACKESCAPE(0);
  				STKPOP(1);
  				break;
  			default:
***************
*** 1521,1526 ****
--- 1543,1549 ----
  			}
  			if (!FIELD_ISPRIM(finfo.field)) {
  				STACKIN(0, TOBJ);
+ 				STACKESCAPE(0);
  				STACKIN(1, TOBJ);
  				STKPOP(2);
  			}
***************
*** 1580,1585 ****
--- 1603,1609 ----
  				switch (sig[0]) {
  				case '[':
  					STACKIN(idx, TOBJ);
+ 					STACKESCAPE(idx);
  					idx -= 1;
  					while (sig[0] == '[') {
  						sig++;
***************
*** 1593,1598 ****
--- 1617,1623 ----
  					break;
  				case 'L':
  					STACKIN(idx, TOBJ);
+ 					STACKESCAPE(idx);
  					idx -= 1;
  					while (sig[0] != ';') {
  						sig++;
***************
*** 1682,1687 ****
--- 1707,1713 ----
  				switch (sig[0]) {
  				case '[':
  					STACKIN(idx, TOBJ);
+ 					STACKESCAPE(idx);
  					idx -= 1;
  					while (sig[0] == '[') {
  						sig++;
***************
*** 1695,1700 ****
--- 1721,1727 ----
  					break;
  				case 'L':
  					STACKIN(idx, TOBJ);
+ 					STACKESCAPE(idx);
  					idx -= 1;
  					while (sig[0] != ';') {
  						sig++;
***************
*** 1782,1787 ****
--- 1809,1815 ----
  				switch (sig[0]) {
  				case '[':
  					STACKIN(idx, TOBJ);
+ 					STACKESCAPE(idx);
  					idx -= 1;
  					while (sig[0] == '[') {
  						sig++;
***************
*** 1795,1800 ****
--- 1823,1829 ----
  					break;
  				case 'L':
  					STACKIN(idx, TOBJ);
+ 					STACKESCAPE(idx);
  					idx -= 1;
  					while (sig[0] != ';') {
  						sig++;
***************
*** 1877,1884 ****
--- 1906,1918 ----
  			break;

  		case NEWARRAY:
+ 		case NEWARRAY_STACK:
  			STACKIN(0, TINT);
  			STACKOUT(0, TOBJ);
+ DBG(CODEANALYSE,
+ 			dprintf("stack candidate at %d\n", pc);
+     )
+ 			STACKMARKALLOC(0);
  			INCPC(2);
  			break;

***************
*** 1917,1922 ****
--- 1951,1957 ----

  		case ATHROW:
  			STACKIN(0, TOBJ);
+ 			STACKESCAPE(0);
  			STKPOP(1);
  			INCPC(1);
  			break;
***************
*** 2006,2011 ****
--- 2041,2048 ----

  	/* Merge locals */
  	for (m = 0; m < meth->localsz; m++) {
+ 	        if(from[m].type != TUNASSIGNED)
+ 		  to[m].introducedpc = from[m].introducedpc;
  		if (from[m].type != TUNASSIGNED && from[m].type != to[m].type && to[m].type != TUNSTABLE) {
  			SET_NEEDVERIFY(pc);
  			if (to[m].type == TUNASSIGNED) {
***************
*** 2019,2024 ****
--- 2056,2063 ----

  	/* Merge stacks */
  	for (m = sp; m < meth->localsz + meth->stacksz; m++) {
+ 	        if(from[m].type != TUNASSIGNED)
+ 		  to[m].introducedpc = from[m].introducedpc;
  		if (from[m].type != TUNASSIGNED && from[m].type != to[m].type && to[m].type != TUNSTABLE) {
  			SET_NEEDVERIFY(pc);
  			if (to[m].type == TUNASSIGNED) {
Index: kaffevm/code-analyse.h
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/code-analyse.h,v
retrieving revision 1.17
diff -c -r1.17 code-analyse.h
*** kaffevm/code-analyse.h	6 Feb 2003 21:35:07 -0000	1.17
--- kaffevm/code-analyse.h	16 Jun 2003 20:19:58 -0000
***************
*** 15,20 ****
--- 15,21 ----
  	Hjava_lang_Class*	type;
  	uint8			used;
  	uint8			modified;
+   int32 introducedpc;
  } frameElement;

  typedef struct perPCInfo {
***************
*** 85,90 ****
--- 86,92 ----
  #define	FLAG_NEEDVERIFY			0x0040
  #define	FLAG_DONEVERIFY			0x0080
  #define	FLAG_STARTOFINSTRUCTION		0x0100
+ #define	FLAG_ESCAPES		        0x0200

  #define	FLAGS(_pc)			codeInfo->perPC[_pc].flags
  #define	STACKPOINTER(_pc)		codeInfo->perPC[_pc].stackPointer
***************
*** 102,107 ****
--- 104,111 ----
  					 (meth->c.bcode.code[(pc)+3]))
  #define	INCPC(V)			pc += (V)

+ #define SET_ESCAPES(PC)                 FLAGS(PC) |= FLAG_ESCAPES
+
  #define	SET_STARTOFBASICBLOCK(PC)	ATTACH_NEW_BASICBLOCK(PC); \
  					SET_NEWFRAME(PC); \
  					FLAGS(PC) |= FLAG_STARTOFBASICBLOCK
***************
*** 146,151 ****
--- 150,156 ----
  #define	IS_UNREACHABLE(pc)		((IS_STARTOFBASICBLOCK(pc) || \
  					  IS_STARTOFEXCEPTION(pc)) && \
  					  !IS_DONEVERIFY(pc))
+ #define IS_ESCAPES(pc)                  (FLAGS(pc) & FLAG_ESCAPES)

  #define	ALLOCFRAME()			KMALLOC((codeInfo->stacksz+codeInfo->localsz+1) * sizeof(frameElement))

***************
*** 250,255 ****
--- 255,274 ----
  					FRAME(pc)[(L)].used = 1
  #define	LOCALIN(L, T)			LOCAL_CHECKTYPE(L, T); \
  					LCL(L).used = 1
+
+ #define ESCAPE(E)                       { \
+                                                 if((E).introducedpc != 0) { \
+ DBG(CODEANALYSE, \
+                                                         dprintf(__FUNCTION__" %d escapes at (%d)\n", \
+                                                                 (E).introducedpc, pc); \
+ ) \
+                                                         SET_ESCAPES((E).introducedpc); \
+                                                 } \
+                                         }
+
+ #define STACKESCAPE(S)                  ESCAPE(SF(S))
+ #define STACKMARKALLOC(S)               SF(S).introducedpc = pc;
+ #define LOCALESCAPE(L)                  ESCAPE(LCL(L))

  struct _methods;
  bool analyzeMethod(struct _methods*, codeinfo**, errorInfo*);
Index: kaffevm/kaffe.def
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/kaffe.def,v
retrieving revision 1.25
diff -c -r1.25 kaffe.def
*** kaffevm/kaffe.def	26 May 2003 07:27:00 -0000	1.25
--- kaffevm/kaffe.def	16 Jun 2003 20:20:02 -0000
***************
*** 2795,2800 ****
--- 2795,2815 ----
  	softcall_newarray(stack(0), stack(0), low);
  }

+ define_insn(NEWARRAY_STACK)
+ {
+ 	/*
+ 	 * ... size ->  ..., object ref
+ 	 */
+ 	check_pc (0);
+ 	check_stack_int(0);
+
+ 	low = (uint8)getpc(0);
+ 	trace_jcode (("newarray_stack %d\n", low));
+
+ 	softcall_newarray_stack(stack(0), stack(0), low);
+ }
+
+
  define_insn(ANEWARRAY)
  {
  	/*
Index: kaffevm/object.c
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/object.c,v
retrieving revision 1.19
diff -c -r1.19 object.c
*** kaffevm/object.c	23 Apr 2002 10:43:41 -0000	1.19
--- kaffevm/object.c	16 Jun 2003 20:20:03 -0000
***************
*** 103,108 ****
--- 103,154 ----
          return (cls);
  }

+
+ /*
+  * Returns the amount of memory that should be allocated for an
+  * array of given type and length
+  */
+ size_t soft_array_size(jint type, int count) {
+   Hjava_lang_Class* elclass = TYPE_CLASS(type);
+ DBG(NEWOBJECT,
+ 	dprintf("soft_array_size class %s count %d\n",  (elclass ? elclass->name->data : "<none>"), count);
+     )
+ 	if (CLASS_IS_PRIMITIVE(elclass) || elclass == PtrClass) {
+ 		size_t total_count = (TYPE_SIZE(elclass) * count) + ARRAY_DATA_OFFSET;
+ 		return total_count;
+ 	}
+ 	else {
+ 		size_t total_count = (PTR_TYPE_SIZE * count) + ARRAY_DATA_OFFSET;
+ 		return total_count;
+ 	}
+ }
+
+ /*
+  * Allocate a new array, of whatever types.
+  */
+ Hjava_lang_Object*
+ newArrayChecked_stack(Hjava_lang_Class* elclass, int count, errorInfo *info, void *location)
+ {
+ 	Hjava_lang_Class* class = 0;
+ 	Hjava_lang_Object* obj = (Hjava_lang_Object*) location;
+
+ 	if (obj) {
+ 		class = lookupArray(elclass, info);
+ 	} else {
+ 		postOutOfMemory(info);
+ 	}
+
+ 	if (class) {
+ 		obj->dtable = class->dtable;
+ 		ARRAY_SIZE(obj) = count;
+ 	}
+ DBG(NEWOBJECT,
+ 	dprintf("newArrayStack %p class %scount %d\n", obj,
+ 		(class ? class->name->data : "<none>"), count);
+     )
+ 	return (obj);
+ }
+
  /*
   * Allocate a new array, of whatever types.
   */
Index: kaffevm/object.h
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/object.h,v
retrieving revision 1.4
diff -c -r1.4 object.h
*** kaffevm/object.h	20 Dec 2000 02:41:52 -0000	1.4
--- kaffevm/object.h	16 Jun 2003 20:20:03 -0000
***************
*** 40,45 ****
--- 40,49 ----
  					 struct _errorInfo *);
  Hjava_lang_Object*	newObject(struct Hjava_lang_Class*);
  struct Hjava_lang_Class* newClass(void);
+
+ size_t soft_array_size(jint type, int count);
+ Hjava_lang_Object*	newArrayChecked_stack(struct Hjava_lang_Class*, int,
+ 					struct _errorInfo *, void *);
  Hjava_lang_Object*	newArrayChecked(struct Hjava_lang_Class*, int,
  					struct _errorInfo *);
  Hjava_lang_Object*	newArray(struct Hjava_lang_Class*, int);
Index: kaffevm/soft.c
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/soft.c,v
retrieving revision 1.51
diff -c -r1.51 soft.c
*** kaffevm/soft.c	7 Aug 2002 08:40:39 -0000	1.51
--- kaffevm/soft.c	16 Jun 2003 20:20:04 -0000
***************
*** 65,70 ****
--- 65,95 ----
  }

  /*
+  * soft_newarray_stack
+  */
+ void*
+ soft_newarray_stack(jint type, jint size, void *location)
+ {
+ 	Hjava_lang_Object* obj;
+ 	errorInfo info;
+
+ 	if (size < 0) {
+ 		throwException(NegativeArraySizeException);
+ 	}
+
+ 	obj = newArrayChecked_stack(TYPE_CLASS(type), size, &info, location);
+ 	if (obj == 0) {
+ 		throwError(&info);
+ 	}
+
+ DBG(NEWINSTR,
+ 	dprintf("New array of %s [%d] (%p)\n",
+ 		TYPE_CLASS(type)->name->data, size, obj); )
+
+ 	return (obj);
+ }
+
+ /*
   * soft_newarray
   */
  void*
Index: kaffevm/soft.h
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/soft.h,v
retrieving revision 1.9
diff -c -r1.9 soft.h
*** kaffevm/soft.h	30 May 2002 22:47:32 -0000	1.9
--- kaffevm/soft.h	16 Jun 2003 20:20:05 -0000
***************
*** 17,22 ****
--- 17,23 ----
  struct Hjava_lang_Object;

  void*	soft_new(struct Hjava_lang_Class*);
+ void*	soft_newarray_stack(jint, jint, void *location);
  void*	soft_newarray(jint, jint);
  void*	soft_anewarray(struct Hjava_lang_Class*, jint);
  void	soft_initialise_class(struct Hjava_lang_Class*);
Index: kaffevm/intrp/icode.h
===================================================================
RCS file: /cvs/kaffe/kaffe/kaffe/kaffevm/intrp/icode.h,v
retrieving revision 1.15
diff -c -r1.15 icode.h
*** kaffevm/intrp/icode.h	7 May 2001 10:04:35 -0000	1.15
--- kaffevm/intrp/icode.h	16 Jun 2003 20:20:06 -0000
***************
*** 293,298 ****
--- 293,299 ----
  #define	softcall_lookupinterfacemethod(r, n, t)	(r)[0].v.taddr = soft_lookupinterfacemethod((t)[0].v.taddr, (n)->class, (n)->idx)
  #define	softcall_new(r, t)			(r)->v.taddr = soft_new(t)
  #define	softcall_newarray(r, s, t)		(r)->v.taddr = soft_newarray(t, (s)->v.tint)
+ #define	softcall_newarray_stack(r, s, t)	(r)->v.taddr = soft_newarray_stack(t, (s)->v.tint, alloca(soft_array_size(t, (s)->v.tint)));
  #define	softcall_anewarray(r, s, t)		(r)->v.taddr = soft_anewarray(t, (s)->v.tint)
  #define	softcall_athrow(s)			soft_athrow((s)[0].v.taddr)
  #define	softcall_checkcast(n, o, t)		soft_checkcast(t, (o)->v.taddr)