1. Interface 2. Changing the compile-time defaults 3. Programming considerations 4. Bugs
GarbageCollector
. An object of this
type encapsulates the state of a heap from which memory is allocated and
automatically reclaimed. More than one GarbageCollector
can be in use
simultaneously. Pointers allocated in one GarbageCollector
must not be stored
in memory allocated by a different GarbageCollector
.
The header file also defines GC_PTR
, the type of pointers to application objects. By
default GC_PTR
is defined as void *
, but the application can change this by
defining GC_PTR
as a different pointer type before including gsgc.h
or
gsgc.c
.
GarbageCollector *GC_new(size_t nbytes)
GarbageCollector
that can provide approximately
nbytes
of storage before performing a collection. (Specifically, nbytes
is
the requested size of 'new space'.) The actual initial overhead will be approximately
six times the value of nbytes
, and will increase as necessary to
accommodate the allocation behaviour of the program. The underlying memory is
obtained from the system by calling malloc
.
GC_PTR GC_malloc(GarbageCollector *gc, size_t nbytes)
nbytes
size and returns a pointer to its first
byte. The object is assumed to contain pointers (and nothing but pointers)
and will be aligned accordingly. Each pointer within the object must always
be the address of another object allocated through the same gc
, or zero. The
non-zero pointers within the object will be followed when constructing the
graph of live objects during garbage collection.
GC_PTR GC_malloc_atomic(GarbageCollector *gc, size_t nbytes)
nbytes
size and returns a pointer to its first
byte. The object is assumed to contain bytes (and nothing but bytes). The
contents of the object will be ignored during garbage collection.
GC_PTR GC_malloc_mapped(GarbageCollector *gc, size_t nbytes, long map)
nbytes
size and returns a pointer to its first
byte. The object can contain pointers, bytes, or a mixture of bytes and
pointers. The map
argument describes where the pointers are located in the
body of the object.
Starting with the least significant, each successive bit in map
indicates
whether each successive pointer-sized (and -aligned) field in the object will
contain a pointer (the map
bit is 1) or atomic bytes (the map
bit is 0). The
most significant bit of map (the sign bit) is considered to repeat for the
rest of the object.
For example, an object that has three initial pointer fields followed by
atomic bytes should be allocated with map
= 7. An object that begins with
four pointer-sized fields containing atomic bytes, and the rest of the object
containing pointers, will be allocated with map
= -16. An object that has
pointers only at pointer offsets 1 and 3 will be allocated with map
= 10.
From the above it follows that GC_malloc
is equivalent to GC_malloc_mapped
with map
= -1, and GC_malloc_atomic
is equivalent to GC_malloc_mapped
with map
= 0. The constants GC_ATOMIC
and GC_POINTERS
are correspondingly defined as
0 and -1 in gsgc.h.
void GC_add_root(GarbageCollector *gc, GC_PTR *ptr)
gc
that the pointer stored at address ptr
is a permanent root for
garbage collection. The pointer stored at ptr
must always be the address of an
object allocated through gc
, or zero.
GC_BEGIN(gc)
The GC_BEGIN
macro expands to a variable declaration and initialisation.
Therefore only one GC_BEGIN
may appear in any given compound statement or
other block scope. (Using more than one will result in a compilation error
due to the multiple definitions of the variable declared by GC_BEGIN
.)
void GC_push_root(GarbageCollector *gc, GC_PTR *ptr, char *memo)
gc
, or
zero. This function can only be called in the same scope as a GC_BEGIN
, which
must precede it.
The memo
is a string that can be used for debugging and diagnostic purposes
(e.g., the source file name and line number of the call, and the name of the
variable pushed). It is safe to pass 0.
GC_END(gc)
GC_BEGIN
within the same compound statement or
block scope. All roots added to gc
using GC_push_root
since the corresponding
GC_BEGIN
are removed from gc
. GC_END
must be called before leaving any scope
in which a GC_BEGIN
has been executed, whether the exit is due to control
reaching the end of the compound statement or by an explicit return
or goto
statement.
GC_PTR GC_check_store(GarbageCollector *gc, GC_PTR dst, GC_PTR src)
src
pointer into a pointer field of the object dst
.
(This is to catch writing a new pointer into an old object, which must
remember the old object as a potential root for subsequent collections.) It
is vital that every store of a pointer into a pointer field be checked by this
function. Failure to do so will (not might) result in catastrophic failure of
the program.
void GC_collect_all(GarbageCollector *gc)
void GC_collect(GarbageCollector *gc)
int GC_size(GC_PTR ptr)
long GC_is_pointers(GC_PTR ptr)
ptr
. (The result will be zero (GC_ATOMIC
)
if the object contains no pointers, -1 (GC_POINTERS
) if it contains only pointers, and
some other non-zero value if it contains a mixture of pointers and atomic bytes.)
void GC_delete(GarbageCollector *gc)
gc
and the objects that were allocated
through it.
gsgc.h
and/or gsgc.c
.
(They are shown here with their default definitions.)
#define GC_PTR void *
#define GC_API /* blank */
static
.
#define GC_MAX_AGE 4
#define GC_MAX_REMEMBERED 1024
#define GC_MIN_NEW_SPACE 4*1024*1024
GC_new
. (The size of new
space will only increase if the application attempts to allocate an
object that is larger than the initial size. Old space grows as
required, to accommodate the size of the working set.)
#define GC_STATS 0
GC_delete
.
#define GC_ALIGN sizeof(void *)
#define GC_INFO 0
#define GC_DEBUG 0
GC_malloc
functions
must always be stored in variables that are declared as
roots, or in pointer fields of objects that are reachable
from those roots, whenever an allocation is performed.
Some care must be taken with function parameters, which have to be treated just like any other variable.
Also, any pointer to an object allocated within an argument expression will be invisible to the collector during the evaluation of the other arguments. Explicit variables should be used to temporarily store and protect any argument values when a collection might occur before the final transfer of control to the callee function. A good rule of thumb is to avoid allocating any object in an argument expression; always store the allocated pointer in a variable, then use the variable in the argument list. It will then be obvious when any of these variables need to be made roots during the evaluation of the remaining arguments.
A Lisp primitive push
that prepends a 'key-value' pair to the
front of an association list could be coded as follows, demonstrating many of
the rules mentioned above.
#define STRINGIFY(X) #X #define GC_MEMO(F, L, V) F":"STRINGIFY(L)":"V #define GC_VAR(GC, VAR) GC_push_root(GC, (GC_PTR *)&VAR, GC_MEMO(__FILE__, __LINE__, #VAR)) struct Pair { oop head, tail; }; union Object { ... struct Pair Pair; ... }; typedef union Object *oop; oop make_pair(GarbageCollector *gc, oop head, oop tail) { GC_BEGIN(gc); // push new root context GC_VAR(gc, head); GC_VAR(gc, tail); // add two new roots oop obj= GC_malloc(gc, sizeof(struct Pair)); obj->Pair.head= head; obj->Pair.tail= tail; GC_END(gc); // pop the root context return obj; } oop push(GarbageCollector *gc, oop key, oop value, oop alist) { GC_BEGIN(gc); GC_VAR(gc, alist); // key and value ignored after allocation in make_pair oop tmp= make_pair(gc, key, value); // avoid allocation within an argument list //GC_VAR(gc, tmp); // protect tmp here if any allocations before call oop blist= make_pair(gc, tmp, alist); GC_END(gc); return blist; }The collector must be informed whenever a pointer value is written into an object. This is done by calling
GC_check_store
,
as demonstrated in these Lisp primitives that modify the value stored
in the head of a list or in the middle of an array.
oop replace_head(GarbageCollector *gc, oop pair, oop value) { pair->Pair.head= value; GC_check_store(gc, pair, value); // notify write of value into pair return value; } struct Array { oop pointers[0]; }; // sized at allocation, zero or more pointers oop set_array_at(GarbageCollector *gc, oop array, int index, oop value) { assert((unsigned)index < GC_size(array) / sizeof(oop)); array->Array.pointers[index]= value; GC_check_store(gc, array, value); // notify write of value into array return value; }(The second function above illustrates that the first argument to
GC_check_store
is a pointer to the object written
into, regardless of where within that object the write occurs.)
Variables that are roots, and pointer fields within allocated objects, must
always contain either a valid pointer to a previously-allocated object or
zero. Pointers whose value is zero are ignored by the collector. If
additional pointer values need to be ignored,
GC_IS_POINTER
should be redefined to answer zero for those
values. An application that uses tagged integers, where odd pointers (least
significant bit 1) encode the value of the integer 'object' directly in the
remaining bits of the pointer, could define GC_IS_POINTER
as
follows:
static inline int gc_is_pointer(void *ptr) { return ptr && !((long)ptr & 1); // non-zero and LS bit clear } #define GC_IS_POINTER(PTR) gc_is_pointer(PTR) #include "gsgc.c"