Pepsi -- not quite The Real Thing

$Id: pepsi.html.in 330 2006-07-22 02:47:43Z piumarta $
corresponds to the idst-5.4 release


Contents:
1   Introduction
2   Compiling and running programs
2.1   Object files and shared libraries
2.2   Compiling the compiler
3   Syntax
3.1   Imperatives
3.2   Prototype definitions
3.3   Translation unit variables
3.4   Method definitions
4   Semantics
4.1   Blocks
4.2   Prototypes and objects
4.3   Everything is first-class
5   Pragmatics
6   The runtime system: introspection and intercession
6.1   Object layout and object pointers
6.2   Essential protocol of runtime objects
6.3   Runtime examples
7   Caveats and gotchas for Smalltalk programmers
8   Appendices
8.1   Compiler directives
8.2   Compiler types
9   Resources

1   Introduction

This is a cardboard cut-out of a prototype-based language similar to Smalltalk. It is currently a standalone language, code-named 'Pepsi' (in deference to The Real Thing, code-named 'Coke', a dynamic execution engine on which a replacement for Pepsi will be built). It is intended: Hopefully it will also serve (at some time or another) to demonstrate that: But mostly I am fed up of battling with C++ (and its ridiculous over-educated type system) and want a platform in which 'Coke' development can continue unhindered by type 'safety'. (As of the instant the 'Pepsi' compiler successfully compiled itself, I hope never to write another line of C++ in my life.)

It would be nice if it ran fast too. The last time I benchmarked it I got about nine times Squeak speed, but this is likely to go down (with increased generality and dynamism in the lowest levels of the implementation) and up (with the sophistication of the implementation), and over the long term things could go either way. (The GC will probably have measurable impact on the performance of 'real'/long-running systems and applications too. Without profiling it I'm not sure how well/badly the current conservative GC is holding up.)

2   Compiling and running programs

The fundamental object and messaging model is called 'Id'. The Id compiler is called 'idc'. The hard-wired language of Pepsi looks quite like Smalltalk and so the suffix '.st' was hijacked for source files. The compiler compiles the files named on the command line to create an executable whose name is derived from the input files (by removing '.st' suffixes). The command
idc foo.st
compiles the file foo.st to create an executable file called foo. The -o option overrides the default name of the output file, if required.

The search path for imported and included files can be extended with the -I option (which can appear any number of times). The command

idc -I../st80 -I.../MyClassLibrary -o bar foo.st
builds the program bar from the source foo.st, searching ../st80 and .../MyClassLibrary for included files.

2.1   Object files and shared libraries

The default behaviour is to compile a single source file into an object file and then link it into an executable program. The -c option tells the compiler not to link the executable program. The command
idc -c bar.st
compiles bar.st into the object file bar.o. Any number of .o files can be linked when compiling an executable program. The command
idc foo.st bar.o baz.o
compiles foo.st into the executable foo combining it with previously compiled object files bar.o and baz.o. (The examples/static directory contains an example of linking multiple object files into a monolithic program.)

The -s option tells the compiler to generate a shared library from the source file. (The resulting library can be loaded into an already-running program.) The command

idc -s bar.st
compiles bar.st into the object file bar.so that can be loaded into a running program with the import: directive. (The directory examples/dynamic contains an example of importing shared libraries into a running program.)

2.2   Compiling the compiler

The compiler source directory contains several directories, as follows:
bootA version of the idc compiler precompiled to C source files, used for bootstrapping the idc compiler.
docDocumentation (including the file you are reading).
examples  A collection of small and large example programs.
gcX.YThe conservative garbage collector used by the Id runtime.
idcThe source for the idc compiler itself (written entirely in idst).
libThe source for the Id runtime library.
st80A Smalltalk-like 'class' library.
To build the compiler, type
make
in the top-level directory (the one containing the directories listed above). It should build the GC, runtime library, and then the compiler itself. To install the compiler and runtime libraries, become the superuser ('root') and type
make install

If all that sounds too complicated, ask me to make you a binary distribution.

The compiler has been tested (and is known to work) on:

3   Syntax

Most of the syntax is the same as Smallalk-80. Comments are contained within double quotes:
"this is ignored"
The few minor additions to Smalltalk-80 syntax are to accomodate compilation from plain text files, variadic blocks (and methods), an expanded range of literal types, and direct access to non-printing characters in Character and String literals.

Programs are translated one source file (with zero or more additional source files being imported) at a time. To the compiler, this body of code is called a translation unit. The compiler always processes one complete translation unit at a time, and currently (this is a temporary limitation) a translation unit must contain an entire program (all object and method definitions required, with no external or unresolved references).

A translation unit consists of a sequence of definitions and imperatives. Definitions either either create a new prototype or add a method to an existing prototype. Imperatives are sequences of code that are executed in-order when the program is run.

3.1   Imperatives

A literal block can appear at the top-level (outside any other kind of definition):
[ statements ]
The code within the block is executed at the moment 'control' nominally reaches the block within the source file at runtime. This is handy for initialising complex data structures (think of it as a means to obtain behaviour similar to class initialisation methods) and also for starting the whole program in motion at the end of the source (something akin to a 'main' method, if you like).

Top-level imperatives can also take the form

{ directive optionalArguments... }
to direct the compiler to perform some unusual action. The most commonly used directive is import:. The imperative
{ import: name }
asks the compiler to search for a file called 'name.st' and make the global declarations within it available to the importing program. The complete list of supported directives is given in the appendix
Compiler directives.

3.2   Prototype definitions

Two top-level forms provide for the creation of new prototypes:
name ( listOfSlots )
creates a new 'root' prototype (it has no parent, or 'delegate') and binds it to name. The prototype contains zero or more named slots, similar to instance variables. The definition could be read as: "name is listOfSlots".

Such a prototype has no useful behaviour (it can't even clone itself to create useful application objects). Adding a minimum of primitive behaviour (e.g., cloning) is the first thing you'll want to do to such an object.

The second form:

name : parent ( listOfSlots )
is similar, except the new prototype delegates to the named parent object and inherits the parent object's slots before adding its own. Such definitions could be read as: "name extends parent with listOfSlots".

(This is every bit as bogus as a single inheritance mechanism being used to share state and behaviour, but I'm still trying to figure out how to separate delegation from the sharing of state without sacrificing performance. Only allowing slots to be accessed by name in their defining prototype, forcing inherited slots to be accessed by message send, is probably the way to go. Better still, making all state accesses into message sends -- especially assignments.)

3.3   Translation unit variables

The top-level form
name := [ expressions ]
creates a new variable with the given name and binds it to the value of the last expression. (The expressions are separated by periods, causing all but the last to become statements.)

3.4   Method definitions

Methods are just 'named blocks', tied to a particular prototype only by permitting direct access to the state within that prototype. (Therein lies yet another reason to abolish direct access to state.) This is reflected in the syntax of the top-level form for adding methods (named blocks) to a prototype:
name pattern [ statements ]
where name identifies a prototype object (defined as described above), pattern looks (more or less) like a Smallalk-80 message pattern, and statements is a block (notice the brackets) providing the behaviour for the method. The pattern component can be a unary, binary or keyword message pattern.

Extensions to Smalltalk's fixed-arity messages include additional and variadic formal arguments. Additional formal arguments for unary and keyword selectors are written like block arguments and can appear before or after the initial opening bracket. For example, two additional formal arguments could be written

name selector :arg1 :arg2 [ statements ]
name selector [ :arg1 :arg2 | statements ]
(where selector is a unary or keyword selector). Unary or keyword message sends can pass additional actual arguments by prefixing each additional argument with a colon. To ask the receiver to add two numbers:
Object add :x :y
[
  ^x + y
]

[
  | sum |
  sum := self add :3 :4.
]
Variadic arguments can be attached to unary or keyword methods. This is indicated by an ellipsis in the message pattern immediately following the last named argument. The pattern for unary and keyword syntax therefore also includes:
name unarySelector ... [ statements ]
name keywords: arguments ... [ statements ]

(Simply for lack of time, there is currently no friendly syntax to recover the 'rest' arguments within the body of a message. Wizards, however, can easily recover these arguments by writing some low-level magic inside a method body.)

3.4.1   Blocks

Blocks are similar to Smalltalk-80 blocks, but allow for local (block-level) temporaries:
[ statements ]
[ :arguments | statements ]
[ | temporaries | statements ]
[ :arguments | | temporaries | statements ]
Both arguments and temporaries are strictly local to the block and will not conflict (other than in name) with similarly-named arguments or temporaries in lexically disjoint blocks. The compiler currently disallows the shadowing of names.

(This means that you cannot set a method-level temporary by naming it as a block argument. It also means two blocks in the same method that share an argument or temporary name will each refer to a completely different value, regardless of the common name.)

3.4.2   Assignment

The Smaltalk-80 'left arrow' assignment operator is gone. The corresponding form is:
identifier := expression
with the ':=' operator having the lowest precedence of any operator (including keyword message sends) and associating from left to right.

3.4.3   Message sends

Are similar to Smalltalk-80: unary, binary and keyword messages have the same precedence as in Smalltalk-80 and cascaded messages (with the ';' operator) work in exactly the same manner.
primary unarySelector
unaryMessage binarySelector unaryMessage
binaryMessage keywords: binaryMessages
receiver messageSend ; messageSend
(Whether or not the binary selectors should be treated differently, introducing several levels of implicit precedence based on the operator name to provide the traditional arithmetic order of evaluation, would also be a possibility.)

An Extension to Smalltalk-80 syntax allows unary and keyword message sends to provide additional actual arguments. (See the discussion above on additional and variadic formal arguments.) The simplest possible change that would allow this is to drop the name part of a 'keyword' (but keep the colon):

receiver unarySelector : anonymousArgument
receiver keywords: arguments : anonymousArgument
with as many ': argument' pairs as required. (Anonymous arguments can only appear after a unary message or the arguments associated with a proper keyword; no further 'keyword: argument' pairs are allowed after the first ': anonymousArgument' that occurs in a keyword send.)

3.4.4   Parentheses

If you don't like the precedence defined by unary, binary, and keyword sends, put parentheses around expressions to force evaluation order.

3.4.5   Literals

Literals are immutable. In other words: literals created by the compiler cannot be modified by the program. This was done for two reaons:
  1. It's cleaner, making the semantics simpler to explain (no more confusing behaviour when a program inadvertently modifies a literal causing some method somplace to have behaviour different to that implied by its source code).
  2. My C compiler puts literals in a read-only data section, at one point causing me a certain amount of stress while debugging what was ultimately correct code but containing an attempt to write into a read-only location. If all compiler-generated literals are immutable then this particular platform idiosyncracy ceases to be of any concern whatsoever.
A handful of new classes (ImmutableArray, ImmutableByteArray, ImmutableWordArray) are present in the library to accomodate the above.

In addition to literal Arrays

#( elements... )
we also have literal WordArrays
#{ integers... }
and ByteArrays
#[ integers... ]
(where each integer must be between 0 and 255). In Array literals, nested Array, ByteArray and WordArray literals can appear without the initial '#' (although one can be supplied if you like).

Integer literals themselves are in decimal by default, with the usual

radixInteger r valueInteger
syntax supported. For the hackers out there, I saw no reason to avoid supporting
0xvalueInteger
for hexadecimal integers too. Digits greater than '9' in hexadecimal literals (in either of the above syntaxes) or in literals of any base greater than ten (in the 'r' syntax) can be specified using upper- or lower-case letters.

Smalltalk-80 Character literals are supported:

$character
as are non-printing Characters either by mnemonic or by explicit value (following the ANSI 'escape sequence' conventions):
syntaxasciiValueASCII designation
$\a7bel (alert)
$\b8bs (backspace)
$\t9ht (horizontal tab)
$\n10nl (newline)
$\v11vt (vertical tab)
$\f12np (new page, or form feed)
$\r13cr (carriage return)
$\e27esc (escape)
$\\92\ (a single backslash character)
(Extended mnemonic names such as '$\newline' for '$\n' could easily be supported too.) In the event that a non-printing character literal not in the above list is required, a generic octal escape is provided:
$\octalNumber
where octalNumber is precisely three (no more, no less) octal digits in the range '000' to '377' specifying the value of the Character. In other words, '$\n' and '$\012' are the same Character, and '$\000' is the 'nul' Character (ascii value zero).

String literals obey much the same rules as Smalltalk-80. Adjacent String literals:

'like''this'
are concatenated with an intervening single quote:
like'this
However, the conventions that apply to '\' in escaping single Character literals also apply to characters within a String. You could write a String literal that contains two lines, each terminated by a newline with the whole String terminated by a nul Character:
'like\nthis\n\000'
(I was very, very tempted to make consecutive String literals simply concatenate without the implicit intervening single quote, as in other languages that support juxtaposed String literals. I may yet change this so that single quotes inside Strings must be escaped
'like\'this'
to bring them into line with other languages. (Escaping the embedded single quote does already work just fine, but it isn't currently the unique means to introduce a single quote into a String -- which is a bug.) If you think that's bad, just consider that it took all my self control to avoid making Character literals look like 'a' 'b' and 'c', and Strings look like "abc" -- with some necessary change to comments too.
Note: The 'character escape' rules above apply to Symbols too. If you want to write the literal symbol for the 'remainder on division' binary message, you have to say '#\\\\' (since the first and third backslash characters escape the second and fourth). I think this is a bug (character escapes should only be recognised if the Symbol is created from a String [so '#'\\\\' == #\\' would hold]) and intend to fix it sometime. In the meantime: beware!

3.4.6   Anything else...?

If you find something (either some feature in the sources that I wrote, or something you think should work but doesn't, that does not seem to be explained here) then please let me know so I can fix this document.

4   Semantics

The semantics are similar to Smalltalk-80, with three main differences:

4.1   Blocks

The restrictions placed on Blocks by Smalltalk-80 have been eliminated, and the (end-user) notion of BlockContext has been replaced by BlockClosure (in several variations according to optimisability). When you write a block '[...]' in a program, what you create is a BlockClosure (and not a partially-crippled, half-initialised activation context, as would be the case in Smalltalk-80).

Block contexts (activated BlockClosures) have strictly local arguments and temporaries. The value of an argument or temporary can never come into contact with, nor be affected in any way by, an enclosing lexical context. They are quite literally inaccessible. You cannot, for example, implictly assign to a method temporary by naming it as a block argument.

BlockClosures can 'close-over' local state defined in a lexically-enclosing scope. In such cases, the closed-over state will be preserved on exit from the enclosing scope, leaving it accessible to future activations of blocks defined within that scope. Each time the defining scope is entered, fresh copies of closed-over state are created. (In other words, block closures 'see' the state associated with the activation in which they were created, rather than that associated with the closure in which they were created. Things like 'fixTemps' are completely unnecessary.)

All BlockClosures are first-class (they can be stored or passed upward for activation at a later time) although block activations are strictly LIFO, with no exceptions. (Your hardware really, really wants things to be this way.)

(For the terminally-curious: closed-over state, corresponding to any variables that appear 'free' within a lexically-nested scope, are stored in a heap-allocated 'state vector' independent of the defining method or block activation context. These state vectors persist for as long as there are reachable block closures that reference them -- either explicitly, as their defining context, or implicitly, by holding a reference to a free variable stored within the vector.)

4.1.1   Non-local returns

An explicit return statement inside a block behaves just like in Smalltalk-80: the method activation in which the block closure was originally created will return the indicated value.

There is currently one limitation: blocks containing non-local returns make no attempt to detect whether their defining method context has already returned. Attempting to return from a block whose method activation has already exited, rather than resulting in a friendly runtime error along the lines 'this block cannot return', will most likely provoke a segmentation fault and core dump. (This is really easy to fix; I'm just too lazy to deal with it right now.)

4.2   Prototypes and objects

Well, it's all just objects really.

Objects are created by being cloned, which creates an uninitialised shallow copy of the original object. By convention the 'reusable' object that you clone, to make a new object to be modified and otherwise abused, is the 'prototype' for its 'clone family'. All members of a clone family share the same behaviour (response to messages), including the 'prototype' at the head of the clone family. If you modify the behaviour of the prototype (or any other member of its clone family) then the behaviour of all members of the clone family (including that of the prototype) is modified, identically. This is something of a compromise between Lieberman-style prototypes (simple conventions, since there is no `meta' class organisation to manage, but harder to implement efficiently) and class-instance systems (easier to implement efficiently, but imposing more complex organisational conventions on their surrounding systems).

In other words, a prototype (in the sense of the present discussion) is nothing more than an object that has been:

In yet other words, writing:
Foo : Point ()
is equivalent to:
" add 'Foo' to the set of visible named prototypes, then... "
  Foo := ObjectMemory allocate: Point byteSize + N "size of Foo slots in bytes".
  Foo methodDictionary: (MethodDictionary new parent: Point methodDictionary).
This results in a useful idiom for creating shared structures:
BadVisibilityZone : Dictionary ()
[
    (BadVisibilityZone := BadVisibilityZone new)
        at: 'Archer'      put: #below;
        at: 'Warrior'     put: #below;
        at: 'Sparrowhawk' put: #above;
        at: 'Cardinal'    put: #above.
]
(although I'm not suggesting that this is either the best idiom nor, by a long way, a secure and desirable one.)

Note 1: the explicit reinitialisation (by sending 'new') of the prototype is required since the implicit cloning in the prototype specification creates an uninitialised object (in all respect other than having a valid method dictionary installed in it).

Note 2: this kind of idiom rapidly grows too verbose and was the motivation for translation-unit variables. The above example can also be written:

BadVisibilityZone := [ Dictionary new
    at: 'Archer'      put: #below;
    at: 'Warrior'     put: #below;
    at: 'Sparrowhawk' put: #above;
    at: 'Cardinal'    put: #above;
    yourself
]

4.2.1   Random thoughts on class-like behaviour

The easiest thing is just to mix 'meta' and 'application' behviour:
Point : Object ( x y )

Point new
[
    self := super new.
    x := 0.
    y := 0.
]

Point magnitude
[
    ^((x * x) + (y * y)) sqrt
]
The only 'bizarre' (or not, according to your perspective) thing about this is that any 'instance' of 'Point' will be able to create new 'Points' in response to 'new'.

Another possibility would be to create parallel hierarchies, with class behaviour defined in one and instance behaviour in the other.

Point : Object ()          "the 'class' side"
aPoint : anObject ( x y )  "the 'instance' side"

Point new
[
    self := aPoint clone.
    x := 0.
    y := 0.
]

aPoint magnitude
[
    ^((x * x) + (y * y)) sqrt
]

4.3   Everything is first-class

In case you hadn't already noticed, 'self' is a variable. (As are 'nil', 'true', and 'false'.) If you assign to 'self' inside a method, the receiver instantly changes identity and retains the new identity through to the end of the method (or the next assignment to 'self'), including any implicit return of 'self' at the end of the method. The following have exactly the same behaviour:
Point new
[
    self := self clone.
    x := y := 0
]

Point new
[
    ^super new setX: 0 setY: 0
]
(assuming the existence of 'setX:setY:'), although the former is: (a) cleaner, (b) more in keeping with 'prototype and clone' style (as opposed to 'class and instance' style), and (c) faster. The disadvantage is that 'super new' might not return a Point, after which assigning to 'x' and 'y' directly might not be a good idea. (Yet another reason to abolish direct manipluation of 'inherited' state within methods...)

The only 'special name' to which you cannot assign is 'super'. (Actually, I never tried to assign to super. I don't think the Parser will let you, but you might just be able to assign to 'self' by calling it 'super'. Of course, the correct response to assigning to 'super' should be to dynamically re-parent 'self', but that's fraught with semantic complications -- not to mention problems with maintaining consistency in methods that access state directly. Again, a great reason to get rid of it.)

5   Pragmatics

The ABI (executable code conventions) are entirely C-compatible. The intention is to integrate seamlessly with other languages/applications, platform libraries and data types, without having (in the vast majority of cases) to leave the object-message paradigm.

In the meantime, primitive behaviour has to be hand-coded (by a wizard) and inserted explicitly into the compiled code at the appropriate point. Code appearing between braces '{...}' is copied verbatim to the output. Such external blocks are legal

provided that the code cannot be confused with a directive ('{ import ...') or WordArray literal ('#{...}').

Here's a trivial example, showing how to send a 'Character' to the 'console', answering 'true' or 'false' depending on whether the operation succeeded:

Character : Object
(
  value    "character's value as a Smalltalk integer"
)

Character putchar
{
  return ((long)self->v_value & 1)
    &&   putchar((long)self->v_value >> 1) >= 0 ? v_self : 0;
}
A few things to note: The above example could be written to raise a 'primitive failed' error on failure (more in keeping with traditional Smalltalk-80 primitive methods):
Character putchar
[
    | _code |
    _code := value _integerValue.
    {
      if (putchar((long)v__code) >= 0) return v_self;
    }.
    " fall through to failure code... "
    ^self primitiveFailed
]

6   The runtime system: introspection and intercession

The only intrinsic runtime operation (in the sense that it is inaccessible to user-level programs) is the 'memoized' dynamic binding (of selectors to method implementations) that takes place entirely within the method cache. Every other runtime operation (prototype creation, cloning objects, method dictionary creation, message lookup, etc.) is achieved by sending messages to objects, is expressed in entirely in idst, and is therefore accessible, exposed and available for arbitrary modification by any user-level program.

6.1   Object layout and object pointers

Objects have a single header word followed by zero or more bytes corresponding to the named slots containing the state of the object.

The header word is a pointer to the object's virtual table. Message sends to the object are resolved (when not present in the method cache) by sending 'lookup:' to the header object. This is the only explicit relationship between an object and the value stored in its header word.

Object pointers correspond to the address in memory of the first slot of an object, one word beyond the object's header (_vtbl pointer). In other words, the object header (containing the _vtbl pointer) is in the word before the one referenced by the object's oop. This is done to allow 'toll-free bridging' of idst objects to C/C++ structs/classes, Objective-C instances, or to native objects in any other language that does not use the same convention of putting a header in the word before an object's address. Allocating the idst _vtbl pointer before (e.g.) a C/C++/ObjC object effectively 'wraps' the foreign object in an 'invisible' idst object, whose layout is identical to (and whose state is stored at the same address as) that expected by the native implementation of the foreign object.

The Id runtime support is manifest in three mechanisms:

6.1.1   Intrinsic objects

The intrinsic objects (all of them prototypes, accessible by name with global visibility) form a small delegation hierarchy as shown below.
_object ()
  _selector : _object ( _size _elements )
  _assoc    : _object ( key value )
  _closure  : _object ( _method data )
  _vector   : _object ( _size "indexable..." )
  _vtable   : _object ( _tally bindings delegate )
All objects delegate to _object. All _objects have a virtual table (either implicit or explicitly stored one word before the address of the object). Virtual tables contain _vectors (one-dimensional fixed-size arrays) containing _associations between _selectors and _closures. A _closure stores a pointer to a method implementation (executable native code) and a pointer to arbitrary data. The method implementation receives the _closure in which it appears as an 'invisible' first argument.

The initial underscore '_' implies that these objects are primitive and not necessarily intended to be included in an end-user object system. Many of their slot names have the same prefix, implying that they store 'primitive' values useful for their state only -- you cannot send message to the values stored in these slots. All other slots (without underscore prefix) contain references to real objects to which messages can be sent.

6.2   Essential protocol of runtime objects

Compiled code assumes the existence of responses to the following messages: Ambitious applications can therefore (amongst other tricks) redefine '_object _methodAt:put:' and/or '_vtbl lookup:' to implement unusual dynamic binding behaviour.

The implementation of the above methods (along with several potentially useful auxiliary methods in the runtime classes) can be found in the file 'Smalltalk/runtime.st'.

6.2.1   Intrinsic methods

_vtable _alloc: _size
answer a new object in the receiver's clone family with _size bytes of free space in the new object's body
 
_object _beNilType
designate the vtable of the receiver to be the vtable for nil (the null pointer)
 
_object _beTagType
designate the vtable of the receiver to be the vtable for tagged (odd) pointers
 
_object _delegated
answer a new object in a new clone family delegating to the receiver's family
 
_vtable _delegated
answer a new clone family delegating to the receiver
 
_selector _export: value
associate the receiver (a name) with value in the global namespace (see _import below)
 
_vtable findKeyOrNil: aKey
answer the _closure stored in the receiver at aKey (or nil if aKey is not found)
 
_vtable flush
flush the contents of the global method cache
 
_selector _import
answer the value named by the receiver in the global namespace (see _export: above)
 
_object _import: _library
import (load and initialise) the named _library (dynamic shared object file)
 
_selector _intern: _cString
answer a unique selector as named by the primitive _cString
 
_vtable methodAt: aSelector put: method with: data
install method as the response to sending aSelector to any member of the receiver's family
 
_object _vtable
answer the receiver's vtable

6.2.2   Intrinsic functions

void *_palloc(size_t size)
Returns a pointer to an area of storage with the given size in bytes. The storage is traced by the garbage collector and is therefore suitable for storing object pointers.
 
void *_balloc(size_t size)
Returns a pointer to an area of storage with the given size in bytes. The storage is not traced by the garbage collector and is therefore unsuitable for storing object pointers.
 
void _nlreturn(void *nonLocalReturn, oop result)
Invokes the given nonLocalReturn returning the given result to the non-local caller. Suitable values for nonLocalReturn are stored in the first pointer slot of full closures.
 
oop _nlresult(void)
Recovers the result argument passed to the most recent call to _nlreturn.
 
oop _bind(oop selector, oop receiver)
Answers the _closure associated with the given selector in the given receiver.

6.2.3   Process arguments

Three global variables are defined during initialisation giving access to the command-line arguments and environment of the process:
int _argc
contains a copy of the original value of argc passed to the program at startup.

char **_argv
contains a copy of the original value of argv passed to the program at startup.

char **_envp
contains a copy of the original value of argv passed to the program at startup.
For some examples of the above in use, search for '{' within the library source code.

6.3   Runtime examples

The directory 'examples/reflect' contains code demonstrating how to reimplement much of the runtime support described above with equivalent userland implementations.

7   Caveats and gotchas for Smalltalk programmers

Numbers are signed (positive or negative) and the scanner is not context-sensitive (the syntactic type of each token is uniquely determined by its spelling, irrespective of its position).

8   Appendices

8.1   Compiler directives

In the imperative form
{ directive optionalArguments... }
the following directives are recognised:

8.2   Compiler types

The following compilerTypes must be associated with a concrete type using a { pragma: type compilerType programType } directive before the first use of the corresponding type of literal: