Self-Similar Programming Language

About the Project
Progress Log
Basic Syntax
Meta information
Anti statements
Coroutines
A class as a suspended coroutine
Overloads
Compile-time constants and structs
NSieve example
Inlining with Exceptions
Exceptions and Coroutines
Ownership of Objects
A note on comparisons

Status: Currently working on: Compiling on Linux.

Most recent source code can be checked out from here:
https://guest:self-similar1.0@www.self-similar.com:448/svn/multi
Contact me at alexei.lebedev ... gmail.com

About the Project

Dec 15 2009. I had started thinking about a programming language which I called Self-Similar around year 2000 while at Viewpoint. The project I was working on at the time was a 3D object editor for shapes acquired with our laser and silhouette scanners. I knew what I wanted from the language -- I had initially formulated it as a set of extensions for C++, but I had no idea how to approach building it. I ended up writing a series of articles instead. These articles have been sitting on this page for the last 9 years. Since then, I have moved from 3D graphics and computational geometry into network programming, and subsequently into real-time. Since 2000, many of the items on my wish list for self-similar have been recognized as useful by others (independently of my input, I'm sure). Move constructors and lambda functions are already a reality for C++ programmers today. However, I have not seen ideas such as nested states, lifespan specification and anti-expressions gain popularity.

Instead of waiting for another ten years, I have decided to actually write self-similar and make it a reality. The source code will be (as originally planned) released under a GPL license. I will be using this page to openly document the progress for myself, so think of this site as a scratchpad.
This page looks like a blog, but the sections are not dated, because I intend to change them frequently as the work progresses.

Progress Log

Jan 23 2010: Added FieldAccess, ArrayIndex and Deref statements. The three-argument form is badly in need of a rewrite. Will switch to 64-bit compilation soon.
Jan 18 2010: Rudimentary register allocation implemented. Problems with 8-bit instructions surfaced -- I didn't realize you only have 4 registers in 8-bit mode. Values 4-7 refer to registers AH,CH,DH,BH instead or lower 8 bits of ESP,EBP,ESI,EDI, so all your 8-bit code has to stay in the first 4 registers. My NSieve benchmark is now within 12% of VC++ 6.0. Another problem I is that all my intermediate expressions have a 3-argument form. Either I have to optimize the moves away, or switch to 2-argument form.

Jan 17 2010: Implemented ownership flag transfer based on data flow analysis.

Jan 16 2010: Some constant propagation -- enough to determine that if a variable participating in a conditional expression can only have one value at that point (and that value is a constant), the conditional will be simplified away. Some floating-point support (load, store, + - / *, float and float64 supported). Decided on the naming convention: int (machine-dependent default), int8, int16, int32, int64, float32, float64. Not done: int128, float80.

Jan 11 2010: All statements now compiled; deleted the interpreter.

Jan 10 2010: Coroutines are now compiled. This includes yield, suspend, resume, wait_parent and wait_child.

Jan 7 2010: Exception handling done, at the expense of speed. Any block that throws an exception, and recursively any block that encloses it, has a bool reverse, set to false at the beginning of the block and checked at the end. if reverse is true, the exception propagates to the enclosing block.

Jan 3 2010: Most statements compiled now (rest interpreted)

Dec 29 2009: Automatic folding of constant expressions, using ConstantExpressionQ.

Dec 25 2009: Overload resolution. Used by the print statement.

Basic Syntax

Some taste of the syntax first:

# each line begins with a command word followed by arguments
# arguments are separated logically, not by spaces
command argument1 argument2 argument3

# arguments can be expressions. the following line has one argument
command 1 + 2 * 3

# expressions can be nested
command (subcommand subargument) argument

# continuation lines start with a ...
command argument1
... argument2

# command can be omitted
2+2
a=b

# eol is a compile-time constant equal to "\n"
print "hello world!" eol

# curly braces are required around nested-block arguments.
# the if statement looks like this
if x > y {
    print ">" eol
} x == y {
    print "==" eol
} {
    print "<" eol
}

# { can only the the last token on a line. } can only be the first.
# the switch statement follows the same pattern as if **todo: implement**
switch x 1 {
    print "x is 1" eol
} 2 {
    print "x is 2" eol
} 3 4 5 {
    print "x is 3,4 or 5" eol
} {
    print "x is something else" eol
}

While binary operators are OK, there is an ambiguity: should A b -c be parsed as A (b) (-c) or A (b-c)? There is a similar ambiguity in C, where a/*b is interpreted as starting a comment, whereas a / *b is something entirely different.

# a variable
var x int 3

# an array variable
var x2 [int 10] 0

# changing an array element
[x2 5] = 2
set [x2 5] 2

# two-d array
var x3 [[int 10] 20] 0
set [[x3 0] 0] 1

If I make [] an infix operator, there will be an inconsistency between () and []. () starts a new argument and is never in-fix.

# arguments to functions are listed one at a time
func Plus1 (in A int) return int {
    return A+1
}
var X int (Plus1 3)

Meta information

The syntax of each command is the relative order of its unnamed arguments. In addition to those, any command can have any number of named arguments. These are position-independent.
This gives breathing room for future features. The kinds of things which can be specified using named arguments are:

Any expression in the program can be enclosed in parentheses without changing the program meaning. This means that any expression in the program can be tagged with named arguments. The feature forms the basis for features like units and ownership.

Anti statements

anti print "B"
print "A"

This prints AB. Why? After executing each statement in a block in a top-down fashion, the execution turns around, and executes all anti-statements in a bottom-up fashion. This rule recursively applies to each block in the program. When you throw an exception, you tell control flow to start executing anti-statements now, without finishing the block first. The execution will proceed going back until a catch statement reverses it again.

anti catch
print "A" eol
anti print "~A" eol
print "B" eol
throw
print "C" eol

# output:
# A
# B
# throwing an exception from main
# ~A
# exception caught in main

Note that only "anti catch" makes sense. Catch by itself couldn't catch an exception since code doesn't exist regular statements when an exception is raised.
Anti-statements solve several problems. First, a destructor for an object is simply an anti-statement that follows a constructor. Second, it's a language feature that's available to the user instead of being hidden away. Third, and most importantly, double exceptions are OK now. If you are told to keep going back while you're already going back, is this a fatal error? It is a fatal error in C++ -- which has unintended consequences (one of which is that you can't throw exceptions from destructors, because a destructor might execute during stack unwinding).

# save value of j until the end of the block
var k int j
anti set j k
set j ..new value..

Coroutines

There are two types of functions, called func and proc. A func is a regular function. A proc is a coroutine (think of it a blend between a procedure and a process). A proc can be interrupted in the middle. You can either call a proc or spawn it. Calling is equivalent to spawning, followed by a wait. During a spawn, control is transferred to the child proc. If that proc is interrupted at some before exiting, the parent is still runnable, and continues execution at the next statement. There is an implicit wait for the child proc at the end of the block. This allows allocating eligible procs (those that can be proven to be non-recursive) on the stack of the parent, just like a local variable. The wait-at-the-end bit can appear to be limiting, but it is actually key to good performance. It allows allocating procs on the stack. Procs behave somewhat similar to classes in C++ (with the spawn being a constructor and the wait being a destructor).

proc X {
    var i int 0
    while i < 10 {
        print "A"
        # yields control to the next runnable proc
        yield
        set i i+1
    }
}

# calls X. this prints AAAAAAAAAA
X

# spawns X.
var x X
var i int 0
while i < 10 {
    print "B"
    yield
    set i i+1
}

# prints ABABABABABABABABABAB

Exceptions from procs always propagate to the parent process at the wait point. For calls, it means on the same line. For procs, it means either at the wait statement, or at the end of a scope.

proc Y {
    proc X {
        yield
        throw
    }
    X x
    print "A"
    # exception from X is re-thrown here
}

If you don't reap the child procs at the end of a scope, which is the default behavior of Simula-67 and most other languages with coroutines, you are forced to 1) allocate everything on the heap and 2) use garbage collection, because you're no longer explicitely specifying lifespans.

A class as a suspended coroutine

proc Class (in arg1 int) (in arg2 float) {
    constructor
    anti destructor
    var x int
}
var c Class
set c.x 1

This is definitely of a to-do variety, but note the following: the code above does exactly what a C++ instantiation of an object would do, including what happens if the constructor of the object throws an exception. We create a proc of type Class. The proc executes its main body and enters a wait state before executing any anti-statements. A variable named c refers to the proc, and we can access c's fields without fear, because if C threw an exception during construction phase, this exception would propagate into the parent frame.

The hairy part is what to do if the body of Class contains yield statements. Then, execution detaches and we can no longer guarantee safe access to variables declared after the yield. A second problem is that constructor arguments arg1 and arg2 are forever part of the class's frame, since they need to be accessible to it.

Overloads

# 1b has the type byte (same as int8) (1 byte)
print 1b eol

# 1 has the type int (4 bytes)
print 1 eol

# 1L has the type int64 (8 bytes)
print 1L eol

# 1.f has the type float32 (4 bytes)
print 1f eol

# 1.0 has the type float64 (8 bytes)
print 1.0 eol

# "abc" has the type strptr. eol is a constant "\n"
print abc eol

print 1 " " 1.0 " " 1.f
# output:
# 1 1 1

There are many functions called print; they are distinguished by their arguments, and the closest match is selected when printing.
Each print function takes only one argument. What makes print loop over supplied arguments is a parsing rule called loop_over_args, illustrated in the code below.

func test (in i int) {
	print (i+1) " "
}
# instruct the parser to apply test to each argument in turn
loop_over_args:true
test 1 2 3 4 5

#output
#2 3 4 5 6

Compile-time constants and structs

# compile-time constants are automatically folded.
var x int (2+3)

# mark x2 as a compile-time constant
var x2 int (1+1) const:true

# this variable is now initialized with a compile-time constant
var x3 int (x2+1)

struct XYZ {
    var a int 0
    var b int x2	# this is OK because x2 is a compile-time constant
}

#Disassembly of main
#bytewise_init main.x 4 bytes: 05000000
#bytewise_init main.x3 4 bytes: 03000000
#bytewise_init main.x4 8 bytes: 0000000002000000

Internally, each variable has an optional field called initializer_value. This is an array of bytes that initializes the variable. A type can also have an optional initializer_value. A struct is required to have an initializer_value, which is computed out of initializers of all of its fields. Struct fields are not required to be initialized. In that case they will be filled with zeros.
Structs do not have constructors.

Add a tag to mark to excuse a struct from having an initializer.

Currently, compile time constants are evaluated using a separate function EvaluateCompileTimeExpression, which mostly understands the same rules of up-conversion and evaluation as CompileExpression. A separate code path means possible divergence in the future. Also, compile-time functions are not supported. Considered briefly constructing a Program, compiling the compile-time expression into the program, executing it and obtaining the result, but decided against it at this stage.
Having structs and procs is probably not enough. Abstract data structures such as vectors, hashes, etc., all can have code associated with them, so they aren't structs, but they aren't procs either because procs are heavyweight.

NSieve example

The goal is to submit competitive entries to the computer language shootout game. For now, I can only run the simplest benchmarks with what I've got so far.

C++               Primes up to  20480000   1299069 0.81
Self-similar      Primes up to  20480000   1299069 0.84
Python 2.5 Psyco  Primes up to  20480000   1299069 2.91
Pyhon 2.6.4       Primes up to  20480000   1299069 10.80

C++ is ahead by a factor of 1.6x to 4x. It's probably much more than that on non-memory-bound tests. What surprised me is the really terrible quality of Psyco code. With all the brouhaha surrounding it, I expected it to have decent performance.

func nsieve (in limit int) {
	var is_prime byteary (malloc limit)
	anti free is_prime
	{
		var u int 0
		while u < limit {
			set [is_prime u] 1b
			u += 1
		}
	}
	var primes_found int 0
	var k int 2
	while k < limit {
		if [is_prime k] == 1b {
			primes_found += 1
			var j int k*2
			while j < limit {
				set [is_prime j] 0b
				j += k
			}
		}
		k += 1
	}
	print "primes up to "    limit    ": "    primes_found eol
}

func TimeNSieve (in limit int) {
	var before double (gettime)
	nsieve limit
	var after double (gettime)
	print "time: " (after-before) eol
}

TimeNSieve 2560000

For more details, see Detailed report

Inlining with Exceptions

Internally, each block of code has End as the last statement. It's not really a statement, it's just a point where execution can branch, as will be seen shortly.

Main {
    anti End
    A
    anti B
    C
    anti D
    E
}

In order to compile this code, I need to transform it into linear form, since that's what the CPU wants. Before I do that, I will write, for each statement, the next statement that should come after it in both forward and reverse cases, separated by a comma.

Main {
    anti End            ()
    A                   (C, anti End)
    anti B              (anti End)
    C                   (E, anti B)
    anti D              (anti B)
    E                   (anti D, anti D)
}

Now the statements can be sorted into default execution order. I will erase links that point to the next line, since the CPU follows them implicitely.

Main {
    A                   (, anti End)
    C                   (, anti B)
    E                   (, anti D)
    anti D              ()
    anti B              ()
    anti End            ()
}

So far so good. Now let's say that C was actually a nested block. Let's assume the nested block looks the same as the one we had.

Main {
    A                   (, anti End)
    C {                 (, anti B)
        A2                   (, anti End2)
        C2                   (, anti B2)
        E2                   (, anti D2)
        anti D2              ()
        anti B2              ()
        anti End2            ()
    }
    E                   (, anti D)
    anti D              ()
    anti B              ()
    anti End            ()
}

In order to inline C into the main block, we'll copy C's continuation pointers into C's End statement:

    ...
    C {
        ...
        anti End2            (E, anti B)
    }

Now we can inline the C block by simply moving all of its statements to the parent block.

Main {
    A                    (, anti End)
    C                    (, anti B)
    A2                   (, anti End2)
    C2                   (, anti B2)
    E2                   ()
    anti D2              ()
    anti B2              ()
    anti End2            (, anti B)
    E                    ()
    anti D               ()
    anti B               ()
    anti End             ()
}

This will execute fine in the forward case, but in the reverse there is a problem. An extra decision has to be made at "anti End2" depending on the direction of code execution. To keep the forward case fast, code duplication is unavoidable. But if I write out all anti-statements for each potential transfer of control, there can be code blowup (the amount of code in each anti statement group is O(N), and the number of those blocks can be O(N) as well). Instead, I can make exactly one copy of each anti-block and terminate it with a ret instruction.

InnerCleanup {
    anti D2'
    anti B2'
    anti End2'
    ret
}
MainCleanup {
    anti D'
    anti B'
    anti End'
    ret
}

To implement return, I just push the address of the next block, and call into one of these routines. So if the C2 statement was a return, I'd have to execute anti B2, anti End2, anti B, anti End:

push address of anti B
jmp B2'

If C2 threw an exception instead, I'd emit:

push address of exception handler
push address of anti B'     # note the ' here
jmp B2'

N-level breaks are trivial using this approach.
To improve performance, the optimization pass can choose which blocks to instantiate separately with specific jump-to addresses at the end.
Without code duplication, the way to handle reversals is to have one bool per End statement, storing the reverse flag for the block containing that statement. To propagate the reversal, End copies the flag upstairs and either falls through or jumps to the corresponding instruction.
Self-similar is currently compiled using method #2, one bool per scope. This will change as soon as it becomes the the worst problem I have.

Currently implemented version uses one boolean per nested scope to store reverse flag. Results in a 10% performance penalty (i'm surprised it's not more)

Exceptions and Coroutines

Consider a coroutine that throws an exception:

proc A{
    proc B{
        yield
        throw
    }
    var b B
    # implicit: anti wait b
    ... something ...
}

There are two ways of handling this exception that make sense to me. One, observe that when one co-routine is running, all others are at well-defined points. This means you can cause an exception in any coroutine in a group, so option one is that B rolls back, and an exception is triggered in A when B reaches the end. Option 2 is that we wait for the (implicitely inserted) anti wait, and use that point to pick up the exception. In this case the exception is like a return status of a subprocess. I choose option 2, although I can make a case for option 1, too.

Now consider this:

proc A{
    proc B{
        yield
        throw
    }
    proc C {
        yield
        throw
    }
    var b B
    var c C
    ... something ...
}

In this exampe, A will pick up two exceptions on its way out, one from B and one from C. It was probably already clear from the discussions above that the proper type for an exception is an array of objects, not one object. Note that with synchronous execution, A would most likely get only one exception because B wouldn't have a chance to run (although that cannot be guaranteed at all, since scheduling happens in an undefined order).
One final example:

proc A {
    proc B{
        ...
    }
    var b B
    # implicit: anti wait b
    throw
}

Here, the parent routine throws an exception. What happens now? Again, I choose the wait statement to be the only place where exception information is exchanged. In this case, when A turns around and reaches anti wait b, an exception will be generated in B. This is to accelerate B joining A. Note that the other way (instant firing) would generate a flood of exceptions in all coroutines. I don't have enough experience programming with coroutines to decide this conclusively. Perhaps the right solution is to always raise an exception when it comes to a wait point. Then the right option for the first example is #1. TBD

Not implemented yet

Ownership of Objects

To track resource leaks, I use a compile-time object property called ownership. Symbols with the ownership flag set are not allowed to leave the scope (via code flow analysis). Duplicate ownership is not allowed.
Consider a function like malloc. Its specification is not complete without saying that the return value has ownership:

func malloc (int bytes int) (return byteary ownership:true)
func free (in byteary ary ownership:true)

# this variable now has ownership
var x byteary (malloc 10)

The compiler can now help us find bugs at compile-time (real examples)

{
    var x byteary (malloc 10)
    # Error: In proc main, variable x leaks ownership.
    # Last acquired in "set x (malloc 10)" (line 2)
}

{
    var x byteary (malloc 10)
    free x
    free x
    # Error: In "free x" (line 4) variable x doesn't have ownership.
    # Lost in "free x" (line 3)
}

{
    var x byteary (malloc 10)
    set x (malloc 20)
    # Error: In "set x (malloc 10)" (line 3), variable x already has ownership.
    # Last acquired in "set x (malloc 10)" (line 2)
}

{
    var x byteary (malloc 10)
    free x
    # OK
}


Implementation involves adding one tag (ownership) with two possible values: true and false. Each argument of a function can be marked as taking ownership, and the return value can be marked as giving ownership. An assignment moves ownership.

Dataflow analysis is currently too simplistic.

# user-defined function that creates ownership
func abc (return int ownership:true) {
    return (0 ownership:true)
}
# user-defined function that destroys ownership
func xyz (in x int ownership:true) {
    x ownership:false   # ownership flag removed
}

Note that this method doesn't prevent pointers escaping, thus can't prevent access to freed memory. For that, scoping rules for pointers would need to be enforced (lifespan specification). So an assignment

In a similar vein, a noaccess flag can be added. Any symbol with its noaccess flag set (or potentially set) cannot be referenced.

Linear type systems do this same thing on a larger scale.

A note on comparisons

It is typical for programming languages to provide six comparison operators. Why six? There are three possible comparison tests for two values: less-than, equals, and greater-than. For each of these tests, either outcome is possible, true of false. The comparison succeeds if at least one of the tests is true. Thus, there is a total of eight possible outcomes. The table below illustrates these outcomes and lists the names of C++ operator names corresponding to them.

 <  ==  >   Value  C++ Name    Explanation
 0  0   0   0                  unordered
 0  0   1   1      >
 0  1   0   2      ==
 0  1   1   3      >=
 1  0   0   4      <
 1  0   1   5      !=
 1  1   0   6      <=
 1  1   1   7                  not unordered

So, operator >= corresponds to the comparison outcome (0,1,1). It is true if A is equal to B, or if A is greater than B. The two comparisons in the table above that have no C++ operators associated with them are unordered and not-unordered. It is not possible for two integers to be unordered, but it is possible for floats, whose representation allows for NaNs. A NaN is neither less, or equal, nor greater than any other number.

A result of any computation can be said to have a type -- if only a special one-off type that only that computation produces. The result of a comparison has the type Comparison. Its value begins its life in the status register of the CPU, but can be saved or used later (x86 SAHF, LAHF, PUSHF instructions).

There is no operator in self-similar to test for unordered, but there is a constant of type Comparison called Unordered. You can then write

# branch if 0,0,0
if (Compare a b) == Unordered {
}
// C++ equivalent:
if (isnan(a) || isnan(b)) {
}