Friday, October 30, 2015

Shuffle it

Coming from the muddy world of javascript, I've always desired cleaner code, not only from myself, but from the community as well. I tried to achieve that by creating a small language that parses to javascript, as so many others have done before me. It's based on Joy, developed by Manfred von Thun over a decade ago, although it seems to have been abandoned. A similar language, Forth, was already developed back in the sixties, and is still actively used in embedded computer systems. Like other "purely functional" languages the core mechanic is the mathematical function. Mathematical functions perform operations on a number of values, and return a new value, not touching anything else in the program. Unlike other functional languages, the functions – or rather, "words" – in Joy (and Forth) have no formal parameters. Consider the following function in javascript:

function add(a,b) {
    return a + b;
}
add(2,2); 

Parameters 'a' and 'b' are needed to create the purely mathematical add function (indeed, I just made a common JS operator into a word...). Instead of using parameters, functions in Joy operate on a stack, which is always available to the program. By simply writing values and words in a left-to-right order, the stack is both appended to and consumed. A simple example of adding two integers:

2 2 add


As can be clearly read from this example, two integers are appended to the stack, and lastly the addition function is executed on those two integers on the stack, which are then discarded. After the operation the integer 4 is left on the stack. Had the program continued, that value could have been used for new operations, for example:

2 2 add 3 multiply


Which obviously leaves 12 on the stack.

The advantages of a stack-based programming style for the web is quite evident, as it's pure, concise and presupposes little knowledge of the syntax. Furthermore it doesn't require knowledge on the more complex subject of function composition, as you can simply read the flow of the program. But there's a catch: there are no temporary placeholders other than the stack. So what do you do when you have some value that you need further down the line? As you now know, words consume values on the stack. How do you retain values until you actually use them?

Let's assume that you need to count the number of items in a list, and that value is needed more than once. Counting items in a list is an operation that you'd only want to do once, not only for performance benefits, but because it makes sense. The following program creates a list and counts the items (the list is created just for the example, normally it would have been put on the stack by another operation):

1 2 3 list count


This would leave 3 on the stack. Next, the result is tested for equality, which puts a boolean on the stack. However, this would consume the value (3) that we need further in the program, and so it needs to be duplicated. The first indented line gets executed when the result is true, the second when it's false:

1 2 3 list count dup 2 eq if
    4 add
    5 add

As you may already figured out, dup introduces another kind of operation that is not part of the flow of the program, but is simply there to retain a value that would otherwise be lost forever. These kinds of operations are called stack shuffling. You might even argue that a language like Forth didn't become mainstream because of the lack of temporary placeholders. But, since it hasn't got any, stack shuffling is a dire necessity. It also puts a great burden on the programmer's mind, who has to keep track of what is on the stack at all times. Furthermore, the programmer needs to make choices what to do first and what to retain for later, making code harder to revise.

Stack shuffling has become a true skill in itself, with a number of techniques that make it more endurable, but in the end a person like me can only take up so much stack space in his own mind. In the end it wasn't the program, but the programmer who had a 'stack overflow'. My enthusiasm for stack-based programming eventually died and I went back to the good old functions with parameters, and the good old placeholders for temporary values.