Monday, December 31, 2018

Towards a safer JavaScript

Let's face it, JavaScript by itself gives you enough freedom to write terribly bad code. It may not yet be the new Perl, but one can certainly go for the write-only style that still haunts one of the most useful programming languages in history. The recent ES generations (are we really finally about to call it ECMAScript?) gave my favorite language a huge productivity boost, and along came static type checkers, proper linters and decent style guides. But it's not all sunshine and unicorns shitting rainbows. The community seems to thirst for more and more syntactic sugar, yet at the same time needs more and more linter rules and type constraints to rein itself in...

Perhaps at some point ES will stabilize and a new JavaScript (or whatever we will call it) may arise, discarding all the idiosyncrasies that we tried so hard to get rid of. But I don't have the patience to wait. And no matter what the community or business world says, I'm pretty sure we can discard some things straight away. As I'm an advocate of pure functional programming, I like to strife for limiting imperative code to a very confined space. Stateful operations with side-effects, like object-oriented programming, have their use, but not always and everywhere. Since we already constrain developers with linter rules and types, why not take it a step further and disallow some more "dangerous" operations as well?

While the relative safety of a more limited syntax seems obvious, it also makes for a much leaner and more predictable syntax tree, the resulting structure of interpreting this syntax (i.e. with a parser). This gives us the ability to parse and transmit code more efficiently and to manipulate it with less effort, which has already been observed in LISP, for example. This means rewriting programs is easier and unlocks possibilities for the arcane science of code generation. On the more humble side, domain-specific languages can be created on top of the simplified syntax tree, for instance a language to query a specific database. However, the topic outlined in this paragraph lies outside of the scope of this article.

I've worked with pure functional languages enough to see the downsides of having no imperative constructs. The main handicap becomes apparent when producing some low level utility. There one has to resort to some abstract alternative to concrete state, leading to much more complexity. Yet while it can certainly be tedious to have no access to imperative code, it doesn't mean we should always expose it. Most of the time we are on the consuming side of said low level utility, and we just need our code to be clean, productive and safe. We want it to communicate what we mean by it, not the procedure how we came to that meaning. But enough has been said of the advantages of removing the imperative mess, let's take a look at how we can actually achieve it.

I would like to propose splitting up JavaScript into an unconstrained and a constrained version. The first is (mostly) for producing low-level stuff and the second is (mostly) for consuming it. One is imperative in nature, the other is declarative. In addition, the latter is a form of lambda calculus, the mathematical ancestor of all functional programming languages. From this it should be clear that I don't mean to just use current ES functionally, but to create a pure, yet related, syntax.

To design this new language, I've taken inspiration from XQuery, the XML querying language. Although it's a strongly typed language (it seems to map to TypeScript quite well), XQuery resembles JavaScript in some respects. XQuery has many nice features that make it more complete than TypeScript, React, Redux, Immutable and RxJS combined together (in my opinion). However, if I would have to choose one and only one feature it would be that it in XQuery you can't have side-effects. How can we transport this to JavaScript? It might be easier than you think.

Every statement must produce a result

A "block" in JavaScript has its own scope, but must eventually address and change state that lives outside of it, or nothing will happen (the only exception is when return is encountered within a function body). In XQuery a block always returns a value instead, which is provided by the last expression in the block, and return is reserved for disambiguation only. If we would apply this rule to a block included in a JavaScript imperative statement (like if, for, while, try or catch) it means we have eliminated our imperative mood all at once (well, almost)! We may then either return that value from the enclosing block (e.g. a function), or assign it to a variable1.

No more vars

In a pure language, a variable can never be modified, in other words, there is no "state". A variable can only be constant. Yet we may want to reuse a variable name by assigning a new value to a previously bound variable name (although this can lead to confusion about its meaning). In any case, we can keep the ES const keyword as it is, and allow for the reuse of a variable name bound with let. However, we can still modify a variable bound with let outside of its scope, in other words, have side-effects. To prevent this, we introduce a small change: every variable must be bound explicitly. From this it follows that we have no more use for var.

Example


function test(a, b) {
  // if returns a value
  // the variable a is reused, but no state is retained
  let a =
    if(a > 0) {
      // the value of the expression in this block is returned
      a + 1
    } else {
      // re-assigning a value to a here doesn't have side-effects
      let a = 2
      return a
    }
  return a + b
}


Sure enough, these are only minor changes, but with great consequences. Obviously there remain some details to work out. For instance, what should loop-statements like for and while return? To me the answer is straight-forward, because in XQuery for always returns a sequence (as part of the FLWOR expression syntax). A sequence is just an abstract data type that serves as an iterable. The iterable already exist in ES, but sequences allow for more functional operations on this data type.

To conclude, this exploration is the result of my years of struggling to find the best tool for the job, that is, my job. I'm still searching for other functional feats to put in my imaginary ES spin-off, but I hope that what I've described here could be the start of something real an usable. Let me know what you think of these functional fancies!
 



1.The difference in syntax for fat arrow functions with and without return will naturally dissolve because of this, as will the difference between an if and a ternary expression (i.e. ? : ).

No comments:

Post a Comment