Sunday, May 14, 2017

Forms as state containers Part 4: form generation

This is a small series on HTML forms. The previous post was Forms as state containers Part 3: validation.

With form generation we leave the realm of functional form controls. In addition to barebones client logic, we have to focus on the User. Many programmers tend to forget that. However, the devil is in the details. You may just want to add a simple form to capture an email address for a newsletter subscription, or an entire multi-lingual webshop with checkout system, it's all the same, really. No matter the use case, the accessibility, look & feel and ease of use of the app are just as important as the validity of the data. Unfortunately, this is were many applications fail, as we've all experienced at some point.

There's not one way to solve the "forms problem", as there's an insurmountable gap between user, designer and programmer. In my opinion, the ultimate goal is: enough flexibility, with the ability for designers to rapidly create working prototypes. But how to cater to this, without resorting to some wysiwyg editor that just spits out semi-structured garble?

Clearly, designers shouldn't be bugged about the stuff developers want, like data types and program logic. So, hereby I distance myself from things like JSON Schema-based form generation, as it's too far removed from the common use case. Instead, I propose to include specialized pieces of markup that provide a sensible starting point for designers to work with, as I will outline below.

Instead of just using the built-in HTML form controls, I find it more useful to explain a form in terms of structure. It shouldn't be hard to explain to a non-programmer what is a text value, a fieldset or a repeated item. Like mentioned earlier, the link is also an important structure for generating forms. Finally, a structure that isn't even available in JSON Schema is an element, even though it's the kind of structures we're building!

With structural types it's possible to generate forms as well as validate them, without having to build an entirely new system. These structures can be coerced to real data types later on, as is exactly the way form controls like input work anyway. Why not use elements like input and other HTML directly, you ask? Well, because we're actually programming! But we don't want the designers to know that ;-) Seriously, though, using HTML is fine, but to make the distinction is vital. Allow me to explain.

Once there was a programming language written purely in markup... Whoah, really?! Yep, there's a language that actually required writing markup by hand. It's called XSLT. One of the problems with XSLT, however, is that it requires tons of knowledge on how to use it. In fact, it's so complex, that although a language never needed a user interface more badly, it can't ever be built (not really, anyway). 

I want to create something simple, a small subset that's just for the purpose of creating forms programmatically, but written in markup, so it can be processed like... well, however you normally process HTML (you know, by hand?). Also, XSLT is XML, which is considered to be too strict for the web, so nowadays we have HTML5... which is exactly the same thing. Oh well, nobody seems to have another answer to XML yet, but that's a discussion for another time. First, let me present "XSLT in HTML".

<fieldset name="personal-info">
  <x name="firstname" />
  <x name="lastname" />
  <x name="email-address" />
  <x list="someid" name="hobbies[]" />
  <link href="/datalist#someid" rel="import" />
</fieldset>


Wait, isn't it kinda like the "schema" in my previous post? It is, but I replaced input with x, because it denotes a "text value" structure. The input is only one control that provides a text value, but there's also select (or, rather, selectOne), textarea, and maybe there's more. Instead, I use x, which is short for "text value". The above piece of markup can be inserted in an HTML document, and when processed with my specialized software, it can simply be expanded into a form that has some basic layout. Additional elements like labels are generated on the fly, just like you would with any "web component".

While the designer can decide upon the appearance of a form, the developer can add data types and constraints that are required for validation. I think a designer is helped with the ability to select the appearance from a template. For example, an enumerated control takes some preset options from a list, but designers shouldn't have to concern themselves with how those options arrive in the form. They shouldn't have to decide between a select or a radio button group: when there are few options, a radio button group will suffice, when there are too many options to view at once, another solution is obvious.

When a more fine-grained approach is required, the designer should be able to style the constituent parts of a form component. In that case, the expansion of the component's content could be further specified, instead of relying on the default. 

Since our "program" is just markup, we have a lot of flexibility. In fact, I'd argue that this will one day be part of HTML, as a standard way of creating logic that is simpler than javascript, which has become a general purpose language. And while that may make programmers happy, it's too complex for the common use case.

You may not be very impressed by any of this, but just wait for the next and last post in this series, and I promise you will be.

Saturday, April 29, 2017

Forms as state containers Part 3: validation

This is a small series on HTML forms. The previous post was Forms as state containers Part 2: managing complex data.

A form is a schema

Since HTML5 there are a lot more attributes avaiblable on form components for expressing constraints. This means forms can nowadays be used as a basic schema for validating data. However, HTML5 form components are a motley crew, so we need some guiding principle to actually make sense of it all. JSON Schema provides a coherent set of constraints for JSON data. And because JSON is just javascript, JSON Schema should be readily applicable to HTML forms as well. You may not know or like JSON Schema, or you may not want to learn it because it isn't a standard (yet), so in this post I'll try to patch up things by merging JSON Schema into the HTML forms standard.

To start it off, some very useful keywords from JSON Schema actually translate directly to HTML5 form constraints. For this purpose I'm considering the version 0.4 draft.


JSON SchemaHTML5 form constraintPurpose
maximummaxUpper limit for a numeric instance
minimumminLower limit for a numeric instance
maxLengthmaxlengthMaximum length of characters in a string1
patternpatternRegular expression pattern to match against
requiredrequiredThe value must be present2
patternpatternRegular expression pattern to match against
enumselect / optgroup / datalistA list of possible values3

Notes:
1. I'm not sure if surrogate pairs are counted as 1 or 2 characters in none, one or both specifications.
2. There's a slight difference, as the keyword in JSON Schema expects an array of properties that are required in an object. Also, a form component can be present, but not filled in, while a property can only be either present or absent. We can however treat both specs as equal in most cases.
3. In HTML an enumerated value is expressed as an option element. In JSON Schema the array can contain items of any type, including null. The HTML option element can only contain a text string.

Types please

The HTML form constraints API says that the data type for all form components is a text string. That's pretty poor, since it's very likely that data submitted by a form will eventually land in another part of your application, or some kind of data storage, and we're going to need more specific type information. When we have it, we can catch some errors in advance, and in other cases we can automatically convert a value to another type. A natural fit for adding type information to a form component would be in a data attribute called type. Data-type... Wow.

Expressing primitive types is very straightforward:

  • input type="text" data-type="string"
  • input type="number" data-type="number" (through type conversion)
  • input type="checkbox" data-type="boolean" (through the checked attribute)

Things get a bit more complicated for objects and arrays. JSON Schema provides the keywords properties for objects, and items for arrays, which can be nested to express any kind of hierarchical data. However, since the form is the schema, we don't always need to express properties or items, as they will naturally appear from the nesting of form components.


For the sake of being complete, let's at least provide a list of properties that can be expected to appear in an object, or rather, a subform. For arrays it should suffice to supply the minItems and maxItems keywords, which express the expected number of items in an array. Optionally we can supply the uniqueItems keyword that can be used to inform that all items in the array must be unique, but identical values should be rare enough in forms.

Type = Format?

Another inconsistency in HTML forms is that the type attribute actually denotes a format according to JSON Schema. It would be better if this pivotal attribute be renamed to format, while the original would be used to express the actual data type of a value, but since the inconsistency is historical we're stuck with it. When we read format instead of type for input elements, we're pretty close to JSON Schema already.

JSON Schema doesn't limit our imagination considering what formats JSON data can have, so we can use that to our advantage when creating custom form components. So instead of thinking up some custom tag name for a functional web component, you could also just name it input and use the type attribute to semantically express its functionality.

Imports

JSON Schema allows importing schema instances by way of JSON Reference. However, as I've demonstrated in the previous post in this series, form parts may just as easily be imported once the HTML imports specification has become widely adopted.

Unobtrusive javascript!

Now that we've perfectly aligned our constraint attributes with JSON Schema, it's time to start validating our form. We have to write some javascript to do this, but we can finally dance and cheer, because the script is unobtrusive. That's because validation is just an added bonus, and the form will work fine without it.

As I've underlined in part 1 in this series, I consider it good practice to bind events to the top form as much as possible, and use both the type of the event, and the component name or matches(some-css-selector) on the event target, to distinguish relevant situations. This is no different for validation. You can decide if you want to validate only on submission, by inspecting the event type for submit. Or you could inspect the target for a specific form component to validate when a change event is encountered.

For things like styling and custom validation messages, we've got a lot of that built into the browser nowadays. I leave the implementation of custom validation as an exercise for the reader for now, but expect some github-hosted form power from my hand in the near future.

Next up: Forms as state containers Part 4: form generation

On unobtrusive javascript

This is a short rant that was originally part of my HTML forms write-up, but that I took out, because it became too long.

I haven't been writing unobtrusive javascript for at least a decade, and perhaps that was a mistake, but it was also inevitable. In my own defense: it seems HTML is actually quite incoherent... Yup, I just wrote that. I mean, you have tags that denote text levels, similar to those used in text processing software, like paragraph, header, blockquote and (later on) the HTML5 semantic tag set. You also have tags that denote some concrete representational blocks that can be considered part of the document flow, like img and table, or more generic tags that can be used to manipulate document flow, like div or span. The hyperlink is the VIP of the web, and a different beast altogether. Then there's the form, with its controls, which isn't representational at all, but primarily functional.

I think the distinction between representational and functional elements in HTML has been underestimated, and led to the explosion of (very obtrusive) javascript libraries to create class-based widgets. At least they were generic enough, and controllable through a well-described API. The standard that eventually evolved from this, namely web components, is no different. Although it may fill the gap between built-in markup and custom, javascript-enriched bundles of functionality, it's still mainly this functionality we're after, not text layout. Calling a class-based object an HTML element doesn't really change anything.

Finally, there's the argument that we should use meaningful (or semantic) tags. But there's nothing semantic about HTML itself. The semantic tag set that was added to HTML5 denotes text layout, which may be meaningful to browsers and text processing software, but not necessarily to humans. We should take people with disabilities into account when writing web applications, and for that it's helpful to specify the parts of document flow. But it's a formality that can also be controlled through application code. In the worst case, thinking semantically the HTML5 way impairs programmatic control and generics, simply making it harder to automate the process of creating web applications.

Ironically, semantic tagging is no longer the domain of markup, and moves in the direction of data. Yes, we're back where we started: markup and data are again separated. JSON-LD is an initiative that's quite new, but it enables consumers of public web API's to retrieve information in an automated fashion. This means we only have to decide what to automate: or we write programs that generate separate representations for humans and machines, or we write programs that generate a mix of meaning and functionality in HTML, sprinkled with some unobtrusive javascript.

So I'm like, yeah, whatever man, whatever.

Forms as state containers Part 2: managing complex data

This is a small series on HTML forms. The previous post was Forms as state containers Part 1: forms as a single source of truth

JSON?

The value of a form can be seen as a key/value map, a plain object in JSON. From this follows that the value of a subform naturally becomes an object in the parent form. In HTML5 subforms can be expressed as a fieldset element. However, the javascript interface of a fieldset is not a natural fit as a subform, since it only groups elements in the parent form: the elements remain accessible in the parent form. Wrapping the fieldset interface slightly in a thin javascript layer, we can treat fieldsets as a true subform, and we can address its "value" property as an object. In the unlikely event that the form doesn't have access to javascript, we may have have another solution, as we'll see later on.

Repeating values obviously fit naturally to javascript arrays. The "value" property of, for example, a multi-select gets (or sets) the array value of the form component. This is all quite obvious, you might think, and with the advent of JSON you just send your form data as a JSON document to the server with an Ajax request. However, this does not adhere to the age-old adagium of unobtrusive javascript. And although graceful degradation and adherence to standards seems a thing of the past, we may still need to send complex form data over the wire using the built-in mechanisms. Unfortunately, this is not so trivial.

The internet mime type of form data traditionally is by default application/x-www-form-urlencoded, the only other flavor is multipart/form-data. Only these formats are available if you want to send form data to the server using the built-in HTTP methods (AKA "verbs") GET or POST that are available on the HTML form element. There is a solution, however, and it comes from an unlikely place...

Look ma, no JSON!

Originally, PHP was a form processing language for the web, and was even briefly named FI (Form Interpreter). From early on it had a way of handling hierarchical form data on the server. When a field name in a URL-encoded piece of data ends with a matching pair of square brackets, the value is interpreted as an array:

myarray[]=1;myarray[]=2;myarray[]=3

In the receiving PHP script, myarray will contain the following array:

[1,2,3]

If the order is important, indices may be specified:

myarray[2]=1;myarray[1]=2;myarray[0]=3

Translates to:

[3,2,1]

You can even encode key/value maps in post data you send to PHP, by supplying a string instead of an integer within the enclosing square brackets:

mymap[a]=1;mymap[b]=2;mymap[c]=3

On the server, mymap will contain:

{"a": 1, "b": 2, "c": 3}

You've guessed it, you can't encode key/value maps with integers as keys, but that doesn't matter, because you also can't use integers as names for form components.

HTML imports

We successfully resolved the issue of sending complex data to the server, but we didn't even have a use case yet! Let's take the largest challenge I can currently think of head-on: creating an Excel-like tabular data grid for editing rows and columns with arbitrary information. From a form perspective, we have an array of repeatable, yet identical, subform instances, where the values in each subform can be edited by the user. For the moment, I'll ignore properties for individual rows or columns, like custom formatting or data types.

Can we create such a beast using just HTML forms? Well, we could start by introducing a fieldset element, and stating that it should be interpreted as an array:

<form name="datagrid">
  <fieldset name="row[]">
    <!-- this will contain subform components -->
  </fieldset>
</form>

Column names are initially sequential letters starting with A. So for starters, let's just take A to G to create the default subform that will be imported:

<fieldset name="row[]">
  <input name="A" type="text" />
  <input name="B" type="text" />
  <input name="C" type="text" />
  <input name="D" type="text" />
  <input name="E" type="text" />
  <input name="F" type="text" />
  <input name="G" type="text" />
</fieldset>


Eventually we will want to repeat both rows and columns, so we can add them more generically, but to just to drive our point home, I present to you my pure HTML form for editing tabular data:



I didn't even have to use tables. So, how can we repeat subforms without all this copy-pasting I just did? Well, we could just write a simple javascript to repeat rows and columns, but it wouldn't be pure HTML anymore. What if I could at least import the fieldset, as a snippet of code, and insert it into the main form without the aid of custom scripts? At first I thought HTML imports could provide a standard way of doing this, but alas... HTML imports aren't about importing HTML at all! They're for bundling functionality, which requires a lot of javascript. Yikes! However, the idea isn't bad, so I'll just use it as I think it should work. To cut it short, the above snippet is packed as separate form piece, and can be considered the default row.

Once the pieces are put together, we obviously need a way to store actual data, or retrieve what we've already stored before, and populate the form with it. We need some formal way of expressing that we want to retrieve multiple rows, which in terms of data means an array. I usually use the following convention: if the URL contains a query, the result will be an array, and if it doesn't, the URL should contain an identifier, in which case the result will be a key/value map.

So, to insert the data located at /row, we create a query to request an array from the server, and import in into the form:

/row?limit=100

The above URL will just return a hundred rows of data. We could use JSON, as the common conviction is that markup contains a lot more data. But since our intent is to insert subform instances, we should still use plain old HTML, as it's much more predictable. And how much bigger is properly compressed markup really?

Below is a working prototype, provided the links are resolved directly at their current location. In case there's no data saved yet, there's only the empty default row. Once the form is submitted, the data is stored, and in that case there will be one row of data, and again an empty row at the bottom.

<form>
  <link href="/header.html" rel="import"></link>
  <link href="/rows?limit=100" rel="import"></link>
  <link href="/default-row.html" rel="import"></link>
</form>


This is just a very basic prototype, that can be enhanced in many ways, for example with sorting and filtering, operations on single cells, you name it. But before you start hacking right away using your favorite javascript framework of the month, consider starting from traditional HTML forms.

Next up: Forms as state containers Part 3: validation

Wednesday, April 26, 2017

Forms as state containers Part 1: forms as a single source of truth

Please read the introduction if you haven't already.

Everything is a form

HTML forms are an age-old standard, and by no means replaceable. I've created a form builder on Dojo Toolkit some years ago, and for all its javascript fanciness, it was driven by the idea that forms can and should be the only means for users to interact with the web. Want to create a fancy slider, a large editable grid, or a even simple (yet interactive) list? Be sure to wrap it in a form element! Why? Because form components manage their own state internally. As soon as your component exposes anything else then a simple "value" property to allow interaction with javascript, you're going to leak state into parts of the application that don't have anything to do with it, making things literally hell to maintain.

Dojo Toolkit was the first large javascript frameworks for building widgets AKA web components (known as Dijits), and had some flaws in the state department, mainly because widgets always inherit from other widgets as a way of isolating application logic. In many, if not all, cases this meant that handing down state in the object hierarchy is inevitable. Newer frameworks learned from Dojo and put internal widget state in explicit containers, which are, let's say, marked for danger. Now, this is as good as it gets. Some developers will argue that component X will eventually need information from component Y, so you should maintain state centrally, outside of components. Enter web forms. And more importantly, web standards. Allow me to explain.

Data isolation

Each component in a form should maintain its own state, no matter how complex. The only thing a form component should expose is its value, which can be retrieved via its name. As you may know, you can extract all values in a form by iterating over all its components. Typically we can represent this data as a key/value map, where each component name is mapped to its value. We don't collect all form data all the time, we only do so when we need to process it. I see three possible moments for this to happen. The first is, obviously, upon submission. The second is when you want to store data offline, so the user won't loose it when something unexpected happens. When should data be stored offline? That depends how valuable the contents it. It's possible to store on every keystroke, but also after several, or according to a schedule.

The third moment the value of a form needs to be extracted is more interesting, but also much more rare. It's when an application is used in a secure environment, and each modification should be saved across the wire, like a live connection. This can happen for instance in banking or in multi-user apps, like a collaborative platform. I do have a solution for this, but I won't go into it now, as it really is a lot more complex. I'll divulge the solution in return for some cold hard cash...

Events bubble up

Because in forms, data is so perfectly isolated, it's also a challenge when you do indeed need information across interdependent components. I should probably give an example here to make it more concrete. Let's say I got some nice date picker widget somewhere off the web. I would like to be able to select two dates, and when both are selected, I want to automatically enter the offset between the two dates (say, in hours) in a number box below the date pickers:







OK, maybe not so nice widget, but hey, I got it like this out of the box for free. So, when both the start and end dates (yeah, you only see times) are set, the hours box is filled. I could register some event listeners on both date pickers that fire two handler functions when a date is selected, and in both handlers check for the other date to be set. However, that means I can't make it more generic, for instance when I would have multiple instances of this form, like in a calendar app. I can't really create a single widget to do all this internally, because it would violate returning a single "value" from it, as I actually have three values. How to do it then?

Instead of managing events locally, I just register event listeners at the top, which means on the form. Isolation of data doesn't mean I need to isolate events as well, on the contrary. Also, when registering events at the top, I have a central mechanism for managing client logic from the user's side of things. We could separate events by type, most notably "change", "click", "update", "focus", "blur", "submit", and match the event target using a regular CSS selector, as available since the arrival of Element.matches().

Having a single handler for all form events looks a bit like a controller you'd have in the over-popular MVC pattern, or the over-hyped Rx (reactive) way of doing things, all straight out of the box, and more importantly, standards-based. Throw application routing in the mix using some very basic pushState script, and you can start building applications without worrying to much about state, as you got perfect isolation and central management.

Next up: Forms as state containers Part 2: managing complex data

Forms as state containers (in stateless web apps)

When wiring up a web application there's a lot of state involved. Only recently I learned about state management libraries in javascript, but I won't be talking about them. I think I don't need to introduce them, because I'm convinced you don't need them. Let's take a step back to see what state actually means in web apps.

Traditionally, the web is stateless, and that's a good thing. When you request a page, the server may store information about your request, for instance for analytical purposes, but it doesn't need to do so. As the HTML document is received, nothing special is stored about that document, except maybe some temporary information (like cookies) is written to disk, if you'll allow it. Every time you request the same page, it's rendered the same way and, ignoring the subject of caching, no state is retained.

The HTTP protocol is stateless by design. Therefore, traditional HTML forms are stateless too. Whenever user data is submitted to a server, the client doesn't "know" about it. The browser doesn't receive any response from the server upon submission. Another (stateless) page is simply returned, informing that the request either succeeded or failed, in which case the user should just try again. As soon as the user navigates away from a page, this information he wished to submit is lost forever. This is a known feature of the web, and users accept it for a fact. As soon as we, the developers, try to "help" a user by retaining their work, that's the moment when things get hairy...

Among other things, retaining client-side information is one of the reasons that single-page (javascript) applications emerged. Instead of navigating by way of HTTP, the user stays on the same location the entire visit, while navigation is taken over by the javascript application. In modern single-page apps (SPA's), this is not even visible. Due to push-state technology, the URL and browser history change just like with HTTP, making the integration seamless. The downside is that the developer now has to deal with the application state.

Since the advent of SPA's a lot of things changed. Using widgets, in-page requests (Ajax) and rendering based on comparing HTML snapshots (DOM diff), only parts of the page are updated, while the rest just remains as-is, making interaction seemingly much faster and snappy. These techniques again bring much more state to the SPA, making things downright impossible to manage. Even the most advanced tools keep track of the entire application in single or multiple data-structures that live alongside the UI. Changes happen in the data, rendering the app is merely a side-effect to inform a user what's going on.

In a series of posts I would like to explore an alternative to state-full SPA's, in the form of, well, forms, as the title suggested.

Next up: Forms as state containers Part 1: forms as a single source of truth

Sunday, April 23, 2017

JSON, the final fronteer: handling semi-structured data the JavaScript way

One of the challenges in javascript remains that there's no convenient way to traverse JSON. Subsetting or modifying complex hierarchies usually requires some sort of custom recursive strategy, which always leads to heavy mental exercise and many errors. Since JSON has pushed XML out of the web stack, there's still some XML legacy to be dealt with. So why not scour the pile of remaining junk for some nifty scrap parts to reuse? Let's travel back to Tatooine...

XPath seems to have had an important role in traversing the DOM, the abstract Document Object Model resulting from interpreting XML documents. It certainly had some useful features that CSS, for instance, lacks, like traversing back up a tree. Unfortunately, the W3C recommendation of XPath is not suitable for JSON.

You might argue that the W3C deemed XPath unfit for JSON, but perhaps it was never given a fair chance. There has been an initiative to create an XPath-like DSL for JSON called JSONPath, but it has probably been collected by a sandcrawler. In this post I would like to explore both the possibility of XPath traversal for JSON and an implementation in javascript.

Let's introduce some data:
<contrib-group> <contrib contrib-type="author"> <name name-style="western"> <surname>McCrohan</surname> <given-names>John</given-names> </name> <aff>Center for Devices and Radiological Health</aff> </contrib> </contrib-group>
By Jabba's beard, that's XML! Haven't seen that on the web since 2013! Yep, and the XPath to retrieve the author's given-names from these ancient runes would have been

//given-names

Wow. Try doing that in javascript! Now, let's inspect what really happened here. The double slashes actually shortcut the expression straight down to all elements named "given-names", in the order that they appear in the document. This is called an axis-step, and expanded the expression looks like:

/descendant::given-names

Furthermore, instead of the string "given-names", the engine finds a curse word called a QName, which is a "qualified name" in a certain "namespace". Ugh. Never mind, it just means that the engine searches for a node of type element, 'cause those are denoted by a QName. So it will test for that type:

/descendant::element(given-names)

Is there a way to write this expression for JSON? Let's see and have a go at modeling the same data. First, we can get rid of the group-level, because, hey, we have arrays. The root doesn't need a name, so we'll just write an object, and since arrays could contain multiple contributors, let's refer to it as "contribs".

{"contribs":[]}


Secondly, we have to find a place to put that abomination called an attribute.

{"contribs":[
  {"type":"autor"}
]}


Easy! Now the rest.

{"contribs":[
  {
    "type": "autor",
    "name": {
      "style": "western",
      "surname": "McCrohan",
      "given-names": "John"
    },
    "aff": "Center for Devices and Radiological Health"
  }
]}
Here's a straightforward mapping, glossing over some details I'll address later on. How to find the same data with our simple expression? Can we just have the expression address the key "given-names" in our "contrib" object? Not quite. While modeling the data as JSON we actually shifted the tree one level. Instead of mapping the "given-names" element to an object, we actually mapped it directly to a string!

Object values are not always objects. And in XML, something called text nodes existed, but they didn't have QNames! Let's just pretend that the string value is a text node and that we can just select any node that satisfies our name test. In JSON, a node can be anything: objects, arrays, strings, numbers and booleans. Note also that object keys need not be a QName, and that they may also have other types, like numbers.

/descendant::node(given-names)

We just ruined the beautiful old datamodel for our selfish purposes... but created a cool new spaceship while doing so. At least now we can take it all the way without trying to be correct. Instead of putting our "given-names" expression in hyperdrive to select all descendants, we could've also been more precise and navigate the tree one step at a time. Since the engine doesn't have to regard all nodes in a document, this can give a huge performance boost. For the XML fragment this would be:

/contrib/name/given-names

In JSON you have to be aware that items in an array don't have names, so we should select it by its index, in this case being the item at index 0. However, the XML people were afraid that to start counting at zero would certainly confuse some Ewoks, so they decided to start counting at one. Since their wisdom was unfathomable let's just go with it:

/contribs/1/name/given-names

Or, expanded:

/child::node(contribs)/child::node(1)/child::node(name)/child::node(given-names)

Finally, I'd like to look at some more arcane XPath stuff that I would call fat filters. In javascript we'd have to either create a loop to filter or use the built-in filter method on arrays. But in XPath we get that for free. In addition we get a reference to the context, in the form of a dot, but also some context-dependent functions, for example "position", that refers to the position of the item we're filtering. If we would have a nodeset that contains multiple contribs, and all we wanted is those of type "author", we could write the following for XML:

/contrib[@contrib-type = "author"]

Expanded this looks more or less like

/child::element(contrib)
+
filter(equals(./attribute::attribute(contrib-type),"author"))

Very straightforward. Like any filter, the engine will simply test the node in the current selection by calling a function that returns either true or false, leaving out the ones that return false. However, in the case of JSON the context we selected above would refer to the array, not an item in it. Luckily the old XML gods have foreseen this and provided a wildcard to select any node for the current axis step. It's like a little bright star at the horizon.

Normally we associate wildcards with name globbing, and not so much with an range of numbers. But droids are smart, right? So let's just have the engine figure out what to do when it encounters a wildcard. In case it encounters an array, it will simply convert the array to a sequence of objects. A sequence looks and feels somewhat like an array, but is actually a monadic type... Very dark and ancient magic indeed, not for the faint of heart and certainly outside of the scope of this galaxy. Suffice it to say a sequence can contain zero, one or more items, and everything is a sequence. Anyway, we can now filter anything:
/contribs/*[type = "author"]

Or
/child::node(contribs)/child:node(*)
+
filter(equals(./child::node(type),"author"))

Phew, point made! Just makes you wonder: how the hell could anyone come up with something so alien as JSONPath? We may never know. They were probably from a planet destroyed by a death star. Mercenaries perhaps, who cares. On to the last phase, implementing it into your brain.

To give the functionality an appropriate name I'm just going to go with "select", since we're selecting something. A handy helper will be a seq function, a factory making Sequences. We're also going to need a way to handle sequences, since that's what the select function will return. So instead of using the built-in javascript string comparison, we need a function "equals", that will compare a sequence with something else. I'm proud to present the "select" function:
function select(jsonDocument, ...paths) => Sequence

Where paths is an array of "steps" through the tree, each performed in turn on a smaller selection. Next we'll need to have some axis functions, like "child" or "descendant" above, but there's actually 13 axes in total. Let a step be some combination of an axis and a node type test. Both will be identified by our engine to perform the necessary operations on the current context. The first two examples from our exploration would translate to javascript like this:
select(contribData, seq(descendant(),node("given-names")))

select(contribData, seq(child(),node("contribs")), seq(child(),node(1)), 
  seq(child(),node("name")), seq(child(),node("given-names")))

For the filter example, we could just filter the result from a selection by writing something like:
filter(
  select(contribData, seq(child(),node("contribs")), seq(child(),node("*"))), 
  function(context) {
    return equals(select(context,seq(child(),node("type"))), "author");
  }
)

However, with XPath it's possible to just continue making the selection narrower, so it would be nice if we could provide the "select" function with some mechanism that doesn't filter the entire result right away, but processes the current subset once the step is encountered. For this we could make filter a function with just the filtering part, and not the input yet.
select(contribData, seq(child(),node("contribs"), seq(child(),node("*")), 
  filter(function(context) {
    return equals(select(context,seq(child(),node("type"))), "author");
  })
)

We could also create a filter method on Sequences, which is fine but less in the spirit of our XML ancestors. Furthermore, there's a lot more going on with types than I can describe here, as the XPath datamodel uses the entire XML Schema typeset. So keeping this a bit more open-ended may well prevent trouble later.

I promised to come back to some details with the test data I glossed over. I didn't mention that the XML fragment is part of JATS, the journal article tag suite standard for publishers. In this standard, believe it or not, the "aff" element, standing for affiliation of a contributor, is allowed to appear both under the level of the contrib-group and under a contrib! Woah, for real? That means our mapping to JSON was rather incomplete, not to mention plain wrong, because we changed the contrib-group to a contribs array, and since arrays aren't objects, we can't put the "aff" key under the "contribs".

Ok, we had been warned... but how can we fix it?! Well, arrays can't have strings as indices, but objects certainly can have numbers as keys. So, we could turn the array into an object and use numbers as keys:
{"contribs":{
  1:{
    "type": "author",
    "name": {
      "style": "western",
      "surname": "McCrohan",
      "given-names": "John"
    }
  },
  "aff": "Center for Devices and Radiological Health"
}}

Perfect! Everything still works. One more minor thing still: that the "aff" element could appear at two levels suggests that besides the "contrib" element, the "contrib-group" element was repeatable as well, so you could group contributors with the same affiliation. This means our model is incorrect at that level now too. Blast! How clever, these XML folks! Well, no way around it anymore, we just have to accept the fact that XML was more flexible than JSON in this respect. Perhaps we could just  introduce a minor convention to model elements in JSON, so we could still use the XPath engine we just created.

For starters, we could use some reserved keys to model QNames, attributes and child nodes. Let's use dollar signs, because they're on your keyboard (I think), aren't allowed in XML and are already reserved elsewhere in JSON land:
{
  "$qname": "contrib-group",
  "$children": [{
    "$qname": "contrib",
    "$attrs": {"contrib-type": "author"},
    "$children":[{
      "$qname":"name",
      "$attrs": { "name-style": "western"},
      "$children": [{
        "$qname": "surname",
        "$children": ["McCrohan"],
      },{
        "$qname":"given-names",
        "$children": ["John"]
      }]
    }]
  }]
}

Oh dear... This example is already quite terse, but it clearly shows how much more memory was needed back when XML was the preferred format. And all this just so that some information could be repeated without creating redundancy... Hey, wait a minute, we don't have much redundancy nowadays anyway. And even before XML this wasn't a problem, when we already had the same tabular data we still use today. Because redundancy is minimized by creating relationships between pieces of data we want to reuse, right?

So why did the JATS folks did it like this? Perhaps because they thought someone would be editing this by hand, and could introduce minor discrepancies when repeating information. Editing XML by hand... cruel times indeed. Or perhaps it was done because they already thought about the presentation of the data, and it would be unexpected for a reader to find the same information repeated, and so they forgot about separation of concerns. We will never know. It seems modeling data has always been a challenge, and nobody gets it right the first time.

But that's for another chapter in this very, very, very, very, very, very, very, very long saga.