Saturday, April 29, 2017

Forms as state containers Part 3: validation

This is a small series on HTML forms. The previous post was Forms as state containers Part 2: managing complex data.

A form is a schema

Since HTML5 there are a lot more attributes avaiblable on form components for expressing constraints. This means forms can nowadays be used as a basic schema for validating data. However, HTML5 form components are a motley crew, so we need some guiding principle to actually make sense of it all. JSON Schema provides a coherent set of constraints for JSON data. And because JSON is just javascript, JSON Schema should be readily applicable to HTML forms as well. You may not know or like JSON Schema, or you may not want to learn it because it isn't a standard (yet), so in this post I'll try to patch up things by merging JSON Schema into the HTML forms standard.

To start it off, some very useful keywords from JSON Schema actually translate directly to HTML5 form constraints. For this purpose I'm considering the version 0.4 draft.

JSON SchemaHTML5 form constraintPurpose
maximummaxUpper limit for a numeric instance
minimumminLower limit for a numeric instance
maxLengthmaxlengthMaximum length of characters in a string1
patternpatternRegular expression pattern to match against
requiredrequiredThe value must be present2
patternpatternRegular expression pattern to match against
enumselect / optgroup / datalistA list of possible values3

1. I'm not sure if surrogate pairs are counted as 1 or 2 characters in none, one or both specifications.
2. There's a slight difference, as the keyword in JSON Schema expects an array of properties that are required in an object. Also, a form component can be present, but not filled in, while a property can only be either present or absent. We can however treat both specs as equal in most cases.
3. In HTML an enumerated value is expressed as an option element. In JSON Schema the array can contain items of any type, including null. The HTML option element can only contain a text string.

Types please

The HTML form constraints API says that the data type for all form components is a text string. That's pretty poor, since it's very likely that data submitted by a form will eventually land in another part of your application, or some kind of data storage, and we're going to need more specific type information. When we have it, we can catch some errors in advance, and in other cases we can automatically convert a value to another type. A natural fit for adding type information to a form component would be in a data attribute called type. Data-type... Wow.

Expressing primitive types is very straightforward:

  • input type="text" data-type="string"
  • input type="number" data-type="number" (through type conversion)
  • input type="checkbox" data-type="boolean" (through the checked attribute)

Things get a bit more complicated for objects and arrays. JSON Schema provides the keywords properties for objects, and items for arrays, which can be nested to express any kind of hierarchical data. However, since the form is the schema, we don't always need to express properties or items, as they will naturally appear from the nesting of form components.

For the sake of being complete, let's at least provide a list of properties that can be expected to appear in an object, or rather, a subform. For arrays it should suffice to supply the minItems and maxItems keywords, which express the expected number of items in an array. Optionally we can supply the uniqueItems keyword that can be used to inform that all items in the array must be unique, but identical values should be rare enough in forms.

Type = Format?

Another inconsistency in HTML forms is that the type attribute actually denotes a format according to JSON Schema. It would be better if this pivotal attribute be renamed to format, while the original would be used to express the actual data type of a value, but since the inconsistency is historical we're stuck with it. When we read format instead of type for input elements, we're pretty close to JSON Schema already.

JSON Schema doesn't limit our imagination considering what formats JSON data can have, so we can use that to our advantage when creating custom form components. So instead of thinking up some custom tag name for a functional web component, you could also just name it input and use the type attribute to semantically express its functionality.


JSON Schema allows importing schema instances by way of JSON Reference. However, as I've demonstrated in the previous post in this series, form parts may just as easily be imported once the HTML imports specification has become widely adopted.

Unobtrusive javascript!

Now that we've perfectly aligned our constraint attributes with JSON Schema, it's time to start validating our form. We have to write some javascript to do this, but we can finally dance and cheer, because the script is unobtrusive. That's because validation is just an added bonus, and the form will work fine without it.

As I've underlined in part 1 in this series, I consider it good practice to bind events to the top form as much as possible, and use both the type of the event, and the component name or matches(some-css-selector) on the event target, to distinguish relevant situations. This is no different for validation. You can decide if you want to validate only on submission, by inspecting the event type for submit. Or you could inspect the target for a specific form component to validate when a change event is encountered.

For things like styling and custom validation messages, we've got a lot of that built into the browser nowadays. I leave the implementation of custom validation as an exercise for the reader for now, but expect some github-hosted form power from my hand in the near future.

Next up: Forms as state containers Part 4: form generation

1 comment: