Rethinking Web Development: Canvas UI

Feb 10, 2014 at 9:28PM

Caleb Doxsey

So here’s a sadly common story:

A startup with an existing web application is seeking to get into the mobile space by producing both an Android and iOS app. Seeing as they both already have the necessary skills to produce a web application and they’d rather not have to learn, build and maintain another two versions of their application, they attempt to create a mobile application using web technologies via PhoneGap.

Almost without fail that first version of their application is a disaster. Slow, clunky, unstable and riddled with bugs, they’ll spend the next 6 months trying to get it to work in some sensible way. Eventually they’ll admit defeat, scrap the old app, and build a new, native one from scratch.

And that’s pretty depressing. One of the supposed advantages of the web was that we could finally do away with so many operating system idiosyncrasies. No longer would we have to make 3 versions of every piece of software (windows, mac, unix) and instead we could make one application which could run everywhere. And then after perhaps half a decade, we threw all that out and recreated the multiplicity of platforms we thought we’d gotten rid of.

Actually it’s worse than that. A mere 3 versions of the application is a pipe dream. In reality you also have to consider all the various devices, in all their various resolutions and capabilities, and all the various versions of the operating systems. This list of Android devices is the stuff developer nightmares are made out of. Serious Android shops really do just have 100s of mobile phones to test on. (because it’s the only way to be sure it actually works for your users)

I can’t believe I’m about to say this, but it makes the almost 10 years we spent working around IE6 bugs seem easy by comparison.

Looking back at the one Android app I built I did come to appreciate one thing about it though. I think perhaps our original inclination about building one app for multiple platforms wasn’t entirely hopeless - it was just in the wrong direction.

Why do we think that HTML, CSS and Javascript represent the ideal set of technologies to build user interfaces? When you actually consider their origins and the original purpose for which they were designed, they are a remarkably poor fit for what we ended up doing with them. HTML is a language to represent hypertext - that is to say a language that gives some structure to basically textual content, and to provide links between different sets of textual content. It’s designed for an encyclopedia. But we aren’t building encyclopedias, at least not most of us, and how much of a modern web application even resembles textual content?

Now I’m not sure I can actually make this argument - it’s the kind of thing that’s really hard to recognize until you see an alternative - but I’ll give it a shot anyway. Here are a few of the problems I see:

Fatally flawed by it’s original ties to the hideously complex SGML, it’s far too forgiving nature for malformed content and its dearth of elements to represent concepts we actually work with, HTML is a hack we put up with because we have to. As a technology it failed to deliver on its ultimate goal: the Semantic Web. (I think we can thank Google for managing to give structure to a system which was supposed to have it, but never actually did)
When it comes to web applications the separation of the semantic meaning of content (HTML) from it’s presentation (CSS) and interactive function (Javascript) is not actually all that useful. Most of what makes a user interface has no real meaning outside of it’s intended function inside that interface. Structuring buttons as buttons, lists as lists, tables as tables and text as text may make the life of a data-crawling robot easier, but the end-user doesn’t ultimately care about such things.

Proof of this can be seen in this basic point: nobody stores their structured, semantic data as HTML. If I query Twitter for a Tweet I get back a JSON object that describes everything from it’s title to it’s author and date of publication. I take this object, transform it into HTML and CSS, and include it in my page. If HTML delivered on what it promised APIs like Twitters would return content as HTML which I could just drop into my page. But it doesn’t deliver and HTML is a cumbersome and inadequate tool to represent structured data.

And yet web developers, suckled as they were at the teat of Designing with Web Standards, are obsessed with the pointless pursuit of semantic value.
WYSIWYG editors, though popular for a time (with products like Dreamweaver), ultimately fell by the wayside. Probably most web designers today know HTML & CSS, and perhaps even work directly in them. Maybe that’s a good thing, but something about it just seems wrong. If a designer is comfortable with Photoshop why shouldn’t he be able to export what he makes in a usable format?

I spent many years translating photoshop mocks into HTML & CSS. I don’t it anymore, but that job still exists. Why? Shouldn’t a computer be able to do this?
CSS is not easy. It’s deceptive, because some things are trivial. But the layout model, coupled with it’s cascading nature, means that full-fledged UIs can be surprisingly difficult to implement. Everyday examples include centering content, implementing float & overflow properly, trying to use the incredibly complex z-index algorithm, and - of course - working around subtle browser inconsistencies.

Now if you don’t agree with those examples, consider the fact that it took years for browsers to finally mostly pass the Acid3 test. If the people who write browsers can’t implement these things properly, how much hope do you have of using them properly?
CSS is, in general, not very reusable. Have you ever tried to take a component you wrote in one site and put it in another? Cascading rules lead to complex, surprising and unpredictable interactions among components, and there doesn’t seem to be any general approach to writing CSS that solves these problems. (though certainly many have tried) The best solution I’ve seen is a style guide, but the truth is most web sites are a hodge-podge of 10,000ish line CSS files that no one can really grasp.
Newish web development tools are remarkably conservative. LESS, SASS and all their cousins are languages designed to produce CSS. But they’re still basically CSS in how they work. No one is re-inventing the layout model and about the riskiest thing you see are automatic polyfills or image substitutions with data-uris. The same can be said for HTML generation languages (with most looking like Markdown and friends - glorified variable substitution languages) and even most Javascript competitors like Coffeescript.

It seems to me that a healthier ecosystem would see more radical offerings along the lines of the Google Web Toolkit. Not that there haven’t been attempts to pull this off, rather the attempts are generally of very low quality. Maybe a better base set of technologies would make things like this easier to pull off.

I could probably keep going, but at this point I want to introduce a radically different way of writing web applications which I called Canvas UI.

Canvas UI

The idea is this: rather than using HTML & CSS as the bedrock of frontend web development we use a more rudimentary graphical API (Canvas, WebGL or SVG), build frameworks on top of that base and develop our actual application using an entirely different set of technologies.

Over the last couple weeks I threw together an example which implements the same functionality as my previous WebRTC example. Source code is available here. Though crude, buggy and clearly not a viable solution, I think there’s enough here to show what it could look like.

The application is drawn on a large, full-page canvas, with the only HTML & CSS being used to create that canvas. All drawing is done via Javascript in an OO class-based component model like this:

var incomingTitle = new Text({
	text: "Incoming",
	fontFamily: FONT,
	fontSize: "20px",
	color: "#333",
	left: function() {
		return this.parent().left() + (this.parent().width() / 2 - this.width() / 2);
	},
	top: function() {
		return this.parent().top() + (this.parent().height() / 2 - this.height() / 2) - 40;
	}
});

Not for the faint of heart, this is a no-frills approach to web development. To list just a few of things you’d need to pull this off:

Canvas has no DOM. Objects are painted to the screen and have no other existence outside of what you give them. Therefore if you want to implement events you will have to do them entirely manually. (That is to say, hook them to the body and calculate the position of your GUI elements to determine whether or not a click fell on them)
Some graphical elements are quite complex in their implementation: for example a large block of text. You won’t get scrollbars, multiline layout, floating images (or other content) or the kind of CSS positioning trickery you may be accustomed too: line-height, text-overflow, indentation, padding, margins, … (unless of course you choose to implement them)
Forms will have to be completely rethought. None of those controls will exist anymore.
Many of the everyday tools web developers use won’t be all the useful anymore. Inspecting a page won’t tell you much, and you can’t go adjusting CSS rules on the fly.
For that matter accessibility is pretty much shot too… (though maybe you could work around this by dumping your textual content to the page in a format screen readers could figure out)

So you might wonder why anyone in their right mind would suggest such a thing. It’s because it offers maximum flexibility. Sure throwing out everything we have given to us with a modern browser is hard, but think of the other side: no longer beholden to whims of the w3c and browser makers you can develop entirely different approaches to rendering GUIs. To give a few examples:

You could make an engine which could translate Android or iOS applications into web applications (and thus reverse what everyone is currently trying to do). My example implements drawable shapes similar to how Android does it:
```
new Shape({
  solid: {
    color: "#F5F5F5"
  },
  stroke: {
    color: "#DDDDDD"
  },
  corners: {
    bottomLeftRadius: 10,
    bottomRightRadius: 10
  }
})
```
That wasn't too hard to put together, but the sky's the limit here on how it could work.
You can change browsers in ways that wouldn’t be possible otherwise. For example you could implement the LaTeX text rendering algorithms which would give you superior text-wrapping and hyphenation, justification and kerning.
A lot of what we do now could be generated: you can make your own layout languages, generate javascript plumbing code for ajax and events and abstract away having to think about resources. (We tend to think this is impossible but Android manages to do it just fine)
This freedom opens up rendering options we never really had. For example imagine a comic speech bubble:

In traditional web development you would implement this use a whole bunch of images thrown together or really crazy CSS. (like this) In Canvas it’s a formula:
```
ctx.moveTo(l, t+a);
ctx.arcTo(l, t, l+a, t, a);
ctx.lineTo(r-a, t);
ctx.arcTo(r, t, r, t+a, a);
ctx.lineTo(r, b-a);
ctx.arcTo(r, b, r-a, b, a);
ctx.lineTo(l+40, b);
ctx.lineTo(l+15, b+15);
ctx.lineTo(l+20, b);
ctx.lineTo(l+a, b);
ctx.arcTo(l, b, l, b-a, a);
ctx.lineTo(l, t + a);
```
That may look complicated, but it’s a lot more flexible. There are only a few low-level, primitive operations you need to learn.
Graphics like this are re-usable. That speech bubble is easy to restyle (changing its fill style) and transform (adjusting its dimensions) and I can copy and paste it into another project. With an SVG backend you could potentially export things directly from an editor.

And I suspect given a full-blown framework with lots of features we’d see ways of doing things we couldn’t imagine before.

So where do we go from here? Well the truth is I don’t really have much drive to finish this project. I already know all the standard web technologies, and UIs are mostly a means to an end for me. I’d rather focus on real coding. But maybe some day we’ll see some radically different approaches to building UIs for the web.

This is the final post in this series. The previous 4 were: WebRTC, Non-RESTful APIs, Cloud IDEs and Static Hosting.