2015-07-03 19:22 Why Go and Rust are Competitors

This is a post in response to http://dave.cheney.net/2015/07/02/why-go-and-rust-are-not-competitors.

It's a popular idea that Go is a programming language suitable for a particular set of problems (web services, command line applications) but not others (games, embedded development) where a language like Rust or C++ would be more appropriate. I think this notion is misguided and I will attempt to explain why in this post.

Go is Evolving

First its important to distinguish between Go as it exists in its current form and Go as it may exist in the (perhaps near) future. For example at one time Go wasn't a particularly good language for Windows development:

Go compilers support two operating systems (Linux, Mac OS X) and three instruction sets.

http://web.archive.org/web/20100310145452/http://golang.org/doc/install.html

And now Go works great in Windows. In fact Go is one of the most capable cross-platform platforms - as easy to use as Java without the need for an installed runtime.

A more recent example is mobile development. With 1.5 it is now possible to create cross-platform (Android + iOS) native mobile applications, albeit with limited API support. Mobile development was not part of the original goal for the language, but now it stands ready to do something which very few other languages have been able to achieve - work equally well on Android and iOS.

So Go is a language that has proven itself more than capable of expanding its scope - crucially without compromising its mission. When considering its competitiveness we have to keep this in mind and ask slightly different questions: Is there something about Go's fundamental design and vision that prevents it from being appropriate for a particular set of problems where Rust would make more sense?

Control

Go and Rust embody two very different methodologies about how to write software. Go has a runtime which manages memory and handles task scheduling. Because of this the language can be kept quite small and its type system is minimalistic: structured types, with interfaces for dynamic dispatch.

Rust does not have a runtime:

Requiring a runtime limits the utility of the language, and makes it undeserving of the title "systems language". All Rust code should need to run is a stack.

The Rust Design FAQ

Instead the language focuses on enforcing memory safety through its type system. Rust programmers (like C or C++ programmers) are required to think carefully about the allocation and deallocation of memory. (consider Borrow) Similarly thread management and scheduling is manual. Notably Rust removed lightweight thread support because they couldn't get it to perform well.

This difference is fundamental, but the superiority of the Rust approach should not be granted by Go programmers. Its not at all obvious that managing your own memory and scheduling your own tasks always leads to superior performance.

In fact, for a certain set of problems, it will almost certainly lead to worse performance. For example consider a web server. A web server fields many requests by reading the HTTP payload, performing some work (perhaps hitting a database or compiling a template) and then writing it back out again. As programmers we tend to think of performance for a program like this in terms of its raw components: can we speed up reading and writing the HTTP payload, can I use less memory when pulling data out of the database, etc...? By making each request as fast and small as possible we can make a faster web server.

But this gets things exactly backwards. The chief problem for modern programmers is not getting our program as small as possible, rather it's the full utilization of the resources available to us. You have a server with a vast ocean of memory and dozens of cores. Are you using these resources?

That small, fast version of the program you put so much effort into optimizing may indeed server a single request very quickly, but a web server is intended to handle multiple requests concurrently. You'd be better off using more threads and memory to perform more requests at the same time:

|-request-1-|-request-2-|-request-3-|
vs
|-----request-1-----|
|-----request-2-----|
|-----request-3-----|

Unfortunately task scheduling is hard and once you start going down this route you realize that you'd really like a unified approach to the problem. If I use epoll for my HTTP sockets, I probably also want to use it for file or database access. But how do you construct a standard library that allows you to do this?

Before long you will find yourself recreating everything that Go provides out of the box.

Problem Domains

So perhaps Rust isn't well suited for massively scaled web servers, but what about other problem domains? Transforming a simple program into one that fully utilizes the resources on your computer is difficult. Is going in the other direction just as hard?

I don't think so.

First of all the need for a truly simple program is not nearly as great as many people imagine it. One of the most successful games in history (Minecraft) was written in Java - a language not exactly known for its simplicity or small memory footprint. Many games increasingly utilize a hybrid approach to their design: a game engine written in C++, but most of the actual game code written in a higher level language like Lua.

Really the issue here isn't that Go is unsuited for this task - any complex game engine will involve similar resource utilization problems, and Go does provide the levers needed to reach levels below the language - the real issue is one of tooling. Companies use these game engines because they already work. Rolling a brand new one would be a massive investment, but its not impossible.

And Go has gotten better at playing well with other platforms. What would be wrong with combining Go with the game engine instead of a scripting language?

But even the smallest platforms aren't out of reach. With the changes in 1.5 Go is staged to provide much more flexibility in how its runtime is implemented. The garbage collector is already much more friendly to realtime applications, but perhaps Go could include a pluggable or tweakable garbage collector and task scheduler which makes sense for the particular platform you are targeting.

In other words, in its current form Go may not appropriate for these platforms, but its not because the methodology is flawed, its because the compiler and runtime technology is not yet intelligent enough. But in the future it may be - and given recent history - it seems reasonable to expect it to happen.

Computers Aren't Simple

There's also another side to this debate which is often lost in the discussion. The decisions Go makes are already decisions that were made a long time ago by the operating system. Rust may not have a garbage collector or task scheduler but Linux does. All of your memory access is virtual and the operating system has to keep track of that memory. (who owns which segments, when they can be freed, etc...) In the same way everyone already runs their software on an operating system which implements thread preemption at least at the process level. You're getting a sophisticated task scheduler merely by running your program in Linux.

The truth is that the battle Rust is fighting was lost a long time ago. Intelligence is deeply embedded throughout the whole stack. Optimizing a simple computer was hard enough when computers were dumb (when they largely matched their abstraction), but optimizing a smart computer is nearly hopeless when it was designed to make typical ways of operating fast.

Consider RAM, theoretically "random access" but in practice, multi-level caching means nearby access is faster than far away access. Or SSDs where their controller already implements log-structured storage at the controller level.

There are many more examples, but the point is that with each additional smart layer, optimization becomes much more difficult - and therefore you are better off using the simplified abstraction and letting the compiler and runtime worry about the messy details.

Rust and Go do Compete

Contrary to popular belief Go and Rust already compete in many problem domains. Rust was created as a language to implement a web rendering engine. There's no good reason Go couldn't be used for this task, and Go programmers shouldn't cede this claim.

The languages are radically different in design - they embody different approaches to writing software. This divide is not domain specific - there aren't a set of problems where Go makes sense, and a different set of problems where Rust makes sense. Rather the divide is philosophical (or ideological) and for that reason probably intractable.

Like many programmers I try to view the world in clear, black and white rationality. I see a problem, see a solution, and have a difficult time imagining how someone could come up with such a radically different solution to the same problem.

One way around this is to imagine that perhaps the reason for the different solutions was because my problem was distinctly different from someone else's. In our case perhaps I was writing a web app, and someone else was building an embedded app. But this line of thinking is a trap.

Our differences may be rooted in different circumstances, but they don't have to be. Ideological divides just aren't that simple. Go and Rust overlap and compete, both sides of the debate have rational proponents, and there's no obvious, irrefutable, objective argument for the superiority of one over the other. The debate between the two (like any philosophical debate) is worth having but we shouldn't pretend like both sides of the debate actually agree.

tl;dr

Programming is hard and Go makes it easier. The trade-off for this ease of use is loss of control. At the moment there is some cost for that trade-off in terms of performance - a cost that seems entirely worth it to me. In the long run a combination of Moore's law and compiler and runtime improvements will see that cost diminish. Perhaps one day the control deemed so necessary by C++ developers will seem as quaint as the techniques needed to write software for early consoles.

2014-04-21 22:10 How to Build Things With Go: Tries

I'm working on a new series of video tutorials for how to build things with Go. The first one is on Tries:

Source Code: github.com/calebdoxsey/tutorials

2014-02-10 21:28 Rethinking Web Development: Canvas UI

So here’s a sadly common story:

A startup with an existing web application is seeking to get into the mobile space by producing both an Android and iOS app. Seeing as they both already have the necessary skills to produce a web application and they’d rather not have to learn, build and maintain another two versions of their application, they attempt to create a mobile application using web technologies via PhoneGap.

Almost without fail that first version of their application is a disaster. Slow, clunky, unstable and riddled with bugs, they’ll spend the next 6 months trying to get it to work in some sensible way. Eventually they’ll admit defeat, scrap the old app, and build a new, native one from scratch.

And that’s pretty depressing. One of the supposed advantages of the web was that we could finally do away with so many operating system idiosyncrasies. No longer would we have to make 3 versions of every piece of software (windows, mac, unix) and instead we could make one application which could run everywhere. And then after perhaps half a decade, we threw all that out and recreated the multiplicity of platforms we thought we’d gotten rid of.

Actually it’s worse than that. A mere 3 versions of the application is a pipe dream. In reality you also have to consider all the various devices, in all their various resolutions and capabilities, and all the various versions of the operating systems. This list of Android devices is the stuff developer nightmares are made out of. Serious Android shops really do just have 100s of mobile phones to test on. (because it’s the only way to be sure it actually works for your users)

I can’t believe I’m about to say this, but it makes the almost 10 years we spent working around IE6 bugs seem easy by comparison.

Looking back at the one Android app I built I did come to appreciate one thing about it though. I think perhaps our original inclination about building one app for multiple platforms wasn’t entirely hopeless - it was just in the wrong direction.

Why do we think that HTML, CSS and Javascript represent the ideal set of technologies to build user interfaces? When you actually consider their origins and the original purpose for which they were designed, they are a remarkably poor fit for what we ended up doing with them. HTML is a language to represent hypertext - that is to say a language that gives some structure to basically textual content, and to provide links between different sets of textual content. It’s designed for an encyclopedia. But we aren’t building encyclopedias, at least not most of us, and how much of a modern web application even resembles textual content?

Now I’m not sure I can actually make this argument - it’s the kind of thing that’s really hard to recognize until you see an alternative - but I’ll give it a shot anyway. Here are a few of the problems I see:

  • Fatally flawed by it’s original ties to the hideously complex SGML, it’s far too forgiving nature for malformed content and its dearth of elements to represent concepts we actually work with, HTML is a hack we put up with because we have to. As a technology it failed to deliver on its ultimate goal: the Semantic Web. (I think we can thank Google for managing to give structure to a system which was supposed to have it, but never actually did)
  • When it comes to web applications the separation of the semantic meaning of content (HTML) from it’s presentation (CSS) and interactive function (Javascript) is not actually all that useful. Most of what makes a user interface has no real meaning outside of it’s intended function inside that interface. Structuring buttons as buttons, lists as lists, tables as tables and text as text may make the life of a data-crawling robot easier, but the end-user doesn’t ultimately care about such things.

    Proof of this can be seen in this basic point: nobody stores their structured, semantic data as HTML. If I query Twitter for a Tweet I get back a JSON object that describes everything from it’s title to it’s author and date of publication. I take this object, transform it into HTML and CSS, and include it in my page. If HTML delivered on what it promised APIs like Twitters would return content as HTML which I could just drop into my page. But it doesn’t deliver and HTML is a cumbersome and inadequate tool to represent structured data.

    And yet web developers, suckled as they were at the teat of Designing with Web Standards, are obsessed with the pointless pursuit of semantic value.

  • WYSIWYG editors, though popular for a time (with products like Dreamweaver), ultimately fell by the wayside. Probably most web designers today know HTML & CSS, and perhaps even work directly in them. Maybe that’s a good thing, but something about it just seems wrong. If a designer is comfortable with Photoshop why shouldn’t he be able to export what he makes in a usable format?

    I spent many years translating photoshop mocks into HTML & CSS. I don’t it anymore, but that job still exists. Why? Shouldn’t a computer be able to do this?

  • CSS is not easy. It’s deceptive, because some things are trivial. But the layout model, coupled with it’s cascading nature, means that full-fledged UIs can be surprisingly difficult to implement. Everyday examples include centering content, implementing float & overflow properly, trying to use the incredibly complex z-index algorithm, and - of course - working around subtle browser inconsistencies.

    Now if you don’t agree with those examples, consider the fact that it took years for browsers to finally mostly pass the Acid3 test. If the people who write browsers can’t implement these things properly, how much hope do you have of using them properly?

  • CSS is, in general, not very reusable. Have you ever tried to take a component you wrote in one site and put it in another? Cascading rules lead to complex, surprising and unpredictable interactions among components, and there doesn’t seem to be any general approach to writing CSS that solves these problems. (though certainly many have tried) The best solution I’ve seen is a style guide, but the truth is most web sites are a hodge-podge of 10,000ish line CSS files that no one can really grasp.

  • Newish web development tools are remarkably conservative. LESS, SASS and all their cousins are languages designed to produce CSS. But they’re still basically CSS in how they work. No one is re-inventing the layout model and about the riskiest thing you see are automatic polyfills or image substitutions with data-uris. The same can be said for HTML generation languages (with most looking like Markdown and friends - glorified variable substitution languages) and even most Javascript competitors like Coffeescript.

    It seems to me that a healthier ecosystem would see more radical offerings along the lines of the Google Web Toolkit. Not that there haven’t been attempts to pull this off, rather the attempts are generally of very low quality. Maybe a better base set of technologies would make things like this easier to pull off.

I could probably keep going, but at this point I want to introduce a radically different way of writing web applications which I called Canvas UI.

Canvas UI

The idea is this: rather than using HTML & CSS as the bedrock of frontend web development we use a more rudimentary graphical API (Canvas, WebGL or SVG), build frameworks on top of that base and develop our actual application using an entirely different set of technologies.

Over the last couple weeks I threw together an example which implements the same functionality as my previous WebRTC example. Source code is available here. Though crude, buggy and clearly not a viable solution, I think there’s enough here to show what it could look like.

The application is drawn on a large, full-page canvas, with the only HTML & CSS being used to create that canvas. All drawing is done via Javascript in an OO class-based component model like this:

var incomingTitle = new Text({
	text: "Incoming",
	fontFamily: FONT,
	fontSize: "20px",
	color: "#333",
	left: function() {
		return this.parent().left() + (this.parent().width() / 2 - this.width() / 2);
	},
	top: function() {
		return this.parent().top() + (this.parent().height() / 2 - this.height() / 2) - 40;
	}
});

Not for the faint of heart, this is a no-frills approach to web development. To list just a few of things you’d need to pull this off:

  • Canvas has no DOM. Objects are painted to the screen and have no other existence outside of what you give them. Therefore if you want to implement events you will have to do them entirely manually. (That is to say, hook them to the body and calculate the position of your GUI elements to determine whether or not a click fell on them)
  • Some graphical elements are quite complex in their implementation: for example a large block of text. You won’t get scrollbars, multiline layout, floating images (or other content) or the kind of CSS positioning trickery you may be accustomed too: line-height, text-overflow, indentation, padding, margins, … (unless of course you choose to implement them)
  • Forms will have to be completely rethought. None of those controls will exist anymore.
  • Many of the everyday tools web developers use won’t be all the useful anymore. Inspecting a page won’t tell you much, and you can’t go adjusting CSS rules on the fly.
  • For that matter accessibility is pretty much shot too… (though maybe you could work around this by dumping your textual content to the page in a format screen readers could figure out)

So you might wonder why anyone in their right mind would suggest such a thing. It’s because it offers maximum flexibility. Sure throwing out everything we have given to us with a modern browser is hard, but think of the other side: no longer beholden to whims of the w3c and browser makers you can develop entirely different approaches to rendering GUIs. To give a few examples:

  • You could make an engine which could translate Android or iOS applications into web applications (and thus reverse what everyone is currently trying to do). My example implements drawable shapes similar to how Android does it:

    new Shape({
      solid: {
        color: "#F5F5F5"
      },
      stroke: {
        color: "#DDDDDD"
      },
      corners: {
        bottomLeftRadius: 10,
        bottomRightRadius: 10
      }
    })

    That wasn't too hard to put together, but the sky's the limit here on how it could work.

  • You can change browsers in ways that wouldn’t be possible otherwise. For example you could implement the LaTeX text rendering algorithms which would give you superior text-wrapping and hyphenation, justification and kerning.

  • A lot of what we do now could be generated: you can make your own layout languages, generate javascript plumbing code for ajax and events and abstract away having to think about resources. (We tend to think this is impossible but Android manages to do it just fine)

  • This freedom opens up rendering options we never really had. For example imagine a comic speech bubble:

    In traditional web development you would implement this use a whole bunch of images thrown together or really crazy CSS. (like this) In Canvas it’s a formula:

    ctx.moveTo(l, t+a);
    ctx.arcTo(l, t, l+a, t, a);
    ctx.lineTo(r-a, t);
    ctx.arcTo(r, t, r, t+a, a);
    ctx.lineTo(r, b-a);
    ctx.arcTo(r, b, r-a, b, a);
    ctx.lineTo(l+40, b);
    ctx.lineTo(l+15, b+15);
    ctx.lineTo(l+20, b);
    ctx.lineTo(l+a, b);
    ctx.arcTo(l, b, l, b-a, a);
    ctx.lineTo(l, t + a);

    That may look complicated, but it’s a lot more flexible. There are only a few low-level, primitive operations you need to learn.

  • Graphics like this are re-usable. That speech bubble is easy to restyle (changing its fill style) and transform (adjusting its dimensions) and I can copy and paste it into another project. With an SVG backend you could potentially export things directly from an editor.

And I suspect given a full-blown framework with lots of features we’d see ways of doing things we couldn’t imagine before.

So where do we go from here? Well the truth is I don’t really have much drive to finish this project. I already know all the standard web technologies, and UIs are mostly a means to an end for me. I’d rather focus on real coding. But maybe some day we’ll see some radically different approaches to building UIs for the web.

This is the final post in this series. The previous 4 were: WebRTC, Non-RESTful APIs, Cloud IDEs and Static Hosting.

2014-01-30 21:05 Rethinking Web Development: WebRTC

Fundamentally web applications are client-server applications. Web developers write code that runs on a server and end users (clients) connect to that server (via a browser using HTTP) to perform tasks. In recent years this rather standard definition of the web application has come under fire. Increasingly code is no longer run on a server (but rather via javascript on the client) and HTTP is no longer, necessarily, the protocol used to communicate between the two machines (SPDY being the rather obvious alternative, but also things like WebSockets which don't really fit in the HTTP bucket).

However, even more dramatic than those two shifts has been the relatively recent introduction of WebRTC. If you're not familiar with WebRTC (the RTC meaning Real Time Communication), it's a technology that allows for peer-to-peer communication. That is to say, end users can communicate with one another directly, without the need for an intermediate server.

It seems to me, at least at this moment, that this is a technology that is generally not well understood and it's potential has not been fully realized. WebRTC ought to be a seismic shift in the way we build web applications. It's not yet, but I suspect it will be in a few years.

In this post, the penultimate post in the series, I will give a brief overview of how to use WebRTC and then discuss some of the possible implications for web development.

How to Use WebRTC

Getting started with WebRTC is not easy. Most of the documentation is very confusing (good luck understanding the spec), there aren't a ton of examples out there yet, and, up to this point at least, the technology has been in a constant state of flux. Nevertheless WebRTC is not merely experimental, it's a fully functional technology available in both Firefox and Chrome today.

To underlie that point, if you've not seen it, you should see the AppRTC project. This is a video chat app, similar to Skype, implemented almost entirely in the browser (with a small bit of server code) and using peer-to-peer transfer of data. For a mere demonstration, it's surprisingly useful and calls into question all of the applications out there that attempt to implement this functionality using custom Java, Flash or similar installed applications.

But back to the question at hand: how does one use WebRTC?

For the purposes of this tutorial I built a small chat application which consists of two components: a server-side Go application which facilitates the initial signaling process between the two peers and a client-side Javascript application which implements the actual WebRTC workflow. First let's take a look at a high level description of the process.

For this example, suppose we have two end users who want to talk to each other: Joe and Anna.

Joe arrives first, connects to a server, subscribes to a topic and waits for someone else to show up.

Anna arrives next, connects to the same server, subscribes to the same topic and at this point the server tells Joe that someone else connected.

Joe sends an "offer" to Anna (via the server) indicating that he wants to establish a peer to peer connection.

Anna receives the "offer" and sends an "answer" to Joe.

Both Joe and Anna trade ICE candidates. ICE stands for interactive connectivety establishment (described here) and basically represents the various ways the two parties can reach each other.

Finally the connection is made, one of the ICE candidates is agreed upon and Joe & Anna can communicate directly with another.

Code

The (mostly) complete source code for this example can be found in this gist. The server is implemented in Go, the client in Javascript and communication between the server and the client occurs over a WebSocket. WebSockets are actually fairly straightforward to implement using Go and they make it so I don't have to worry too much about storing things on the server. (As a long-polling comet server would require)

When a user first connects, the application will ask them to enter a topic. This topic is how the two peers are connected on the server (they have to enter the same thing). You can think of it like an agreed-upon virtual meeting location. That topic is sent to the WebSocket like so:

function startWebSocket() {
  ws = new WebSocket("ws://api.badgerodon.com:9000/channel");
  ws.onopen = function() {
    ws.send(JSON.stringify({
      topic: topic,
      type: "SUBSCRIBE"
    }));
  };
}

The server receives this connection, sends back the user's ID and subscribes them to the topic: (IDs are generated using a full cycle PRNG)

id := <-idGenerator
out <- Message{
  Type: "ID",
  To: id,
}
for {
  var msg Message
  websocket.JSON.Receive(ws, &msg)
  if msg.Type == "SUBSCRIBE" {
    msg.Data = out
  }
  in <- msg
}
// in subscribe
topics[topicId][userId] = out
for id, c := range topic {
  if id != userId {
    send(c, Message{
      Topic: topicId,
      From: userId,
      To: id,
      Type: "SUBSCRIBED",
    })
  }
}

And now the first user waits for someone else to show up. When that happens the same process is repeated, except this time the first user is informed that the second user subscribed, so he proceeds to begin the process of establishing a peer-to-peer connection and sends the second user an offer (via the server):

case "SUBSCRIBED":
  to = msg.from;
  startPeerConnection();
  sendOffer();
  break;
pc.createOffer(function(description) {
  pc.setLocalDescription(description);
    ws.send(JSON.stringify({
    topic: topic,
    type: "OFFER",
    to: to,
    data: description
  }));
}, ...);

The second user sees this offer and sends an answer in reply:

case "OFFER":
  to = msg.from;
  startPeerConnection();
  setRemoteDescription(msg.data);
  sendAnswer();
pc.createAnswer(function(description) {
  pc.setLocalDescription(description);
  ws.send(JSON.stringify({
    topic: topic,
    type: "ANSWER",
    to: to,
    data: description
  }));
});

In addition to the offer and the answer both users have also been forwarding along ICE candidates:

pc.onicecandidate = function(evt) {
  ws.send(JSON.stringify({
    topic: topic,
    type: "CANDIDATE",
    to: to,
    data: evt.candidate
  }));
};

Finally once the first user receives an answer, and an ICE candidate is agreed upon the two peers are connected. RTC has the ability to stream audio and video, but for this example I used an RTCDataChannel:

pc = new RTCPeerConnection({
  iceServers: [{
    // stun allows NAT traversal
    url: "stun:stun.l.google.com:19302"
  }]
}, {
  // we are going to communicate over a data channel
  optional: [{
    RtpDataChannels: true
  }]
});
dc = pc.createDataChannel("RTCDataChannel", {
  reliable: true
});

Actually sending messages is trivial:

function onSubmit(evt) {
  evt.preventDefault();
  var text = chatInput.value;
  chatInput.value = "";
  var msg = {
    type: "MESSAGE",
    data: text,
    from: from,
    to: to
  };
  onMessage(msg);
  dc.send(JSON.stringify(msg));
}

Implications

There have been two large, and completely divergent trends when it comes to networked applications: the cloud and decentralized, distributed networks. On the one hand companies like Google, Apple and Facebook have been moving all of their user's information and applications to their data centers. Your email, pictures, games, etc... are no longer on your computer, but rather they are accessed over the internet, and stored in the cloud. This comes with a whole host of advantages, but it also comes with a steep cost: Google reads all your emails, Facebook knows everything about your personal life, Apple can track all your movements and although these companies certainly provide a great deal of value with their products, in the end their goal is not to make you a happy customer (because most of the products are free), but to use the information you freely provide them for their own ends. (and mostly just to serve you ads)

At the same this has been happening, we've also the seen the rise of decentralized, distributed networks. From file-sharing applications (like BitTorrent) and streaming content providers (like Spotify) to instant messaging applications (like Skype) and even digital currencies (like BitCoin). Decentralized networks have no single, governing authority. Data is spread across the network and communication is peer-to-peer.

And though the architecture of the web has always, fundamentally, been client-server, in some ways peer-to-peer architecture represents a more accurate representation of one of the original goals of the web:

The project started with the philosophy that much academic information should be freely available to anyone. It aims to allow information sharing within internationally dispersed teams, and the dissemination of information by support groups.

Now perhaps at one time, the Googles of this world had intended to organize the world's information. But increasingly, the fundamental goal of companies like Google is not merely to organize the world's information, but to own it. You can see this when they killed Google Reader. Google doesn't want you to read blog posts on other servers, they want everyone to use Google Plus as their blog. With peer-to-peer networks it may be possible to undermine this trend. People can own their own content again.

This example demonstrates one of the most obvious use cases for this technology: video, audio or textual chat among peers. But there are other possibilites:

  • The chat application has a far broader usage than most people realize. Of course there are the typical Google & Facebook-chat like applications, but there are also feedback libraries (like Olark), support applications (like you might see on Comcast's website) and a whole host of multiplayer games.

    Furthermore, WebRTC is secure by-default. In this day-and-age of NSA skepticism where Google reads all your email, and Facebook knows your entire life history, WebRTC is a breath of fresh air. You can finally have your privacy back and still get the robust accessibility and flexibilty of a modern web application.

  • Bit Torrent has demonstrated the power of a distributed file sharing network. With HTML5 technologies it is possible to build such a network directly in the browser. For example: ShareFest. Could someone implement an in-browser DropBox? (perhaps, similar to SpaceMonkey?) Or Mega? One of the downsides of WebRTC is that users must be connected at all times: once their browser closes they are unreachable: but that issue doesn't seem insurmountable... and maybe with a few helper nodes such a system could be sustained.

  • A distributed social network (ala Diaspora) may have a better chance of succeeding if it's just as easy to use as Facebook or Twitter. Storing all that data is challening (particularly with large media like pictures and video), but the relationships and textual updates are much more realistic storage-wise. Store that data among your (actual) peers and perhaps it could even be fairly reliable.

  • With the introduction of things like WebGL, WebAudio, PNACL, asm.js and a myriad of other HTML5 technologies, it's possible to build real games that exist solely in the browser. WebRTC makes it so those games can be multiplayer. This isn't a new idea - consider Artillery - but, as far as I know, it's not something which has really been realized yet.

  • One of the reasons internet companies are so successful is that it's very difficult to build something like Facebook. It requires a tremendous amount of capital and large, robust, highly reliable and fast systems can't be put together by just anyone. How many projects have been sunk by their inability to scale?

    And yet peer-to-peer networks have the potential to scale in a way the cloud never could. A signaling server can handle an enormous amount of load with no issues. If the vast majority of your application can be re-written to run purely on the client, most of your server-side concerns evaporate.

    Of course, most web applications can't be moved entirely to the front-end. Nevertheless this isn't an all or nothing game. The more work you can push to your end users, the less work you have to do on your own machines. Is it possible to do some of that background processing in a browser? It might be more possible than you imagine: modern browsers have threads, type arrays, full blown databases, offline capabilities, file-system access, etc...

And those are just a few of the things that came to mind in the last week. It's an exciting time to be a web developer and it'll be interesting to see just what turns up in the next few years.

If you managed to make it this far: thanks for sticking with it. Stay tuned for my final post in this series, where I will propose an even more radical change to how we build web applications.

2014-01-27 08:30 Rethinking Web Development: Non-RESTful APIs

Software development is a strange industry. Applications are hard to build: they take months of work, have lots of moving parts and they're extremely risky - often being built under tight deadlines in competitive markets. And they're usually built by surprisingly small teams of developers; developers who probably learned a very large chunk of the knowledge needed to build the application as they built it. It's kind of amazing that anything we build works at all.

And so I'm constantly surprised that we obsess over things which don't really matter. Developers will have vigorous arguments over everything from arbitrary stylistic choices (tabs vs spaces, where to braces, ...) to tool choice (like which editor to use) and naming conventions. Sometimes it seems like the amount of energy we expend discussing these things is precisely correlated with its level of arbitrariness.

Actually it's worse than that. Have you ever met an Architecture Astronaut? They take your crude, simple but working project and transform it into a more "correct" version. It's not actually functionally any different (one hopes), but it now has umpteen levels of indirection, factories, interfaces, powerful abstractions, design patterns and a myriad of other features you never really knew you needed. It also has the quality of being basically unmaintainable because you don't actually understand it anymore.

A remarkable demonstration of this can be seen in the Fizz Buzz Enterprise Edition. It's funny because it's not all that different from reality.

Actually this tendency to transform arbitrary decisions into moral categories and then hang them as a millstone around the neck of others is a much broader phenomena. At its root is perhaps the need for markers to indicate who is in the group and who is out of it, and then once we have those markers established the need to assert our mastery (as a game of one-upmanship). This tendency is only exacerbated by the challenges we're presented with: we aren't actually confident that we know what we're doing and the systems we build are often incredibly difficult to get working. Rather than address the hard problems, we isolate the easy things and focus on them. (Maybe I can't build a distributed database, but I can tell you that your function should start with a capital letter)

So I thought I might tackle one of these software shibboleths: the Restful API.

REST

REST is a complicated architectural style described here. I'm not particulary interested in tackling the actual academic meaning of the term, rather what it has become as a popular buzzword implementation. I have in mind the focus on proper HTTP verbiage (GET, POST, PUT, DELETE), resource URIs, and thinking almost entirely in terms of the representation and transfer of resources.

For example suppose we are building a RESTful API for email. We might have a URL structure like this:

{GET|POST|DELETE} /emails/{EMAIL_ID}
{GET|PUT} /emails

GET /emails lists the most recent emails, PUT /emails creates a new one, GET /emails/{EMAIL_ID} gets a particular email, POST /emails/{EMAIL_ID} updates an email, DELETE /emails/{EMAIL_ID} deletes an email.

So here's why I don't like this approach:

State

REST focuses on state, but state is not the primary building block of an application. What does it mean to "create" an email? Does that mean it sends it? How can you "update" or "delete" an email? Suppose you have 1,000,000 emails... listing them all doesn't really work anymore does it? Consider this approach to sending emails:

PUT /emails/{EMAIL_ID}/send

A URL like this doesn't make sense. It would be read "Create a 'send' record for email {EMAIL_ID}". What exactly would you send to this endpoint? An empty JSON object? ({}) It's an uneasy fit.

Suppose instead you add a "state" field to your email:

{"id":1234, "subject": "whatever", ... , "state": "unsent"}

And you would update that record and POST the update. This is a better approach from a URL perspective, but it's much messier from an implementation perspective. On the server I have to detect changes to this object and act accordingly (ie State was changed, therefore I will send the email). Do I merely update the record in my database, schedule the email to be sent later, and respond that everything is fine? Or do I wait for the action to be completed in its entirety? Maybe I add additional types of state:

{ ... "state": "sending" }, { ... "state": "sent" }, { ... "state": "bounced" }, ...

But "state" in this sense is not really a property of an email itself, rather it's more a property of our system: I'm currently sending your email, I sent your email, I tried to send your email but it was blocked, ...

Problems like this aren't unusual - they're typical. Modern web applications aren't glorified Wikipedias, they're the desktop applications of yesteryear: described in terms of user workflow and actions not in terms of the mere transfer of resources.

Caching is Broken

One of the supposed advantages of a RESTful architecture is that it lends itself to caching. Unfortunately those caching mechanisms are notoriously difficult to use properly.

Consider a web application with a lot of Javascript (which is basically all of them). Somewhere in the HTML for that site the Javascript has to be included:

<script src="/assets/js/site.js"></script>

That's web dev 101, and it's wrong. For 2 reasons:

  1. Most web applications change frequently. When you change site.js there's no guarantee that your end user will get the latest version the next they visit your site unless you explicitely make it so your web server adds headers to invalidate the cache.
  2. If you add headers to invalidate the cache everytime a user comes to your site that means they're downloading 100s of KB of script everytime they reload (which can be devestating to performance)
The solution is to use a hash as part of the name of the script and add aggressive caching headers:

<script src="/assets/js/site-d131dd02c5e6eec4.js"></script>

Let's just call a spade a spade here: that's a hack. The modern web developer spends and inordinate amount of time optimizing the performance of their applications to work around issues like this. This is because the architecture to which they're beholden is fundamentally flawed. A well designed system makes the typical case easy, not hard.

For more guidance on caching read this article by Google. Speaking of Google, they got so fed up with slowness of HTTP, they silently replaced it with SPDY on their servers.

Clean URLs

Perhaps you've read this blog post: URLs are for People. Well, I disagree. URLs are not for people. Nobody enters them manually, and rarely do they even bother to look at them. Your domain matters, but outside of that if you're spending more than a few minutes thinking about how you want to layout your URLs, you are wasting your time focusing on a part of your system that doesn't actually matter. (And one thing I love about that article is his two primary examples of bad URLs are two of the most popular sites on the internet: Google and Amazon...)

HTTP Verbs

If you look at a framework like Ruby on Rails it places a great deal of emphasis on using the correct HTTP verbs. What's bizarre about this is even in the original construction of web servers with simple CGI-based forms, the set of HTTP verbs was not widely supported. GET and POST were widespread, but their lesser-known cousins DELETE and PUT were unreliable. This leads Rails to add code like this:

<input name="_method" type="hidden" value="delete" />

So why advocate for a system which doesn't actually work out of the box?

Bi-Directional Communication

RESTful architectures are client-server architectures. All requests originate from the client, and all responses originate from the server. Therefore it is impossible for the server to initiate communication with the client. Sadly almost every application needs server-initiated communication.

For example it'd be great if our email application could tell the client when a new email came through. The RESTful solution to this problem is to poll periodically - a clumsy, inefficient and unreliable process.

Alternatives

Perhaps the main alternative to REST when it comes to APIs is RPC, which could be pulled off with any number of mechanisms (AJAX, WebSockets, ...). But RPC is not a magic bullet, it has its own set of issues (which probably led to the creation of REST in the first place). My point is not to offer an architecture which is superior to REST in every way, rather I think that the application ought to drive the discussion about architecture. If REST works for you application, then by all means use it, but if it doesn't don't be afraid to use something else.

Too often we measure the quality of an application by it's conformity to a set of pre-defined rules - the Thou Shalt Not's of web development - when we should really be treating those rules as suggestions - conventions that someone once found useful in building their own application.