2014-08-25 20:05 How Go is Unique: Static Linking, Composition and Russian Doll Coding

About a month ago I gave a talk at the NYC Go meetup. Here is the video of that as well as the slides and a transcription.




What I was thinking we could do is - this talk's about half an hour, some of its a little esoteric and not super down to earth - but may be useful for other people. If you're one of the ones who wants to learn something about go, like be more hands on, afterwards we can do a bit of a tutorial or something. I have a bit of a programming problem we can work on if anybody is interested in that so meet me afterwards and we can do that.

So this talk, the basic idea is that, I wanted to show some ways that Go is interesting and different than other programming languages. I don't know if anybody read this article, it was on hacker news and other places ... this quote is from that article. It says:

I hope my complaints reveal a little bit about how:
· Go doesn't really do anything new.
· Go isn't well-designed from the ground up.
· Go is a regression from other modern programming languages.

And I think this is a very common sentiment in the community. A lot of people feel this way about Go. And to be honest I think a lot of Go developers will wear this sentiment as a badge of honor. Right? Because programming is not just about having the best ideas and working on something crazy new, it's really about getting things done. And I think that way, and I think a lot of other developers do as well and Go's very much a language of getting things done. It's well designed in that sense and it's very easy to use, and you can make very good code that is effective.

But I want to say that this is actually not true at all. And actually some of the decisions Go makes are pretty radical, and radical in ways that are surprising. And so I'm going to talk about two of those ways and then I'll go into an actual example of how to write some Go code using them.

So the first example actually has nothing to do with the language per se. I'll give y'all a brief history of computing here. You start with hardware - this is the Eniac, the original computer that was programmed with switches and cables, so if you wanted to reprogram it, you had to like unplug things and plug them in. It took a very long time and so one of the first innovations in computing was software and the stored program computer.

So this is the Von Neumann architecture, modern computers follow the same architecture, and the idea is you store your program in memory along with your data. This means you can program using machine code but of course as soon as you can do that, you can have a program which makes other programs. And so that's like an assembler for assembly or a programming compiler like Go for example, or C, or any other programming language. That opens up the world of what we can do with it.

The first thing you'll notice when you start going down that avenue, is once you start making more complex pieces of software is you have code that needs to be reused. And so we introduce libraries. I have a mathematical function I wrote for program A and I want to be able to use it in program B. So the simplest way to do that is using static linking. So I literally take that code and copy it into the other program. I'm reusing it - I'm getting that advantage - but it's actually a copy of the original code.

Now there are two problems you get with that, and probably others... the two basic problems are (1) since you're copying you end up with a lot of redundant code - and in machines that had not a lot of memory, that was a big problem. It's not much of a problem anymore, nobody would really care about extra code but that was one issue. The other issue was if you wanted to change it. And that's a bigger problem, even today. To give you an example: OpenSSL - a security library that everybody uses - frequently has bugs which are incredibly important to get fixed. So I have 10 programs, they all use OpenSSL, it would be nice to just update that once and all of them get that update. And so that's the dynamic linking approach. At an operating system level this would be DLLs in windows, SOs in unix, and these are shared code libraries and the way it works is: I make my program and the compiler, it inserts, basically, a symbol, and then when the program is run it replaces that symbol with the address of the actual library.

But this is a bigger idea. It's not just an operating system thing. It's an approach to doing software. You see it with compiled virtual machine languages like Java and .Net. They have dynamic linking in the sense they have DLLs. They have Java jars. And you see it even in interpreted languages: ruby, python, node.js. In other words what I'm saying is: if I write a program in python, it depends on a library, I need to have that library installed on the machine in order to use my program. That's dynamic linking.

I like to think of it this way. Dynamic linking is, in some ways, the fruit in the garden. And it looks amazing! We say, if we could have this we could be Gods. That's the idea. So there's Eve offering the apple. And we have taken that fruit, but what we didn't realize is that this was Original Sin. This is the curse.

And, maybe you don't agree with me, but to give you an example. I made this slide of things, technologies that people use. And in some ways all of these technologies are examples of ways to handle the problems introduced by dynamic linking.

So these two up here, chef and puppet, are ways of doing IT automation, which, in a lot of ways, is installing libraries like OpenSSL. Why do I need to install those in the end system? Because my program depends on them.

These are all dependency management systems: for example bundler in ruby or whatever.

And this one, is really interesting: Docker. So Docker, is I can make containers. And the cool thing about containers is I can run them anywhere. Which if you think about it is static linking. So it's taking a dynamically linked programming language and turning it into a statically linked one.

So anyway, to get back to Go. Go is a statically linked compiled programming language. So it never took the fruit. I think, I'm not even sure the inventors of the language realized they were doing that on purpose as much as, since plan 9 didn't do it, they were like, well why would we do it. They didn't even think that way. And so, Go is very radical in the sense that most programming languages are dynamically linked, but Go is not.

So this is how compilation in Go works. This is important because I think there's a bit of misinformation about dependency management in Go. A lot of people complain that Go doesn't have a good dependency management solution. It doesn't have bundler. It doesn't have NPM. I guess the point I'm trying to make is that's not as big a deal as people make it because since it's statically compiled, you don't have to worry so much about, when you put it out and run it, it's gonna work.

This is how it works in Go, to make this point, you have these 3 folders. If you have a workspace. You have these 3 folders: bin, pkg, src. All your source code is stored in the src folder. And so the name of the library - like this github.com/nf/todo - that is the package name when you include it. It's really simple, it's layed out in folders. And then when I run go get, go build and go install, they look in these various folders for things. So if I run go install it's gonna take this and put it in the bin folder. And this is a package main, and if it's not a package main it'll put it in the pkg folder.

So it's a really simple layout. If you understand this (and if you don't understand this, I'd recommend reading this doc ... it will explain how this works). It's pretty simple, you can understand it in like half an hour. And the reason I think that's important is that dependency management in Go is easy, because "Dependencies are a build time issue, not a runtime issue." If you get your source code in the right place, you're done. So it's just a matter of getting the source code in the right place. There are tools that can do that like these tools (gpm, godep, goop). But you don't even need a tool, you could write a bash script that could do this. It's actually that simple. And in fact, the naive approach, of just doing `go get` which always gets you the latest head version. That works pretty good. And the reason it works pretty good, is you're doing this at build time. It's not an issue you get when you say, oh deploy my code, it's an issue you get, when I'm writing code. Which is exactly the time when you can fix a problem that you find.

But if you're concerned about versioning, there are ways to do it. In fact one of the ways is called vendoring, and that's where you actually copy the code. So instead of being github.com/nf/todo, you sort of put it in your own namespace, and so you're sort-of forking their project and making it your own. So that's one approach, but you can also version with these tools. Basically you'll have a file that will list the projects and a version and it will make sure you get that version every time.

So that's dependency management in Go. The bigger point being though: static binding is great. It makes a lot of these problems way easier to deal with.

So here's the second topic I want to talk about. Rob Pike has this article, which is a great article - you should read it, called "Less is Exponentially More". Just search for that in google and you'll find it. And he has this quote:

If C++ and Java are about type hierarchies and the taxonomy of types, Go is about composition.

And what he's driving at here, he ask this question, why is it that Go has not been popular with C++ programmers. And I think he discovers, and he writes about it here, the reason why is it's something very fundamental to the approach of writing software. And so this, right here:

Traditional object oriented, and this is also true of strongly typed functional programming languages focus on types as the fundamental building blocks of software. So in Java, C++, C#; and in Haskell and Scala (and I think even probably in newer programming languages like swift and rust) the approach you're taking is you think about types as your fundamental building block. So when you start and say "I need to develop a piece of software that does X," you say "what are my types?" That's where you are going to start. You build your type hierarchy and then from there your code will flow out.

Now Go's approach is not like that. It's very different. In the Go approach you're gonna start with, sort of, what do I need to make it do and then as you write your program your types will emerge. You'll discover the types in the program. And we'll see an example (maybe) of how that works in a second. But I think this is actually a really fundamental disagreement and it has areas in more than just programming

So, this is a famous painting call the school of athens and the two figures... well does anyone know who the two figures in the center are? [audience: plato and aristotle] Thank You. So yes Plato and Aristotle. What's interesting about this painting. You'll notice that Plato is pointing upwards, Aristotle is like downwards. This is, in some ways, not entirely accurate ... I don't want to get into philosophy too much here, but, in Plato's philosophy types, or what he calls forms, or categories in some ways. This idea of form, which is very similar to type, is fundamental. That's the most important thing. And the famous analogy he gives is called the Allegory of the Cave.

If you imagine, the way he positions is, you imagine you have several people who are chained up, and they're staring at a wall in a cave. And behind them is a fire. And then there's people in front of the fire, but behind them, who have shapes that they're putting up, and they see on the back of the wall shadows. So they see people walking across, but shadows of people. And this is how they've lived their entire lives. So all they see are the shadows on the back wall. That's the idea.

Now imagine if they were to be unchained and to look around and see the things for themselves. They finally saw the things that made the shadows. And imagine furthermore that they left the cave and saw the real world as it was. So, you could say, in a sense, that the wonder and power, of higher order, let's say lazy and pure functions, are unfurled before them in all their splendor. But then they disover that actually, like making a program that says "hello world" is really hard. And that'd be true of Haskell.

And that's, I think, in some ways when you discover Haskell, you get that sensation of, you feel that power, I can finally experience the world as it is. I can move on. But I think sometimes we overestimate our ability to analyze the world around us and really see the types as they are.

And I think you see that problem in languages like Java and C++. We come up with a solution and then we try to put it on the problem and it doesn't fit so well. And then we try to make the problem fit into it. Becomes really difficult. So, I think, in some sense, Go's approach is much better, because it makes it so that we can get a solution that's better suited.

But to get back to the original point I made: you don't have to agree with me on that. I'm just saying that is something of a fundamental disagreement. And that you could see how somebody could have a different opinion. That doesn't mean that Go is a badly designed language. It just means its very different. That's the point.

OK, so here's the example of composition. And here's an example program that you have rectangle, and rectangle has area. And circle, and circle has area. They both implement the shape interface. Notice they don't "implement" the shape interface. You never say that. The shape interface, sort of, emerges from the types. Because they happen to both have the area function they get the shape interface.

So Go is generally, you're writing has-a relationships, that's the composition bit. So I say A has a B. I don't say A is a B. This [pointing to interfaces] get's you something like the is-a relationship, but you're not building a hierarchy. It's getting you, sort of, one level up. And so generally the focus is on has-a, whereas in other languages the focus is on is-a.

And to give you another example for where this composition is useful you can imagine unix pipes, where you have one program it's output goes to the input of another program. And as long as you make the program, which can take that data, do stuff with it, you can swap it out with a different program. So that's unix pipes, and go's model is very similar to that.

So the example I want to look at is: Russian Doll Coding. That's what I'm going to call it. And so in russian doll coding, we have little things, and we have bigger things that look like the little things, and have them inside of them, and you sort of build up from there.

What's nice about this approach to coding is it's easy to break down a very complex problem into a much simpler problem. And Go is actually extremely well suited to these types of problems.

So this is a program I worked on a few months ago. I had to create an app which would take a song, or some other mp3, and several other mp3s (small ones) and insert them into it, and do this streaming. And so you could imagine, you have a piece of music, and you want to put ads in it. This is, sort of, the use case. This is the idea. I have mp3 1, mp3 2 & 3. I'm inserting them into the first one and then I want to stream that over HTTP. Go is incredibly well suited for a problem like this. It can do it extremely efficiently and it's really easy to understand. And this is an example of a problem that falls apart for languages like Ruby. Which we'll see in a second.

So here's the basic structure of what that program would look like. Package main, you import some things, and then you make this http HandleFunc. And this is how you write an http handler in Go. It's pretty simple, I just set the content type to audio/mpeg and then this tmp io ReadSeeker bit, is the bit we're going to implement. But basically http comes with a ServeContent function which makes this really easy. So it sets the appropriate headers it handles range. So for example if I got disconnected, reconnect it will resume the download, like this ServeContent will handle that bit for you. And it does so by implementing this ReadSeeker interface.

So what is the ReadSeeker interface? It's really simple: it's the Reader interface and the Seeker interface. Hence the name ReadSeeker. And Reader is just, I read data and put it into a slice of bytes. So it's a forward only, kind of idea. And then Seeker means I can move where that starts. So a good example of a ReadSeeker is a file. I can have a file. I can, sort of, move where I want to go into that file, and then I can start reading. And then I can move somewhere else, and start reading again. And seek and read are very familiar if you're used to working with files.

So basically what we want to do is create a Splice function. So I want to create a new ReadSeeker which is the result of splicing several other ReadSeekers into the first one at given offsets. [pointing to function in slide] This would be the sort of function I want to implement. I have my source, this will be the initial file, and then I have the splice map. And so imagine I want to, say, insert one file at 5 seconds and another one at 10 seconds. I would build that. And then that's gonna return another ReadSeeker. This is the Russian Doll bit here. I'm taking a ReadSeeker and returning a new ReadSeeker.

If we were to put that in code, what I'm trying to do is this: imagine I open 3 files, and then I create my tmp, my io ReadSeeker we saw before with the call to Splice. Everybody following this? Any questions?

Here's my 5 seconds, here's my 10 seconds, and I want to create this thing. Now notice this is actually a really small bit of code. And I haven't actually implemented Splice yet, but this is pretty, you know I could get started with that.

So here's, like, the way I started first. And this is very similar to how you would do this in something like Ruby. I said, why don't I just use FFMPEG. FFMPEG is a program out there that can do these kinds of things. It can convert audio, it can concatenate audio files, it can split them, etc. So it already exists, why not just call it. So you can use os/exec which will let you call programs from your go program, and then you can give it it's things, which you can lookup online how to do this. And then take that resulting mp3 and serve it over http.

The problem with this is you get lots of temporary files, which can be kinda clunky to manage, because you gotta make sure you delete them after you're done with them, and it ended up using a lot of memory. Which I was really disappointed in how FFMPEG implemented this. It did not do this efficiently. It turns out with an MP3, if you have MP3s of all the same format, you can splice them together on the fly without need to pull the whole thing in. You can take pieces of one and put it in the other. The format allows for that. So it was really disappointed that it didn't do that for me.

And of course this last one, back to the dynamic linking problem. It requires FFMPEG on the server. And who wants to put FFMPEG on their web server?

So we want to implement this in Go. There's basically two ways I want to build this. I need two things. I need a MultiReadSeeker, which takes multiple ReadSeekers and concatenates them together. So I have ReadSeeker 1, ReadSeeker 2 and then up to ReadSeeker N, and this would be the function definition. So I need to be able to build this thing. And basically the way that would work is, if I seek past the first file, I would seek into the second file. And if I read past it the same way. So it treats multiple files as one big file.

And the other piece I need is a SectionReadSeeker. And that is, instead of the whole file I just want a piece of it. And so it treats this bit here in the middle as a file. This is actually really simple, [pointing to MultiReadSeeker] this guy is a little tricky. There are some edge cases you gotta think about. But I think you could kinda see how one would implement something like that. And the same goes for this [back to SectionReadSeeker]. You can see how given an offset and a length I could implement this. Basically the way it would work, would be if I read past the end of a file I would return an error, instead of returning the data. And then everything that uses Readers would know how to handle that. Because they all implement the same interface.

So to, sort of, get back to how we started. I have this original file, and this is the code I would call inside Splice. I would say "NewMultiReadSeeker", "NewSectionReadSeeker", and I give it from 0 to A, that's the first bit, and then I'd say "File 2", and then NewSectionReadSeeker of the first file, A to B, and then f3, and then the end bit. So now I have 5 pieces, I glued them all together.

And then I have this resulting ReadSeeker that does everything I want it to do. So the only things that are left are: well first of all I need to handle mp3s properly. An mp3 is made up of headers, an id3 tag, and then frames in the middle. So basically I created a strip function, which just gives you the middle bit, and what would this return? Like what would this call? Anybody? [audience: byte array] This would return a SectionReadSeeker right? This is just a SectionReadSeeker of a file. But I just need to process the first bit to know how to do that.

The other thing I need to create is something to turn a time, like 5 seconds, into an offset. That turns out to be really easy if you read the spec on an MP3. The frames tell you how long they are, you can find formulas that will do this, and then maybe create new headers. Maybe, you don't have to do that, you can just return data. The other reason you need the frames is so you don't cut in the middle of a frame.

But the truth is, actually if you just did the dumb thing, it would probably work ok. You'd get like a little glitch in the file, but people probably wouldn't notice.

So to reiterate our final thing, this is the russian doll bit. I have my outer MultiReadSeeker, inside of it I have SectionReadSeekers, and they have pointers to files. That's the idea.

The reason why this is a powerful way to approach programming is I can think about making this [pointing to SectionReadSeeker], as a sort of, really simple small problem. And then I can think about making this [pointing to MultiReadSeeker], as a small problem, and then once I've done that, I've already finished all the work I need to do for the big problem. That's the composition. Not in the sense of having data, but in the sense of, thinking about the problem. It's actually very similar to functional programming, but it's not functional programming. It's kind of a different thing.

So other examples of this kind of thing. You see it all over the place in io. They love their readers in io. So you have a LimitReader which only reads up to a certain limit, and then you have MultiReader, which is like our MultiReadSeeker, combines a bunch of readers together. You have TeeReader, which is kinda cool, because it, like, reads and as it's reading it writes to another thing. Super handy.

You see this idea in compression. I have a reader, I want to compress it, you can feed it to gzip, it creates a new reader, that's now compressed. And anything that uses a reader can use your compressed reader. So it's really simple.

And then there was a great talk at Gophercon, where he built web services using this approach. So basically implement the http handler at various levels to build an entire web stack so this is a super powerful idea of the russian doll coding. I don't know if there's a better name for it, but you see it all over the place.

So that is basically it.


Any questions about what I talked about, or something completely different? Yeah.

[audience #1: could you explain why it's functional and then it's not]

So in functional programming you this, take an example like Haskell, because its pure you can reason about functions in a way where you can put them together really easily. So you, sort of, see that similar here, where I'm reasoning about this bigger thing, making it smaller and smaller, and then sort of glueing - in functional you glue the functions together - and in this you're glueing interfaces together, so that's the difference.

[audience #1: I like it. But I mean, your point about Haskell making easy things hard. But it also tends to make impossible things do-able. So in a sense of, yes it is hard, if you're coming from, hey I just wanna get shit in and out, then yes, it starts to look much more complicated, so you're actually. And this was one of the criticism of a friend of mine made was, and I was curious what your take on it was. The go code you'd write for 4 cores would not necessarily be the same code you'd write for 32 or 64]

So, that's, it's tricky because I think sometimes programmers think oh I have more cores if I use them, my code will go faster, and that's not really true, because you have to change the way you think about what your program does in order to take advantage of that, and so for example, my code here, I'll go all the way back to this guy, is not thread safe. I could not, break this up into pieces and make it work, but the point is, it doesn't actually matter, because this is a concurrent program, if you look at the very beginning. Because what if two people stream an mp3, they both get it. Now each one of these pieces is serial and sequential, not multithreaded, but the thing as a whole, my program as a whole is. And so when it comes to a webserver, this is not actually very important for me to think in terms of making my, sort of, working code concurrent, because it does more than one thing at a time. And so its handling multiple requests, so that's what's more important.

So I think that, in Go, those things are orthogonal, they're for different reasons. You could make this concurrent. You could use channels and things. I don't think it would make your program go any faster, because this is an io bound problem.

[audience #2: just theoretically though, is what she said right? If you have a problem that's cpu bound, and you had a multicore cpu and you go from a 2 core to a 32 core, do you have to write your code differently?]

I guess what I'm saying is, in this example I would not.

[audience #3: it's a very situational sort of problem. I mean if you're processing a lot of data and you know it's gonna take a lot of time, but it's something you know you could just simply throw at multiple cores, then you would probably write it the same way, and just use more cores. But if you've got something that doesn't necessarily benefit from more cores, if I run this serially it's just as fast]

[audience #2: I guess I'm just asking if you have a problem like that do you have to build it that way. You know, if you're working on a quad core laptop, do you have to write it different when you put it on a xenon processor or something?] [audience #3: not really] [audience #2: so do you have to say how many cores you're gonna ]

Right. So I think the point is that just throwing more cores at a problem, actuall doesn't usually result in much, most problems, if they're like this, it's not going to do anything for you. But yeah that can be a real tricky area, and I think that's also a confusion about Go. People sometimes look at go and say: "Oh it's super parallel", but that's not what it's trying to do and so that's a different sort of area. And haskell has some tricks for some of that, but I don't think you can just take a problem and say if we add more cores to it, it goes faster, that's just not the way we should think about it.

[audience #4: I guess what. I would say from a compositional perspective you wouldn't necessarily write it much different. You might tweak it slightly, if you were like "I want to optimize it because I have X number of cores, I'm going to optimize this to run that exact number of cores" and, you know, not do anything crazy, but it's really about, does the problem warrant multiple cores. And that's something people run into all the time, is that, when it comes to Go, people say "Oh this is great" what you're saying, let me just throw more cores at it, but that doesn't necessarily fit the problem]

[audience #2: Like if I want to crawl the internet. So I want to pull down 100,000 pages.]

[audience #5: Right if you're gonna crawl the internet, you're gonna spin up a bunch of goroutines one for each website. One for crawl pathway, whether or not you're running on a single core or 500 cores, and then whether or not you run it on one core, or 500 cores, if you're spinning up a goroutine for each website, it will automatically parallelize. It's funny because it's an embarassingly parallel problem, except for storage, but besides that you can do that]

But that is a good example, because 90% of your code is like doing the work of fetching this URL, processing it, that bit stays the same, and all you do is say, now go do that 100 times. So, like I said, those two things are orthogonal, they're different, they go together really nicely, and actually if I'd spent more time on this, I think the next step of this is, like I said this isn't thread safe, and that's kind of an issue you can get if you're not careful, and to see how this dovetails nicely into channels. So we see it a little with the http bit, but that is, like, the next step: oh I see why channels are the way they are, cause they go together really nicely. Because those are, sort of, two different things thats why. Any other questions?

[audience #5: Why are you saying its not thread-safe?]

What I'm saying is, if you, if I had that MultiReadSeeker and I had two threads try to use it, it would seek and you'd start getting the data from the wrong place. Yeah so, and actually there's a really tricky, like I said, making that MultiReadSeeker can be tricky, because even though if you did it with just one thread, it's, notice how I'm using the same file multiple times, so when it goes to the next file, it needs to make sure to seek first, because otherwise it will be starting in the wrong place, there are sort of, little edge cases like that, and that's probably why MultiReadSeeker doesn't exist in the standard library. Like it's not a super obvious, how do you implement this? Whereas MultiReader is, because you just start reading the next one, it's really simple. So there's a little bit of trickiness there. Basically I was just trying to say that you can't use that one thing in multiple places. Yeah.

[audience #1: I'm, again, full disclosure, I come from haskell, so, and my excitement, initially I was in a dynamic type, so that was the way to ... blah blah blah, since such is the pain when things go too dynamic, but my question is, with regards to the interfaces, how strict are they? Can I wrap shit as something else and force it down an interface?]

I don't think I understand... they're very strict in the sense that you have to implement the method as its defined.

[audience #1: So you'd define it as strong static or weak static] I'm not sure I'm getting that... [audience #1: to give you an example, let's say I'm writing something in Haskell, basically if I setup the interface to be an int, basically a limited allocation of memory, if I clap on an Integer, anywhere in that entire function, boom the compiler says No! this has to be all integer or all int all the way through. In some of the object oriented languages, yes you get that in interfaces, but you can trick it into thinking its getting an any type, or some other way to wrap it so that you can just force it through, where otherwise, technically according to the language it should not go]

I'm not sure I can answer it but, Yes Go does, which is another thing that was interesting about that article, was he said that there was no unsafe capabilities in Go, which is completely false. There's an unsafe library that will let you do anything. So you could put data somewhere, no it's now a float, and it would let you do that. That would be unsafe, but you could do that. So yes you could trick it that way, but as far as, if its an int - it is an int. It's not like C it doesn't auto-cast things. You have to be explicit.

[audience #6: what was the first project you use go in?]

Probably a website. Just playing around. The first real serious one I did. I worked for a company where we had a deployment system, and it did something very similar to what docker did. It was that kind of project. Go does a pretty good job if you have to, like, call lots of programs and work with files and stuff. It does pretty good, very similar to what python... Yeah

[audience #7: so someone told me that doing realtime music software with go is going to be a hard problem because the hardware itself wasn't really concurrent underneath, and I still have no idea what he meant by that]

So Go's scheduler is very sophisticated and it uses the cores available and all that, you can set it up, there's a runtime thing. But, maybe they're saying that, for example, with newer processes they can do multiple things at the same time, right, SIMD and all that stuff, but the truth is, actually, I wrote an article once about assembly and go, and if you go look at the math library for example, it has tons of assembly, and they have optimized some of those functions, and so, if you wanted to you could write a function that uses those instructions, in other words you do 8 multiples at once or whatever. It is possible, you'd have to write it in assembly though.

[audience #7: my friend wrote a fast matrix library, to do that stuff, but still I was wondering if there was something with the cores going on. I mean this person probably works a lot closer to actual]

It might be an example, like with Erlang, which gives you much better guarantees. Maybe its something like that. [audience #7: or like chuck, which is strongly timed] Because there's no guarantee that this goroutine will run for this much time. It doesn't do anything like that. [audience #7: yeah, yeah ok] it's magic. Any other questions?

[audience #2: so the GOPATH thing you had at the beginning. Am I supposed to. I setup a common area to pull down github like database libraries/drivers. Am I supposed to pull that in my project, just create everything self contained, is that the whole philosophy? When you did go get and go path.]

So there are different approaches to that problem. One is to have one big GOPATH. You would just pull the code down into this folder. Actually you don't have to, you just do go get, and if you have this environmental variable it will [audience #2: so is that for your project. Is it project based where you do it? Say if you have 3 different programs you are working on, do you have one GOPATH per project, or ] That would be the approach of having one GOPATH for all the projects, which actually is not a bad approach if you can get away with it. It depends on, it kinda depends, if its just your companies code, one GOPATH is probably pretty good. Because if you're interfering with each other. You should, in all your projects, be using the same version of some library, otherwise you are going to have other headaches. So if you can get away with it, that's nice. But that's not always possible. If you depend on some 3rd party program that depends on a version of a library that you're not using then you can get this collision. In other words say there were 2 versions of this task library and I have some program that uses one and another program that uses a different one well this is all in one place so that kinda sucks really. So you can, the other approach is to have different GOPATHs. So you can setup a GOPATH for one and a GOPATH for the other. [audience #2: and then you'd pull down the libraries, each for each one] right, and then you would, like, do git checkout and give it the version you wanted. The other tools will do for you. [audience #2: so you have to run go run todo.go and it does all the other files together, you don't have to do any kind of makefiles or anything like that?] no that's right. It's all based on the GOPATH. And in fact those other tools, they do things like, you run the command and the name of the tool, and then go run whatever, and what it does is adds to the GOPATH the, it has them in a folder local. So those tools will do it for you. These tools. You can look at them and see how they do it. There are other approaches.

[audience #8: so you talked a little bit about static linking. Obviously with static linking you're going to have bigger size binaries. So how efficient is the linkage tool? How granular can it go down? Function level? Package level?]

Yeah I'm not sure about that. Go binaries right now, and this is a known issue, are large. Though large... so I work on a lot of java code and our jars are 180megs, and they're not that big, and so, anything to me is an improvement. They're like 10 megs or 20 megs. So they're not crazy big. Yeah I don't know how, this also falls apart when you're linking with C programs, it doesn't quite. Some of those are dynamic, but for Go code itself I think that is something they're working on, try to make it not quite as big. You can use that, there's that executable compression, upx or whatever. There's a project out there that can do it for go. That helps out a bit. Maybe 75% ... You can also use differential compression when you push out your code and so, say, 10k of it changed ... rsync will do that ...

[audience #9: how do you pin the version?]

So the way that most of these work is that there is a file in your project - a godep file - which has in it the list of projects that you depend on and their versions. And you just say godep whatever. Or you can just always use head, and that's risky.

[audience #10: so as regards binary size, just as an answer, something that happens in the go community is well is instead of bringing in a whole library, you might just crop out the bit of code that you need which will be a bit more duplication, but it will get your binary sizes down, you can see this in the standard library, like there are certain string functions, parsing integers into strings, and stuff like that or vice versa. There's a whole library that does that, it's in the standard library. But for things like, I want to say the http package They decided not to take a dependecy on the big conversion library and just brought out the two or three functions they needed ... which is pretty legit]

I think that's also another interesting thing, that's sort of, unique about go. They're not afraid to violate the dry principle, and they will do. For example you'll see that with "how do I do a map in go" and they're like: "just use a for loop" and you're like "that's not a map", but it is. It's only 3 lines. It's not like. Sometimes repeating yourself is ok. That's an example too. You know it's a very small function, so why not just repeat it. You see that in testing a lot. Because you don't want to have libraries for your tests.

[audience #7: no one has every been fired for using a for loop]

Someone should post that on twitter [audience #7: my friend did already so ...] Any other questions?

2014-04-21 22:10 How to Build Things With Go: Tries

I'm working on a new series of video tutorials for how to build things with Go. The first one is on Tries:

Source Code: github.com/calebdoxsey/tutorials

2014-02-10 21:28 Rethinking Web Development: Canvas UI

So here’s a sadly common story:

A startup with an existing web application is seeking to get into the mobile space by producing both an Android and iOS app. Seeing as they both already have the necessary skills to produce a web application and they’d rather not have to learn, build and maintain another two versions of their application, they attempt to create a mobile application using web technologies via PhoneGap.

Almost without fail that first version of their application is a disaster. Slow, clunky, unstable and riddled with bugs, they’ll spend the next 6 months trying to get it to work in some sensible way. Eventually they’ll admit defeat, scrap the old app, and build a new, native one from scratch.

And that’s pretty depressing. One of the supposed advantages of the web was that we could finally do away with so many operating system idiosyncrasies. No longer would we have to make 3 versions of every piece of software (windows, mac, unix) and instead we could make one application which could run everywhere. And then after perhaps half a decade, we threw all that out and recreated the multiplicity of platforms we thought we’d gotten rid of.

Actually it’s worse than that. A mere 3 versions of the application is a pipe dream. In reality you also have to consider all the various devices, in all their various resolutions and capabilities, and all the various versions of the operating systems. This list of Android devices is the stuff developer nightmares are made out of. Serious Android shops really do just have 100s of mobile phones to test on. (because it’s the only way to be sure it actually works for your users)

I can’t believe I’m about to say this, but it makes the almost 10 years we spent working around IE6 bugs seem easy by comparison.

Looking back at the one Android app I built I did come to appreciate one thing about it though. I think perhaps our original inclination about building one app for multiple platforms wasn’t entirely hopeless - it was just in the wrong direction.

Why do we think that HTML, CSS and Javascript represent the ideal set of technologies to build user interfaces? When you actually consider their origins and the original purpose for which they were designed, they are a remarkably poor fit for what we ended up doing with them. HTML is a language to represent hypertext - that is to say a language that gives some structure to basically textual content, and to provide links between different sets of textual content. It’s designed for an encyclopedia. But we aren’t building encyclopedias, at least not most of us, and how much of a modern web application even resembles textual content?

Now I’m not sure I can actually make this argument - it’s the kind of thing that’s really hard to recognize until you see an alternative - but I’ll give it a shot anyway. Here are a few of the problems I see:

  • Fatally flawed by it’s original ties to the hideously complex SGML, it’s far too forgiving nature for malformed content and its dearth of elements to represent concepts we actually work with, HTML is a hack we put up with because we have to. As a technology it failed to deliver on its ultimate goal: the Semantic Web. (I think we can thank Google for managing to give structure to a system which was supposed to have it, but never actually did)
  • When it comes to web applications the separation of the semantic meaning of content (HTML) from it’s presentation (CSS) and interactive function (Javascript) is not actually all that useful. Most of what makes a user interface has no real meaning outside of it’s intended function inside that interface. Structuring buttons as buttons, lists as lists, tables as tables and text as text may make the life of a data-crawling robot easier, but the end-user doesn’t ultimately care about such things.

    Proof of this can be seen in this basic point: nobody stores their structured, semantic data as HTML. If I query Twitter for a Tweet I get back a JSON object that describes everything from it’s title to it’s author and date of publication. I take this object, transform it into HTML and CSS, and include it in my page. If HTML delivered on what it promised APIs like Twitters would return content as HTML which I could just drop into my page. But it doesn’t deliver and HTML is a cumbersome and inadequate tool to represent structured data.

    And yet web developers, suckled as they were at the teat of Designing with Web Standards, are obsessed with the pointless pursuit of semantic value.

  • WYSIWYG editors, though popular for a time (with products like Dreamweaver), ultimately fell by the wayside. Probably most web designers today know HTML & CSS, and perhaps even work directly in them. Maybe that’s a good thing, but something about it just seems wrong. If a designer is comfortable with Photoshop why shouldn’t he be able to export what he makes in a usable format?

    I spent many years translating photoshop mocks into HTML & CSS. I don’t it anymore, but that job still exists. Why? Shouldn’t a computer be able to do this?

  • CSS is not easy. It’s deceptive, because some things are trivial. But the layout model, coupled with it’s cascading nature, means that full-fledged UIs can be surprisingly difficult to implement. Everyday examples include centering content, implementing float & overflow properly, trying to use the incredibly complex z-index algorithm, and - of course - working around subtle browser inconsistencies.

    Now if you don’t agree with those examples, consider the fact that it took years for browsers to finally mostly pass the Acid3 test. If the people who write browsers can’t implement these things properly, how much hope do you have of using them properly?

  • CSS is, in general, not very reusable. Have you ever tried to take a component you wrote in one site and put it in another? Cascading rules lead to complex, surprising and unpredictable interactions among components, and there doesn’t seem to be any general approach to writing CSS that solves these problems. (though certainly many have tried) The best solution I’ve seen is a style guide, but the truth is most web sites are a hodge-podge of 10,000ish line CSS files that no one can really grasp.

  • Newish web development tools are remarkably conservative. LESS, SASS and all their cousins are languages designed to produce CSS. But they’re still basically CSS in how they work. No one is re-inventing the layout model and about the riskiest thing you see are automatic polyfills or image substitutions with data-uris. The same can be said for HTML generation languages (with most looking like Markdown and friends - glorified variable substitution languages) and even most Javascript competitors like Coffeescript.

    It seems to me that a healthier ecosystem would see more radical offerings along the lines of the Google Web Toolkit. Not that there haven’t been attempts to pull this off, rather the attempts are generally of very low quality. Maybe a better base set of technologies would make things like this easier to pull off.

I could probably keep going, but at this point I want to introduce a radically different way of writing web applications which I called Canvas UI.

Canvas UI

The idea is this: rather than using HTML & CSS as the bedrock of frontend web development we use a more rudimentary graphical API (Canvas, WebGL or SVG), build frameworks on top of that base and develop our actual application using an entirely different set of technologies.

Over the last couple weeks I threw together an example which implements the same functionality as my previous WebRTC example. Source code is available here. Though crude, buggy and clearly not a viable solution, I think there’s enough here to show what it could look like.

The application is drawn on a large, full-page canvas, with the only HTML & CSS being used to create that canvas. All drawing is done via Javascript in an OO class-based component model like this:

var incomingTitle = new Text({
	text: "Incoming",
	fontFamily: FONT,
	fontSize: "20px",
	color: "#333",
	left: function() {
		return this.parent().left() + (this.parent().width() / 2 - this.width() / 2);
	top: function() {
		return this.parent().top() + (this.parent().height() / 2 - this.height() / 2) - 40;

Not for the faint of heart, this is a no-frills approach to web development. To list just a few of things you’d need to pull this off:

  • Canvas has no DOM. Objects are painted to the screen and have no other existence outside of what you give them. Therefore if you want to implement events you will have to do them entirely manually. (That is to say, hook them to the body and calculate the position of your GUI elements to determine whether or not a click fell on them)
  • Some graphical elements are quite complex in their implementation: for example a large block of text. You won’t get scrollbars, multiline layout, floating images (or other content) or the kind of CSS positioning trickery you may be accustomed too: line-height, text-overflow, indentation, padding, margins, … (unless of course you choose to implement them)
  • Forms will have to be completely rethought. None of those controls will exist anymore.
  • Many of the everyday tools web developers use won’t be all the useful anymore. Inspecting a page won’t tell you much, and you can’t go adjusting CSS rules on the fly.
  • For that matter accessibility is pretty much shot too… (though maybe you could work around this by dumping your textual content to the page in a format screen readers could figure out)

So you might wonder why anyone in their right mind would suggest such a thing. It’s because it offers maximum flexibility. Sure throwing out everything we have given to us with a modern browser is hard, but think of the other side: no longer beholden to whims of the w3c and browser makers you can develop entirely different approaches to rendering GUIs. To give a few examples:

  • You could make an engine which could translate Android or iOS applications into web applications (and thus reverse what everyone is currently trying to do). My example implements drawable shapes similar to how Android does it:

    new Shape({
      solid: {
        color: "#F5F5F5"
      stroke: {
        color: "#DDDDDD"
      corners: {
        bottomLeftRadius: 10,
        bottomRightRadius: 10

    That wasn't too hard to put together, but the sky's the limit here on how it could work.

  • You can change browsers in ways that wouldn’t be possible otherwise. For example you could implement the LaTeX text rendering algorithms which would give you superior text-wrapping and hyphenation, justification and kerning.

  • A lot of what we do now could be generated: you can make your own layout languages, generate javascript plumbing code for ajax and events and abstract away having to think about resources. (We tend to think this is impossible but Android manages to do it just fine)

  • This freedom opens up rendering options we never really had. For example imagine a comic speech bubble:

    In traditional web development you would implement this use a whole bunch of images thrown together or really crazy CSS. (like this) In Canvas it’s a formula:

    ctx.moveTo(l, t+a);
    ctx.arcTo(l, t, l+a, t, a);
    ctx.lineTo(r-a, t);
    ctx.arcTo(r, t, r, t+a, a);
    ctx.lineTo(r, b-a);
    ctx.arcTo(r, b, r-a, b, a);
    ctx.lineTo(l+40, b);
    ctx.lineTo(l+15, b+15);
    ctx.lineTo(l+20, b);
    ctx.lineTo(l+a, b);
    ctx.arcTo(l, b, l, b-a, a);
    ctx.lineTo(l, t + a);

    That may look complicated, but it’s a lot more flexible. There are only a few low-level, primitive operations you need to learn.

  • Graphics like this are re-usable. That speech bubble is easy to restyle (changing its fill style) and transform (adjusting its dimensions) and I can copy and paste it into another project. With an SVG backend you could potentially export things directly from an editor.

And I suspect given a full-blown framework with lots of features we’d see ways of doing things we couldn’t imagine before.

So where do we go from here? Well the truth is I don’t really have much drive to finish this project. I already know all the standard web technologies, and UIs are mostly a means to an end for me. I’d rather focus on real coding. But maybe some day we’ll see some radically different approaches to building UIs for the web.

This is the final post in this series. The previous 4 were: WebRTC, Non-RESTful APIs, Cloud IDEs and Static Hosting.

2014-01-30 21:05 Rethinking Web Development: WebRTC

Fundamentally web applications are client-server applications. Web developers write code that runs on a server and end users (clients) connect to that server (via a browser using HTTP) to perform tasks. In recent years this rather standard definition of the web application has come under fire. Increasingly code is no longer run on a server (but rather via javascript on the client) and HTTP is no longer, necessarily, the protocol used to communicate between the two machines (SPDY being the rather obvious alternative, but also things like WebSockets which don't really fit in the HTTP bucket).

However, even more dramatic than those two shifts has been the relatively recent introduction of WebRTC. If you're not familiar with WebRTC (the RTC meaning Real Time Communication), it's a technology that allows for peer-to-peer communication. That is to say, end users can communicate with one another directly, without the need for an intermediate server.

It seems to me, at least at this moment, that this is a technology that is generally not well understood and it's potential has not been fully realized. WebRTC ought to be a seismic shift in the way we build web applications. It's not yet, but I suspect it will be in a few years.

In this post, the penultimate post in the series, I will give a brief overview of how to use WebRTC and then discuss some of the possible implications for web development.

How to Use WebRTC

Getting started with WebRTC is not easy. Most of the documentation is very confusing (good luck understanding the spec), there aren't a ton of examples out there yet, and, up to this point at least, the technology has been in a constant state of flux. Nevertheless WebRTC is not merely experimental, it's a fully functional technology available in both Firefox and Chrome today.

To underlie that point, if you've not seen it, you should see the AppRTC project. This is a video chat app, similar to Skype, implemented almost entirely in the browser (with a small bit of server code) and using peer-to-peer transfer of data. For a mere demonstration, it's surprisingly useful and calls into question all of the applications out there that attempt to implement this functionality using custom Java, Flash or similar installed applications.

But back to the question at hand: how does one use WebRTC?

For the purposes of this tutorial I built a small chat application which consists of two components: a server-side Go application which facilitates the initial signaling process between the two peers and a client-side Javascript application which implements the actual WebRTC workflow. First let's take a look at a high level description of the process.

For this example, suppose we have two end users who want to talk to each other: Joe and Anna.

Joe arrives first, connects to a server, subscribes to a topic and waits for someone else to show up.

Anna arrives next, connects to the same server, subscribes to the same topic and at this point the server tells Joe that someone else connected.

Joe sends an "offer" to Anna (via the server) indicating that he wants to establish a peer to peer connection.

Anna receives the "offer" and sends an "answer" to Joe.

Both Joe and Anna trade ICE candidates. ICE stands for interactive connectivety establishment (described here) and basically represents the various ways the two parties can reach each other.

Finally the connection is made, one of the ICE candidates is agreed upon and Joe & Anna can communicate directly with another.


The (mostly) complete source code for this example can be found in this gist. The server is implemented in Go, the client in Javascript and communication between the server and the client occurs over a WebSocket. WebSockets are actually fairly straightforward to implement using Go and they make it so I don't have to worry too much about storing things on the server. (As a long-polling comet server would require)

When a user first connects, the application will ask them to enter a topic. This topic is how the two peers are connected on the server (they have to enter the same thing). You can think of it like an agreed-upon virtual meeting location. That topic is sent to the WebSocket like so:

function startWebSocket() {
  ws = new WebSocket("ws://api.badgerodon.com:9000/channel");
  ws.onopen = function() {
      topic: topic,
      type: "SUBSCRIBE"

The server receives this connection, sends back the user's ID and subscribes them to the topic: (IDs are generated using a full cycle PRNG)

id := <-idGenerator
out <- Message{
  Type: "ID",
  To: id,
for {
  var msg Message
  websocket.JSON.Receive(ws, &msg)
  if msg.Type == "SUBSCRIBE" {
    msg.Data = out
  in <- msg
// in subscribe
topics[topicId][userId] = out
for id, c := range topic {
  if id != userId {
    send(c, Message{
      Topic: topicId,
      From: userId,
      To: id,
      Type: "SUBSCRIBED",

And now the first user waits for someone else to show up. When that happens the same process is repeated, except this time the first user is informed that the second user subscribed, so he proceeds to begin the process of establishing a peer-to-peer connection and sends the second user an offer (via the server):

  to = msg.from;
pc.createOffer(function(description) {
    topic: topic,
    type: "OFFER",
    to: to,
    data: description
}, ...);

The second user sees this offer and sends an answer in reply:

case "OFFER":
  to = msg.from;
pc.createAnswer(function(description) {
    topic: topic,
    type: "ANSWER",
    to: to,
    data: description

In addition to the offer and the answer both users have also been forwarding along ICE candidates:

pc.onicecandidate = function(evt) {
    topic: topic,
    type: "CANDIDATE",
    to: to,
    data: evt.candidate

Finally once the first user receives an answer, and an ICE candidate is agreed upon the two peers are connected. RTC has the ability to stream audio and video, but for this example I used an RTCDataChannel:

pc = new RTCPeerConnection({
  iceServers: [{
    // stun allows NAT traversal
    url: "stun:stun.l.google.com:19302"
}, {
  // we are going to communicate over a data channel
  optional: [{
    RtpDataChannels: true
dc = pc.createDataChannel("RTCDataChannel", {
  reliable: true

Actually sending messages is trivial:

function onSubmit(evt) {
  var text = chatInput.value;
  chatInput.value = "";
  var msg = {
    type: "MESSAGE",
    data: text,
    from: from,
    to: to


There have been two large, and completely divergent trends when it comes to networked applications: the cloud and decentralized, distributed networks. On the one hand companies like Google, Apple and Facebook have been moving all of their user's information and applications to their data centers. Your email, pictures, games, etc... are no longer on your computer, but rather they are accessed over the internet, and stored in the cloud. This comes with a whole host of advantages, but it also comes with a steep cost: Google reads all your emails, Facebook knows everything about your personal life, Apple can track all your movements and although these companies certainly provide a great deal of value with their products, in the end their goal is not to make you a happy customer (because most of the products are free), but to use the information you freely provide them for their own ends. (and mostly just to serve you ads)

At the same this has been happening, we've also the seen the rise of decentralized, distributed networks. From file-sharing applications (like BitTorrent) and streaming content providers (like Spotify) to instant messaging applications (like Skype) and even digital currencies (like BitCoin). Decentralized networks have no single, governing authority. Data is spread across the network and communication is peer-to-peer.

And though the architecture of the web has always, fundamentally, been client-server, in some ways peer-to-peer architecture represents a more accurate representation of one of the original goals of the web:

The project started with the philosophy that much academic information should be freely available to anyone. It aims to allow information sharing within internationally dispersed teams, and the dissemination of information by support groups.

Now perhaps at one time, the Googles of this world had intended to organize the world's information. But increasingly, the fundamental goal of companies like Google is not merely to organize the world's information, but to own it. You can see this when they killed Google Reader. Google doesn't want you to read blog posts on other servers, they want everyone to use Google Plus as their blog. With peer-to-peer networks it may be possible to undermine this trend. People can own their own content again.

This example demonstrates one of the most obvious use cases for this technology: video, audio or textual chat among peers. But there are other possibilites:

  • The chat application has a far broader usage than most people realize. Of course there are the typical Google & Facebook-chat like applications, but there are also feedback libraries (like Olark), support applications (like you might see on Comcast's website) and a whole host of multiplayer games.

    Furthermore, WebRTC is secure by-default. In this day-and-age of NSA skepticism where Google reads all your email, and Facebook knows your entire life history, WebRTC is a breath of fresh air. You can finally have your privacy back and still get the robust accessibility and flexibilty of a modern web application.

  • Bit Torrent has demonstrated the power of a distributed file sharing network. With HTML5 technologies it is possible to build such a network directly in the browser. For example: ShareFest. Could someone implement an in-browser DropBox? (perhaps, similar to SpaceMonkey?) Or Mega? One of the downsides of WebRTC is that users must be connected at all times: once their browser closes they are unreachable: but that issue doesn't seem insurmountable... and maybe with a few helper nodes such a system could be sustained.

  • A distributed social network (ala Diaspora) may have a better chance of succeeding if it's just as easy to use as Facebook or Twitter. Storing all that data is challening (particularly with large media like pictures and video), but the relationships and textual updates are much more realistic storage-wise. Store that data among your (actual) peers and perhaps it could even be fairly reliable.

  • With the introduction of things like WebGL, WebAudio, PNACL, asm.js and a myriad of other HTML5 technologies, it's possible to build real games that exist solely in the browser. WebRTC makes it so those games can be multiplayer. This isn't a new idea - consider Artillery - but, as far as I know, it's not something which has really been realized yet.

  • One of the reasons internet companies are so successful is that it's very difficult to build something like Facebook. It requires a tremendous amount of capital and large, robust, highly reliable and fast systems can't be put together by just anyone. How many projects have been sunk by their inability to scale?

    And yet peer-to-peer networks have the potential to scale in a way the cloud never could. A signaling server can handle an enormous amount of load with no issues. If the vast majority of your application can be re-written to run purely on the client, most of your server-side concerns evaporate.

    Of course, most web applications can't be moved entirely to the front-end. Nevertheless this isn't an all or nothing game. The more work you can push to your end users, the less work you have to do on your own machines. Is it possible to do some of that background processing in a browser? It might be more possible than you imagine: modern browsers have threads, type arrays, full blown databases, offline capabilities, file-system access, etc...

And those are just a few of the things that came to mind in the last week. It's an exciting time to be a web developer and it'll be interesting to see just what turns up in the next few years.

If you managed to make it this far: thanks for sticking with it. Stay tuned for my final post in this series, where I will propose an even more radical change to how we build web applications.

2014-01-27 08:30 Rethinking Web Development: Non-RESTful APIs

Software development is a strange industry. Applications are hard to build: they take months of work, have lots of moving parts and they're extremely risky - often being built under tight deadlines in competitive markets. And they're usually built by surprisingly small teams of developers; developers who probably learned a very large chunk of the knowledge needed to build the application as they built it. It's kind of amazing that anything we build works at all.

And so I'm constantly surprised that we obsess over things which don't really matter. Developers will have vigorous arguments over everything from arbitrary stylistic choices (tabs vs spaces, where to braces, ...) to tool choice (like which editor to use) and naming conventions. Sometimes it seems like the amount of energy we expend discussing these things is precisely correlated with its level of arbitrariness.

Actually it's worse than that. Have you ever met an Architecture Astronaut? They take your crude, simple but working project and transform it into a more "correct" version. It's not actually functionally any different (one hopes), but it now has umpteen levels of indirection, factories, interfaces, powerful abstractions, design patterns and a myriad of other features you never really knew you needed. It also has the quality of being basically unmaintainable because you don't actually understand it anymore.

A remarkable demonstration of this can be seen in the Fizz Buzz Enterprise Edition. It's funny because it's not all that different from reality.

Actually this tendency to transform arbitrary decisions into moral categories and then hang them as a millstone around the neck of others is a much broader phenomena. At its root is perhaps the need for markers to indicate who is in the group and who is out of it, and then once we have those markers established the need to assert our mastery (as a game of one-upmanship). This tendency is only exacerbated by the challenges we're presented with: we aren't actually confident that we know what we're doing and the systems we build are often incredibly difficult to get working. Rather than address the hard problems, we isolate the easy things and focus on them. (Maybe I can't build a distributed database, but I can tell you that your function should start with a capital letter)

So I thought I might tackle one of these software shibboleths: the Restful API.


REST is a complicated architectural style described here. I'm not particulary interested in tackling the actual academic meaning of the term, rather what it has become as a popular buzzword implementation. I have in mind the focus on proper HTTP verbiage (GET, POST, PUT, DELETE), resource URIs, and thinking almost entirely in terms of the representation and transfer of resources.

For example suppose we are building a RESTful API for email. We might have a URL structure like this:

{GET|PUT} /emails

GET /emails lists the most recent emails, PUT /emails creates a new one, GET /emails/{EMAIL_ID} gets a particular email, POST /emails/{EMAIL_ID} updates an email, DELETE /emails/{EMAIL_ID} deletes an email.

So here's why I don't like this approach:


REST focuses on state, but state is not the primary building block of an application. What does it mean to "create" an email? Does that mean it sends it? How can you "update" or "delete" an email? Suppose you have 1,000,000 emails... listing them all doesn't really work anymore does it? Consider this approach to sending emails:

PUT /emails/{EMAIL_ID}/send

A URL like this doesn't make sense. It would be read "Create a 'send' record for email {EMAIL_ID}". What exactly would you send to this endpoint? An empty JSON object? ({}) It's an uneasy fit.

Suppose instead you add a "state" field to your email:

{"id":1234, "subject": "whatever", ... , "state": "unsent"}

And you would update that record and POST the update. This is a better approach from a URL perspective, but it's much messier from an implementation perspective. On the server I have to detect changes to this object and act accordingly (ie State was changed, therefore I will send the email). Do I merely update the record in my database, schedule the email to be sent later, and respond that everything is fine? Or do I wait for the action to be completed in its entirety? Maybe I add additional types of state:

{ ... "state": "sending" }, { ... "state": "sent" }, { ... "state": "bounced" }, ...

But "state" in this sense is not really a property of an email itself, rather it's more a property of our system: I'm currently sending your email, I sent your email, I tried to send your email but it was blocked, ...

Problems like this aren't unusual - they're typical. Modern web applications aren't glorified Wikipedias, they're the desktop applications of yesteryear: described in terms of user workflow and actions not in terms of the mere transfer of resources.

Caching is Broken

One of the supposed advantages of a RESTful architecture is that it lends itself to caching. Unfortunately those caching mechanisms are notoriously difficult to use properly.

Consider a web application with a lot of Javascript (which is basically all of them). Somewhere in the HTML for that site the Javascript has to be included:

<script src="../assets/js/site.js"></script>

That's web dev 101, and it's wrong. For 2 reasons:

  1. Most web applications change frequently. When you change site.js there's no guarantee that your end user will get the latest version the next they visit your site unless you explicitely make it so your web server adds headers to invalidate the cache.
  2. If you add headers to invalidate the cache everytime a user comes to your site that means they're downloading 100s of KB of script everytime they reload (which can be devestating to performance)
The solution is to use a hash as part of the name of the script and add aggressive caching headers:

<script src="../assets/js/site-d131dd02c5e6eec4.js"></script>

Let's just call a spade a spade here: that's a hack. The modern web developer spends and inordinate amount of time optimizing the performance of their applications to work around issues like this. This is because the architecture to which they're beholden is fundamentally flawed. A well designed system makes the typical case easy, not hard.

For more guidance on caching read this article by Google. Speaking of Google, they got so fed up with slowness of HTTP, they silently replaced it with SPDY on their servers.

Clean URLs

Perhaps you've read this blog post: URLs are for People. Well, I disagree. URLs are not for people. Nobody enters them manually, and rarely do they even bother to look at them. Your domain matters, but outside of that if you're spending more than a few minutes thinking about how you want to layout your URLs, you are wasting your time focusing on a part of your system that doesn't actually matter. (And one thing I love about that article is his two primary examples of bad URLs are two of the most popular sites on the internet: Google and Amazon...)

HTTP Verbs

If you look at a framework like Ruby on Rails it places a great deal of emphasis on using the correct HTTP verbs. What's bizarre about this is even in the original construction of web servers with simple CGI-based forms, the set of HTTP verbs was not widely supported. GET and POST were widespread, but their lesser-known cousins DELETE and PUT were unreliable. This leads Rails to add code like this:

<input name="_method" type="hidden" value="delete" />

So why advocate for a system which doesn't actually work out of the box?

Bi-Directional Communication

RESTful architectures are client-server architectures. All requests originate from the client, and all responses originate from the server. Therefore it is impossible for the server to initiate communication with the client. Sadly almost every application needs server-initiated communication.

For example it'd be great if our email application could tell the client when a new email came through. The RESTful solution to this problem is to poll periodically - a clumsy, inefficient and unreliable process.


Perhaps the main alternative to REST when it comes to APIs is RPC, which could be pulled off with any number of mechanisms (AJAX, WebSockets, ...). But RPC is not a magic bullet, it has its own set of issues (which probably led to the creation of REST in the first place). My point is not to offer an architecture which is superior to REST in every way, rather I think that the application ought to drive the discussion about architecture. If REST works for you application, then by all means use it, but if it doesn't don't be afraid to use something else.

Too often we measure the quality of an application by it's conformity to a set of pre-defined rules - the Thou Shalt Not's of web development - when we should really be treating those rules as suggestions - conventions that someone once found useful in building their own application.