Juggle Tutor

Reflections on Abandoned Projects

Sep 4, 2017 at 5:00PM

Caleb Doxsey

This long, meandering post is about Juggle Tutor, an interactive, instructional juggling application which I never managed to finish. I'll start by explaining where the project came from, then provide a general overview of my product vision(s), as well as provide a technical breakdown of the technologies I used. Finally I'll explain why I ultimately failed to complete the project.

Over the years I've worked on a lot of side projects. You get this idea in your head and run with it for a few days or weeks with reckless abandon. Sometimes you even manage to produce something. But, at least for me, more often than not, I never actually finish the project. Either I'll lose whatever spark of creativity was driving me, or the project turns out to be too difficult to complete.

I think this is a fairly common tendency among software developers. It's one of those strange areas where this supposdely rigorous, mathematical, engineering discipline is, in practice, a whole lot more like art (or craft anyway) than science.

As I was thinking about this, it also occurred to me that this was a common tendency in my family. Growing up, my father would constantly be working on projects around the house. Some of those were the fairly standard projects you'll find in many homes: painting, re-doing a bathroom, etc. (my dad is very handy, a trait I didn't really inherit...) but a lot of those projects he worked on were of a much more esoteric nature.

When I was about 8 or 9 years old we adopted 2 cats (Ortho and Para) from the humane society. Typical of many cats, particularly in a place like Alabama, our cats were indoor/outdoor cats. If you've ever had an indoor/outdoor cat you'll know that anytime they're outside, they want inside and anytime they're inside, they want outside. So like most people in this situation, we installed a cat door to avoid constantly having to open the door for them. However a simple cat door wasn't enough for my dad, he decided that he'd put the cat door in a window in the laundry room and build a set of stairs for the cats to use.

As you can see in the picture above, each tread and riser had to be intricately cut and, although it's a bit blurry, you can also see that he's sanded and rounded many of the edges and even installed an awning over the entrance (I guess cats don't want to get wet in the rain?). All of this for 2 cats. And yet the thrill of seeing them use it for the first time probably made the whole project worth it.

I think many software side projects are like this, labors of love born out of a manic, dogged drive to bring your idea into reality, which, from the outside anyway, would appear to be a complete waste of time.

At this point I could probably veer off into a tangent about how we're entirely too focused on efficiency, pragmatism and the economic value of an activity, and how a recreational, almost playful, creativity for its own sake is sorely lacking in the American psyche... but I'll skip that and get back to the subject at hand.

Let me tell you where Juggle Tutor came from.

Origins

In 2012 I jumped ship from a failing startup to pursue a consulting company with some former colleagues. Ultimately that consulting company didn't work out, but in the interim between jobs, I had a lot of free time on my hands to work on my own software projects. One of those projects was successful (my book An Introduction to Programming in Go) but the others never really went anywhere.

At the time the iPad was just coming into its own, and there were some really interesting technologies involving various sensors, computer vision, motion capture (the XBox Kinect) and even what would later become augmented reality. But, at least in my opinion, it looked like although the technology was there, there was very little software taking advantage of it. (Since that time there have been more applications in this vein: Pokemon Go and Snap which is basically a multi-billion dollar company built around using face detection to add masks to people's faces...)

Worse than unrealized potential, there were aspects about the direction of software that didn't seem very positive to me. Using a smartphone was a virtual, isolating experience and it encouraged unproductive and ultimately unfulfilling habits. And, although I didn't know it at the time, social media would only make these problems worse, managing to encourage very negative forms of social pressure. (the social justice virtue-signaling mob descending on wrong-think on Twitter, the artificial or the perfect life posturing you see on Facebook and Instagram, etc.) But did it have to be this way? A smartphone is just a tool - a platform to run software. Why don't we make more apps that encourage the exploration of real environments, positive socialization or the development of real skills?

For example, one idea I had was smart phone laser tag which would utilize the camera, wifi and a printed target which users would wear. "Shooting" would consist of taking a picture, and a simple image analysis would determine whether or not the shot hit or missed. Though I didn't know about it at the time such a product already existed (apparently in many forms), though it's interesting that it never really seemed to take off in the popular imagination. Maybe internet-native children never played tag...

But it was the development of real skills that I was particularly interested in. Is it possible to build games which are simultaneously fun, recreational activities but which also develop real-world skills and open deeper, more fulfilling experiences? Much early education is of this sort, where learning a skill like reading is essential for human flourishing, but which is also really difficult to learn, especially when you first start. And games could, theoretically anyway, help develop these sorts of skills.

Consider rhythm games, where the player presses buttons on a controller according to on-screen instructions and in time with the music. Popular examples are Guitar Hero or Rock Band, and these games are both social and quite fun. They're also great examples of the way mechanical skills develop - you really can get better at them over time. Unfortunately, the mechanical skills that are developed have very little to do with actual instruments. The skill doesn't transfer, and, speaking from experience, the ability to play an actual guitar is a much more rewarding experience than playing the virtual guitar in Guitar Hero.

But actually learning to play an instrument takes a long time and can be real drudgery for the aspiring musician. It's almost a teenage rite of passage to try and learn guitar for 6 months only to eventually give up and never touch it again. And this is where games might help; they are powerful motivating devices for difficult activities, providing incentivization that can overcome discouragement. It's amazing what a little competition can do to motivate a young person. What if rhythm games could develop the actual skills needed to play an instrument?

I'll just go ahead and tell you that such an application exists, it's called Rock Smith, and though there are flaws in its execution (it doesn't really work with a classical guitar for example), it's great to see developers trying to build these applications. (An example of doing this with piano is Synthesia)

So my goal was to take this same basic idea and apply it to Juggling.

The Game

The game would have several components:

The first component would be a demonstration or tutorial of how to juggle. It would explain the mechanics of juggling and provide useful tips for how to learn. Juggling is a skill that consists of many differents patterns and types of equipment. With methodical repetition you can train yourself to repeat these patterns consistently and eventually you're juggling. It's a lot like learning to ride a bike.
By utilizing the camera on your device and computer vision algorithms, the tutorial could be interactive. The game would watch what you were doing, assess how well you were completing the trick and provide useful tips as you practiced. If you've ever played Let's Dance or one of its cousins, you'll have some idea of how this might work. (I even looked into purchasing letsjuggle.com, but it was already taken, hence why I renamed it to Juggle Tutor) I have more to say about the implementation of this later.

Let's Dance
Once I had a system which could accurately assess juggling competence, I could use it to implement achievements. There are dozens of different tricks that the aspiring juggler can learn, and each of these could correspond with an achievement (and unlike most games this would be an actual achievement corresponding to a real-world skill). Many tricks build on top of one another, for example the Mill's Mess uses techniques from the Reverse Cascade, so the achievements could be layed out hierarchically - where completing one trick would unlock more advanced tricks.

Reverse Cascade

Mill's Mess

An alternative approach to assessing achievements would be to allow the submission of demonstrations and have humans determine whether or not the trick was completed. (More on problems with this idea later)
The obvious next step for achievements is to build a profile and be able to share it with anyone on the internet. Lot's of MMOs have this, like World of Warcraft. You'd login with Google, complete the achievements and share them with friends. It'd be something like you're a level 7 Juggler, though I could've been creative with titles.
Juggling is a performance art, and often too much emphasis is placed on technical competency, rather than the ability to entertain. Juggling is, in this sense, a means to an end, and a fastidious focus on achivements would ultimately undermine what real Juggling looks like. A sarcastic, biting critique of this mentality can be found in this video from Anthony Gatto, one of the most accomplished Jugglers in the world, where he does something technically amazing, which very few people in the world could hope to accomplish, in front of a few bored and distracted kids:

I bet you didn't make it through that whole video...

So another aspect of the game could be recording exhibitions, which are intended to be entertaining for those who watch them and not mere displays of technical ability. They could be voted on, liked and shared on social media. The site could then serve the dual purpose of both being an interactive game to learn juggling as well as a means to advertise one's own juggling abilities - a juggling certification if you will.

I wanted the game to be accessible to children and young adults; the kind of thing they could pick up with minimal supervision, derive some enjoyment from and, obviously, learn to juggle. But not just learn to juggle, but also learn what tough, self-directed skill development, over a long period of time looks like. A hard-fought, deep satisfaction that most games can't really deliver. A kind of grit if you will, the development of which has lasting implications for the rest of life.

Such was my pie-in-the-sky vision anyway. Let's move on to implementation.

Implementation

Initially I intended to build this as an iPad application, but I ran into a few problems:

I didn't own an iPad or a Mac. You can't really build software for an iPad without a Mac, so this turns out to be a pretty big problem.
I didn't know anything about objective c, and, in particular, wasn't sure I could leverage some of the algorithms I thought I would need.
As it became clear that the social dimension of the game was important, I thought limiting it to the iPad didn't really make sense.

It turns out that all the technology necessary to build the application can be done through the browser and since, as a web developer, I was much more comfortable in that environment, I decided to build a web application instead.

It went through many iterations, but the basic design ended up looking like this:

A simple layout, large text, and bold colors were intended to give the application a playful, engaging style. I used Google App engine for the backend, so I could easily use Go and wouldn't have to worry about maintaining infrastructure for a database. Most of the code is pretty custom, and you can see it here if you're interested. (It won't actually work out of the box as some important files have been removed)

With that version of the application you could record video to Youtube (so I could avoid the expense of storing video, and simultaneously easily supporting sharing), and there was an offline task which periodically ran to synchronize content (things like the number of likes a video had, or manually deleted videos). Using Google Single Sign-in made this all pretty straightforward.

What I was much more interested in was creating the interactive component. HTML5 provides several APIs which allow the developer access to the hardware needed to implement an application like this.

First you can access the camera using code like this:

var video = document.createElement('video');
document.documentElement.appendChild(video);
video.muted = true;
video.style.display = 'none';

navigator.getUserMedia(
    { video: true, audio: false },
    function(stream) {
        var url = window.URL || window.webkitURL;
        video.src = url ? url.createObjectURL(stream) : stream;
        video.play();
    },
    function(error) { console.error('failed to load video', error); }
);

Which renders the video to a hidden element. I then created a couple Canvas elements:

var w = 256; var h = 256;

var offscreenCanvas = document.createElement('canvas');
offscreenCanvas.width = w; offscreenCanvas.height = h;
offscreenCanvas.style.display = 'none';
document.body.appendChild(offscreenCanvas);
var offscreenContext = offscreenCanvas.getContext('2d');

var canvas = document.createElement('canvas');
document.documentElement.appendChild(canvas);
canvas.width = w; canvas.height = h;
var ctx = canvas.getContext('2d');

var buf = ctx.createImageData(w, h);
function fill(arr, width, height) {
    arr = new Uint8ClampedArray(arr);
    buf.data.set(arr);
    ctx.putImageData(buf, 0, 0);
}

And then I had an animation function which took the content of the video frame, ran some code to analyze it, annotated it with shapes or other content, and copied the resulting data to the visible canvas:

function runAnimator() {
  if (video.readyState === video.HAVE_ENOUGH_DATA) {
    offscreenContext.drawImage(video, 0, 0, w, h);
    // do stuff with the offscreen canvas
    var frame = offscreenContext.getImageData(0, 0, w, h);
    fill(frame.data, frame.width, frame.height);
    requestAnimationFrame(runAnimator);
  }
}

From there it's just a matter of cleaning up the visuals and surfacing some let's-danceish content. Actual code (for what its worth) can be seen here.

Tracking Juggling Balls

I did a bunch of research to try and find ways to implement ball tracking. Probably the first place you'll want to start is OpenCV (some of which has been translated to javascript, and you can also probably use it with emscripten), and there's a ton of academic research on this topic, particularly related to face and hand tracking (for obvious reasons). I read a lot of academic papers, but understood very little of it. (I wonder if anyone else finds it a bit odd that so little academic computer science ever seems to translate to actual software...) You'll find the remains of experiments littered throughout the code, so I'll just explain what my final attempt looked like.

I decided the easiest way to track the balls would be to target a specific color, and since most people have tennis balls laying around, that would be an obvious thing to use. Optic Yellow is such an unusual color that it would be relatively straightforward to isolate it from the surrounding background. First you mask the image to only the colors you're interested in:

void MaskGreenish(Image img) {
  for (int32_t x = 0; x < img.Width; x++) {
    for (int32_t y = 0; y < img.Height; y++) {
      int32_t o = (y * img.Width + x) * 4;

      uint8_t r = img.Data[o + 0];
      uint8_t g = img.Data[o + 1];
      uint8_t b = img.Data[o + 2];
      uint8_t a = img.Data[o + 3];

      if (IsOpticYellow(r, g, b)) {
        img.Data[o + 0] = img.Data[o + 1] = img.Data[o + 2] = 0xFF;
      } else {
        img.Data[o + 0] = img.Data[o + 1] = img.Data[o + 2] = 0x00;
      }
    }
  }
}

Then, using an algorithm known as union-find and a data structure known as a disjoint set, you find all the contiguous blobs in the image:

int FindLargestBlobs(Image img, vector < Rect > & blobs) {
  int w = img.Width;
  int h = img.Height;

  DisjointSet dsset(w * h);
  // first pass
  for (int x = 0; x < w; x++) {
    for (int y = 0; y < h; y++) {
      int o = y * w + x;
      int ol = dsset.Find(o);
      if ((x - 1) >= 0) {
        int l = y * w + (x - 1);
        if (img.Data[o * 4] == img.Data[l * 4]) {
          int ll = dsset.Find(l);
          dsset.Union(ol, ll);
        }
      }
      if ((x + 1) < w) {
        int r = y * w + (x + 1);
        if (img.Data[o * 4] == img.Data[r * 4]) {
          int rl = dsset.Find(r);
          dsset.Union(ol, rl);
        }
      }
      if ((y - 1) >= 0) {
        int t = (y - 1) * w + x;
        if (img.Data[o * 4] == img.Data[t * 4]) {
          int tl = dsset.Find(t);
          dsset.Union(ol, tl);
        }
      }
      if ((y + 1) < h) {
        int b = (y + 1) * w + x;
        if (img.Data[o * 4] == img.Data[b * 4]) {
          int bl = dsset.Find(b);
          dsset.Union(ol, bl);
        }
      }
    }
  }
  map < int, Rect > rectangles;

  // 2nd pass
  for (int x = 0; x < w; x++) {
    for (int y = 0; y < h; y++) {
      int o = y * w + x;
      if (img.Data[o * 4] > 0) {
        int lbl = dsset.Find(o);
        if (rectangles.count(lbl) == 0) {
          Rect r = {
            x,
            y,
            x,
            y
          };
          rectangles[lbl] = r;
        } else {
          Rect r = rectangles[lbl];
          if (x < r.x1) {
            r.x1 = x;
          }
          if (x > r.x2) {
            r.x2 = x;
          }
          if (y < r.y1) {
            r.y1 = y;
          }
          if (y > r.y2) {
            r.y2 = y;
          }
          rectangles[lbl] = r;
        }
      }
    }
  }
  set < Rect, compareRectangle > all_rectangles;
  for (map < int, Rect > ::iterator it = rectangles.begin(); it != rectangles.end(); ++it) {
    all_rectangles.insert(it - > second);
  }
  set < Rect, compareRectangle > ::reverse_iterator it = all_rectangles.rbegin();
  int count = 0;
  for (int i = 0; i < blobs.size() && it != all_rectangles.rend(); ++i, ++it) {
    blobs[i] = * it;
    count++;
  }
  return count;
}

We're targeting the biggest 3 objects in this example, so you'll get back 3 rectangles, which you can then draw on the canvas:

vector blobs(3);
int blobsFound = FindLargestBlobs(scratch, blobs);
pp::VarArray balls;
balls.SetLength(blobsFound);
for (int i = 0; i < blobsFound; i++)
{
  Rect r = blobs[i];
  pp::VarDictionary d;
  d.Set("X1", r.x1);
  d.Set("X2", r.x2);
  d.Set("Y1", r.y1);
  d.Set("Y2", r.y2);
  balls.Set(i, d);
}
result.Set("Balls", balls);

It looks like this:

The Juggle Tracker in Action

If you've got some tennis balls laying around, you can try a live demonstration here. (It requires Chrome)

From there I intended to use face detection to find the center line of the image to determine the relative, spatial position of the balls. The cascade pattern (the standard juggling pattern for 3 balls) consists of the balls crossing from waist height on one side, to around forehead level on the other side in a continuous, parabolic motion. Sampling the positions of the balls over time should provide enough information to determine whether or not someone is juggling.

And this is about as far as I got with the interactive component of the application. I realized I was pretty deep in the weeds, in over my head, and would need a lot more resources to make this a reality. If you're familiar with it, the XBox Kinect required an enormous training regime for its artifical intelligence to even get close to consistent results. I didn't have the relavent expertise or resources to pull that off.

Nevertheless I think the technology could work if someone could invest the time needed. I'm a little surprised I haven't seen more done in this area. Your browser is full of pretty amazing, almost entirely unused technologies. (like WebRTC for example)

PNACL

You may've noticed the code examples above are written in C++, not Javascript. Early attempts at building this were done in Javascript, but I quickly ran into performance problems. The analysis had to be run in realtime at 30 frames per second or so, and although I could do the masking without much trouble, the union-find algorithm is computationally expensive. Javascript just wasn't cut out for the task.

However Chrome has a pretty amazing way to work around this problem: PNACL.

Native Client is a sandbox for running low-level code in the browser safely and efficiently. It's a core technology in the Chromium operating system and also used as part of Google App Engine.

The Portable Native Client is a version of Native Client that can be run directly by Chrome without requiring any special access. It allowed me to compile C++ code into a special pseudo-assembly and run it in the browser without having to use Javascript.

In the above example I compiled a C++ library using tup:

HOME = /Users/caleb
NACL_SDK = $(HOME)/nacl_sdk
PNACL_BIN = $(NACL_SDK)/pepper_49/toolchain/mac_pnacl/bin
PNACL_INCLUDE = $(NACL_SDK)/pepper_49/include
PNACL_LIB = $(NACL_SDK)/pepper_49/lib/pnacl/Release
CXX = $(PNACL_BIN)/pnacl-clang++
LINK = $(PNACL_BIN)/pnacl-ld
FINALIZE = $(PNACL_BIN)/pnacl-finalize
STRIP = $(PNACL_BIN)/pnacl-strip

: foreach app/native/*.cpp |> $(CXX) -std=gnu++11 -o %o -c %f -O4 -I$(PNACL_INCLUDE) |> tmp/%B.o {objs}
: {objs} |> $(CXX) %f -o %o -L$(PNACL_LIB) -lppapi_cpp -lppapi |> tmp/native.bc
: tmp/native.bc |> $(FINALIZE) %f -o %o --compress |> static/ui.pexe

This pexe is then loaded on the page and called via RPC. It exposes one method (Track) which takes in image data and returns the rectangle positions used for tracking.

Sadly PNACL is a dead technology. Google has decided to move forward with web assembly (WASM) instead, which, although it has some major limitations right now, should result in cross-browser support for this kind of thing in the future.

Though it's worth taking a moment to reflect on the nature of the web right now, where cutting edge technology 5 years ago, is all but dead now, and most people never even knew about it.

The Moderation Curse

By this point I had basically given up. I had run out of time and had to find a job, which ate up any of the time I would've spent on the project.

A few years later I came back to it, scratched the interactive component, and attempted to move forward by using human intelligence instead. The tutorial would be a mixture of text content, animations and lessons - the sort of educational content you can find all over the web. Once a player felt confident in their ability to perform a trick they would submit video proof, which would then be assessed by an administrator or other users to gain the achievement.

To start with, that'd basically be me, or perhaps some special class of user on the site - a juggling community if you will.

But remember this was a tool intended for children. And it dawned on me, as a believer in human depravity, that allowing the submission by children of personal videos to an online community, even for the noblest of reasons, would ultimately result in a whole lot of sinister behavior, and I couldn't think of a way to avoid this problem.

So if you got rid of the interactive component (because I wasn't able to build it) and the sharing component (because I was too scared to build it), all you had left was a bunch of tutorials and animations. Which might still be an interesting web site (though producing content is an exhausting undertaking if you've ever tried to do it) but the spark was gone, and I decided to move on to other things.

Monetization

Let me finish by discussing the monetization of the idea.

Entrepreneurialism has a bit of a paradox at its core. It would be the height of arrogance to imagine that an idea you came up with is truly unique and that no one had ever tried it before. With a big enough market and enough entrepreneurs, surely someone would've come up with your idea by now. And therefore the reason why your idea isn't out there is either it's too difficult to realize, it won't be profitable, or it is out there and you just don't know about it yet. And yet if everyone were to think this, no one would create anything at all, a sort of Zeno's Paradox of Entrepreneurialism.

I had several ideas for monetization:

The first would be to sell the application directly. Perhaps the first tutorial could be given away and you'd have to pay to unlock the subsequent ones. Costs were minimal because all the video content was hosted on Youtube for free, and, unless the thing really went viral, the relational database elements were minimal.
The second approach would be to utilize advertising or product placement. For example, referral links to juggling equipment on Amazon.
The third approach, and in my opinion the most viable, would be to reach out to an existing juggling equipment manufacturer and propose the application be bundled with juggling equipment. There are hundreds of thousands of juggling sets strewn about the country being given as gifts to children every Christmas or birthday. (That's how I learned to juggle anyway) Just add an insert into the set for this online application to help you learn. I suspect the marginal increase in demand would more than cover its cost.

Summary

So that's Juggle Tutor in a nutshell. Rest in Peace.

I hope to write a few more posts in this series. Stay tuned.