Error Risk


In most machine learning discussions I have with people, I find that the notion of error risk is new to them. Here’s the basic idea: you have a trained machine learning model that’s processing incoming data, and it naturally has an error rate. Let’s say the error rate is 5%, or conversely, the model is correct about 95% of the time. Error risk, then, is the set of possible negative consequences from incorrect predictions or decisions.

In an extreme example, the error risk for a 747’s autopilot system is perilously high. For a model predicting user shopping behavior on an e-commerce site, the risk is rather low –  maybe it recommends the wrong product once or twice, but nobody gets hurt.

The depth of the model’s integration and the speed at which it makes decisions are both correlated with the amount of error risk. If the program in question is running some analytics off the side, and merely supplying supplementary information to some human decision-maker, the risk is almost zero. However, if the program is itself making decisions, such as how much to bank right in a 45mph crosswind or how much of a certain inventory to order from a supplier, the risk increases substantially.

I’ve taken to quantifying error risk by asking the following questions:

  1. Is the program or system making autonomous decisions? If yes, what happens when the wrong decision is made?
  2. If it is making decisions, what is the cycle time / how quickly are those decisions being made?
  3. If it is not making decisions, is the information it’s providing critical or supplementary? (Critical information could be things like cancer diagnostics, whereas supplementary information could be providing simple reports to a digital marketing team.)

Other questions come up in these situations, but the above are the most important.

Optimal use of machine learning in applications means gaining maximal benefit at minimum risk wherever possible. To get as close as possible to “pure upside” in implementing machine learning, what’s required is some strategic thinking around where the opportunities lie and what the error tolerance might be in those applications. Even state-of-the-art machine learning systems have intrinsic error. Therefore error risk must always be accounted for, even if the error size is tiny.

My ideas about optimal implementation of machine learning borrow heavily from ideas in portfolio optimization, especially the efficient frontier. This is the “sweet spot” in the tradeoff between rewards and risks. As machine learning makes its way into more applications, it’s worth taking the time to consider both the upside and the downside. Measures of optimality can only help to make more informed decisions about how to apply the latest technology.

From Paper to Production: Shortening the Ramp

One of the things that strikes me about the current state of machine learning is how long it still takes to get a new algorithm or model into production. From the time that 1) a paper is published to when 2) its contents are evaluated by those doing machine learning in industry and 3) they subsequently commit to developing it, years have past. It does not need to be this way.

Those doing machine learning are understandably wary of newer methods, and I can see why they might opt to give it time for hidden problems to be discovered before committing. The long-term viability of a model is often judged by its very ability to remain on the scene after many years, which, though vaguely tautological, remains valid. For those models that survive the process, they are deemed essential, timeless. The rest are disregarded.

There are two sides from which this can be viewed: the business risk side, and the development side. These are deeply intertwined.

The Risk Side

Product managers and tech leadership who are coming to grips with the reality that is the pervasiveness of machine learning have an increasing number of considerations, many of them in technology areas that may not be within their expertise. There may be a bit of silver lining, however: the discussion regarding what machine learning technology makes its way into new products is fundamentally one of risk. To the extent that they can work with those within their organization or its allies who have the expertise to accurately assess the risk of newer methods, they can quantify the level of risk for when things go wrong.

For instance, if a new method can drop the error rate of a certain type of prediction down to 3% (as opposed to a previous 5%), how does that affect the risk statistics of the business? Does it enable broader distribution or reach into a higher market segment? Does it enable new products entirely? These questions must be answered.

New qualitative capabilities may seem more difficult to judge, but that is not necessarily true. For instance, some newer capabilities involve an AI system describing what’s in a photograph, using completely natural language with understandable sentences. If a product manager or CTO is considering using this capability in a new product or feature, the error rate of the method can still be used to assess the risk to the business that the new feature exposes them to. The degree of risk will vary widely by the specific industry and application, but the process remains the same.

The Development Side

Even if all parties can agree that a new method is tempting enough to use in a feature, somebody still has to code the thing. This is where progress is sluggish. Developing new, unfamiliar models and validating them is a nontrivial effort, even for experienced ML programmers. Assuming you’re doing the first implementation in a given language or environment, it requires a degree of getting into the thought process of the researchers. Often, direct correspondence is needed to clarify details.

While many papers include pseudocode that can be readily translated into a programming language, just as many do not. From there, you are left to develop a deep understanding of the model’s description and translate its mathematical definition and data structures into a complete implementation. It’s hard work.

This is the part where things can slow down: without a clear understanding of the model and its behavior, it is not possible for a tech lead, data scientist, or ML developer to accurate judgements about the level of risk or the likelihood of bugs or other surprise behavior. More than the error rate, one has to assume that the resulting implementation will have its own quirks and bugs. To assume otherwise would be both unrealistic and foolish.

Many companies may be slower to adopt “bleeding edge” methods, then, because it is simply too difficult to enumerate implied capabilities and to quantify the risk it imposes. How can this be solved?

Shorten the Ramp

Consider the situation where there is X new deep learning model and a company really wants to use it in their products, but may not have a good way of reaching the logical conclusions of doing so. We can point out the main issues:

  • It can be a challenge to arrive at an exact error rate for the specific application before an implementation has been made. The paper will use test datasets, but the model will almost surely behave differently with the data specific to a feature.
  • There is often a break in the communication between those gaining understanding of the model and those assessing how it may affect the business overall. It could be anything from a smash hit to total disaster.
  • Even when a model is finished, it will need to land in an environment in which to run. Engineers should keep the infrastructure requirements in mind from the beginning.
  • In a waterfall or waterfall-like process, it is of course not possible to create requirements in the absence of understanding of the capabilities involved. This stalls progress.
  • Agile development is out the window, due to the high sensitivity of the relationship between model performance and feature risk or cost. These aren’t really the kinds of things you can just “ship first, iterate later”. Much needs to be worked out before it goes into the hands of feature developers.

All of this points toward two ways to shorten the ramp to deployment:

  1. If a company is genuinely interested in adopting new algorithms and models in their offerings, they need to provide representative data as soon as possible.
  2. Their engineering and/or data science team(s) need to have tools and infrastructure to support rapid prototyping of new models.

Only with datasets that are representative of what will occur in a production setting can a team judge their implementation and profile its performance. In the paper, a model may boast 90-something percent accuracy, but you may find that for your problem it is a little less, thereby affecting how risky the investment in developing the model is.

This can happen before the implementation phase, by looking at the test data used in a paper. A talented developer or data scientist can format the internal dataset to be similar to that used in a test dataset from the paper, thereby reducing opportunity for errors to arise from differences in data formatting.

An example process, then, might look like this:

  1. Product manager decides they need X new capability in their next phase of features.
  2. Their first order of business, then, is to gather and build datasets that are close to the actual problem. At the very least, they should ensure that whoever can build that dataset has access to all the tools and data sources needed to complete it quickly.
  3. Product manager hands over dataset, high-level requirements, and asks data science and/or engineering team(s) to begin investigating models.
  4. Technical team either begins profiling models they already know about or scouting for models that are known to enable the required capabilities.
  5. Development / prototyping begins with the selected model(s) and the datasets provided.
  6. Throughout development process, error rates and other important metrics are reported back to the product manager (or whomever is overseeing the process).
  7. Risk calculations are adjusted as this information flows in. For example, if it’s a photo auto-tagging feature in question, one can determine how many users are likely to experience incorrectly tagged photos and how often, based on the volume of photos and the error rate of the model. From that, one can determine how much of a risk it is to a the business – are the users likely to leave if they experience the error, or is it not a deal-breaker?
  8. Once all models have been profiled and tested, a decision can be reached about whether or not to proceed with the feature.

Of course not every company has a product manager or entire teams for data science and engineering, but the overall structure can be applied by those filling the roles – even if it’s all the same person.

In summary, the best way I see to shorten the time from a published paper to a viable production implementation is to 1) provide data as early as possible and 2) ensure engineering has tools to quickly prototype and test models. Both are difficult, but both will pay off considerably for those willing to put in the effort.

I hope this has been helpful to you. Please ask questions in the comments.



Technology and the Limits of Convenience

I need an Apple Watch. Badly. I need it because the distance between my wrist and my coat pocket is simply too much, because I need to save that extra second when checking my phone for notifications. I need it because I need one more device to monitor my health.

While I’m at it, I also need an app that saves me a few seconds booking a table, finding the right bar for my Saturday night, and so on. Hell, anything that can save me those precious seconds throughout my hectic day will have my dollar.

The obvious facetiousness aside (I don’t need any of those things), I’m growing weary of seeing so many startups without missions. Don’t get me wrong, I’m all for creating amazing products. There is, however, an eventual lack of authenticity in the endless strive for greater convenience. These pure-convenience plays face continually diminishing returns.

It is easy to mistake one-off convenience for recurring utility. Entrepreneurs have gotten all too good at tricking themselves – and investors – that their ad hoc gimmick will scale to epic proportions, and keep compounding on its original value. Yet only the most central and important of product functions will see this happen. Along the peripherals, most utility is exhausted almost immediately.

It’s well known that the vast majority of startups fail. You can’t make them all into winners. You can, however, stress the importance of real, lasting, and growing utility. The more apps I see, and the more pitches I hear for the latest and greatest fad ever to hit the App Store, the more I feel like we need to focus on using technology to help us be better people. Beyond convenience, and beyond mere utility, there lies a realm of innovation wherein products are not actually products at all, but catalysts for social movements. Those same movements can help us become better citizens of humanity.

The job of the entrepreneur in all times before has been to find and capture economic opportunity. Now, however, a higher calling is in order: entrepreneurs need to rise to the challenge of taking the higher-level principles of creating things that bring about positive social change, and finding specific opportunities to execute on opportunities that build toward a greater goal. Building a business still takes as much savvy and boldness as ever, but with the new requirements of relevance to social context and mission-driven offerings. It may be the hardest problem of all, but it will turn out to be the most worth it.

We’ve had it all wrong.

All this time, we’ve had it all wrong.

Artificial Intelligence (AI) has been a science for over 50 years now, and in that time has accomplished some amazing things – computers that beat human players at chess and Jeopardy, find the best routes for delivery trucks, optimize drug delivery, and many other feats.  Yet the elusive holy grail of “true AI”, or “sentient AI”, “artificial general intelligence” – by whatever name, the big problem – has remained out of our grasp.

Look at what the words actually say though – artificial intelligence.  Are we sure that intelligence is really the crucial aspect to creating a sentient machine?

I claim that we’ve had it wrong.  Think about it: intelligence is a mere mechanical form, a set of axioms that yield observations and outcomes.  Hypothesis, action, adjustment – ad infinitum.  The theory has been if we could just create the recursively self-optimizing intelligence kernel, BOOM! – instant singularity.  And we’d have our AGI to run our robots, our homes, our shipping lanes, and everything imaginable.

The problem with this picture is that it assumes intelligence is the key underlying factor.  It is not.

I claim the key factor is…

…wait for it…


Consciousness might be defined as how ‘aware’ an entity is of itself and its environment, which might be measured by how well it was able to distinguish things like where it ends and its environment begins, a sense of agency with reference to past actions it performed, and a unified experience of its surroundings that gives it a constantly evolving sense of ‘now’.  This may overlap with intelligence, but it is a different goal: looking in the mirror and thinking “that’s me” is different than being able to beat humans at chess.  A robot understanding “I broke the vase” is different than an intelligence calculating the Voronoi diagram of the pottery’s broken pieces lying on the floor.

Giulio Tononi’s work rings a note in harmony with these ideas.  Best of all, he and others discuss practically useful metrics of consciousness.  Whether Integrated Information Theory is the root of all consciousness or not is immaterial; the point is that this is solid work in a distinctly new direction, and approaches the fundamental problems of AI in a completely new way.

Tononi’s work may be a viable (if perhaps only approximate) solution to the binding problem, and in that way could be immensely useful in designing systems that have a persisting sense of their evolving environment, leading us to sentience.  It is believable that intelligence may be an emergent property of consciousness, but it seems unlikely that intelligence alone is the ingredient for consciousness itself, and that somehow a certain ‘amount’ of intelligence will yield sentience.  One necessarily takes precedence over the other.

Given this, from now on I’ll be focusing my work on Artificial Consciousness, which will differ from Artificial Intelligence namely in its goals and performance metrics: instead of how effectively an agent solved a problem, how aware it was of its position in the problem space; instead of how little error it can achieve, how little ambiguity it can achieve in understanding its own boundaries of existence (where the program ends and the OS begins, where the robot’s body ends and the environment begins).

I would urge you to read Tononi’s work and Adam Barrett’s work here.  My Information Theory Toolkit ( has several of the functions you’ll need to start experimenting on systems with a few more lines of code (namely, use Kullback-Leibler divergence).

In the coming months, I’ll be adding ways to calculate the Information Integration of abstracted systems, or its Phi value.  This is NP-Hard, so it will have to remain in the domain of small systems for now.  Nonetheless, I believe if we start designing systems with the intent of maximizing their integration, it will yield some system topologies that have more beneficial properties than our usual ‘flat’ system design.

Artificial Intelligence will no doubt continue to give us great advances in many areas, but I for one am embarking on a quest for something subtly but powerfully different: Artificial Consciousness.

Note: If you have some programming skill and would like to contribute to the Information Theory Toolkit, please fork the repository and send me an email so we can discuss possibilities.  I’ll continue to work on this as I can.

The Best & Worst of Tech in 2013

Keeping with tradition, I’ll review some of the trends I noticed this past year, and remark on what they might mean for those of us working in technology.


JavaScript becomes a real language!

To some this might seem trivial, but there’s a lot to be said here.  With the massive growth of Node.js and many associated libraries, Google’s V8 engine has been stirring up the web world.  Write a Node.js program, and I guarantee you’ll never think of a web server the same way again.

Why is this good?  This isn’t an advertisement for Node.js, but I would posit that these developments are good because they open up entire new worlds of productivity – rapid prototyping, readable code, and entirely new ways of thinking about web servers.  Some folks are even running JavaScript on microcontrollers now, a la Arduino.  JavaScript has been unleashed from the confines of the browser, and is maturing into a powerful tool for creating production-quality systems with high scalability and developer productivity.  Exciting!

Cognitive Computing Begins to Take Form

Earlier this year I stated a belief that 2013 would be the year of cognitive systems.  Well that hasn’t been fulfilled completely, but we’ve nonetheless seen some intriguing developments in that direction.  IBM continues to chug away at their cognitive platforms, and Watson is now deployed working full time as an AI M.D. of sorts.  Siri has notably improved from earlier versions.  Vicarious used their algorithms to crack CAPTCHA.  Two rats communicated techepathically (I just made that word up) with each other from huge distances, and people have been controlling robots with their minds.  It’s been an amazing year.

The cognitive computing/cybernetics duo is going to change, well, everything.  I would argue that cybernetics may just top the list of most transformative technologies, but it has a ways to go before we go full Borg.

Wearables Start to Become a Thing

Ah, wearables.  We’ve waited for nifty sci-fi watches for so long – and lo!  They have come.  Sort of.  They’re on their way, and we’re starting to catch glimpses of what this will actually mean for technology.  I agree with Sergey Brin here: it’ll get the technology out of our hands and integrated into our environment.  Personally I envision tech becoming completely seamless and unnoticeable, nature-friendly and powerful, much like our own biological systems, but that’s another article entirely.

Wearable technology will combine with the “Internet of Things” in ways we can’t yet imagine, and will make life a little easier for some and much, much better for others.

Internet of Things

The long-awaited Internet of Things is finally starting to coalesce into something real.  Apple is filing patents left and right for connected home gear, General Electric is making their way into the space with new research, and plenty of startups are sprouting to address the challenges in the space (and presumably be acquired by one of the big players).

This development is so huge it’s almost difficult to say what it will bring.  One thing is for sure: the possibilities are only limited to one’s imagination.

21st Century Medicine is Shaping up to be AWESOME

Aside from the fact that we now have an artificial intelligence assisting in medical diagnosis, there have been myriad amazing developments in medicine.  From numerous prospects for cures to cancer, HIV, and many other disease to the advances in regenerative medicine and bionanotechnology, we’re on the fast track to a future wherein medical issues can be resolved quickly and with relatively little pain.  There’s also a different perspective: solve the issue at the deepest root, instead of treating symptoms with drugs.


Every Strategy is a Sell Strategy

This year, tech giants went acquisition-mad.  It seems like every day one of them has blown another few billion dollars on some startup somewhere.

Why is this bad?  It may be good for the little guy (startup) in the short term – they walk away with loads of cash – but in the long term I suspect it will have a curious effect.  It’s almost like business one-night-stand-ism.  You build a company knowing full well that you’re just going to sell it to Google or Facebook.  If not, you fold.

You can see where this goes.  People are often saying they look forward to ‘the next Google’, or ‘the next Facebook’, or whatever.  Well there might not be any.  That is, all the big fish are eating the little fish before they have the chance to become big fish.  Result?  Insanely huge fish.

It’s great that a couple of smart kids can run off, Macbook Pros in hand, and [potentially] make a few billion bucks in a few years, with or without revenue.  But who is going to outlast the barrage of acquisition offers and become the next generation of companies?

Big Data is Still not Clearly Defined

Big Data.  Big data.  BIG.  DATA.

What does it mean?

The buzzword and its many ilk have been floating around for a couple of years now, and still nobody can really define what it does.  Most seem to agree it goes something like: prop up a Hadoop cluster, mine a bunch of stale SQL records in massive company/organization, cast the MapReduce spell and – Hadoopra cadabra!  Sparkling magical insights of pure profit glory appear, fundamentally altering life and the universe forever – and sending you home with bigger paychecks.

I’m all for data analysis.  In fact I believe that a society that makes decisions based on hard evidence and good data-crunching is a smart society indeed.  But the ‘Big Data’ hype has yet to form into anything definitive, and remains a source of noise.  (Big data fanboys, go ahead and flame in the comments.)

 America’s Innovation Edge Dulls

It’s true.  I hate to admit, but it is, undeniably, absolutely true.  America has dropped the ball when it comes to innovation.  That’s not to say we’re not innovating cool things, generating economy and all of that – we are.  But that gloss has started to tarnish.  Specifically, America has a problem with denying talented people the right to be here and work.

It could be our hyper-paranoid foreign policy in lieu of 9/11, it could be the flawed immigration system, it could be Washington gridlock or a million other things.  It’s not particularly fruitful to pass the blame now.  We’re turning away the best and the brightest from around the world, and simultaneously continuing to outsource some of what used to be our core competencies.  The bright spot in all of this is that high-tech manufacturing would seem to be making a comeback, perhaps in part thanks to 3D printing, but it’s not quite enough.  We need more engineers, more inventors, and more people from outside our borders.  This has always been the place people come to plant the seeds of great ideas.  Let’s stay true to that.

21st Century Mathematics

What does mathematics look like in the 21st century?  I’m in no position to make any declarations, what with not being an expert on math history and all, but I’d like to offer up a couple of brief observations to think on.

If I had to name one candidate for the overall flavor of 21st century mathematics, I’d say complex adaptive systems.  Why?  Because it encapsulates the transition we’re seeing from the rigid, linear, and static to the complex, nonlinear, and dynamic.  I think a lot of this in particular has been motivated by a couple of things: 1. our great and ever-increasing numbers as humans, and 2. the increasing complexity of the technology we use to accomplish our tasks.  Given an exponential increase in population coupled with an exponential increase in the complexity of the technology being used virtually every second of every day, some new mathematics were due to emerge.  Among the more interesting examples you have things like fractal geometry, cellular automata, and ‘system of systems’.  New variations of these and other approaches are appearing daily in academic journals, and some make it to market.

The pace is increasing, to the degree that the landscape is changing faster than anybody can keep up with.  That’s technology as a whole.  For mathematics, an entirely new era is on its way in, motivated by society and the thirst for new technology.  The reigning paradigms of this century will likely be vastly complex networked systems, and how to describe them accurately.  This includes anything from social networks to artificial intelligence, transportation systems (including space traffic), economics, neuroscience, biology – almost anything you can think of.  What’s becoming apparent is that everything that was off limits to traditional mathematics is becoming accessible through new frameworks.

These are exciting times!  The future is bright, and there is surely no end to the amount of adventure a motivated person can have this century.