Deep Learning & Temporal Modeling

In most discussions I’ve seen of deep learning, and certainly most of the models demonstrated, there is no discussion of temporal sequence modeling. I was curious where the state of the art was for this task, and thought to compare with some of my own intuitions about sequence modeling.

As a first pass, I found a handful of papers that discuss using stacked Restricted Boltzmann Machines in various configurations to achieve temporal learning – they are “Robust Generation of Dynamical Patterns in Human Motion by a Deep Belief Nets“, “Temporal Convolution Machines for Sequence Learning“, and “Sequential Deep Belief Networks“. The last of these three lays out their approach nicely in one sentence: “An L-layer SDBN is formed by stacking multiple layers of SRBMs.” It is, in essence, the stacking game continued.

While the approaches described in the three papers above would seem to yield decent results, I often wonder about extendability and scalability. RBMs / DBNs have lovely analytical properties, and cleverly get around intractable subproblems with use of sampling schemes, but their topology is nonetheless locked in all setups I’ve seen. This leaves no room for simulated neurogenesis.

Why would simulated neurogenesis be important? Moving the synapse weights around in a fixed topology yields nice results, but there may situations where we would want the topology to grow as needed – namely, if the network did not have a representation for a given pattern, it could create one. This assumes that we are ok with being slightly inefficient in terms of ensuring that the representation space is adequately filled before we generate new neurons. The tradeoff is a greater amount of space in memory.

Lately I’ve been working on these types of networks, and it is most certainly difficult to do. First you have the basic foundation of any neural network – learning spatial patterns. That part is relatively easy: you can quickly build something to learns to model spatial patterns and produce labels for them.

The next step is creating the dynamics that model temporal sequences. For example, a sequence of words: if I say “four score and-“, many of you will instantly think “seven years ago”. This is a sequence. The spatial patterns are the specific combinations of letters to form words, and the temporal pattern is sequence of those words. We learn this kind of thing with relative ease, but for machines this is a huge task.

Unsurprisingly, it is in this step that things get complicated. The noted Deep Learning approach is to specify a given time depth. In one paper T=3, meaning it can model sequences 3 time steps deep – in our example, up to 2 words in advance. When the network receives “four score and”, if it knows the sequence in question then it is thinking of “seven years”. When it gets to “seven”, it is thinking “years ago”.

Note that in one of the papers above, they use a different subnetwork entirely to model temporal constraints. Again, while this is nice from an analytical point of view, it likely has little to do with the way the brain works. The brain is essentially a hundred billion cells who knows nothing but how to behave in response to electrical and chemical signals. Their emergent behavior gives rise to your consciousness. The brain, at least as far as anybody can tell, does not have a “temporal network”. Temporal information is learned as a natural consequence of the dynamics of the vast network of cells. Somehow, we need to figure out a way to model temporal information inline with the spatial information, and make it all fit together nicely.

That said, the approach I’ve come to borrows a bit from Deep Learning, a little bit from Jeff Hawkins’ and Dileep George’s work on spatiotemporal modeling, and a little from complex systems in general. I’ve been searching for the core information overlap between different approaches, and have found some commonalities. From that, I’ve come to some notes of practice.

First, it almost always becomes necessary to sacrifice analytical elegance for emergent behavior. In many ways, emergent behavior is innately chaotic, and therefore difficult to model mathematically. Stochastic methods yield some insight, but there are higher-level states of emergence that may not be obvious from analysis of a single equation or definition of a system. In this case, it is simpler in practice to use heuristic methods to find emergent complexity, and attempt to characterize it as it is discovered, rather than attempt to discover all possible states from the definition of the system. A characteristic of chaotic systems is that you have to actually advance/evolve them in order to derive their behavior, as opposed to knowing in advance via analytical methods.

Second, and further emphasizing the use of heuristic methods, tuning the model with genetic algorithms tends to yield better results than attempting to solve for optimal parameters explicitly. Perhaps this is merely a difference in style, and if I’d spent more time studying complex systems I might know of better ways to do this, but at my current level of understanding a simple genetic algorithm that swarms over parameter configurations yields better results than attempting to understand what the “perfect network” might look like. There is a philosophical difference in that with this method you’re using the machine to understand the machine, in some sense surrendering control and rendering of insights to the machine itself. Genetic algorithms may be able to find subtleties that my slightly-more-evolved-ape-brain will miss or otherwise fail to conceptualize merely from the definition of the system and associated intuitions.

Deep Learning is advancing quickly, and while it offers some interesting food for thought when attempting to solve temporal modeling problems, I am not yet sold on the notion that it is the final answer to this more general problem. Choices of representation and methods of optimization may be trivial in some cases, but when they differ greatly from the norm they may yield some advantage. Not only that, but sticking closer to the way the brain represents information has done nothing but improve the performance and capabilities of the resulting systems. The path I’m on may all come to nothing, or it may shine light on some new ways to think about temporal modeling problems.

A closing note: A whitepaper describing my work is underway, so you can stare in awe at some formidable-looking equations and cryptic diagrams. I’ve been several years down this path now, and it’s high time to encapsulate all of the work done in a comprehensive overview.


Context & Permutations

In the pursuit of Artificial General Intelligence, one of the challenges that comes up again and again is how to deal with context.  To illustrate: telling a robot to cross the street would seem simple enough.  But consider the context that five minutes ago somebody else told this robot not to cross the street because there was some kind of construction work happening on the other side.  What does the robot decide to do?  Whose instruction does it consider more important?

A robot whose ‘brain’ did not account for context properly would naively go crossing the street as soon as you told it to, ignoring whatever had come before.  This example is simple enough, but you can easily imagine other situations in which the consequences would be catastrophic.

The difficulty in modeling context in a mathematical sense is that the state space can quickly explode, meaning that the number of ways that things can occur and sequences they can occur in is essentially infinite.  Reducing these effective infinities down to manageable size is where the magic occurs.  The holy grail in his case is to have the computing of the main algorithm remain constant (or at least linear) even as the number of possible permutations of contextual state explodes.

How is this done?  Conceptually, one needs to represent things sparsely, and have the algorithm that traverses this representation only take into account a small subset of possibilities at a time.  In practice, this means representing the state space as transitions in a large graph, and only traversing small walks through the graph at any given time.  In this space-time tradeoff, space is favored heavily.

The ability to adeptly handle context is of utmost importance for current and future AIs, especially as they take on more responsibility in our world.  I hope that AI developers can form a common set of idioms for dealing with context in intelligent systems, so that they can be collaboratively improved upon.

We’ve had it all wrong.

All this time, we’ve had it all wrong.

Artificial Intelligence (AI) has been a science for over 50 years now, and in that time has accomplished some amazing things – computers that beat human players at chess and Jeopardy, find the best routes for delivery trucks, optimize drug delivery, and many other feats.  Yet the elusive holy grail of “true AI”, or “sentient AI”, “artificial general intelligence” – by whatever name, the big problem – has remained out of our grasp.

Look at what the words actually say though – artificial intelligence.  Are we sure that intelligence is really the crucial aspect to creating a sentient machine?

I claim that we’ve had it wrong.  Think about it: intelligence is a mere mechanical form, a set of axioms that yield observations and outcomes.  Hypothesis, action, adjustment – ad infinitum.  The theory has been if we could just create the recursively self-optimizing intelligence kernel, BOOM! – instant singularity.  And we’d have our AGI to run our robots, our homes, our shipping lanes, and everything imaginable.

The problem with this picture is that it assumes intelligence is the key underlying factor.  It is not.

I claim the key factor is…

…wait for it…


Consciousness might be defined as how ‘aware’ an entity is of itself and its environment, which might be measured by how well it was able to distinguish things like where it ends and its environment begins, a sense of agency with reference to past actions it performed, and a unified experience of its surroundings that gives it a constantly evolving sense of ‘now’.  This may overlap with intelligence, but it is a different goal: looking in the mirror and thinking “that’s me” is different than being able to beat humans at chess.  A robot understanding “I broke the vase” is different than an intelligence calculating the Voronoi diagram of the pottery’s broken pieces lying on the floor.

Giulio Tononi’s work rings a note in harmony with these ideas.  Best of all, he and others discuss practically useful metrics of consciousness.  Whether Integrated Information Theory is the root of all consciousness or not is immaterial; the point is that this is solid work in a distinctly new direction, and approaches the fundamental problems of AI in a completely new way.

Tononi’s work may be a viable (if perhaps only approximate) solution to the binding problem, and in that way could be immensely useful in designing systems that have a persisting sense of their evolving environment, leading us to sentience.  It is believable that intelligence may be an emergent property of consciousness, but it seems unlikely that intelligence alone is the ingredient for consciousness itself, and that somehow a certain ‘amount’ of intelligence will yield sentience.  One necessarily takes precedence over the other.

Given this, from now on I’ll be focusing my work on Artificial Consciousness, which will differ from Artificial Intelligence namely in its goals and performance metrics: instead of how effectively an agent solved a problem, how aware it was of its position in the problem space; instead of how little error it can achieve, how little ambiguity it can achieve in understanding its own boundaries of existence (where the program ends and the OS begins, where the robot’s body ends and the environment begins).

I would urge you to read Tononi’s work and Adam Barrett’s work here.  My Information Theory Toolkit ( has several of the functions you’ll need to start experimenting on systems with a few more lines of code (namely, use Kullback-Leibler divergence).

In the coming months, I’ll be adding ways to calculate the Information Integration of abstracted systems, or its Phi value.  This is NP-Hard, so it will have to remain in the domain of small systems for now.  Nonetheless, I believe if we start designing systems with the intent of maximizing their integration, it will yield some system topologies that have more beneficial properties than our usual ‘flat’ system design.

Artificial Intelligence will no doubt continue to give us great advances in many areas, but I for one am embarking on a quest for something subtly but powerfully different: Artificial Consciousness.

Note: If you have some programming skill and would like to contribute to the Information Theory Toolkit, please fork the repository and send me an email so we can discuss possibilities.  I’ll continue to work on this as I can.

The Best & Worst of Tech in 2013

Keeping with tradition, I’ll review some of the trends I noticed this past year, and remark on what they might mean for those of us working in technology.


JavaScript becomes a real language!

To some this might seem trivial, but there’s a lot to be said here.  With the massive growth of Node.js and many associated libraries, Google’s V8 engine has been stirring up the web world.  Write a Node.js program, and I guarantee you’ll never think of a web server the same way again.

Why is this good?  This isn’t an advertisement for Node.js, but I would posit that these developments are good because they open up entire new worlds of productivity – rapid prototyping, readable code, and entirely new ways of thinking about web servers.  Some folks are even running JavaScript on microcontrollers now, a la Arduino.  JavaScript has been unleashed from the confines of the browser, and is maturing into a powerful tool for creating production-quality systems with high scalability and developer productivity.  Exciting!

Cognitive Computing Begins to Take Form

Earlier this year I stated a belief that 2013 would be the year of cognitive systems.  Well that hasn’t been fulfilled completely, but we’ve nonetheless seen some intriguing developments in that direction.  IBM continues to chug away at their cognitive platforms, and Watson is now deployed working full time as an AI M.D. of sorts.  Siri has notably improved from earlier versions.  Vicarious used their algorithms to crack CAPTCHA.  Two rats communicated techepathically (I just made that word up) with each other from huge distances, and people have been controlling robots with their minds.  It’s been an amazing year.

The cognitive computing/cybernetics duo is going to change, well, everything.  I would argue that cybernetics may just top the list of most transformative technologies, but it has a ways to go before we go full Borg.

Wearables Start to Become a Thing

Ah, wearables.  We’ve waited for nifty sci-fi watches for so long – and lo!  They have come.  Sort of.  They’re on their way, and we’re starting to catch glimpses of what this will actually mean for technology.  I agree with Sergey Brin here: it’ll get the technology out of our hands and integrated into our environment.  Personally I envision tech becoming completely seamless and unnoticeable, nature-friendly and powerful, much like our own biological systems, but that’s another article entirely.

Wearable technology will combine with the “Internet of Things” in ways we can’t yet imagine, and will make life a little easier for some and much, much better for others.

Internet of Things

The long-awaited Internet of Things is finally starting to coalesce into something real.  Apple is filing patents left and right for connected home gear, General Electric is making their way into the space with new research, and plenty of startups are sprouting to address the challenges in the space (and presumably be acquired by one of the big players).

This development is so huge it’s almost difficult to say what it will bring.  One thing is for sure: the possibilities are only limited to one’s imagination.

21st Century Medicine is Shaping up to be AWESOME

Aside from the fact that we now have an artificial intelligence assisting in medical diagnosis, there have been myriad amazing developments in medicine.  From numerous prospects for cures to cancer, HIV, and many other disease to the advances in regenerative medicine and bionanotechnology, we’re on the fast track to a future wherein medical issues can be resolved quickly and with relatively little pain.  There’s also a different perspective: solve the issue at the deepest root, instead of treating symptoms with drugs.


Every Strategy is a Sell Strategy

This year, tech giants went acquisition-mad.  It seems like every day one of them has blown another few billion dollars on some startup somewhere.

Why is this bad?  It may be good for the little guy (startup) in the short term – they walk away with loads of cash – but in the long term I suspect it will have a curious effect.  It’s almost like business one-night-stand-ism.  You build a company knowing full well that you’re just going to sell it to Google or Facebook.  If not, you fold.

You can see where this goes.  People are often saying they look forward to ‘the next Google’, or ‘the next Facebook’, or whatever.  Well there might not be any.  That is, all the big fish are eating the little fish before they have the chance to become big fish.  Result?  Insanely huge fish.

It’s great that a couple of smart kids can run off, Macbook Pros in hand, and [potentially] make a few billion bucks in a few years, with or without revenue.  But who is going to outlast the barrage of acquisition offers and become the next generation of companies?

Big Data is Still not Clearly Defined

Big Data.  Big data.  BIG.  DATA.

What does it mean?

The buzzword and its many ilk have been floating around for a couple of years now, and still nobody can really define what it does.  Most seem to agree it goes something like: prop up a Hadoop cluster, mine a bunch of stale SQL records in massive company/organization, cast the MapReduce spell and – Hadoopra cadabra!  Sparkling magical insights of pure profit glory appear, fundamentally altering life and the universe forever – and sending you home with bigger paychecks.

I’m all for data analysis.  In fact I believe that a society that makes decisions based on hard evidence and good data-crunching is a smart society indeed.  But the ‘Big Data’ hype has yet to form into anything definitive, and remains a source of noise.  (Big data fanboys, go ahead and flame in the comments.)

 America’s Innovation Edge Dulls

It’s true.  I hate to admit, but it is, undeniably, absolutely true.  America has dropped the ball when it comes to innovation.  That’s not to say we’re not innovating cool things, generating economy and all of that – we are.  But that gloss has started to tarnish.  Specifically, America has a problem with denying talented people the right to be here and work.

It could be our hyper-paranoid foreign policy in lieu of 9/11, it could be the flawed immigration system, it could be Washington gridlock or a million other things.  It’s not particularly fruitful to pass the blame now.  We’re turning away the best and the brightest from around the world, and simultaneously continuing to outsource some of what used to be our core competencies.  The bright spot in all of this is that high-tech manufacturing would seem to be making a comeback, perhaps in part thanks to 3D printing, but it’s not quite enough.  We need more engineers, more inventors, and more people from outside our borders.  This has always been the place people come to plant the seeds of great ideas.  Let’s stay true to that.

A slight misunderstanding

Often in discussions of artificial intelligence I see and hear the quote, “The brain does around X calculations per second.”  Usually this number is around 100 trillion.  Why?

This is presumed because the brain is said to have about 100 trillion synapses between all of its neurons.  By treating each synapse as a computational element capable of performing an action based on a stimulus, the brain is then modeled as “something doing 100 trillion calculations per second.”

There are several problems with this:

1. What is the nature of these “calculations” we’re talking about?  Is this simple addition, probability tables, or differential equations?

2. The language of “per second” could wrongly imply that the brain somehow runs on a constant master clock.

3. Are there additional layers of important information being exchanged beyond merely the synapses?

In more detail,

#1: Given the tendency to want to measure things in FLOPS (Floating Point Operations Per Second), I can see why this approach would be appealing.  It’s as simple as just counting how many “computational elements” are in the system, and then saying it can do that many calculations per second, right?  I’m led to think, well, no.  A FLOP is likely to be an extremely simple operation such as addition or multiplication.  Something more complex, such as linear algebra routines or probability functions, will require sophisticated code and hence numerous instructions/FLOPS to execute it.  The argument that “the brain does 100 trillion calculations per second and therefore we will have true AI when computers can do 100 trillion CPS” then is as useful as saying “a human is made of 150 lbs of matter so when you have 150 pounds of matter you’ll have a human”, or something equally ridiculous.  The number of calculations is not completely unimportant, but it is secondary to what kind of calculations are being done.  In the case of classical computers, as stated, complex instruction sets use up many operations to do their work, and so a raw measure of simple calculations isn’t very informative of the system’s overall capability.

#2: We’re used to think of calculations happening in a uniform, clock-like way because of the way our chips are designed.  The problem is, the brain, for as far as we can tell, processes everything asynchronously.  Each node is operating more or less independently of the others.  That’s not to say that classical computers won’t be useful in emulating brain-like mechanics, but modeling an asynchronous system with a highly synchronized one comes with complications that should not be ignored.

#3: This also ties in with #1.  Namely, due to the passing of information through electrochemical channels, there are additional layers of computation that, as of yet, have not been completely modeled nor understood.  The actual communication mechanisms of the brain could be simpler than we thought, or they could turn out to be vastly complex.  It’s anybody’s guess right now.  But in any guess, a direct conversion from calculations per second (as in the brain) to something like FLOPS (as in classical computers) is like saying “a machine that can add 10 numbers together in a second can also solve 10 high-order partial differential equations per second”.  With extremely clever software something like this may eventually be possible (perhaps a map from addition operators to matrix solvers or something like that), but for now this kind of crude conversion is wildly inaccurate.

I worry that a lot of people are buying this idea that once we get 100 TeraFLOPS machines we’ll somehow have an uber-AI.  Unless software comes a long ways, those who are counting on this idea may be very disappointed when this emergence doesn’t happen.

It is worth noting that a quantum computer used for AI would be a completely different picture – different from both classical computers and from the brain.  A behemoth of the quantum variety would be capable of things that neither an Intel i7 nor a human brain can do, but that is another discussion entirely.

Until next time, then.

The Best and Worst of Tech in 2012

The following is a list of what is, in my humble opinion, the best and worst technology and tech culture trends in 2012.  I’ll start with some of the best, go over the worst, and end with more of the best as well as some points for 2013.


1. Aquaponics is taking leaps forward.  Vertical farming, hydroponics, and aquaponics are all picking up steam in the marketplace, especially as Kickstarter and IndieGoGo projects sprout almost daily.  I’ve long felt that we needed better ways to grow food, in urban areas or places with poor soil, and aquaponics has long been a viable solution.  It just hasn’t widely popularized until now.

2. Quantum physics often makes headlines.  I understand a lot of physicists would see this as a bad thing, given how badly the concepts are utterly botched, but I see the net effect as positive.  More people are curious about deep matters of physics now than ever before, as far as I can tell.  CERN has done a lot to drive this forward, but they are by no means the only player.

The more that quantum physics moves into early education, the better off the next generation will be.  So far, I’d say we’re off to a decent start.

3. Open, online education takes off.  Between Khan Academy, EdX, Coursera, and the countless other superb online education sites, anybody anywhere can now get a decent education despite all other factors in life.  This is especially true as it pertains to programming.  You can now learn nearly any programming language online, for free.  This has already changed education as we know it, and all signs point to its continuing to do so.

4. As already mentioned, crowdsourcing has changed a lot about how we think about projects, businesses, and community.  Especially awesome are the community spaces that are popping up, such as makerspaces, art co-ops, etc.  Despite a shaky economy and the worst national debt in our history, people are pressing forward and doing what they love.

The global scale of these projects is also impressive.  Now there are Kickstarters spanning multiple nations via their founders, demonstrating the platform’s ability to link people of common interest together.

5. Open source software is more prevalent than ever – and is now complemented by open source hardware.  From Hadoop to the Apache suites, Arduino to Raspberry Pi (whose open source-ness may be in question), CentOS to Fedora – open source is king.  This opens up numerous pathways for new development that can’t presently be imagined.

6. Cloud computing gets a foothold.  It’s taken a while, from my perspective, from when I first started hearing about the idea until it started to really take hold.  I think it’s safe to say that cloud computing has finally become a market standard.  Many would argue as to when that was exactly, but I’ll just leave it at “sometime in 2012”.

7. Chips are designed to consume less energy, and produce less heat.  This goes hand in hand with #6 above: people started to realize how much juice we were really taking up, and did something about it.  Not only is this good for businesses, but good for the planet too – in case you somehow forgot that pitch.

8. Parallel computing grows.  From Adapteva to Intel’s new 48-core designs, we’re finally starting to see significant advances in truly parallel processing.  GPUs have had their part to play in this as well, with suites like CUDA and OpenCL driving massive markets forward.

And now, some of THE WORST:

1. The patent wars continue.  There are no measures which can adequately describe how damaging this has been to business, innovation, and ultimately, progress.  Between Apple, Samsung, Google, Oracle, and Microsoft, it’s been fire and brimstone all year in the courthouses.  The only bright side is that patent attorneys are getting paid by the truckload.

One could write an entire thesis on this and barely scratch the surface.  For the purposes of this blog, suffice it to say that the effects of this onslaught will be felt for a long, long time (due to precedents set, laws made, etc.).

2. This follows naturally from #1: the US patent system still does not have comprehensive support for software or algorithms.  Patenting anything in this vein reduces to invoking the kind of convoluted sophistries that only those fluent in legalese could produce.  Acknowledging that software/algorithm patents are a touchy issue, the least we could do is disambiguate what can and cannot be patented.  Without clarity, good business cannot be done.

3. Everything starting with “i”.  iCar.  iWork.  iBot.  iGarage.  It’s gotten both ridiculous and annoying.  In the first place, mimicking Apple is not always the wisest thing to do.  And secondly, it really couldn’t be that hard to be more creative with your names, could it?

4. Everything is “smart”.  SmartCard.  SmartGlass.  SmartTires.  SmartSmart.  SmartSmartSmart.  This too has gotten utterly ridiculous.  Putting “smart” in front of a generic product name does not say anything about the product or the company.  I could make a “SmartStick”, and that wouldn’t change what it most certainly is: a stick.  Likewise, a ‘SmartCard’ is still just a card.  Better to make some indication of what the thing actually does or is useful for.  Even “MetroCard” is fine: it’s a card that you use for the metro.

5. Apps.  I don’t have any problem with apps themselves, but the feverish craze that has everybody chattering like caffeine-injected chipmunks about the latest apps drive me crazy.

Of course mobile technologies have added tremendous value to many aspects of our lives.  I get that.  And of course it’s exciting – they’re nifty gadgets with lots of even niftier sub-gadgets (apps).  But the marketing craze and the lingo surrounding the whole thing has gone far beyond acceptable.  (Too many times when I tell people that I develop software do they just default and say, “Oh, so you make apps.”  No.  I do not make apps.  Software is infinitely more than apps.)

To celebrate the insanity, I came up with some great product names this year: iSmartApp.  iAppSmartSmartApp.  iLookSmart.  My personal favorite, a tech company: iSmartSolutions.

Buzzwords are always annoying, but I promise I won’t complain about any more of them in this posting.

6. Still, nobody seems to have a clue as to what to do about the cybersecurity problem.  There’s a lot of panic presented by the media – no surprise there – but there is also reason to be concerned.  As of yet, nobody in particular has stepped up to the plate to offer a feasible fix, however temporary.  As devices grow closer and closer to users (and eventually implanted in the users), this is hugely important.  Quantum-encrypted interconnects would be a nice start, and fortunately there is lots of active research toward that, but it may be taking too long.

More of THE BEST:

9. Space technology is beginning to bloom in the private market.  By now, almost every tech person knows this, but it’s still good to realize how important this is for the human race.  Getting out into the stars has been the ultimate dream, and it is soon to be a reality.

10. 3D printing has also seen huge improvements.  As the precision and the number of materials one can work with continues to grow, this family of techniques is sure to transform manufacturing forever.

11. The world didn’t end after all.  We’ll all have 2013 to develop more amazing technologies and change the world.

2013 promises to be an industrious year.  The economy is going to come back stronger than ever, new tech is right on the horizon,  education is being transformed worldwide, and we’re starting to learn from our mistakes.  We have everything to hope for.

“How to Create a Mind” Review

I’ve just finished reading Ray Kurzweil’s new book, “How to Create a Mind”.  In it I found a wealth of good information, especially in the form of thought experiments.

Kurzweil’s latest work ties in a mass of data about the brain, pattern recognition, and his own experiences, creating a sort of roadmap to creating strong AI.  The account is clearly written and concepts are well explained.  He includes some interesting research, perhaps most intriguing are the experiments with split-brain patients, illuminating more subtle aspects of consciousness.

The grand theory presented in the book, the Pattern Recognition Theory of Mind, has some nice features.  It promises completely asynchronous processing to emulate the brain’s same ability, as well as uniformity of elements.  This uniformity is part of what enables arbitrary regions to be configured to process different types of information, given sufficient exposure to their respective type of data.

Kurzweil’s formulation of hierarchical pattern recognition seems to stem almost exclusively from Hidden Markov Models, specifically of the hierarchical variety.  While these models are indeed useful for many applications, missing throughout the book is an explanation of any explicit role of time.  In Jeff Hawkins’ “On Intelligence”, temporal patterns take a central role in the theories presented, distinguishing it from most traditional machine learning designs.  By contrast, Kurzweil’s PRTM (Pattern Recognition Theory of Mind) does not take time directly into account.  We’re left to assume that temporal patterns are implicit in the changing of spatial patterns, though some definite remark on that would have been helpful.

Most of the book’s real value does not come from detailed algorithmics or mathematical ingenuity, but again from the deep and illuminating thought experiments presented.  Kurzweil has a way of exposing subtle relationships in concepts that no other author can, save Marvin Minsky (another personal favorite, who was a mentor of Kurzweil’s).  The book delivers a powerfully enlightening look into the intricate world of pattern recognition, and presents fascinating a viable avenues of exploration for making intelligent machines.  Anyone who is interested in the brain, AI, robotics, or just technology in general should definitely give this a read.