This is your brain on Pong

Many, many times, I’ve sworn that I would give up AGI research and dedicate my time to things that are more productive. I’ve probably worked on nearly 100 projects. Some were bigger than others, but a quick calculation of the hours that I’ve probably spent coding, reading, thinking, or banging my head against the wall is depressing. None of the work, of course, produced an AGI, but I don’t actually feel too bad about that part because no one else’s work has either, to my knowledge. What’s upsetting is that I could have spent that time snowboarding, or learning violin, or building a internet company. And yet despite this, I’m not better than the village drunk who swears to never drink again: it seems like a good idea and even do-able the morning after a project has been given up, but sooner or later I’m drinking the AGI punch again.

Of late I’ve run out of shiny new ideas, and so I did a mental survey of what I’ve already tried with the intent to pick what was probably the most successful of my projects and try to expand upon it. GoID players may already be familiar with what I chose: the robotic arm. (This is one of the GoID tasks. Read the description there for the gory details.)

The essential problem is that you need to get the robotic wrist to a particular Cartesian point, only using angular impulses to do so, compensating for gravity, and you need to do it quickly and efficiently. The sensor values that are provided are accurate enough that inverse kinematics could be used, but this is supposed to be an AGI problem, and no one believes that real life uses kinematics, so we’ll have none of that here thank you. The solution that I wrote (which is an enhanced version of the sample script) uses experience gathered by brute force and ignorance to build up a mapping of pairs of “where I am” and “where I want to be”s, against “what I need to do to get there”s. This approach creates a multi-dimensional problem space that needs to be reasonably populated with samples before it becomes useful. It takes a fair amount of time to search the space, and it takes a great deal of memory to remember all of the samples. Regardless, the result is that this relatively simple script starts off mostly just flailing about, but in time is realistically reaching for targets with determination. For those of you with children, I won’t need to make the obvious analogy.

And what are our brains but massive storage? Ok, a bit more than just that. But in particular our cerebellum, where movement plans get carried out moderated by the stream of sensory input, contains around half of our brain cells even though it takes up only a fraction of the space. I believe that this part of our brain uses a learning and execution strategy that is similar to what I implemented in the robotic arm task. Come on, after all my years of research, if only by probability, I had to get something right.

On a related note I started again playing tennis, and so naturally began wondering how it is I learn to hit the ball over the net so that it lands in play. I mean seriously, when you think about it, what are the odds? Sure enough, when you watch beginners play they rarely even get the racket to touch the ball, much less hit a winner. And there are so many variables at work from the moment your opponent hits the ball to the moment you hit it back. (Some would even say this before your opponent hits the ball.) You use your projections of the trajectory of the ball to run into place and position your body. You watch very carefully as the ball hits the ground because that initial information about how it bounces is critical. You might even have already incorporated the ball’s spin – if you can make it out – into how it will bounce. Then you wind up for your swing, which again is taking the ball’s spin into account if only subconsciously because things are now happening so quickly. But you also note from you peripheral vision where your opponent is so that you don’t make your shot too easy to return. And finally, you start your swing and hope for the best your muscle memory can do, because it’s impossible to make conscious decisions in the milliseconds it takes for the swing to complete.

In fact, the better you are, the fewer conscious decisions you make during an entire point. You might note that your opponent is in a weak position and rush to the net hoping to make a quick volley, all without conscious help because your subconscious has learned to recognize the combination of variables that leads to this winning conclusion.

You may be thinking that I want to build a tennis-playing robot. I do, but I won’t, although it would be so very cool. It would also be very expensive. (As you all know, all of my research is self-funded, which is to say not funded.) The plan is to create a version of Pong that bots can play. One key initial problem to solve is how a bot can learn movement patterns that play out in real time. So, initially the physical model will be trivial until mechanisms that solve such problems are worked out. Such a model might just have a mass-less ball that bounces off of walls and paddles at basic reflection angles. Improvements to the game could have a ball that can spin, or paddles that are rounded, or mechanisms on the paddles that can add or remove energy from the ball, or bots that can move horizontally from the baseline. The “real” environment should be complex enough that no bot can be able to predict too far in advance with any accuracy. The intent is to build a bot that can eventually play the game with some skill that has been refined from continued experience.

I suspect that the most interesting work will be in determining optimizations in the learning process, experience retention, and retrieval. It will probably be necessary early on to implement some manner of hierarchy so that general strategies of play can be established that break down into real movement plans.

But the first thing to do is create the simulation of the court. Box2D might work, but I don’t know yet if it handles things like spinning balls bouncing off of walls, or ball spin causing a curved trajectory.

Work in progress. Comments welcome.

8 thoughts on “This is your brain on Pong

  1. Interesting article. I’ve used Box2D myself to create a platform game and I think it can do everything you need. You can set friction on the ball and walls the effect of spin when the rub rubs against the wall on contact will be taken account of correctly. So you could simulate a snooker game with sidespin on the ball effecting the trajectory for example. The only non-physical thing I’ve noticed Box3d doing is when you have spring joints, they always get friction even if you try to turn it off, so the objects connected by the spring eventually lose kinetic energy whatever you do.

    I had thought about writing a program to play a platform game (based on box2d). The computer could use a tree structure system to consider future possibilities and go for the best outcome. The AI could use box2D when thinking eg, it “realises” if it pushes a crate to the right it will fall of its platform onto the ground.

    “no one believes that real life uses kinematics” But why not try this? Use box2d to do “internal rehersal” of what physical objects will do. you don’t need to invert the kinematics, just run forward and predict some possible outcomes.


    1. But why not try this?

      That’s certainly worth a try, yes. What i meant was that the brain is not suited to full kinematics calculations. But a reasonable approximation of kinematics might be something the brain could do. What you propose might still be too much for real wet-ware, but it would be interesting to see how well it works, and what problems arise.


  2. One of the most successful developments in AI is chess. It’s so successful we fall into the “computer can do it, so it can’t have been intelligent after all” trap. So why not use the same techniques in the physical world. Let’s say you design a robot hand to pick up a ball. Here’s the analogiy

    Figure of merit: chess:
    having more pieces; robot arm: height of ball.

    What can I do:
    chess: move my pieces; hand: move my fingers

    consequences of my actions:
    chess: moving my piece into a square results in capturing an opponent piece; hand: consequences of moving my fingers predicted by Box2d (might move piece, knock it over, grasp it)

    You can then use max/min or alpha/beta etc to search and find the optimal move. E.g. just as a chess computer realises it must surround its opponents king with more than one piece, the hand computer can figure out it needs to grasp the ball with at least two fingers. Just one finger will merely slide the ball sideways.

    Now I now real space is continuous and chess is discrete but it seems the blockworld described above isn’t THAT complex especially with box2d to do the “consequences of my actions”. Has this even been tried?

    Couldn’t we create a household robot this way? Not AGI but still something useful.


    1. Yes, chess is one of the most successful developments of AI. This blog, however, is about AGI. I’ll leave it to the master:

      Deep Blue can play chess, but it doesn’t know to come in from the rain. – Marvin Minsky

      It might be, and quite likely is, possible to create a household robot using kinematics (which is the classical way of doing what you propose). But this defeats the purpose of finding biologically feasible ways of solving the problem, i.e. to do it like the brain does it. Do note that a move in chess is a discrete action; a “move” in tennis is a long series of quick actions, with a complex movement plan that is constantly being revised given new information. These are not very comparable. You mention making the choice between using one or two fingers, but that is a discrete choice in a universe where the atomic decision is orders of magnitude smaller. If you tried using max/min or alpha/beta to search that space, you’d have lost the game days ago.

      You might (re-?)read this post, where i sort of explain why i don’t do classical AI. To summarize: it’s been done, and though it can work in narrow cases, it just overall doesn’t work very well.


  3. I certainly take your point here. One of the reasons I mentioned using chess like tree searching is to allow the use of Box2D which you mentioned. So the AI will use Box2D “in its head” to make predictions about the consequences of actions. The actual physical behaviour of the world is then also decided by a separate instance of box2D. I can see Box2D fitting into a search algorithm but can’t see how it would help a neural net. That is one of the great mysteries: how can animal brains apparently create virtual worlds including the past and possible futures?

    “with a complex movement plan that is constantly being revised given new information”

    Doesn’t a chess computer have precisely this? It constantly predicts and then revises based on its opponents actual moves.

    Also, I’ve heard its currently impossible to design a neural net to play chess. Yet the human brain can play chess so what is the difference? I think the answer to this question may have profound consequences…


  4. > Yet the human brain can play chess so what is the difference?

    The human brain doesn’t do a deep mechanical search. It recognizes patterns and correlates them with winning moves. There is no way a brain can play like a computer within any reasonable time frame. But what i think is interesting is that we could program computers to play more like humans. That would probably teach us a lot about the brain.

    > Doesn’t a chess computer have precisely this?

    I wouldn’t say “precisely”, no. But i see your point. There are differences in the definitions of the words of the sentence for each context. A “move” in chess is a distinct action that happens on the order of many seconds. A “move” in the brain is a messy change that happens in milliseconds, maybe faster. “New information” to a chess program is the opponent’s distinct last move. But to a brain it’s a torrent of input patterns. You have to normalize the time frames and the quantities of information to say that they are the same. You can of course choose to do this, but there are practical limits to need to be considered. Even a computer that worked orders of magnitude faster than the human brain would have a tough time operating in the real world, if it were to try to search the future too deeply, and rely on distinct information.

    In the case of my Pong idea, i think what you are proposing would result in a bot that sits idle calculating its next ideal move while the ball sails on by. You also need to remember that input data is not necessarily exact. The closer the ball comes, the better your information is, and so you need to be constantly re-calculating and tweaking your movement plan to compensate for previous inaccuracies (floating point error?). Deep search would never be able to keep up. Frankly, i believe that’s why the human brain doesn’t do it.


  5. Good points here, thought the ball would not “sail on by” as things only move in the box2d world when we issue a “step()” command. We can choose to wait until the AI has done its calculating. But of course the real world is different.

    Perhaps you can help me with this. Does a neural network have a goal in any way? Can it be programmed to optimise some figure of merit? A max/min style algorithm (for playing chess or solving a slider puzzle) constantly tries to improve its figure of merit (number of pieces captured, number of pieces in correct place) but we cannot tell a NN to optimise a figure of merit in this way.

    So how do we “tell” the pong NN it is supposed to win? Or, in other words, where does animal motivation come from?


  6. Box2D is intended to provide a realistic real-time simulation. If you performed the deep calculations you are proposing with each step() iteration, you would no longer have real-time.

    I believe that any AGI needs to have motivation. I don’t do neural nets (in the sense of collections of simulated neurons), so i don’t know how to implement motivation there or where it comes from in real brains. But my pong agents will certainly be motivated to win somehow. I’m still working out how action selection will work, so i don’t have an answer yet. But progress is being made at least: the simulated court is working, and i have a plan for starting the agent development.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s