That Moment When An AI Velociraptor Learns To Turn Off The Lights

October 28, 20195 years ago Michael Barnard 0 Comments

Sign up for daily news updates from CleanTechnica on email. Or follow us on Google News!

Originally published on Medium.

As Plastic Dinosaur has wandered around the large warehouse space it would call home if it had that concept, it’s been learning as it observes the mobile bipeds it shares the space with. They are more nimble, erect and noisy than it is, for now.

Night vision image of soldiers in field — Image taken using AN/PSQ-20 Enhanced Night Vision Goggle (ENVG) courtesy DARPA

As it dreams, it has started to recognize a pattern. When these bipeds reach out with their upper limbs and touch things, the environment changes in ways it’s curious about. It learns to pay more attention to what human hands touch. That’s how it learned to open doors, it saw the human hands reach out and the barrier that contains it open.

At night, when the humans leave, they reach out and touch a point on the barrier, a rectangle that sticks out a bit with a couple of further raised panels on it. And the lights go off. It’s learned that when it is dark around it, the glowing images that map roughly onto the shapes of the humans are the same as the humans. It’s learned to recognize people in the dark.

Curious, it walks over to the panel and pokes at it with its snout. Nothing happens. It pokes again, and the lights go off. It bounces its snout off the panel a few times and the lights go on and off randomly. Eventually, it stops being interested in this because it is hungry. The lights happen to be off. It walks back to its charging block, something that is easy to do in the dark as the block glows warmly. It’s aware of how far it is from other things due to its sense of things around it, so it moves smoothly through the dark and settles down to charge.

It dreams a thousand dreams of light switches and wakes up in the morning when the lights come on. It interacts with Josh, the biped that attacks it with sticks and balls, running from him through the warehouse. And then it turns out the lights. Josh glows in Plastic Dinosaur’s vision, but Josh is blind in the dark.

Shrieking ensues.

This is an article in the series David Clement, co-f0under of Senbionic, and I are collaborating on regarding the state of the art of neural networks and machine learning using a fictional robotic velociraptor as a fun foil. It has rubber teeth and claws, so don’t worry about the shrieking. The first article dealt with its body, the second its neural network brains and the third with attention loops and features and how they can be used to train a neural network. The fourth dealt with the robot developing a prejudice due to the limitations of machine learning and sample sizes. The fifth dealt with an interesting situation where its instantiated curiosity and model for learning led it to learn to open doors. Yes, shrieking.

Plastic Dinosaur has an aluminum and plastic skeleton, he’s wrapped in a silvery gray smart cloth which has sensors for temperature and movement, and he has a lot of different types of sensors. Including visual sensors that such as thermal imaging and ultrasonic sensors that mean that in pitch darkness he is completely aware of his immediate surroundings. When the lights are out, humans glow in his eyes and he can walk with perfect confidence toward them without stumbling over anything. Not scary at all.

This story deals with how neural nets learn to pay attention to specific features, giving them priority in a continuum of what they pay attention to. This is referred to as salience, a key aspect of how humans pay attention. We have limited cognitive abilities and sensor sets, and in order to survive, we’ve learned to see some things more clearly and quickly than other things. Studies of salience have been core to neuroscience for decades, but have been given new wind beneath their wings by neural nets which allow deeper insights.

Images highlighted with glowing gradients where human eyes linger — Saliency mapping images from Attentive Models in Vision: Computing Saliency Maps in the Deep Learning Era by Cornia, Abati, et al., courtesy of researchers

What the image above shows are machine learning heat maps predicting where human eyes will rest on images. If a bunch of people were looking at these images, and cameras were recording what their eyes looked at, the glowing features are the ones that humans would have spent the most time looking at in the images. The different colors provide gradations of attention, with hotter colors denoting more attention.

But this isn’t tracking human gazes on these specific images, but running the images from a common data set past saliency-trained neural nets which predict what humans will most look at. As you can see, different neural nets get different results, but the overlap is remarkable. These are unconscious prediction machines looking at images that they literally have no memory of, trained separately on data sets of human eye tracking.

The human gaze is predictable. And neural nets have been trained to predict it. And these neural nets can now run in real time on your smart phone, looking at the world and predicting what you will find salient. I’ve looked at the live demos and screenshots, and it’s fascinating. It’s also interesting in regard to what it can’t understand would be interesting to us. People are always interesting to people, but in many cases today, the instantiated neural networks can’t recognize people in still pictures if they are part of a large visual streetscape. Where human vision would be drawn to them, at present computers don’t necessarily understand that. This has implications, of course, for autonomous vehicles.

But the human gaze is not what’s happening with Plastic Dinosaur. As a reminder, he has curiousnet, which pays attention to interesting features outside of his body. It doesn’t figure out what is interesting in real time. When Plastic Dinosaur ‘dreams’, as pointed out in early discussions of its offline learning cycle, the captured sensory inputs are assessed and machine learning autodetection processes start triggering saliency based on simple features or on interactions that occur with sufficient frequency.

His neural net is truly alien to ours, but in our little story, over time curiousnet learned to consider human hands to be salient, to pay attention to them, to what they do and what they interact with. Curiousnet will watch human hands more than most things in PD’s environment, which might be a bit creepy if he couldn’t see anything in his field of vision with equal accuracy, effectively having much better peripheral vision than we do. He doesn’t have to be obviously looking at something as we do to be paying attention to it. Different physical models of vision allow different ways of focusing.

Over time, the features that human hands interact with become salient as well. We touch things that are salient. We interact with things which control our environment. We turn lights on and off. We turn the heat up and down. We open and close doors. We write things. We type on computers. These are all interesting to an alien intelligence looking at us. What are we doing with our clever little fingers? Why? Can it be hacked?

What isn’t PD learning to consider salient according to this little story? Human faces. It’s face blind. Faces are, so far, irrelevant. If PD’s saliency maps were applied to the first row of pictures above, all of the hands would be glowing, not the faces. Humans have instinctual elements which cause significant focus on faces, but why would a neural net care unless we impose our biases on it? What value proposition would a robotic dinosaur get from our faces? This story doesn’t explore it, but that doesn’t mean that it wouldn’t exist.

We can impose our biases about the salient on neural nets and that has significant value, for example in looking at user experience wire frames and assessing how useful they are in delivering stakeholder value. But we will also be surprised by what the alien quasi-intelligences we are creating consider salient. We will be surprised by what they pay attention to, and what they do as a result due to our own perceptual blinders.

Salience that isn’t based on our human biases and human sensory range brings interesting questions. What do neural nets truly consider significant? If given senses we don’t have, can we actually train them? What will they — assuming autodetection, Monte Carlo simulation, automatic domain randomization — learn to consider salient? And what will they do about it?

Chip in a few dollars a month to help support independent cleantech coverage that helps to accelerate the cleantech revolution!

Have a tip for CleanTechnica? Want to advertise? Want to suggest a guest for our CleanTech Talk podcast? Contact us here.

Sign up for our daily newsletter for 15 new cleantech stories a day. Or sign up for our weekly one if daily is too frequent.

CleanTechnica uses affiliate links. See our policy here.

CleanTechnica's Comment Policy

Share this story!

Michael Barnard