The Pursuit of A.I.
We believe that artificial intelligence
is a tool that can help us solve some of the world's most complex problems.
Problems like climate change,
access to water,
and fighting disease.
Artificial intelligence offers us a chance to accelerate progress
on questions like these
by acting as a multiplier for human ingenuity.
But in order to create artificial intelligence,
we first need to understand exactly what intelligence is.
The hallmark of human intelligence is its generality.
Nobody has expressed this idea more vividly
than Science Fiction author Robert A. Heinlein:
"A human being should be able to change a diaper,
plan an invasion, butcher a hog,
conn a ship, design a building,
write a sonnet, balance accounts,
build a wall, set a bone,
comfort the dying, take orders, give orders,
cooperate, act alone,
solve equations, analyze a new problem,
pitch manure, program a computer,
cook a tasty meal,
fight efficiently, die gallantly.
Specialization is for insects."
After sifting through over 70 definitions of intelligence,
my colleague Shane Legg arrived at a very similar conclusion.
Intelligence measures an agent's ability
to achieve goals in a wide range of environments.
Or, in his favourite language [mathematical formula on the screen]
So let's unpack this definition.
Who or what can be intelligent?
The answer is what we call an Agent:
Someone or something that can take action.
And this could be a human being, it could be a machine,
it could be a piece of software.
In order to be intelligent, an Agent needs to be able to achieve goals.
But not just a few, or very specific ones, goals in a wide range of environments.
Clearly, the definition calls for a generalist.
Specialists need not apply.
So how would you go about building an intelligent machine?
Here's an idea: for every task or problem
think of how a human would solve it.
Then take the human solution
and encode it as a set of rules for the computer.
We sometimes call this approach GOFAI
Good Old-Fashioned Artificial Intelligence.
This plausible approach has been tried,
but unfortunately it has mostly failed.
People are incredibly good at solving problems,
but incredibly bad at explaining exactly how they do it.
So what else can we do?
Let's turn to the one living proof
that human-level intelligence is actually possible.
Our own human minds.
I don't know what your experience is,
but in my experience, when we learn we learn from experience.
When we apply that very idea to computer software,
we call it machine learning.
You can think of machine learning as a new way of programming a computer.
We don't feed it rules or instructions,
instead we let it learn from examples and experience.
Hence for the remainder of the talk,
the protagonist will be an artificial learning agent
that interacts with the world in three different ways.
It observes the state of the world,
it takes action in the world,
and it receives rewards for achieving goals.
The learning agent's brain is an artificial neural network.
That's a a computer architecture that's inspired by the human brain.
During training it learns by adapting the connections
between its artificial brain cells.
To take actions that lead to the greatest reward in the future.
So as an example for training,
let's take our best friend and learning agent, the dog.
And cat lovers will understand why we are not using a cat here.
What would you do if you wanted to train your dog
to sit on command?
First you lure it into a sitting position.
Then you utter the command, "sit",
and then you reward it with a treat.
As you do this again and again,
the dog's brain learns by adapting the connections between its neurons
to get the treat.
So artificial learning agents are trained in much the same way as this,
and that's what we call machine learning.
Science fiction, you might be saying,
not at all.
The mobile phone responding to your voice commands?
Your photos automatically tagged by content?
Your messages translated into a different language online?
All of these systems have been trained by machine learning.
By feeding them examples, not rules.
But for true artificial intelligence
where will all the experience
and the rich interactions to learn come from?
Surprisingly, we can find inspiration
in Stefan Zweig's famous book "The Royal Game".
The book tells the story of Dr B,
an innocent man who has been arrested and is being held in solitary confinement.
Not unlike our learning agent,
Dr B is alone in a small world, and starved of stimulation.
"They did nothing to us,
other than subjecting us to complete nothingness".
For as is well known
nothing on earth puts more pressure on the human mind than nothing.
One day, while waiting for an interrogation,
Dr B manages to steal a book from one of his captors.
A book about the game of chess.
Eager to engage his mind,
Dr B devours the book and learns to play chess.
He replays the master games in the book
again and again.
But after a while, those games have lost their novelty.
Desperate for further diversion,
Dr B attempts to play chess against himself.
But he soon realizes
that in order to play chess against himself,
he needs to split his mind into two halves:
an "I" black and an "I" white.
Now, only now, with two agents in play,
can there be true interaction and learning.
Years later on a cruise ship,
Dr B meets the world chess champion at the time, Mirko Czentovic.
An expert at chess, and only at chess.
In a stunning demonstration of his skills,
Dr B manages to do the impossible:
he wins at chess against the World Chess Champion.
Fast forward 80 years, and Stefan Zweig's story becomes reality
in a way that not even the author could have imagined.
The modern Czentovic, Stockfish,
computer chess champion 2016.
A good, old-fashioned Artificial Intelligence
for playing chess, and only chess.
The modern Dr B, AlphaZero,
an artificial learning agent,
that learns to play chess solely by playing against itself.
And not just chess.
Also the difficult games of Shogi and Go.
In a match of 1,000 games,
AlphaZero wins 25 times as many games as Stockfish.
And according to former world chess champion Garry Kasparov,
it did so in great style indeed.
So how does AlphaZero work?
AlphaZero is a learning agent, that much like Dr B,
begins with zero knowledge about the game of chess,
except for the rules.
When it starts playing against itself,
it plays more or less randomly.
But at some point, it stumbles upon a victory,
and that's when the magic starts happening.
AlphaZero begins to learn.
It learns by adapting the connections between its artificial neurons
to make more successful moves more likely to be played.
As it becomes a better player,
it produces even better games from which to learn.
In only eight hours,
AlphaZero learns more about the game of chess
than had been learnt in the past 1,500 years.
But can our learning agents go beyond pure competition,
and become team players?
After all, we humans thrive on cooperation.
In order to find out,
we trained our learning agents on another game.
Capture the flag.
In Capture the Flag, players team up
to capture the opponent team's flag, while protecting their own.
In the video game players control the movements of their avatar,
and they can tag opponents.
Here's how to capture a flag.
On the left is the agent perspective,
and on the right is a top-down view.
So, run to the opponent base
and pick up the flag.
Then bring it back to your own base,
but in order to score, you need to make sure
that your own flag is at your own base at the time.
Similar to AlphaZero, we trained the learning agents
by letting them play against themselves for millions of games.
But the environment here is much richer than chess,
it's a 3D environment.
The agents need to learn to cooperate as well as to compete.
And because our goal is generality, we train them on a diverse set of maps,
with different teammates, and against different opponents.
We were excited to discover that the learning agents learnt
some rather advanced behaviours.
For example, they learnt to defend their home base.
They learnt to set up camp in the opponent base,
to wait till the flag there re-appears, so that they can capture it.
And they learnt to follow their teammates, because then they can work together.
Eventually, the learning agents became stronger at the game
than strong human players.
Against a fixed pool of opponents, the learning agents won 74% of their games
as compared to strong human players, who only won 52% of their games.
But what surprised us the most, was that the human players
preferred to play with the artificial learning agents on their team.
They said they were simply more skilled and reliable.
But enough about artificial learning agents.
What have we as human learning agents learnt so far?
Maybe that the pursuit of artificial intelligence
is a more human endeavour than we thought.
After all, we take inspiration from human intelligence,
when we say that the hallmark of intelligence is its generality.
We take inspiration from human learning,
when we say that intelligent agents need experience and not instructions.
And we take inspiration from human interaction,
when we emphasize that we need to train our learning agents together,
to overcome the nothingness that haunted Dr B in the story.
So to conclude,
using our own human intelligence
maybe one day we can create true Artificial Intelligence,
a tool that not only plays games
but that can help us tackle some of the great problems of our times
with greater intelligence.
by: Thore Graepel | TEDxExeter