Image credit: SPL |
The €1 Million Brain Prize, awarded by the Lundbeck Foundation in Denmark,
has gone to three neuroscientists for their work understanding the mechanisms
of reward in the brain. The winners are:
o Peter Dayan – Director of the Gatsby
Computational Neuroscience Unit, University College of London
o Ray Dolan – Director of the Max Planck
Centre for Computational Psychiatry and Ageing
o Wolfram Schultz – Professor of Neuroscience
and Wellcome Trust Principal Research Fellow at the University of Cambridge
Collectively, their work examines the ability of humans and animals to link rewards to events and actions. This capacity has been a foundation of our survival, but can also be the root of many neurological and psychiatric disorders, such as addiction, compulsive behaviour and schizophrenia. In order for the successful survival and reproduction of a species, an animal must be able to make decisions that avoid danger and bring benefits (such as food, shelter, etc.). T decision-making requires predicting outcomes from environmental clues and previously learned responses. For instance, certain smells may indicate that an animal should prepare to chase prey, or to avoid a fruit item. The brain plays a key role in this decision making and learning, and at the centre of this is the neurotransmitter dopamine.
Wolfram Schultz |
Wolfram Schultz
In the 1980s, Professor Wolfram Schultz developed a way of recording the activity of neurons
in the brain that use dopamine to transmit information. He found that the
dopamine neurons would respond whenever a monkey was given fruit juice reward.
Schultz then showed the animals different visual patterns; whenever a certain
pattern was shown, the monkey would receive a reward. After a time the dopamine
neurons began to respond to the visual pattern, rather than the juice reward
(response to the juice reward itself declined over time). Conversely, when no
reward was given (after the correct pattern was shown), the dopamine neuron
activity decreased below normal levels. If the reward was given at another time
or was bigger than expected, the dopamine neuron activity would spike
(1). This was the first clear demonstration of the neurological basis of
one cornerstone of learning theory in Comparative and Behavioural Psychology; Pavlovian conditioning (2).
Building on Schultz’s work, Peter Dayan found the
pattern of activity from dopamine neurons described by Schultz resembled the
‘reward prediction error’. This signal is the difference between
predicted and actual reward resulting from an action or event. It continuously
updates according to the result of new events and outcomes. Dayan would go on
to work with Schultz to create computational models investigating how the brain
uses information to make predictions and how this information is updated when
new or contrasting information is presented.
Peter Dayan |
Peter Dayan
Schultz explains the reward prediction error and
resulting learning in the following analogy:
I am standing in front of a drink-dispensing machine in Japan
that seems to allow me to buy six different types of drinks, but I cannot read
the words. I have a low expectation that pressing a particular button will
deliver my preferred blackcurrant juice (a chance of one in six). So I just
press the second button from the right, and then a blue can appears with a
familiar logo that happens to be exactly the drink I want. That is a pleasant
surprise, better than expected. What would I do the next time I want the same
blackcurrant juice from the machine? Of course, press the second button from
the right. Thus, my surprise directs my behavior to a specific button. I have
learned something, and I will keep pressing the same button as long as the same
can comes out. However, a couple of weeks later, I press that same button
again, but another, less preferred can appears. Unpleasant surprise, somebody
must have filled the dispenser differently. Where is my preferred can? I press
another couple of buttons until my blue can comes out. And of course I will
press that button again the next time I want that blackcurrant juice, and
hopefully all will go well.
Which
button to push?
|
What happened?
The first button press delivered my preferred can. This pleasant
surprise is what we call a positive reward prediction error. “Error” refers to
the difference between the can that came out and the low expectation of getting
exactly that one, irrespective of whether I made an error or something else
went wrong. “Reward” is any object or stimulus that I like and of which I want
more. “Reward prediction error” then means the difference between the reward I
get and the reward that was predicted. Numerically, the prediction error on my
first press was 1 minus 1/6, the difference between what I got and what I
reasonably expected. Once I get the same can again and again for the same
button press, I get no more surprises; there is no prediction error, I don’t
change my behavior, and thus I learn nothing more about these buttons. But what
about the wrong can coming out 2 weeks later? I had the firm expectation of my
preferred blackcurrant juice but, unpleasant surprise, the can that came out
was not the one I preferred. I experienced a negative prediction error, the
difference between the nonpreferred, lower valued can and the expected
preferred can. At the end of the exercise, I have learned where to get my
preferred blackcurrant juice, and the prediction errors helped me to learn
where to find it.
Professor Ray Dolan's work has involved
imaging the human brain in order to understand the mechanisms for learning and
decision-making. Advancing the work of Schultz and Dayan, he showed that the
reward prediction error can account for how humans learn, and the role that
dopamine plays within it. He has collaborated with Dayan for the past decade to
investigate human motivation, variations in happiness, and human gambling
behaviour.
Ray Dolan |
Ray Dolan
Schultz continues to study both animals and
humans, using neuroimaging to study changes in neuron signals in Parkinson’s
patients, smokers and drug addicts. The more we understand the process which
leads people to take certain actions, the better positioned we are to
intervene.
Professor Sir Colin Blakemore (University of
London), chairman of the Brain Prize selection committee said,
“The judges concluded that the discoveries made by Wolfram
Schultz, Peter Dayan and Ray Dolan were crucial for understanding how the brain
detects reward and uses this information to guide behaviour. This work is a
wonderful example of the creative power of interdisciplinary research, bringing
together computational explanations of the role of activity in the monkey brain
with advanced brain imaging in human beings to illuminate the way in which we
use reward to regulate our choices and actions. The implications of these
discoveries are extremely wide-ranging, in fields as diverse as economics,
social science, drug addiction and psychiatry”.
Primate research remains today an invaluable tool for comparative research into human health and disease. While other animals remain useful as models for such investigations, non-human primates are arguably the best species to be used for such investigations due to their remarkable similarity to humans. The research performed by Schultz, and built upon by Dayan and Dolan, highlight this simple fact and perhaps also exemplifies why critical consideration against the use of non-human primates for research is needed. The Brain Prize also shows how animal and non-animal methods are often used together to build our understanding of how the brain works.
Originally published on SPEAKING OF RESEARCH
Image source: Lundbeck Foundation |
No comments :
Post a Comment