4: Reinforcement Learning Theory – Flashcards

Unlock all answers in this set

Unlock answers
question
Why don't we simply make viral DNA that encodes GCaMP under a dopamine specific promoter?
answer
-instead of directly encoding a viral DNA with GCaMP, scientists need to insert cre first -the reason for this is cause GCaMP is constantly being modified (to be faster and brighter) -then would need to make a new virus each time but this way just make one cre virus
question
Why is cre-recombinase used so much?
answer
-the same cre dependent virus can be used in hundreds of different mouse lines (dopamine, serotonin..) -the same dopamine cre mice can be used to drive expression of any foreign protein (GFP, ChR2, GCaMP, OptoD1) (1) these tools enable immense combinatorial control (can target different neuronal populations across the brain. Cre-dependent virus can be reused in any one of these mouse lines to visualize and manipulate neuronal activity in any type of neuron (2) gene promoter fragments are often too long to fit inside most viruses Ex: string of DNA to make tyrosine hydroxylase is too long to put in an AAV virus so need to make transgenic mice and add artificial chromosomes to them (BAC)
question
Paper 1 vs 2 on Nucleus Accumbens
answer
-Paper 1: interactions between oxytocin, serotonin and glutamate inputs to the nucleus accumbens promote social reward learning -Paper2: dopamine input to the Nucleus Accumbens could promote or reinforce social interaction
question
2 things that the nucleus accumbens does
answer
(a) regulates motivational states (b) reinforces our actions and decisions
question
Animals and detecting patterns
answer
a) stimulus-stimulus learning : learn associations btwn different stimuli across time and space -these do NOT rely on dopamine reinforcement signals -this doesn't require movement just watching b) procedural learning: learn through trial and error how to move our muscles to elicit specific actions -this requires that you move and do stuff
question
Stimulus-Stimulus Learning
answer
-you simply observe the world around you and try to discern patterns of incoming information -brain has several recording devices (video, audio, taste, touch, smell..) -long term memory records how a stimuli in the world relates to you or each other Ex: the bell rings at these times, that person likes to ruin my day
question
How could stimulus-stimulus learning be used to control movements?
answer
-initially our motor cortex randomly generates some neural activity and then we observe with our senses the consequences of that neural activity -after sampling all possible variations in the motor cortex, babies could know exactly what neural activity is needed to elicit a specific movement >>this is stimulus-stimulus learning because we created a stimulus (neural activity) and we observed a stimulus (specific movement) Ex: these neurons fire so this arm moves
question
How is motor learning in reality?
answer
-probably don't know what given neural pattern produced what movement Ex: when ask someone to shoot a goal again they probably can't do it -we can't repeat our actions with great fidelity -the best we can do is the brain tells us the action was good or bad
question
Procedural learning: movement vs reinforcement
answer
-it works through reinforcement mechanisms -movement: numerous parts of the brain may be trying to influence behavior but only those that get through a filter will result in movement >>so when there's movement that means that some neural activity was successful in getting through the filter -reinforcement: brain assess' whether things are getting better or worse for the animal (is the filter doing a good or bad job) >>this will strengthen or weaken recent neural activity so that it is more or less likely to happen in the future
question
How is movement-related neural activity initially generated?
answer
-initially it's random (when you're a baby) -the neural circuits that generated successful movement were made stronger and more likely to be activated in the future specifically when that context re-arises Ex: itchy so when you scratch it soothes the itch. Next time you're itchy will make a claw and scratch
question
Movement and goal
answer
-an animal envisions what movement they want to make (pick up a cup) and it's reinforced when they actually pick up the cup or whatever gets them closer to the goal >>do NOT need a desire or goal in mind for reinforcement learning to improve your life >>if any random neural activity seems to improve your life it will be reinforced
question
How does a reinforcement learning signal increase the likelihood that the brain will regenerate same neural activity?
answer
-it's a prediction error signal! -reinforcement signal alerts the brain when expect good or bad
question
Equation for "prediction error signal"
answer
prediction error signal= actual- expected value of current situation -prediction error indicates that there was just an UNEXPECTED change in your current or predicted success -this information is broadcast widely throughout the brain as credit or blame
question
Where in the brain do actions get reinforced?
answer
-all motor commands get sent to the striatum which is the input nucleus of the basal ganglia (1) this information gets encoded in excitatory glutamatergic inputs from the cortex (2) second important input to the striatum is dopamine -it's believed that dopamine inputs are the reinforcement signal and that glutamate inputs are what gets reinforced (i.e. motor commands)
question
Nucleus Accumbens and Striatum
answer
-NAc is a subregion of the striatum
question
Dopamine Projections
answer
-it projects from VTA to NAc, striatum, amygdala,PFC, septum, hippocampus... -dopamine is a neuromodulator and there's not that many of them but each sends out large projections
question
Clusters of dopamine neurons
answer
-in general they are homogenous and fire at the same time -dopamine neurons are unmyelinated so can't fire very fast (20 times per second) >>this isn't a lot but it broadcasts to the entire brain -0-40Hz
question
Dopamine and extracellular space
answer
-it's cleared from the extracellular space 100times slower than glutamate or GABA
question
What does this say about dopamine?
answer
-it contains little information but what it says is important
question
Dopamine as a Prediction Error Signal
answer
-dopamine release occurs every time your estimation of the current moment is BETTER than anticipated but withheld every time it's WORSE
question
Ex: when you first learn to move your arm and pick up a cup
answer
-you have low expectations of yourself when you first learn to pick up a cup -any movement or neural activity that gets you closer to your goal will increase dopamine activity -dopamine signaling will strengthen recently active gulatmatergic synapses (so the motor commands they encode become more likely to win)
question
What happens when your expectations grow?
answer
-the dopamine system become more and more selective in terms of when it fires
question
Three Factor Rule
answer
-abrupt changes in dopamine levels will strengthen or weaken recently active glutamatergic synapses in the striatum -strength of glutamatergic inputs in the striatum change when the synapses experience: (a) presynaptic activity (b) post synaptic activity (c) abrupt changes in local dopamine receptor activity -three factor rule (modified hebbian learning): neurons that fire together become eligible for dopamine induced synaptic plasticity Ex: if levels in dopamine drops than the synapse gets weaker but if goes up than synapse is stronger
question
Electrophysiological recordings: monkey predicts food
answer
-monkey receives different visual stimuli that range from good or bad predictors of food delivery (they learned to dissociate these cues over a year) -amount of dopamine firing dependent on how well the animal predicts food delivery >>>1) unexpected reward= greatest dopamine firing (on the reward delivery) >>>2) 25% chance of getting food>50%>75% >>>3) when guaranteed reward firing doesn't increase
question
Raster plot graph
answer
-each row is one second long -each black DOT is when one dopamine neuron fires an action potential -there's a summary histogram on top of the graph (bar) vs raster plot graph (dots)
question
Dopamine and knowledge creation
answer
-dopamine is used in thought and decision but not in knowledge creation >>anything that originates within you is reinforced by dopamine
question
Dopamine and meaningless stimuli
answer
-it does not fire
question
Reward firing
answer
-dopamine fires to the INFORMATION of a reward, not the actual reward itself Ex: If see star and then get sugar, dopamine neurons will fire to the star
question
Negative Prediction Error
answer
-withheld firing of dopamine -when expected rewarded not received dopamine abruptly stops firing >>negative prediction error saying that the situation is worse than anticipated
question
When does dopamine firing increase happen for expected vs unexpected reward?
answer
-for unexpected reward it fires the most when actually get the reward -for expected reward fires most at prediction stimuli and not at the actual reward
question
Phenomenon of blocking
answer
-when a reward predictive stimulus (%) is presented in conjunction with another stimulus ($) that has already been learned to be predictive of the reward, then learning the second signal (5) is blocked (doesn't occur) -classical conditioning
question
2 phases of classical conditioning
answer
Phase1: conditioned stimulus (%) is paired with unconditioned stimulus (reward) Phase2: compound stimulus (% + $) is paired with the same reward Test: a separate test for each conditioned stimulus is performed. There is no response to the second cue even though it could be used to predict reward
question
Summary: Reinforcement Learning and Dopamine
answer
-we think dopamine encodes reinforcement learning signal that alerts the brain to when current reward or expectation of reward are better or worse than anticipated (prediction error signal) -this dopamine signal is broadcast widely throughout the brain
question
Pleasure Signal
answer
-dopamine has been called the pleasure signal for many years but this does not correlate -pleasure>>might be what you feel when you consume rewards or partake in a rewarding experience (situation -dependent when you eat food, laugh have sex) Ex: if you get food when you're hungry it's pleasurable -unexpected pleasurable events always increase phasic dopamine signals -dopamine might not increase at the reward itself but the earliest PREDICTOR of reward! >>dopamine is a teaching signal that notifies the brain as soon as you anticipate value changes
question
2 distinct functions of dopamine
answer
(a) phasic dopamine signal-regulates reinforcement learning by encoding a feedback signal (prediction error signal) -phasic means quick bursts or pauses in activity (b) tonic baseline dopamine levels regulate motivational state -means constant, slow and steady -this baseline firing at 4Hz (4 times per second)
question
What does baseline firing correspond to?
answer
-motivation, effort and vigor of actions
question
How much dopamine is in your extracellular space depends on 3 things:
answer
(a) speed of this tonic activity (b) number of dopamine neurons that are participating in firing at all (c) amount of phasic activity -these can all cause large changes in resting tonic amount of extracellular dopamine levels
question
Tonic Dopamine Signaling + Gain setting
answer
-these concentrations influence the "gain" setting in the system -high background levels of dopamine in striatum (extracellularly) lead to high gain setting -low extracellular dopamine levels lead to low gain setting >>this gain setting determines how excitable neurons are within the striatum and how responsive they are to glutamate inputs >>there's no learning in these tonic states just motivation
question
What happens if you artificially increase dopamine receptor activity in animals
answer
-move more and become hyperactive -more engaged in their envmt -seem more motivated -more willing to take risks and do hard things to get rewards
question
What happens if you artificially decrease dopamine receptor activity in animals
answer
-move less and appear lazy -are less motivated and interested in pursuing rewards -are less engaged in their envmt (ignore predictive stimuli) -will continue to press the lever for food if its easy to do so but not if it requires any effort even if it is a bigger or better reward
question
What happens if you lose ALL dopamine receptor activity
answer
-Parkinson's -you can't initiate purposeful movement, you become "locked in" >>>it's not that you lose the ability to move just the motivation to do so (paradoxical) Ex: throw a ball at your face can still catch it or move away (still have reactionary movement) >>need to trick them to move, won't do it volitionally
Get an explanation on any task
Get unstuck with the help of our AI assistant in seconds
New