سال انتشار:
2020
عنوان انگلیسی مقاله:
Dynamic resource allocation during reinforcement learning accounts for ramping and phasic dopamine activity
ترجمه فارسی عنوان مقاله:
تخصیص منابع پویا در طول حساب های یادگیری تقویتی برای فعالیت دوپامین ramping و مرحله ای
منبع:
Sciencedirect - Elsevier - Neural Networks, 126 (2020) 95-107. doi:10.1016/j.neunet.2020.03.005
نویسنده:
Minryung R. Song a, Sang Wan Lee a,b,c,d,e,∗
چکیده انگلیسی:
For an animal to learn about its environment with limited motor and cognitive resources, it should
focus its resources on potentially important stimuli. However, too narrow focus is disadvantageous
for adaptation to environmental changes. Midbrain dopamine neurons are excited by potentially
important stimuli, such as reward-predicting or novel stimuli, and allocate resources to these stimuli
by modulating how an animal approaches, exploits, explores, and attends. The current study examined
the theoretical possibility that dopamine activity reflects the dynamic allocation of resources for
learning. Dopamine activity may transition between two patterns: (1) phasic responses to cues and
rewards, and (2) ramping activity arising as the agent approaches the reward. Phasic excitation has
been explained by prediction errors generated by experimentally inserted cues. However, when and
why dopamine activity transitions between the two patterns remain unknown. By parsimoniously
modifying a standard temporal difference (TD) learning model to accommodate a mixed presentation of
both experimental and environmental stimuli, we simulated dopamine transitions and compared them
with experimental data from four different studies. The results suggested that dopamine transitions
from ramping to phasic patterns as the agent focuses its resources on a small number of rewardpredicting
stimuli, thus leading to task dimensionality reduction. The opposite occurs when the
agent re-distributes its resources to adapt to environmental changes, resulting in task dimensionality
expansion. This research elucidates the role of dopamine in a broader context, providing a potential
explanation for the diverse repertoire of dopamine activity that cannot be explained solely by
prediction error.
Keywords: Prediction error | Salience | Temporal-difference learning model | Pearce-Hall model | Habit | Striatum
قیمت: رایگان
توضیحات اضافی:
تعداد نظرات : 0