University of Cambridge > Talks.cam > Computational Neuroscience > Computational Neuroscience Journal Club

Computational Neuroscience Journal Club

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Puria Radmard.

Please join us for our Computational Neuroscience journal club on Wednesday 14th February at 2pm UK time in the CBL seminar room, or online on zoom.

The title is ‘Distributional Reinforcement Learning’, presented by Changmin Yu and Puria Radmard.

Summary:

In traditional reinforcement learning algorithms such as temporal difference learning, the value functions maps states to the expected total future return. In distributional reinforcement learning, this is extended to include the multiplicity of rewards, by mapping states to full distributions of returns. In this session, Changmin and Puria will start with an introduction to both traditional [1] and distributional [2, 3] reinforcement learning. Dabney et al., 2019 [4], show the distributional nature of value representation in VTA dopaminergic neurons, and the simple changes to classical TD learning that can bring about distributional value representations. Recent discoveries showed that midbrain dopaminergic neurons exhibit distributional value coding, which suggests the underlying mechanisms for such neurons to follow the distributional rather than classical expectation-based reinforcement learning regime. Prefrontal cortex neurons have been shown to be significantly involved in decision-making and reward-guided learning, and are anatomically related with the dopaminergic neurons. Muller et al. 2024 [5] present new analyses of existing data of primate prefrontal neurons in decision making tasks, showing that similar to what was found in rodent dopamine neurons [4], PFC neurons exhibit highly diverse profiles in optimism with respect to value coding, and in asymmetric scaling relative to positive versus negative RPEs. Moreover, in a task with dynamic reward structure, the authors show diversity in the rate of learning associated with positive and negative RPEs, hinting on the computational nature of distributional RL in the PFC for decision-making.

[1] Dayan, P. and Abbott, L.F. (2001) Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. The MIT Press, Cambridge. [2] Bellemare, Marc G., Will Dabney, and Rémi Munos. “A distributional perspective on reinforcement learning.” In International conference on machine learning, pp. 449-458. PMLR , 2017. [3] Dabney, Will, Mark Rowland, Marc Bellemare, and Rémi Munos. “Distributional reinforcement learning with quantile regression.” In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1. 2018. [4] Dabney, W., Kurth-Nelson, Z., Uchida, N. et al. A distributional code for value in dopamine-based reinforcement learning. Nature 577, 671–675 (2020). [5] Muller, T.H., Butler, J.L., Veselic, S. et al. Distributional reinforcement learning in prefrontal cortex. Nat Neurosci (2024).

This talk is part of the Computational Neuroscience series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity