Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Computational Neuroscience Journal Club

Add to your list(s) Download to your calendar using vCal

Changmin Yu ( Gatsby Computational Neuroscience Unit, UCL, London, UK)
Wednesday 14 February 2024, 14:00-16:00
CBL Seminar Room, Engineering Department, 4th floor Baker building.

If you have a question about this talk, please contact Puria Radmard.

Please join us for our Computational Neuroscience journal club on Wednesday 14th February at 2pm UK time in the CBL seminar room, or online on zoom.

The title is ‘Distributional Reinforcement Learning’, presented by Changmin Yu and Puria Radmard.

Summary:

In traditional reinforcement learning algorithms such as temporal difference learning, the value functions maps states to the expected total future return. In distributional reinforcement learning, this is extended to include the multiplicity of rewards, by mapping states to full distributions of returns. In this session, Changmin and Puria will start with an introduction to both traditional [1] and distributional [2, 3] reinforcement learning. Dabney et al., 2019 [4], show the distributional nature of value representation in VTA dopaminergic neurons, and the simple changes to classical TD learning that can bring about distributional value representations. Recent discoveries showed that midbrain dopaminergic neurons exhibit distributional value coding, which suggests the underlying mechanisms for such neurons to follow the distributional rather than classical expectation-based reinforcement learning regime. Prefrontal cortex neurons have been shown to be significantly involved in decision-making and reward-guided learning, and are anatomically related with the dopaminergic neurons. Muller et al. 2024 [5] present new analyses of existing data of primate prefrontal neurons in decision making tasks, showing that similar to what was found in rodent dopamine neurons [4], PFC neurons exhibit highly diverse profiles in optimism with respect to value coding, and in asymmetric scaling relative to positive versus negative RPEs. Moreover, in a task with dynamic reward structure, the authors show diversity in the rate of learning associated with positive and negative RPEs, hinting on the computational nature of distributional RL in the PFC for decision-making.

[1] Dayan, P. and Abbott, L.F. (2001) Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. The MIT Press, Cambridge. [2] Bellemare, Marc G., Will Dabney, and Rémi Munos. “A distributional perspective on reinforcement learning.” In International conference on machine learning, pp. 449-458. PMLR , 2017. [3] Dabney, Will, Mark Rowland, Marc Bellemare, and Rémi Munos. “Distributional reinforcement learning with quantile regression.” In Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1. 2018. [4] Dabney, W., Kurth-Nelson, Z., Uchida, N. et al. A distributional code for value in dopamine-based reinforcement learning. Nature 577, 671–675 (2020). [5] Muller, T.H., Butler, J.L., Veselic, S. et al. Distributional reinforcement learning in prefrontal cortex. Nat Neurosci (2024).

This talk is part of the Computational Neuroscience series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Computational Neuroscience Journal Club

This talk is included in these lists:

Other lists

Other talks