Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Computational Neuroscience Journal Club

Add to your list(s) Download to your calendar using vCal

Guillaume Hennequin and Kris Jensen
Tuesday 12 October 2021, 14:00-15:30
Online on Zoom.

If you have a question about this talk, please contact Jake Stroud.

Please join us for our fortnightly journal club online via zoom where two presenters will jointly present a topic together. The next topic is ‘Policy-gradient reinforcement learning’ presented by Guillaume Hennequin and Kris Jensen.

Zoom information: https://us02web.zoom.us/j/84958321096?pwd=dFpsYnpJYWVNeHlJbEFKbW1OTzFiQT09 Meeting ID: 841 9788 6178 Passcode: 659046

Summary: Humans and animals continually learn from interacting with their environment in a paradigm commonly known as reinforcement learning. In the neuroscience literature, this is often phrased in the context of Q learning or temporal difference learning where decisions are made on the basis of the learned values of every state and action. In this journal club we focus on an alternative approach to reinforcement learning where a policy is instead learned by direct optimization of the future expected reward. We start with an introduction to such ‘policy gradient’ reinforcement learning by deriving the canonical ‘REINFORCE’ algorithm and giving an overview of techniques used to reduce variance and stabilize learning. We then discuss how such policy gradient methods could potentially be implemented in biological circuits using well-known synaptic plasticity rules. Finally we consider a case study of how policy gradient methods can be used to model biological agents and provide insights into the structure and function of neural circuits.

Relevant reading:

Levine (2021). Berkeley CS 285 Lecture 5 notes (introduction to policy gradient methods and variance reduction). http://rail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-5.pdf.

Fremaux et al. (2010). “Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity.” https://www.jneurosci.org/content/30/40/13326.

Wang & Kurth-Nelson et al. (2018). “Prefrontal cortex as a meta-reinforcement learning system.” https://www.nature.com/articles/s41593-018-0147-8.

This talk is part of the Computational Neuroscience series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Computational Neuroscience Journal Club

This talk is included in these lists:

Other lists

Other talks