University of Cambridge > Talks.cam > Computational Neuroscience > Computational Neuroscience Journal Club

Computational Neuroscience Journal Club

Add to your list(s) Download to your calendar using vCal

  • UserGuillaume Hennequin and Kris Jensen
  • ClockTuesday 12 October 2021, 14:00-15:30
  • HouseOnline on Zoom.

If you have a question about this talk, please contact Jake Stroud.

Please join us for our fortnightly journal club online via zoom where two presenters will jointly present a topic together. The next topic is ‘Policy-gradient reinforcement learning’ presented by Guillaume Hennequin and Kris Jensen.

Zoom information: https://us02web.zoom.us/j/84958321096?pwd=dFpsYnpJYWVNeHlJbEFKbW1OTzFiQT09 Meeting ID: 841 9788 6178 Passcode: 659046

Summary: Humans and animals continually learn from interacting with their environment in a paradigm commonly known as reinforcement learning. In the neuroscience literature, this is often phrased in the context of Q learning or temporal difference learning where decisions are made on the basis of the learned values of every state and action. In this journal club we focus on an alternative approach to reinforcement learning where a policy is instead learned by direct optimization of the future expected reward. We start with an introduction to such ‘policy gradient’ reinforcement learning by deriving the canonical ‘REINFORCE’ algorithm and giving an overview of techniques used to reduce variance and stabilize learning. We then discuss how such policy gradient methods could potentially be implemented in biological circuits using well-known synaptic plasticity rules. Finally we consider a case study of how policy gradient methods can be used to model biological agents and provide insights into the structure and function of neural circuits.

Relevant reading:

Levine (2021). Berkeley CS 285 Lecture 5 notes (introduction to policy gradient methods and variance reduction). http://rail.eecs.berkeley.edu/deeprlcourse/static/slides/lec-5.pdf.

Fremaux et al. (2010). “Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity.” https://www.jneurosci.org/content/30/40/13326.

Wang & Kurth-Nelson et al. (2018). “Prefrontal cortex as a meta-reinforcement learning system.” https://www.nature.com/articles/s41593-018-0147-8.

This talk is part of the Computational Neuroscience series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity