| COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. | ![]() |
University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Mean-field dynamics and training of deep transformers
Mean-field dynamics and training of deep transformersAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact nobody. SCLW01 - Bridging Stochastic Control And Reinforcement Learning: Theories and Applications In this talk, we will examine continuous limits of transformer architectures, which form the basis of common generative models. There is rich literature on the limiting behaviour of neural networks, including for large width of single layer neural networks (mean-field analysis) and large depth of residual neural networks (neural ODE and SDE analysis). Here, we consider limits of transformers with attention and scaling for a large number of layers, tokens, and attention heads. The analysis reveals that for plausible training outputs, a McKean—Vlasov limit with or without diffusive common noise results. Joint work with William Gibson. This talk is part of the Isaac Newton Institute Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other lists2d to 3d equation sets and implication of super massive blackholes Isaac Physics Seminars & Events The obesity epidemic: Discussing the global health crisisOther talksGreg Cooke on "Ecological modelling of hycean worlds" CSAR Lecture: Biopharmaceutical Development - The Journey from Molecule to Medicine Polar Oceans Seminar Talk - Yohei Takano Skyla White on SOUP Intro LCLU Coffee Royal Papworth Hospital and Acute Medicine |