COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |
University of Cambridge > Talks.cam > ML@CL Ad-hoc Seminar Series > Large language models and human cognition
Large language models and human cognitionAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact hes24. Large language models (LLMs) have remarkable capabilities, but they make mistakes that seem surprisingly elementary. For example, one of today’s best LLMs says “A word like delicious but with one letter different is precious” and other state-of-the-art LLMs make similar mistakes. I will argue that LLM failures are understandable given the fundamental techniques on which LLMs are built, namely tokenization, transformers, pretraining, and alignment. I will then discuss how the failure modes of LLMs are similar to known failure modes of human thought, and how each of the four central techniques has an analog in human cognition and learning. Moving on to multimodal models and the latest agentic and chain-of-thought models, I will discuss how similar analysis applies to them also. The discussion will lead to three conclusions: human cognition is based on processing similar to that done by LLMs; human-level artificial intelligence is hence in reach; but superintelligence, for common meanings of that word, is not on the horizon. Charles Elkan is currently an affiliate professor of computer science at the University of California, San Diego, where for many years previously he was a tenured full professor. In recent years he has worked in New York as the co-founder of Ficc.ai (https://www.ficc.ai), as a venture partner at Fellows Fund (https://www.fellowsfundvc.com), and as a consultant. Until 2020 he was a managing director and the global head of machine learning at Goldman Sachs, while from 2014 to 2018 he was the first Amazon Fellow, leading scientists and engineers in Seattle, Palo Alto, and New York doing research and development in machine learning for both e-commerce and cloud computing. He earned his Ph.D. at Cornell University and his undergraduate degree in mathematics at Cambridge. His students have gone on to professorships at universities that include Columbia and Carnegie Mellon, and to central roles in industry, including as team lead for multiple generations of ChatGPT and for AI in Google search . This talk is part of the ML@CL Ad-hoc Seminar Series series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsLattice field theory informal seminars Odd perfect numbers Visiting African Fellows' Research ShowcaseOther talksGeneralized Schrodinger bridges in stochastic thermodynamics Alan Turing and the Enigma Machine Communicating Complex Models to Aid Decision Making Good COP, bad COP: first reflections on COP29 Cambridge RNA Club - ONLINE Title TBC |