University of Cambridge > Talks.cam > NLIP Seminar Series > Does Syntax Still Matter in the World of LLMs?

Does Syntax Still Matter in the World of LLMs?

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Michael Schlichtkrull.

Abstract:

Large Language Models (LLMs) have shown impressive results in a recent period to the extent that some cognitive scientists are claiming that syntactic theories should be abandoned as an explanation of human language in favour of LLMs. I will provide evidence that syntax is still beneficial both in scientific and engineering pursuits with human language. First, LLMs do not provide a prediction nor an explanation of what are the universal properties of all human languages, unlike the syntactic theory considered here. Second, human brain activity of some brain regions can be accounted for better by an incremental syntactic parser than by a LLM surprisal. Finally, LLMs can work even better if augmented with a syntactic compositional structure. If that is so, you might ask, why is syntax not more popular in NLP then? I believe it is because the modern hardware accelerators (GPUs and TPUs) are not optimal for tree-like computation so it is difficult to train large scale syntactic models. To account for that we have created a JAX library, called SynJAX, that makes it easier to build syntactic models that run efficiently on GPU /TPU.

Bio:

Miloš Stanojević is a Senior Research Scientist in Google DeepMind. Prior to that he did a PostDoc at the University of Edinburgh with Mark Steedman where he worked on Combinatory Categorial Grammars (CCG), and collaborated with Ed Stabler on Minimalist Grammars. He has received a PhD degree from University of Amsterdam for the work on machine translation. His main research interest is in bridging the gap between theoretical linguistics and natural language processing by bringing the right inductive biases to the machine learning models of language.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity