University of Cambridge > Talks.cam > Language Technology Lab Seminars > "What it can create, it may not understand" Studying the Limits of Transformers.

"What it can create, it may not understand" Studying the Limits of Transformers.

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Panagiotis Fytas.

Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. They only take seconds to produce outputs that would challenge or exceed the capabilities even of expert humans. Yet, these models simultaneously show failures on surprisingly trivial problems. This presents us with an apparent paradox: how do we reconcile seemingly superhuman capabilities with the persistence of errors that few humans would make? Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify Transformers, in this talk, I will discuss the limits of LLMs across three different compositional tasks. Our findings show that although LLMs can outperform humans in generation, they consistently fall short of human capabilities in measures of understanding, showing weaker correlation between generation and understanding performance, and more brittleness to adversarial inputs. We further show that transformers can often solve multi-step compositional problems by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills. Overall, our findings support the hypothesis that models’ generative capability may not be contingent upon understanding capability, and call for caution in interpreting artificial intelligence by analogy to human intelligence.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity