Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Language Modelling with Phonemes

Add to your list(s) Download to your calendar using vCal

Zebulon Youra Goriely (University of Cambridge)
Friday 25 October 2024, 12:00-13:00
Zoom link: https://cam-ac-uk.zoom.us/j/4751389294?pwd=Z2ZOSDk0eG1wZldVWG1GVVhrTzFIZz09.

If you have a question about this talk, please contact Suchir Salhan.

The statistical properties of language and how they may be used in language processing and language acquisition have been studied for many decades. Recently, large language models have demonstrated striking language-learning capabilities, providing evidence for the “richness” of the linguistic stimulus, but are often trained on data that seems cognitively implausible both in terms of quantity (thousands of human-lifetimes) and quality (written text, internet sources). For these models to help us study language, we must think far more carefully about the plausibility of the input – using phonemes instead of letters, using spoken sources, and reducing the quantity. We must then determine whether the architectures we use are suitable at this scale and input representation. These models can then give us valuable analytical insights about the statistical properties of language and the learnability of language, as well as giving us practical benefits for tasks associated with language modelling and language understanding.

Speaker Biography

Zebulon Goriely is a fourth-year PhD student working on Transformer Language Models and Child Language Acquisition, supervised by Professor Paula Buttery.

This talk is part of the NLIP Seminar Series series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Language Modelling with Phonemes

This talk is included in these lists:

Other lists

Other talks