University of Cambridge > Talks.cam > NLIP Seminar Series > Formal syntactic theory in the current NLP landscape

Formal syntactic theory in the current NLP landscape

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Suchir Salhan.

Natural language processing used to rely on formal methods in its early days, and this included formal theories of syntax where sentence structure was of relevance. In the statistical era, the focus shifted to annotation schemes such as Penn Treebank and Universal Dependencies, which still rely on formal theory in their origins but prioritize simplicity over consistency. Now in the era of deep learning, while most training forgoes any annotation, annotated corpora remain crucial for evaluation and interpretation of the output of language models. In this context, it is important that the theory underlying the annotation be consistent and, furthermore, developed independently of NLP tasks. I will talk about the recent work we did with the Head-driven Phrase Structure Grammar theory of syntax (HPSG). We have worked with HPSG to grow and improve existing corpora of Spanish and to improve the parsing speed of the English Resource Grammar, so that the English corpora can be grown more easily.  Currently, we are working with the English Resource Grammar to study linguistic properties of texts generated by LLMs, including looking for any systematic differences with similar texts written by people. I will talk about this work in progress at the end of the talk.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2025 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity