Appellation d'Origine Contrôlée - Language variability: a major challenge for natural language applications
- 👤 Speaker: Thierry Poibeau, LIPN-CNRS, Paris; RCEAL affiliated lecturer
- 📅 Date & Time: Tuesday 05 May 2009, 16:00 - 17:30
- 📍 Venue: GR-06/07, English Faculty Building
Abstract
Many Natural Language Processing (NLP) applications are challenged by the great variability of language – the fact that the same word may refer to various things and that the same idea could be expressed in a variety of ways. This talk will focus on language variability and, in particular, on how existing NLP techniques handle it.
I will first focus on named entities, i.e. sequences of text corresponding to person names, location names, dates, currencies, etc. Named entities are important for text processing since they are good indicators of the content of texts and can serve as a basis for deeper analysis. They are typically considered as “rigid designators”, unambiguously referring to a single, stable entity in the world. I will show that this assumption is not always correct; rather, the meaning of a named entity can be affected by context. I will illustrate this with the case of metonymy and show that although metonymy is a relatively well-understood linguistic phenomenon, it is difficult to analyse it using a fully automatic approach.
I will then focus on discourse processing, on a task which aims to automatically structure free text according to a set of semantic principles. Automatic discourse analysis is challenging since it requires considering multiple linguistic cues and their interaction in complex patterns. These patterns may include conflicting information among which the parser has to choose. I will present a framework particularly designed to choose an optimal solution from a range of complex, interacting constraints that sometimes contradict themselves. This approach is implemented and evaluated for Health Practices Guidelines (i.e. short documents describing the practices that physicians should follow).
In the conclusion, I will discuss the dependency of computational approaches on language usage: the meaning of a linguistic item is largely dependent on context, and the context is difficult model in advance.
Series This talk is part of the RCEAL Tuesday Colloquia series.
Included in Lists
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Thierry Poibeau, LIPN-CNRS, Paris; RCEAL affiliated lecturer
Tuesday 05 May 2009, 16:00-17:30