University of Cambridge > > RCEAL Tuesday Colloquia > Having a closer look at Multiword Expressions

Having a closer look at Multiword Expressions

Add to your list(s) Download to your calendar using vCal

  • UserAline Villavicencio (Federal University of Rio Grande do Sul (Brazil) and University of Bath)
  • ClockTuesday 06 February 2007, 16:00-17:30
  • HouseGR-05, English Faculty Building.

If you have a question about this talk, please contact Teresa Parodi.


The term Multiword Expressions (MWEs) has been used to describe expressions for which the syntactic or semantic properties of the whole expression cannot be derived from its parts, including a large number of related but distinct phenomena, such as phrasal verbs (e.g. come along), nominal compounds (e.g. frying pan), institutionalised phrases (e.g. bread and butter), and many others.

Due to their heterogeneous characteristics, MWEs present a tough challenge for both linguistic and computational work (Sag et al., 2002). However, they are an integral part of language and their importance has long been recognised. Moreover, as the number of such expressions in a speaker´s lexicon is equiparable to the number of single word units (Jackendoff, 1997), an appropriate treatment of MWEs is important for many language technology tasks and applications. This is reflected in several existing grammars and lexical resources, where almost half of the entries are MWEs. Nonetheless, regardless of how large a hand-crafted widecoverage resource is, there are always going to be words and constructions that are not included in it. Therefore, MWEs still cause a large number of problems, such as parse failures, due to e.g. missing lexical entries or syntactic idiosyncracies.

In this talk I give a brief summary of what multiword expressions are and some of the the challenges that they pose. I then present some proposals for detecting MWEs in corpora and subsequently handling them. These range from language specific methods for dealing with a particular type of MWE to more language independent approaches for treating MWEs in general, using resources like corpora and the World Wide Web and a combination of linguistic and statistical information.

This talk is part of the RCEAL Tuesday Colloquia series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2024, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity