University of Cambridge > Talks.cam > Natural Language Processing Reading Group > Overview of: Measuring the non-compositionality of multiword expressions. [best paper award at COLING]

Overview of: Measuring the non-compositionality of multiword expressions. [best paper award at COLING]

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Jimme Jardine.

Laura, recently back from COLING , will give us a brief summary of the conference, and present the paper that won the best paper award there.

The paper is: Fan Bu and Xiaoyan Zhu. Measuring the non-compositionality of multiword expressions.

Although she will not cover them in the talk, two other papers Laura found interesting are listed below. They may just tickle your paper bits too…

As always, please volunteer the dates you would be willing to present a paper or two!!

Later, Jimme

——-

Fan Bu and Xiaoyan Zhu. Measuring the non-compositionality of multiword expressions.

Multiword Expressions (MWEs) appear frequently and grammatically in the natural languages. Identifying MWEs in free texts is a very challenging problem. This paper proposes a knowledge-free, training-free, and language-independent Multiword Expression Distance (MED). The new metric is derived from an accepted physical principle, measures the distance from an n-gram to its semantics, and outperforms other state-of-the-art methods on MWEs in two applications: question answering and named entity extraction.

Mark Johnson and Katherine Demuth. Unsupervised phonemic Chinese word segmentation using Adaptor Grammars.

Adaptor grammars are a framework for expressing and performing inference over a variety of non-parametric linguistic models. These models currently provide state-of-the-art performance on unsuper- vised word segmentation from phonemic representations of child-directed unseg- mented English utterances. This paper in- vestigates the applicability of these mod- els to unsupervised word segmentation of Mandarin. We investigate a wide vari- ety of different segmentation models, and show that the best segmentation accuracy is obtained from models that capture inter- word “collocational” dependencies. Sur- prisingly, enhancing the models to exploit syllable structure regularities and to cap- ture tone information does improve over- all word segmentation accuracy, perhaps because the information these elements convey is redundant when compared to the inter-word dependencies.

Shachar Mirkin, Jonathan Berant, Ido Dagan, Eyal Shnarch. Recognising Entailment within Discourse.

Texts are commonly interpreted based on the entire discourse in which they are sit- uated. Discourse processing has been shown useful for inference-based applica- tion; yet, most systems for textual entail- ment – a generic paradigm for applied in- ference – have only addressed discourse considerations via off-the-shelf corefer- ence resolvers. In this paper we explore various discourse aspects in entailment in- ference, suggest initial solutions for them and investigate their impact on entailment performance. Our experiments suggest that discourse provides useful informa- tion, which significantly improves entail- ment inference, and should be better ad- dressed by future entailment systems.

This talk is part of the Natural Language Processing Reading Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2019 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity