Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Unsupervised Morphological Segmentation with Log-Linear Models

Add to your list(s) Download to your calendar using vCal

Diarmuid Ó Séaghdha (University of Cambridge)
Monday 08 March 2010, 12:30-13:30
GS15, Computer Laboratory.

If you have a question about this talk, please contact Diarmuid Ó Séaghdha.

At this session of the NLIP Reading Group we’ll be discussing the following paper:

Hoifung Poon, Colin Cherry and Kristina Toutanova. 2009. Unsupervised Morphological Segmentation with Log-Linear Models. In Proceedings of NAACL -09.

Abstract: Morphological segmentation breaks words into morphemes (the basic semantic units). It is a key component for natural language processing systems. Unsupervised morphological segmentation is attractive, because in every language there are virtually unlimited supplies of text, but very few labeled resources. However, most existing model-based systems for unsupervised morphological segmentation use directed generative models, making it difficult to leverage arbitrary overlapping features that are potentially helpful to learning. In this paper, we present the first log-linear model for unsupervised morphological segmentation. Our model uses overlapping features such as morphemes and their contexts, and incorporates exponential priors inspired by the minimum description length (MDL) principle. We present efficient algorithms for learning and inference by combining contrastive estimation with sampling. Our system, based on monolingual features only, outperforms a state-of-the-art system by a large margin, even when the latter uses bilingual information such as phrasal alignment and phonetic correspondence. On the Arabic Penn Treebank, our system reduces F1 error by 11% compared to Morfessor.

This talk is part of the Natural Language Processing Reading Group series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Unsupervised Morphological Segmentation with Log-Linear Models

This talk is included in these lists:

Other lists

Other talks