University of Cambridge > > Natural Language Processing Reading Group > Unsupervised Multilingual Learning for Morphological Segmentation

Unsupervised Multilingual Learning for Morphological Segmentation

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Diarmuid Ó Séaghdha.

At this session of the NLIP Reading Group we’ll be discussing the following paper:

Benjamin Snyder and Regina Barzilay. 2008. Unsupervised Multilingual Learning for Morphological Segmentation. In Proceedings of ACL -08.

Abstract: For centuries, the deep connection between languages has brought about major discoveries about human communication. In this paper we investigate how this powerful source of information can be exploited for unsupervised language learning. In particular, we study the task of morphological segmentation of multiple languages. We present a nonparametric Bayesian model that jointly induces morpheme segmentations of each language under consideration and at the same time identifies cross-lingual morpheme patterns, or abstract morphemes. We apply our model to three Semitic languages: Arabic, Hebrew, Aramaic, as well as to English. Our results demonstrate that learning morphological models in tandem reduces error by up to 24% relative to monolingual models. Furthermore, we provide evidence that our joint model achieves better performance when applied to languages from the same family.

This talk is part of the Natural Language Processing Reading Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity