University of Cambridge > Talks.cam > NLIP Seminar Series > Bayesian non-parametric models for parsing and translation

Bayesian non-parametric models for parsing and translation

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Laura Rimell.

Context free grammars have long been popular for modelling natural language syntax and translation between human lanuages. However, the underlying independencies assumed by the model are much too stringent for accurate data modelling. Considerable research effort has focussed on using linguistic intuitions to enrich CFGs, resulting in state-of-the-art parsing performance. In this talk, I take a different approach by learning an enriched grammar directly from the data without result to linguistic knowledge. Instead the grammar is an emergent structure, found by unsupervised inference in a Bayesian model of tree-substitution grammar (TSG; a.k.a. DOP ). Bayesian methods provide an elegant and theoretically principled way to model TSG by including a prior over the grammar and integrating over uncertain events. In this talk I’ll describe non-parametric Bayesian models for two related tasks: 1) learning a TSG for syntactic parsing and 2) learning a synchronous TSG for machine translation. The models learn compact and simple grammars, uncovering latent linguistic structures and in doing so outperform competitive baselines.

This is joint work with Phil Blunsom and Sharon Goldwater.

http://www.dcs.shef.ac.uk/~tcohn/

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity