Inducing Synchronous Grammars for Machine Translation
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Laura Rimell.
In this talk I’ll outline my work modelling statistical machine translation
(SMT) as a probabilistic machine learning problem.
Although SMT systems have made large gains in translation quality in recent
years, most are currently induced using a hand engineered pipeline of
disparate models linked by heuristics. Although such techniques are effective
for translating between related languages (e.g. English and French), they fail
to capture the latent structure necessary to translate between languages which
diverge significantly in syntactic structure, such as Chinese and English.
I’ll present non-parametric Bayesian models for inducing synchronous context
free grammars. These models are capable of learning the latent structure of
translation equivalence from a corpus of parallel string pairs. I’ll discuss
the difficult inference problems posed by such models and describe Monte Carlo
sampling techniques that can help solve them. Finally I’ll present experiments
demonstrating competitive results on full scale translation evaluations.
This talk is part of the NLIP Seminar Series series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|