University of Cambridge > Talks.cam > NLIP Seminar Series > The Geometry of Machine Translation

The Geometry of Machine Translation

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Tamara Polajnar.

Most modern statistical machine translation systems are based on linear statistical models. One extremely effective method for estimating the model parameters is minimum error rate training (MERT), which is an efficient form of line search adapted to the highly non-linear objective functions used in machine translation. We will show that MERT can be represented using convex geometry, which is the mathematics of polytopes and their faces. Using this geometric representation of MERT we investigate whether the optimisation of linear models is tractable in general. It has been believed that the number of feasible solutions of a linear model is exponential with respect to the number of sentences used for parameter estimation, however we show that the exponential complexity is instead due to the feature dimension. This result has important ramifications because it suggests that the current trend in building statistical machine translation systems by introducing very large number of sparse features is inherently not robust.

This talk is part of the NLIP Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity