Log in

Cambridge users (raven) details

Other users details

No account? details

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Machine Translation with LSTMs

Add to your list(s) Download to your calendar using vCal

Ilya Sutskever (Google)
Friday 28 November 2014, 10:30-11:30
Engineering Department, LR3B.

If you have a question about this talk, please contact Dr Jes Frellsen.

Room changed to LR3B in Inglis building and time changed to 10:30

Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this talk, I will present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. The method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. The main result is that on an English to French translation task from the WMT -14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM ’s BLEU score is penalized on out-of-vocabulary words. While this performance is respectable, it is worse than state of the art performance on this dataset (which is 37.0) mainly due to the LSTM ’s inability to translate out-of-vocabulary (OOV) words. In the second half of the talk, I will present a simple method for addressing the OOV problem. The method consists of annotating each OOV word in the training set with a “pointer” to its origin in the source sentence, which makes it easy to translate the OOV words at test time using a dictionary. The new method achieves a BELU score of 37.5, which is a new state-of-the-art.

This is joint work with Thang Luong, Oriol Vinyals, Quoc Le, an Wojciech Zaremba.

This talk is part of the Machine Learning @ CUED series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

Machine Translation with LSTMs

This talk is included in these lists:

Other lists

Other talks