Multilingual Image Description with Neural Sequence Models
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact Kris Cao.
We introduce multilingual image description, the task of generating descriptions of images given data in multiple languages. This can be viewed as visually-grounded machine translation, allowing the image to play a role in disambiguating language. We present models for this task that are inspired by neural models for image description and machine translation. Our multilingual image description models generate target-language sentences using features transferred from separate models: multimodal features from a monolingual source-language image description model and visual features from an object recognition model. In experiments on a dataset of images paired with English and German sentences, using BLEU and Meteor as a metric, our models substantially improve upon existing monolingual image description models.
This talk is part of the NLIP Seminar Series series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|