Multimodal natural language processing: when text is not enough
Add to your list(s)
Download to your calendar using vCal
If you have a question about this talk, please contact James Thorne.
In this talk I will provide an overview of work on multimodal machine learning, where visual information is used to build richer context models for natural language tasks. Most of the talk will be focused on approaches to machine translation that exploit both textual and visual information to deal with complex linguistic ambiguities as well as common linguistic biases. I will cover state of the art approaches and their limitations and describe a number of studies on whether and when visual information can be beneficial to the task.
This talk is part of the NLIP Seminar Series series.
This talk is included in these lists:
Note that ex-directory lists are not shown.
|