University of Cambridge > Talks.cam > Language Technology Lab Seminars > IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Marinela Parovic.

Reliable evaluation benchmarks designed for replicability and comprehensiveness have driven progress in machine learning. Due to the lack of a multilingual benchmark, however, vision-and-language research has mostly focused on English language tasks. In this talk, I will present the Image-Grounded Language Understanding Evaluation benchmark that aims at filling this gap. IGLUE brings together—by both aggregating pre-existing datasets and creating new ones—visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups. Based on the evaluation of the available state-of-the-art models, we find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks. Moreover, downstream performance is partially explained by the amount of available unlabelled textual data for pretraining, and only weakly by the typological distance of target–source languages. We hope to encourage future research efforts in this area by releasing the benchmark to the community.

This talk is part of the Language Technology Lab Seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity