University of Cambridge > Talks.cam > Natural Language Processing Reading Group > Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Diarmuid Ó Séaghdha.

I’ll be presenting and discussing the following paper from EMNLP 2008 :

Rion Snow, Brendan O’Connor, Daniel Jurafsky and Andrew Y. Ng. Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks . Proceedings of EMNLP 2008 .

Abstract:

Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon’s Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. For all five, we show high agreement between Mechani- cal Turk non-expert annotations and existing gold standard labels provided by expert labelers. For the task of affect recognition, we also show that using non-expert labels for training machine learning algorithms can be as effective as using gold standard annotations from experts. We propose a technique for bias correction that significantly improves annotation quality on two tasks. We conclude that many large labeling tasks can be effectively designed and carried out in this method at a fraction of the usual expense.

This talk is part of the Natural Language Processing Reading Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity