BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Large-scale Retrieval with Ivory and MapReduce - Tamer Elsayed\, C
 airo Microsoft Innovation Centre (CMIC)
DTSTART:20111031T103000Z
DTEND:20111031T113000Z
UID:TALK34417@talks.cam.ac.uk
CONTACT:Microsoft Research Cambridge Talks Admins
DESCRIPTION:It is commonly acknowledged that web-scale collections have ou
 tgrown the capabilities of individual machines\, necessitating the use of 
 clusters to tackle many problems in information retrieval. The release of 
 the 25-terabyte billion-page ClueWeb09 collection in 2009 and the increasi
 ng popularity of Hadoop\, the open source implementation of the MapReduce 
 distributed framework\, have motivated academic researchers to think more 
 seriously about cluster-based distributed retrieval solutions. \nIn this t
 alk\, we will first introduce Ivory\, an end-to-end open-source distribute
 d retrieval system built at University of Maryland\, College Park\; Ivory 
 takes full advantage of Hadoop and its underlying distributed file system 
 for both indexing and retrieval. We will then present an overview of sever
 al research projects evolved around Ivory\, such as approximate positional
  indexing for efficient ranked retrieval\, scalable monolingual and cross-
 lingual pairwise document similarity\, and automatically-extracted pseudo 
 test collections for learning ranking functions for the task of web search
 .  \n
LOCATION:Small lecture theatre\, Microsoft Research Ltd\, 7 J J Thomson Av
 enue (Off Madingley Road)\, Cambridge
END:VEVENT
END:VCALENDAR