BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Musketeer: all for one\, one for all in data processing systems  -
  Ionel Gog (University of Cambridge)
DTSTART:20150409T140000Z
DTEND:20150409T150000Z
UID:TALK58375@talks.cam.ac.uk
CONTACT:Eiko Yoneki
DESCRIPTION:Many systems for the parallel processing of big data are avail
 able today. Yet\, few users can tell by intuition which system\, or combin
 ation of systems\, is “best” for a given workflow. Porting workflows b
 etween systems is tedious. Hence\, users become “locked in”\, despite 
 faster or more efficient systems being available. This is a direct consequ
 ence of the tight coupling between user-facing front-ends that express wor
 kflows (e.g.\, Hive\, SparkSQL\, Lindi\, GraphLINQ) and the back-end execu
 tion engines that run them (e.g.\, MapReduce\, Spark\, PowerGraph\, Naiad)
 .\n\nIn this talk\, I will present Musketeer\, a system that decouples the
  ways workflows are defined from the manner in which they are executed. Mu
 sketeer dynamically maps front-end workflow descriptions to a broad range 
 of back-end execution engines. Without requiring any manual porting effort
 \, users now have a choice of many systems. Musketeer currently supports f
 our high-level query languages and generates code for seven popular data p
 rocessing systems\, in some cases speeding up realistic workflows by up to
  9x.\n
LOCATION:FW26\, Computer Laboratory\, William Gates Builiding
END:VEVENT
END:VCALENDAR
