Musketeer: all for one, one for all in data processing systems
- ๐ค Speaker: Ionel Gog (University of Cambridge)
- ๐ Date & Time: Thursday 09 April 2015, 15:00 - 16:00
- ๐ Venue: FW26, Computer Laboratory, William Gates Builiding
Abstract
Many systems for the parallel processing of big data are available today. Yet, few users can tell by intuition which system, or combination of systems, is โbestโ for a given workflow. Porting workflows between systems is tedious. Hence, users become โlocked inโ, despite faster or more efficient systems being available. This is a direct consequence of the tight coupling between user-facing front-ends that express workflows (e.g., Hive, SparkSQL, Lindi, GraphLINQ) and the back-end execution engines that run them (e.g., MapReduce, Spark, PowerGraph, Naiad).
In this talk, I will present Musketeer, a system that decouples the ways workflows are defined from the manner in which they are executed. Musketeer dynamically maps front-end workflow descriptions to a broad range of back-end execution engines. Without requiring any manual porting effort, users now have a choice of many systems. Musketeer currently supports four high-level query languages and generates code for seven popular data processing systems, in some cases speeding up realistic workflows by up to 9x.
Series This talk is part of the Computer Laboratory Systems Research Group Seminar series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- CL's SRG seminar
- Computer Laboratory Systems Research Group Seminar
- Department of Computer Science and Technology talks and seminars
- FW26, Computer Laboratory, William Gates Builiding
- Interested Talks
- ndk22's list
- ob366-ai4er
- rp587
- School of Technology
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Ionel Gog (University of Cambridge)
Thursday 09 April 2015, 15:00-16:00