BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Big Data! Interactively Analyse 100GB of Data using Spark\, Amazon
  EMR and Zeppelin - Raoul-Gabriel Urma and Valentin Dalibard
DTSTART:20170316T190000Z
DTEND:20170316T210000Z
UID:TALK71588@talks.cam.ac.uk
CONTACT:Chih-Chun Chen
DESCRIPTION:Register via eventbrite: https://www.eventbrite.co.uk/e/cambri
 dge-tech-talk-big-data-interactively-analyse-100gb-of-data-using-spark-ama
 zon-emr-and-zeppelin-tickets-32617949164\n\nYou may have been hearing a lo
 t of buzz around Big Data\, Apache Spark\, Amazon Elastic Map Reduce (EMR)
  and Apache Zeppelin. What’s the fuss about\, and how can you benefit fr
 om these state of the art technologies?\n\nIn this highly interactive sess
 ion\, you will learn how to leverage Spark to rapidly mine a large real-wo
 rld data set. We will conduct the analysis live entirely using an iPython 
 Notebook to show you how easy it can be to get to grips with these technol
 ogies.\n\nIn the first part of the session\, we characterise what Big Data
  is. We will then use a sample of data from the Open Library dataset\, and
  you will learn how to apply common Spark patterns to extract insights and
  aggregate data. In the second part of the session\, you will see how to l
 everage Spark on Amazon EMR to scale your data processing queries over a c
 luster of machines and interactively analyse a large data set (100GB) with
  a Zeppelin Notebook. Along the way\, you will learn gotchas as well as us
 eful performance and monitoring tips.\n\n
LOCATION:Eagle Labs\, 28 Chesterton Road\, Cambridge\, CB4 3AZ\, United Ki
 ngdom
END:VEVENT
END:VCALENDAR
