University of Cambridge > Talks.cam > Cambridge Spark talks > Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

Download to your calendar using vCal

Raoul-Gabriel Urma and Valentin Dalibard
Thursday 16 March 2017, 19:00-21:00
Eagle Labs, 28 Chesterton Road, Cambridge, CB4 3AZ, United Kingdom.

If you have a question about this talk, please contact Chih-Chun Chen .

Register via eventbrite: https://www.eventbrite.co.uk/e/cambridge-tech-talk-big-data-interactively-analyse-100gb-of-data-using-spark-amazon-emr-and-zeppelin-tickets-32617949164

You may have been hearing a lot of buzz around Big Data, Apache Spark, Amazon Elastic Map Reduce (EMR) and Apache Zeppelin. What’s the fuss about, and how can you benefit from these state of the art technologies?

In this highly interactive session, you will learn how to leverage Spark to rapidly mine a large real-world data set. We will conduct the analysis live entirely using an iPython Notebook to show you how easy it can be to get to grips with these technologies.

In the first part of the session, we characterise what Big Data is. We will then use a sample of data from the Open Library dataset, and you will learn how to apply common Spark patterns to extract insights and aggregate data. In the second part of the session, you will see how to leverage Spark on Amazon EMR to scale your data processing queries over a cluster of machines and interactively analyse a large data set (100GB) with a Zeppelin Notebook. Along the way, you will learn gotchas as well as useful performance and monitoring tips.

This talk is part of the Cambridge Spark talks series.

This talk is included in these lists:

Eagle Labs, 28 Chesterton Road, Cambridge, CB4 3AZ, United Kingdom

Note that ex-directory lists are not shown.

Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

📅 Download to calendar (vCal)

👤 Speaker: Raoul-Gabriel Urma and Valentin Dalibard 🔗 Website
📅 Date & Time: Thursday 16 March 2017, 19:00 - 21:00
📍 Venue: Eagle Labs, 28 Chesterton Road, Cambridge, CB4 3AZ, United Kingdom

Questions? Contact Chih-Chun Chen

Abstract

Register via eventbrite: https://www.eventbrite.co.uk/e/cambridge-tech-talk-big-data-interactively-analyse-100gb-of-data-using-spark-amazon-emr-and-zeppelin-tickets-32617949164

Series This talk is part of the Cambridge Spark talks series.

Included in Lists

Eagle Labs, 28 Chesterton Road, Cambridge, CB4 3AZ, United Kingdom

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

This talk is included in these lists:

Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

This talk is included in these lists:

Other lists

Other talks

Big Data! Interactively Analyse 100GB of Data using Spark, Amazon EMR and Zeppelin

Abstract

Included in Lists