Broadcast Date: June 17, 2019

Level: 200

Every data lake initiative begins with setting up extract, transform, and load (ETL) processes where data is moved from various data sources into a central data repository. In this tech talk, we will show how you can use AWS Glue to build, automate, and manage ETL jobs in a scalable, serverless Apache Spark platform. See how to support Python shell jobs too, in addition to Spark jobs.

Learning Objectives

  • Learn about building a data lake on AWS
  • Discover how to create ETL processes using AWS Glue
  • Understand how serverless Spark and Python jobs reduce costs

Who Should Attend?

Analysts, Developers, Data Scientists, Data Engineers, DBAs


  • Raghu Prabhu, Sr. Business Development Manager, AWS

Learn More

To learn more about the services featured in this talk, please visit:

Intro body copy here about 2018 re:Invent launches.

Download the Slide Deck


Service How To

December 19th, 2018 | 1:00 PM PT

Developing Deep Learning Models for Computer Vision with
Amazon EC2 P3 Instances.

Register Now>


What's New / Cloud Innovation

December 11th, 2018 | 1:00 PM PT


Register Now>

Data Lakes & Analytics

Webinar 1:

What's New / Cloud Innovation

December 10th, 2018 | 11:00 AM PT


Register Now>

Webinar 2:

What's New / Cloud Innovation

December 12th, 2018 | 11:00 AM PT


Register Now>