How to Build a Data Lake in Amazon S3 & Amazon Glacier


How to Build a Data Lake in Amazon S3 & Amazon Glacier



When:
February 1 | 11:00 AM - 11:45 AM PT (2:00 PM - 2:45 PM ET)

Level 200 | Service How To
In this session, we discuss best practices for data ingestion, storage, cataloging and analysis on Amazon object storage services. We examine ways to reduce or eliminate costly extract, transform, and load (ETL) processes using query-in-place technology, such as Amazon S3 Select, Amazon Glacier Select, Amazon Athena, and Amazon Redshift Spectrum. We also review custom analytics integration using Apache Spark, Apache Hive, Presto, and other technologies in Amazon EMR.

Learning Objectives:
• Understand the options for building an analytics platform that leverages Amazon S3 & Amazon Glacier
• Learn about the key considerations for ETL and other core analytics functions
• Determine if query-in-place capabilities like Amazon S3 Select, Amazon Glacier Select, Amazon Athena, and Amazon Redshift Spectrum are a good fit for your use case

Who Should Attend: Storage Administrators, Data Scientists, Analytics Professionals

Speaker(s): PD Dutta, Sr. Product Manager, Amazon S3, AWS

Register for the Webinar