AWS Innovate - Databases & Analytics Edition


Join us for AWS Innovate - Databases and Analytics Edition. AWS Innovate is a free online conference designed for developers and DBAs looking to migrate their databases or build modern, data-driven applications using the latest database technologies; and for data engineers and data analysts, looking to migrate their data warehouses or build data lakes to manage their growing volumes of data and uncover new insights using a variety of analytics technologies.

Hear the very latest in databases and analytics in the keynote by Raju Gulabani, AWS, VP of Databases, Analytics and AI/ML. Dive deep into any of the 18 breakout sessions across four tracks delivered by AWS experts. This free online conference is designed to provide you practical insights about how to move your database workloads to the cloud, choose the right database for your application development needs, build a data lake quickly in the cloud, and analyze all of your data.

From the world's largest enterprises to emerging start-ups, more customers are deploying databases, data lakes, and data warehouses on AWS than anywhere else! 

100,000+ databases migrated to AWS
10,000+ data lakes and data warehouses deployed on AWS
Hundreds of thousands of customers rely on AWS Databases


Location: Online – Register to gain access to the on-demand recordings

Enter a section title here

Event Agenda

Hover over the sessions to learn more

Duration
Session Titles
45 mins
45 mins
Keynote

Track 1 - Get started: Moving your data to the cloud

40 mins

Migrate your Oracle and SQL Server databases to Amazon RDS (200)

Organizations today are looking to free themselves from the constraints of on-premises databases and leverage the power of fully managed databases in the cloud. Amazon RDS is a fully managed relational database service that you can use to run your choice of database engines including open source engines, Oracle, and SQL Server in the cloud. Amazon RDS automates time-consuming database administration tasks and adds capabilities such as replication and Multi-AZ failover to make your database deployments more scalable, available, reliable, manageable, and cost-effective. This session covers why you should consider moving your on-premises Oracle & SQL Server deployments to Amazon RDS and the tools to get started.



40 mins

Migrate your on-premises Data Warehouse to Amazon Redshift (200)

Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions due to cost, complexity, and rigid architectures of traditional data warehouses. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. Amazon Redshift offers 10x the performance at 1/10th the cost of traditional data warehouses, and extends queries to the data lake with no data movement needed. In this session, we discuss how moving to Amazon Redshift enables you to unlock better price-performance and scale, while automating your day-to-day administration tasks. We also show how Amazon Redshift natively integrates with your data lake and enables you to analyze open data formats with SQL without the need to load, transform, or move the data.



40 mins

Why cloud databases like Amazon Aurora are more scalable and reliable (300)


Amazon Aurora is a fully managed MySQL and PostgreSQL-compatible relational database with the speed, reliability, and availability of commercial databases at one-tenth the cost. It is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. This session provides an overview of Aurora, explores Aurora features, such as serverless, global databases, multi-master, replication and multi-AZ failover, and helps you get started.

40 mins

Deploying open source databases on AWS (200)

Open source databases like MySQL, PostgreSQL, and Redis now rank among the world’s most popular databases. Fast-growing companies and large enterprises alike prefer open source databases due to their low cost, freedom from traditional license models, flexibility, community-backed development and support, and large ecosystems of tools and extensions. While open source databases are widely available, they can become difficult and time-consuming to manage in production environments. AWS Database Services including Amazon RDS (MySQL, PostgreSQL, MariaDB), and Amazon ElastiCache (Redis, Memcached) make it easy to manage open source database workloads in the cloud with performance, scalability, and availability.


Track 2 - Building apps with modern databases

Which database to choose: Pick the right purpose-built database for the job (200)

Developers building modern applications need purpose-built databases, so they have the freedom to choose the right database for the right job. AWS offers purpose-built relational, key-value, document, in-memory, graph, time series, and ledger databases so you can select the best database based on the application workload, not the other way around. Attend this session to learn how to pick the right database services to address specific application issues.


Building large scale data-driven apps with AWS databases (300)


Applications today require databases with unlimited scale that respond quicker than ever before. Ecommerce and social media applications and connected devices need more than what traditional relational databases offer. Attempts to scale a relational database management system (RDBMS) involve upgrades to more powerful—and often proprietary—hardware. This work is also known as “vertical scaling,” and it usually includes the undesirable combination of rising costs, operational complexity, and performance bottlenecks. Come to this session to learn how AWS databases such as Amazon DynamoDB are built for the scale and performance needs of today’s applications, without the complexity of running massively scalable, distributed databases allowing developers to build applications rather than manage infrastructure.



Extreme Performance at Cloud Scale: supercharge your real-time apps with Amazon ElastiCache (300)

Microseconds are the new milliseconds. Real-time applications such as caching, session stores, and other real-time processing need microsecond latency and high throughput to support millions of requests per second. Developers have traditionally relied on specialized hardware, and workarounds such as disk-based databases combined with data reduction techniques to manage data for real-time applications. These approaches can be expensive and not scalable. Learn how you can boost the performance of real-time apps by using the fully-managed, in-memory AWS database ElastiCache for extreme performance, high scalability, availability, and security.

Databases for building business-critical enterprise apps (300)

Packaged and custom enterprise apps help organizations manage core business processes, such as sales, billing, customer service, and HR. For decades, developers have built enterprise apps with old-guard commercial databases, but these databases are expensive, proprietary, have high lock-in, impose punitive licensing terms, don’t scale, and are difficult to manage. AWS offers a better way forward with fully-managed, cloud-native, modern database services that make it is easier, faster, and cost-effective for you to build enterprise apps. Attend this session to learn about AWS databases services including Amazon Aurora & RDS and dive deep into service capabilities such as scalability, reliability, and performance to quickly build enterprise apps.


Track 3 - Get started: Building your data lake

How to go from zero to data lake in days (200)

AWS provides the most comprehensive, secure, scalable, and cost-effective portfolio of services for building data lakes for analytics. In this session, you will learn how to discover, load, store, prepare, catalog, and secure your data in a data lake. Then, you will learn how to analyze that data with the largest choice of analytics approaches, including data warehouse, operational, and real-timing streaming analytics, and even ML and AI. This will give you an overview of what AWS analytics can help you accomplish. Finally, you will hear about how leading companies built successful and productive data lakes.



Picking the right analytical engine for your needs (200)


AWS offers analytical engines for several use cases including big data processing, using Hadoop and Spark, data warehousing, ad-hoc analysis, real-time analytics, and operational/log analytics. In this session, you will learn about what engines you can use to analyze the data stored in your Amazon S3 data lake. You will also learn how to use these engines together to generate new insights by complementing your data warehouse workloads with ad-hoc and real-time analytics engines to quickly incorporate new data into your reports.



Breaking the silos: Extending your DW to your data lake (300)


Traditional data warehouses require data to be loaded before it can be analyzed. This creates silos between the data warehouse, where transformed and structured data is stored in a proprietary format that other analytical engines cannot access, and the data lake, where data is stored as it arrives. Amazon Redshift breaks through data silos by enabling you to query exabytes of data in open formats directly from your Amazon S3 data lake. In this session you will learn how to use a data catalog such as AWS Glue to crawl the Amazon S3 data lake and create external tables, how to register those external tables in Amazon Redshift, and how to use Redshift to query data in your Amazon S3 data lake using those external tables - all without any data movement, duplication, or transformation.

Big data in the era of heavy privacy regulations (200)

The General Data Protection Regulation (GDPR) and California Consumer Protection Act (CCPA) are new privacy regulations that have major implications for data management and data analytics. In this session, you will learn about key parts of these new rules that you should consider while designing or evolving your analytics platform. You will then learn about approaches that will help with compliance, including metadata classification, tagging, fine-grained access controls, comprehensive encryption, anonymization, and erasure of customer data.


Track 4 - Analyzing and getting insights

Amazon Redshift use cases and deployment patterns (400)

More than 10,000 customers use Amazon Redshift and collectively process over 2 exabytes of data per day. In this session, we will show some common deployment patterns for various use cases across different industries. We will also highlight best practices and offer tips to avoid common pitfalls, based on lessons learned from our customer engagements. You will walk away knowing how to get insights from all of your data in your Redshift data warehouse and Amazon S3 data lake at the best performance and lowest cost.



Processing Big Data with Hadoop, Spark, and other frameworks in Amazon EMR (300)


Amazon EMR provides a managed Hadoop framework that makes it easy, fast, and cost-effective to process vast amounts of data. You can also run other popular distributed frameworks such as Apache Spark, HBase, Presto, and Flink in EMR. EMR Notebooks, based on the popular Jupyter Notebook, provide a development and collaboration environment for ad hoc querying and exploratory analysis. In this session, you will learn how EMR securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics.


Scalable, secure log analytics with Amazon Elasticsearch Service (200)

You’ve got servers, you’ve got applications, you’ve got microservices, that means, you’ve got logs. They’re not the most exciting data that your systems generate, but many times, they’re the most useful for real-time application monitoring, root-cause analysis, security analytics, and more. Customers like Autodesk, Nike, Expedia, and many more are using Amazon Elasticsearch Service to ingest, analyze, and search their log data, at multi-petabyte scale. In this session, you will learn about the Amazon Elasticsearch Service, how to get data into Amazon Elasticsearch Service, and how to use Kibana to visualize the insights from your log data.


High-performance data streaming and real-time analytics with AWS (300)

For many use cases timing is critical and the value of data diminishes rapidly. This means that every micro-second counts. Amazon Kinesis services and Amazon Managed Streaming for Kafka provide customers with fully managed streaming options enabling data to be collected, stored and processed as soon as it is created. In this session, you will learn how to solve data streaming use cases using AWS and how to decide which of the services are best suited to your needs.


50 mins
45 mins
Customer spotlight 1: How iRobot built the Roomba to use real-time data to smartly clean your home - 45 minCustomer Spotlight 1: A smart vacuum gathers a lot of data. See how iRobot, inventor of the Roomba, uses Redshift for real-time business intelligence and Athena+Aurora for the vacuum’s real-time processing.
or
45 mins
Customer spotlight 2: Equinox’s Data Warehouse modernization journey - 45 minCustomer Spotlight 2: This session highlights how Equinox uses their Amazon Redshift data warehouse and Amazon S3 data lake to setup an efficient and flexible analytics platform for clickstream data. This enables them to determine abandoned carts or optimize marketing spending, analyze offline and online behavior, and also behavior before customers even registered an account.

5 mins
Conclusion - 5 min

Featured Speakers

  • Raju GulabaniVP of Databases, Analytics & AI, AWS

    Raju Gulabani is VP of Databases, Analytics & AI within AWS at Amazon.com. He is responsible for P&L, product management, engineering and operations for Database services such as Amazon Aurora and Amazon DynamoDB, and Analytics services such as Amazon Redshift and Amazon EMR, as well as AI services like Amazon Lex, Amazon Polly, and Amazon Rekognition. Prior to joining Amazon in his current position in 2010, Raju spent four years at Google and built the Google Apps business (now known as G Suite).Earlier in his career, Raju founded an Intel backed Wi-Fi Voice over IP company as well as held engineering management positions at Microsoft.

  • Ryan KellyData Architect, Equinox

    Ryan Kelly is a data architect at Equinox, where he helps outline and implement frameworks for data initiatives. He also leads clickstream tracking which helps aid teams with insights on their digital initiatives. Ryan loves making it easier for people to reach and ingest their data for the purposes of business intelligence, analytics, and product/service enrichment. He also loves exploring and vetting new technologies to see how they can enhance what they do at Equinox.

  • Richard BoydCloud Data Engineer, iRobot

    Richard Boyd is a cloud data engineer with the iRobot Corporation’s Cloud Data Platform where he builds tools and services to support the world’s most beloved vacuum cleaner. Before joining iRobot, Richard built discrete event simulators for Amazon’s automated fulfillment centers in Amazon Robotics. His previous roles include cyber warfare systems analyst at MIT and research for the Center for Army Analysis. He holds advanced degrees in Applied Mathematics & Statistics.

  • Todd EscalonaSolutions Architect Evangelist, AWS

    As a Solutions Architect Evangelist, Todd spends his time working directly with his customers and partners on a global basis, while listening to understand their goals and working backwards from there. He defines requirements, provides architectural guidance around specific use cases, and assists in designing applications and services that are scalable, reliable, and performant. Outside of speaking at public events and hosting hackathons, Todd’s interests spread across various technologies such as Artificial Intelligence, Machine Learning and serverless event driven architectures.

  • Sireesha MuppalaSolutions Architect, AWS

  • Liam MorrisonPrincipal Solutions Architect, AWS

  • Christopher KingPartner Solutions Architect, AWS

  • Shyam SrinivasanSenior Product Marketing, AI & ML, AWS

    Shyam Srinivasan is a Senior Product Marketing Manager in the AWS AI/ML team and leads the strategy for driving awareness towards educating ML developers and data scientists, focusing on Amazon SageMaker. Shyam constantly engages with customers about machine learning and loves to bring ideas to life with machine learning. Outside of work, Shyam has fun with his family with travel and games.

  • Emily WebberML Specialist, Solutions Architect, AWS

    Emily Webber has been leading data science projects for many years, piloting the application of machine learning into such diverse areas as social media violence detection, economic policy evaluation, computer vision, reinforcement learning, IOT, drone, and robotic design. Her master’s degree is from the University of Chicago, where she developed new applications of machine learning for public policy research with the Data Science for Social Good Fellowship. As a Machine Learning Specialist for Amazon Web Services she guides customers from project ideation to full deployment.

  • Sunil MallyaSenior AI Solutions Architect, AWS

    Sunil Mallya is a lead on the Machine Learning Solutions Lab focused on Deep Learning and Reinforcement Learning at AWS. Sunil is working with AWS customers in various transformation and innovation initiatives across verticals by building models for cutting edge ML/DL/RL apps. Prior to joining AWS, Sunil co-founded the neuroscience and machine learning-based image analysis and video thumbnail recommendation company Neon Labs. He has worked on building large scale low latency systems at Zynga and has an acute passion for serverless computing. He holds a Master’s Degree in Computer Science from Brown University.

  • ,


    ,

  • ,


    ,

  • ,


    ,

Session Details

  • Track 1 Session 1: Migrate your Oracle and SQL Server databases to Amazon RDS (200)

    Organizations today are looking to free themselves from the constraints of on-premises databases and leverage the power of fully managed databases in the cloud. Amazon RDS is a fully managed relational database service that you can use to run your choice of database engines including open source engines, Oracle, and SQL Server in the cloud. Amazon RDS automates time-consuming database administration tasks and adds capabilities such as replication and Multi-AZ failover to make your database deployments more scalable, available, reliable, manageable, and cost-effective. This session covers why you should consider moving your on-premises Oracle & SQL Server deployments to Amazon RDS and the tools to get started.
  • Track 1 Session 2: Migrate your on-premises Data Warehouse to Amazon Redshift (200)

    Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions due to cost, complexity, and rigid architectures of traditional data warehouses. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. Amazon Redshift offers 10x the performance at 1/10th the cost of traditional data warehouses, and extends queries to the data lake with no data movement needed. In this session, we discuss how moving to Amazon Redshift enables you to unlock better price-performance and scale, while automating your day-to-day administration tasks. We also show how Amazon Redshift natively integrates with your data lake and enables you to analyze open data formats with SQL without the need to load, transform, or move the data.
  • Track 1 Session 3: Why cloud databases like Amazon Aurora are more scalable and reliable (300)

    Amazon Aurora is a fully managed MySQL and PostgreSQL-compatible relational database with the speed, reliability, and availability of commercial databases at one-tenth the cost. It is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. This session provides an overview of Aurora, explores Aurora features, such as serverless, global databases, multi-master, replication and multi-AZ failover, and helps you get started.
  • Track 1 Session 4: Database Freedom: Deploying open source databases on AWS (200)

    Open source databases like MySQL, PostgreSQL, MongoDB, and Redis now rank among the world’s most popular databases. Fast-growing companies and large enterprises alike prefer open source databases due to their low cost, freedom from traditional license models, flexibility, community-backed development and support, and large ecosystems of tools and extensions. While open source databases are widely available, they can become difficult and time-consuming to manage in production environments. AWS Database Services including Amazon RDS (MySQL, PostgreSQL, MariaDB), Amazon ElastiCache (Redis, Memcached), and Amazon DocumentDB (MongoDB) make it easy to manage open source database workloads in the cloud with performance, scalability, and availability.

Frequently Asked Questions

    • Q: Where is this event?
      This event is an online event, hosted by AWS on the INXPO platform.
    • Q: Who should attend this event?
      Developers building data-driven apps; DBAs and data engineers who are building analytics infrastructure and data pipelines; Analysts and data scientists who are deriving insights that answer complex business quesions and building/trainining machine learning models.
    • Q: How much does this event cost?
      There is no cost to attend this event.
    • Q: What are the prerequisites before attending the event?
      There are no prerequisites for attending the event. We encourage attendees to browse the Database and Analytics pages on the AWS website to get a brief overview of the services available to them.

Customer Highlights




  • Epics Games’ entire analytics platform runs on AWS. Billions of game events, like player interactions on the map, their accuracy, damage taken and dealt, and what resources they are using are all sent to AWS.
  • Yelp stores approximately 18 months’ worth of advertising information in Amazon Redshift. Teams use the information to understand how ads are being delivered and to train models that will result in more relevant future ads.
  • Lyft uses AWS to move faster as a company and manage its exponential growth, leveraging AWS products such as Amazon DynamoDB, Amazon Kinesis, and Amazon Redshift to support more than 100 microservices that enhance every element of its customers’ experience.
  • Airbnb moved its MySQL database to Amazon Relational Database Service (Amazon RDS) because it simplifies much of the time-consuming administrative tasks typically associated with databases and allows difficult procedures, such as replication and scaling, to be completed with a basic API call or through the AWS Management Console.
  • SignalFx is a real-time monitoring platform for cloud infrastructure and applications. Built on streaming analytics, SignalFx alerts in seconds and accelerates resolution with directed troubleshooting.

Session Proficiency Levels Explained

Level 100 (Introductory): sessions will focus on providing an overview of AWS services and features, with the assumption that attendees are new to the topic.

Level 200 (Intermediate): sessions will focus on providing best practices, details of service features and demos with the assumption that attendees have introductory knowledge of the topics.

Level 300 (Advanced): sessions will dive deeper into the selected topic. Presenters assume that the audience has some familiarity with the topic, but may or may not have direct experience implementing a similar solution.

Level 400 (Expert): sessions are for attendees who are deeply familiar with the topic, have implemented a solution on their own already, and are comfortable with how the technology works across multiple services, architectures, and implementations.

Session Details

Track 1 Session 1

Migrate your Oracle and SQL Server databases to Amazon RDS (200)

Organizations today are looking to free themselves from the constraints of on-premises databases and leverage the power of fully managed databases in the cloud. Amazon RDS is a fully managed relational database service that you can use to run your choice of database engines including open source engines, Oracle, and SQL Server in the cloud. Amazon RDS automates time-consuming database administration tasks and adds capabilities such as replication and Multi-AZ failover to make your database deployments more scalable, available, reliable, manageable, and cost-effective. This session covers why you should consider moving your on-premises Oracle & SQL Server deployments to Amazon RDS and the tools to get started.

Track 1 Session 2

Migrate your on-premises Data Warehouse to Amazon Redshift (200)

Most companies are over-run with data, yet they lack critical insights to make timely and accurate business decisions due to cost, complexity, and rigid architectures of traditional data warehouses. They are missing the opportunity to combine large amounts of new, unstructured big data that resides outside their data warehouse with trusted, structured data inside their data warehouse. Amazon Redshift offers 10x the performance at 1/10th the cost of traditional data warehouses, and extends queries to the data lake with no data movement needed. In this session, we discuss how moving to Amazon Redshift enables you to unlock better price-performance and scale, while automating your day-to-day administration tasks. We also show how Amazon Redshift natively integrates with your data lake and enables you to analyze open data formats with SQL without the need to load, transform, or move the data./p>

Track 1 Session 3

Why cloud databases like Amazon Aurora are more scalable and reliable (300)

Amazon Aurora is a fully managed MySQL and PostgreSQL-compatible relational database with the speed, reliability, and availability of commercial databases at one-tenth the cost. It is up to five times faster than standard MySQL databases and three times faster than standard PostgreSQL databases. This session provides an overview of Aurora, explores Aurora features, such as serverless, global databases, multi-master, replication and multi-AZ failover, and helps you get started.

Track 1 Session 4

Deploying open source databases on AWS (200)

Open source databases like MySQL, PostgreSQL, and Redis now rank among the world’s most popular databases. Fast-growing companies and large enterprises alike prefer open source databases due to their low cost, freedom from traditional license models, flexibility, community-backed development and support, and large ecosystems of tools and extensions. While open source databases are widely available, they can become difficult and time-consuming to manage in production environments. AWS Database Services including Amazon RDS (MySQL, PostgreSQL, MariaDB), and Amazon ElastiCache (Redis, Memcached) make it easy to manage open source database workloads in the cloud with performance, scalability, and availability.

Track 2 Session 1

Which database to choose: Pick the right purpose-built database for the job (200)

Developers building modern applications need purpose-built databases, so they have the freedom to choose the right database for the right job. AWS offers purpose-built relational, key-value, document, in-memory, graph, time series, and ledger databases so you can select the best database based on the application workload, not the other way around. Attend this session to learn how to pick the right database services to address specific application issues.

Track 2 Session 2

Building large scale data-driven apps with AWS databases (300)

Applications today require databases with unlimited scale that respond quicker than ever before. Ecommerce, social media applications, and connected devices need more than what traditional relational databases offer. Attempts to scale a relational database management system (RDBMS) involve upgrades to more powerful—and often proprietary—hardware. This work is also known as “vertical scaling,” and it usually includes the undesirable combination of rising costs, operational complexity, and performance bottlenecks. Come to this session to learn how AWS databases such as Amazon DynamoDB are built for the scale and performance needs of today’s applications, without the complexity of running massively scalable, distributed databases allowing developers to build applications rather than manage infrastructure.

Track 2 Session 3

Extreme Performance at Cloud Scale: supercharge your real-time apps with Amazon ElastiCache (300)/strong>

Microseconds are the new milliseconds. Real-time applications such as caching, session stores, and other real-time processing need microsecond latency and high throughput to support millions of requests per second. Developers have traditionally relied on specialized hardware, and workarounds such as disk-based databases combined with data reduction techniques to manage data for real-time applications. These approaches can be expensive and not scalable. Learn how you can boost the performance of real-time apps by using the fully-managed, in-memory AWS database ElastiCache for extreme performance, high scalability, availability, and security.

Track 2 Session 4

Databases for building business-critical enterprise apps (300)

Packaged and custom enterprise apps help organizations manage core business processes, such as sales, billing, customer service, and HR. For decades, developers have built enterprise apps with old-guard commercial databases, but these databases are expensive, proprietary, have high lock-in, impose punitive licensing terms, don’t scale, and are difficult to manage. AWS offers a better way forward with fully-managed, cloud-native, modern database services that make it is easier, faster, and cost-effective for you to build enterprise apps. Attend this session to learn about AWS databases services including Amazon Aurora & RDS and dive deep into service capabilities such as scalability, reliability, and performance to quickly build enterprise apps.

Track 3 Session 1

How to go from zero to data lake in days (200)

AWS provides the most comprehensive, secure, scalable, and cost-effective portfolio of services for building data lakes for analytics. In this session, you will learn how to discover, load, store, prepare, catalog, and secure your data in a data lake. Then, you will learn how to analyze that data with the largest choice of analytics approaches, including data warehouse, operational, and real-timing streaming analytics, and even ML and AI. This will give you an overview of what AWS analytics can help you accomplish. Finally, you will hear about how leading companies built successful and productive data lakes.

Track 3 Session 2

Picking the right analytical engine for your needs (200)

AWS offers analytical engines for several use cases including big data processing, using Hadoop and Spark, data warehousing, ad-hoc analysis, real-time analytics, and operational/log analytics. In this session, you will learn about what engines you can use to analyze the data stored in your Amazon S3 data lake. You will also learn how to use these engines together to generate new insights by complementing your data warehouse workloads with ad-hoc and real-time analytics engines to quickly incorporate new data into your reports.

Track 3 Session 3

Breaking the silos: Extending your Data Warehouse to your data lake (300)

Traditional data warehouses require data to be loaded before it can be analyzed. This creates silos between the data warehouse, where transformed and structured data is stored in a proprietary format that other analytical engines cannot access, and the data lake, where data is stored as it arrives. Amazon Redshift breaks through data silos by enabling you to query exabytes of data in open formats directly from your Amazon S3 data lake. In this session you will learn how to use a data catalog such as AWS Glue to crawl the Amazon S3 data lake and create external tables, how to register those external tables in Amazon Redshift, and how to use Redshift to query data in your Amazon S3 data lake using those external tables - all without any data movement, duplication, or transformation.

Track 3 Session 4

Big data in the era of heavy privacy regulations (300)

The General Data Protection Regulation (GDPR) and California Consumer Protection Act (CCPA) are new privacy regulations that have major implications for data management and data analytics. In this session, you will learn about key parts of these new rules that you should consider while designing or evolving your analytics platform. You will then learn about approaches that will help with compliance, including metadata classification, tagging, fine-grained access controls, comprehensive encryption, anonymization, and erasure of customer data.

Track 4 Session 1

Amazon Redshift use cases and deployment patterns (400)

More than 10,000 customers use Amazon Redshift and collectively process over 2 exabytes of data per day. In this session, we will show some common deployment patterns for various use cases across different industries. We will also highlight best practices and offer tips to avoid common pitfalls, based on lessons learned from our customer engagements. You will walk away knowing how to get insights from all of your data in your Redshift data warehouse and Amazon S3 data lake at the best performance and lowest cost.

Track 4 Session 2

Processing Big Data with Hadoop, Spark, and other frameworks in Amazon EMR (300)

Amazon EMR provides a managed Hadoop framework that makes it easy, fast, and cost-effective to process vast amounts of data. You can also run other popular distributed frameworks such as Apache Spark, HBase, Presto, and Flink in EMR. EMR Notebooks, based on the popular Jupyter Notebook, provide a development and collaboration environment for ad hoc querying and exploratory analysis. In this session, you will learn how EMR securely and reliably handles a broad set of big data use cases, including log analysis, web indexing, data transformations (ETL), machine learning, financial analysis, scientific simulation, and bioinformatics.

Track 4 Session 3

Scalable, secure log analytics with Amazon Elasticsearch Service (300)

You’ve got servers, you’ve got applications, you’ve got microservices, that means, you’ve got logs. They’re not the most exciting data that your systems generate, but many times, they’re the most useful for real-time application monitoring, root-cause analysis, security analytics, and more. Customers like Autodesk, Nike, Expedia, and many more are using Amazon Elasticsearch Service to ingest, analyze, and search their log data, at multi-petabyte scale. In this session, you will learn about the Amazon Elasticsearch Service, how to get data into Amazon Elasticsearch Service, and how to use Kibana to visualize the insights from your log data.

Track 4 Session 4

High-performance data streaming and real-time analytics with AWS (300)

For many use cases timing is critical and the value of data diminshes rapidly. This means that every micro-second counts. Amazon Kinesis services and Amazon Managed Streaming for Kafka provide customers with fully managed streaming options enabling data to be collected, stored and processed as soon as it is created. In this session, you will learn how to solve data streaming use cases using AWS and how to decide which of the services are best suited to your needs.