Big Data & Streaming Data Web Day




 

About the Event

Amazon Web Services (AWS) delivers an integrated suite of services that provide everything you need to capture and process streaming data, and to build a data lake for analytics. Join our Big Data and Streaming Data Web Day to learn more about big data and streaming data solutions, and how to leverage new AWS features for these activities.


Who Should Attend

This Web Day is designed for Developers, DevOps, and Cloud Architects as well as everyone who are interested in deep technical content around big data & streaming data.


Streaming Data Webinars


September 4, 2019 | 14:00 - 15:00 CEST
Apache Kafka on AWS: Amazon MSK
Apache Kafka is one of the most popular open-source projects for building messaging and streaming applications. Kafka takes data from various sources, makes it available to different applications, and therefore helps to eliminate daily batch jobs. Kafka also plays an important role for Change Data Capture (CDC) and in the world of microservices. This presentation gives an overview of the new Amazon Managed Streaming for Apache Kafka: Amazon MSK.
Based on knowledge gained from several Kafka implementation projects I will explain some of the technical underpinnings first. You will learn about brokers, topics, and Zookeeper. Then I will explain what makes Kafka special, analyse major pain points in on-prem Apache Kafka projects, and conclude why the cloud is the best way to use Kafka.
We will show many of the new features of Amazon MSK, such as provisioning Kafka clusters, the Kafka CLI, CloudFormation, extending storage, integration of broker and topic monitoring with CloudWatch and many more.
Speaker: Frank Munz, Technical Evangelist, AWS
Level: 300 (Advanced)


September 4, 2019 | 15:00 - 16:00 CEST
Build and Run Streaming Applications with Apache Flink and Amazon Kinesis Data Analytics for Java Applications
Stream processing facilitates the collection, processing, and analysis of real-time data and enables the continuous generation of insights and quick reactions to emerging situations. Yet, despite these advantages compared to traditional batch-oriented analytics applications, streaming applications are much more challenging to operate. Some of these challenges include the ability to provide and maintain low end-to-end latency, to seamlessly recover from failure, and to deal with a varying amount of throughput.
We all know and love Flink to take on those challenges with grace. In this session, we explore an end to end example that shows how you can use Apache Flink and Amazon Kinesis Data Analytics for Java Applications to build a reliable, scalable, and highly available streaming applications. We discuss how you can leverage managed services to quickly build Flink based streaming applications and show managed services can help to substantially reduce the operational overhead that is required to run the application.
So you will not only see how to actually build streaming applications with Apache Flink on AWS, you will also learn how leveraging managed services can help to reduce the overhead that is usually required to build and operate streaming applications to a bare minimum.
Speaker: Dr. Steffen Hausmann, Solutions Architect, AWS
Level: 400 (Expert)


Big Data Webinars


September 5, 2019 | 13:30 - 14:30 CEST
Modern Data Platform - Rethinking Data
A Modern Data Platform combines Traditional Business Intelligence, Big Data and Machine Learning. It includes structured and unstructured data from internal and external data sources in a batch and neartime fashion.
We will see how to blend these different facets together, creating actionable insights, for you to provide to your users and to make downstream systems smarter. Data are treated and stored in an uniform fashion, reducing friction to access and integrate it to ultimately extract its value. Data governance, retention and modelling is applied end-to-end at the level appropriate to the data's purpose. Data from the data sources are captured raw, as-is, preserving all information, but provided to the consumers in the granularity and form as-needed. We will see the application of machine learning to source data that was inaccessible before. Data and derived insights and their implementation are differentiated by how they are used. Distributing data, providing reports or insights through APIs pose significantly different requirements to a data platform. Using fit-for-purpose solutions make it easier to implement them.
We will provide a conceptual understanding how data flows through the platform and how to produce actionable insights to be used at the point of impact. Then we will peel back the covers and have a peek at exemplary implementations using services on AWS. We will also cover what principles are at play and why this is easier than expected.
To make it your own we will provide a bottom up approach and top down approach to get started.
Speaker: Mariano Kamp, Principal Solutions Architect, AWS
Level: 200 (Intermediate)


September 5, 2019 | 14:30 - 15:30 CEST
Query Your Data in S3 with SQL and Optimize for Cost and Performance
Streaming services allow you to ingest and analyze events continuously in real time. One of Big Data's principles is to store raw data as long as possible - to be able to answer future questions. If the data is permanently stored in Amazon Simple Storage Service (S3), it can be queried at any time with Amazon Athena without spinning up a database. This session shows step by step how the data should be structured so that both costs and response times are reduced when using Athena. The details and effects of compression, partitions, and column storage formats are compared. Finally, the CTAS feature of Amazon Athena is used to derive optimized views from the raw data for frequently issued queries.
Speaker: Steffen Grunwald, Solutions Architect, AWS
Level: 400 (Expert)


September 5, 2019 | 15:30 - 16:30 CEST
Visualize Data Stored in Data Lakes
Storing data in S3 data lakes opens up door for enormous opportunities including analytics and AI/ML. Many of AWS customers are using S3 data lakes as an enabler for strategic data driven decisions and business innovation. Data visualization is an important aspect of data driven decisions, and helps to create more value out of data stored in S3 data lakes.
This session is focused on the data visualization aspect of S3 data lakes. Amazon QuickSight is a fast, cloud powered Business Intelligence service to create and publish interactive dashboards. Its built-in integration capabilities with many AWS services including S3, Athena, Glue, Redshift and EMR makes it a tool of choice for many AWS customers. This session will guide its viewers on using various data sources and data processing engines with QuickSight to get more insights from data.
Speaker: Naila Karim, Solutions Architect, AWS
Level: 300 (Advanced)


The technical depth of AWS sessions ranges from 100 (introductory) to 400 (expert), take a look at the overview here for more guidance.



Big Data & Streaming Data Web Day

September 4 - 5 , 2019



About the Event

Amazon Web Services (AWS) delivers an integrated suite of services that provide everything you need to capture and process streaming data, and to build a data lake for analytics. Join our Big Data and Streaming Data Web Day to learn more about big data and streaming data solutions, and how to leverage new AWS features for these activities.


Who Should Attend

This Web Day is designed for Developers, DevOps, and Cloud Architects as well as everyone who are interested in deep technical content around big data & streaming data.


Streaming Data Webinars


September 4, 2019 | 14:00 - 15:00 CEST
Apache Kafka on AWS: Amazon MSK
Apache Kafka is one of the most popular open-source projects for building messaging and streaming applications. Kafka takes data from various sources, makes it available to different applications, and therefore helps to eliminate daily batch jobs. Kafka also plays an important role for Change Data Capture (CDC) and in the world of microservices. This presentation gives an overview of the new Amazon Managed Streaming for Apache Kafka: Amazon MSK.
Based on knowledge gained from several Kafka implementation projects I will explain some of the technical underpinnings first. You will learn about brokers, topics, and Zookeeper. Then I will explain what makes Kafka special, analyse major pain points in on-prem Apache Kafka projects, and conclude why the cloud is the best way to use Kafka.
We will show many of the new features of Amazon MSK, such as provisioning Kafka clusters, the Kafka CLI, CloudFormation, extending storage, integration of broker and topic monitoring with CloudWatch and many more.
Speaker: Frank Munz, Technical Evangelist, AWS
Level: 300 (Advanced)


September 4, 2019 | 15:00 - 16:00 CEST
Build and Run Streaming Applications with Apache Flink and Amazon Kinesis Data Analytics for Java Applications
Stream processing facilitates the collection, processing, and analysis of real-time data and enables the continuous generation of insights and quick reactions to emerging situations. Yet, despite these advantages compared to traditional batch-oriented analytics applications, streaming applications are much more challenging to operate. Some of these challenges include the ability to provide and maintain low end-to-end latency, to seamlessly recover from failure, and to deal with a varying amount of throughput.
We all know and love Flink to take on those challenges with grace. In this session, we explore an end to end example that shows how you can use Apache Flink and Amazon Kinesis Data Analytics for Java Applications to build a reliable, scalable, and highly available streaming applications. We discuss how you can leverage managed services to quickly build Flink based streaming applications and show managed services can help to substantially reduce the operational overhead that is required to run the application.
So you will not only see how to actually build streaming applications with Apache Flink on AWS, you will also learn how leveraging managed services can help to reduce the overhead that is usually required to build and operate streaming applications to a bare minimum.
Speaker: Dr. Steffen Hausmann, Solutions Architect, AWS
Level: 400 (Expert)


Big Data Webinars


September 5, 2019 | 13:30 - 14:30 CEST
Modern Data Platform - Rethinking Data
A Modern Data Platform combines Traditional Business Intelligence, Big Data and Machine Learning. It includes structured and unstructured data from internal and external data sources in a batch and neartime fashion.
We will see how to blend these different facets together, creating actionable insights, for you to provide to your users and to make downstream systems smarter. Data are treated and stored in an uniform fashion, reducing friction to access and integrate it to ultimately extract its value. Data governance, retention and modelling is applied end-to-end at the level appropriate to the data's purpose. Data from the data sources are captured raw, as-is, preserving all information, but provided to the consumers in the granularity and form as-needed. We will see the application of machine learning to source data that was inaccessible before. Data and derived insights and their implementation are differentiated by how they are used. Distributing data, providing reports or insights through APIs pose significantly different requirements to a data platform. Using fit-for-purpose solutions make it easier to implement them.
We will provide a conceptual understanding how data flows through the platform and how to produce actionable insights to be used at the point of impact. Then we will peel back the covers and have a peek at exemplary implementations using services on AWS. We will also cover what principles are at play and why this is easier than expected.
To make it your own we will provide a bottom up approach and top down approach to get started.
Speaker: Mariano Kamp, Principal Solutions Architect, AWS
Level: 200 (Intermediate)


September 5, 2019 | 14:30 - 15:30 CEST
Query Your Data in S3 with SQL and Optimize for Cost and Performance
Streaming services allow you to ingest and analyze events continuously in real time. One of Big Data's principles is to store raw data as long as possible - to be able to answer future questions. If the data is permanently stored in Amazon Simple Storage Service (S3), it can be queried at any time with Amazon Athena without spinning up a database. This session shows step by step how the data should be structured so that both costs and response times are reduced when using Athena. The details and effects of compression, partitions, and column storage formats are compared. Finally, the CTAS feature of Amazon Athena is used to derive optimized views from the raw data for frequently issued queries.
Speaker: Steffen Grunwald, Solutions Architect, AWS
Level: 400 (Expert)


September 5, 2019 | 15:30 - 16:30 CEST
Visualize Data Stored in Data Lakes
Storing data in S3 data lakes opens up door for enormous opportunities including analytics and AI/ML. Many of AWS customers are using S3 data lakes as an enabler for strategic data driven decisions and business innovation. Data visualization is an important aspect of data driven decisions, and helps to create more value out of data stored in S3 data lakes.
This session is focused on the data visualization aspect of S3 data lakes. Amazon QuickSight is a fast, cloud powered Business Intelligence service to create and publish interactive dashboards. Its built-in integration capabilities with many AWS services including S3, Athena, Glue, Redshift and EMR makes it a tool of choice for many AWS customers. This session will guide its viewers on using various data sources and data processing engines with QuickSight to get more insights from data.
Speaker: Naila Karim, Solutions Architect, AWS
Level: 300 (Advanced)


The technical depth of AWS sessions ranges from 100 (introductory) to 400 (expert), take a look at the overview here for more guidance.