Apache Hudi on AWS


AWS Technical Guide

Run Apache Hudi at scale on AWS


Apache Hudi is an open-source transactional data lake framework that greatly simplifies incremental data processing and data pipeline development. AWS offers native support for Apache Hudi, allowing you to easily build transactional data lakes on top of Amazon Simple Storage Service (Amazon S3) on AWS.

This technical guide provides guidance on getting started with Apache Hudi on different AWS services, as well as best practices and recommendations for running Apache Hudi on AWS at scale while optimizing cost and performance.

In this guide, you will learn:


  • An overview of Apache Hudi
  • How to get started with Apache Hudi on different AWS services
  • How to query Apache Hudi tables with AWS
  • Best practices for optimizing and tuning Apache Hudi on AWS at scale

Complete this form to download the technical guide