AWS Lake Formation
Try our preview capabilities: Transactions, Row-level Security, and Acceleration
Sign up for an invitation to preview
Sign up to preview three new capabilities in AWS Lake Formation: transactions for concurrent updates and consistent query results, row-level security policies for granular access control, and accelerated access though inline filtering, aggregations, and automatic file compaction.
Transactions - Insert, delete, and modify rows concurrently
Data lakes need to show users the correct view of data at all times, even while there are simultaneous real-time or frequent updates to the data. A common pattern in data lakes is to organize data into tables comprised of rows that can include structured or semi-structured data. To load streaming data or quickly incorporate changes from source data systems, you need to insert, delete, and modify rows across multiple tables in parallel. Today, developers write custom application code or use open source tools to manage these updates. These solutions are complex and difficult to scale because writing application code that maintains consistency when concurrently reading and writing the same data is tedious, brittle, and error prone.
AWS Lake Formation introduces new APIs that support atomic, consistent, isolated, and durable (ACID) transactions using a new data lake table type, called a ‘governed table.’ A governed table allows multiple users to concurrently insert, delete, and modify rows across tables, while still allowing other users to simultaneously run analytical queries and machine learning (ML) models on the same data sets that return consistent and up-to-date results. The ability to update and delete individual rows in governed tables, like a row (record) of customer data after they have asked to be forgotten, helps users comply with “right to be forgotten” provisions in privacy laws like GDPR and CCPA.
Row-level security
Making sure users have access to only the right data in a data lake is difficult. Some users need access to all data within a dataset, while other users are restricted from seeing columns of sensitive information like social security numbers or rows of data like sales records from other regions. Data lake administrators often maintain multiple copies of data to apply different security policies for different users. This adds complexity, operational overhead, and extra storage costs.
AWS Lake Formation already allows you to set access policies to hide data, such as hiding a column with social security numbers, from users who do not have permission to view that data. With row-level security, you can now set row-level policies in addition to column-level policies. For example, you can now set a policy that gives a regional sales manager access to only the sales data for their region.
Acceleration - Better performance with filtering, aggregations, and automatic file compaction
Preview support
Please submit the information below to request an invitation to the preview. We will contact you with instructions if you are approved.