EMR Notebooks

Amazon EMR Notebooks, a managed environment based on Jupyter and Jupyter-lab notebooks, enables users to interactively analyze and visualize data, collaborate with peers, and build applications using EMR clusters. EMR Notebooks is designed for Apache Spark. It supports Spark Magic kernels, which allows you to remotely run queries and code on your EMR cluster using languages like PySpark, Spark SQL, Spark R, and Scala.

With EMR Notebooks, there is no software or instances to manage. You can either attach the notebook to an existing cluster or provision a new cluster directly from the console. You can attach multiple notebooks to a single cluster, and detach notebooks and re-attach them to new clusters.

EMR Notebooks allows you to:

Monitor and debug Spark jobs directly from your notebook.
Install notebook-scoped libraries on a running EMR cluster.
Associate Git repositories with your notebook for version control, and to simplify code collaboration and reuse.
Compare and merge two notebooks using the nbdime utility

There is no additional cost for using EMR Notebooks. You only pay for the EMR cluster attached to the notebook. It’s easy to create multiple notebooks directly from the EMR console. Follow this step-by-step tutorial to get started.

Interesting in learning more? Fill the form below to request a briefing with an Amazon EMR Specialist.

Website Referral Code:

Z-[OP]-Form Validation Bot Verification:

Last Web Form Update:

_mkto_trk

Suppress SFDC Auto-Response Email:

Z-[OP]-URL Tracking TRK Campaign:

Z-[OP]-URL Tracking SiteCatalyst Campaign:

Z-[OP]-URL Tracking SiteCatalyst Segment:

Z-[OP]-URL Tracking SiteCatalyst Channel:

Z-[OP]-URL Tracking SiteCatalyst Geo:

Z-[OP]-URL Tracking SiteCatalyst Content:

Z-[OP]-URL Tracking SiteCatalyst Medium:

Z-[OP]-URL Tracking SiteCatalyst Outcome:

Z-[OP]-URL Tracking SiteCatalyst Publisher:

Z-[OP]-URL Tracking SiteCatalyst S_FID:

Z-[OP]-Form Terms and Conditions Copy:

Z-[OP]-Email Validation Hygiene:

Z-[OP]-URL Tracking Lead ID:

Z-[OP]-DB-Annual Revenue:

Z-[OP]-DB-City:

Z-[OP]-DB-Company Size:

Z-[OP]-DB-Company:

Z-[OP]-DB-Country:

Z-[OP]-DB-Employee Range:

Z-[OP]-DB-IP Address:

Z-[OP]-DB-Industry:

Z-[OP]-DB-Internet Service Provider:

Z-[OP]-DB-Lead ID:

Z-[OP]-DB-PostalCode:

Z-[OP]-DB-StateProv:

Z-[OP]-DB-Website Domain:

Z-[OP]-Form Unique ID:

Business Email Address:

First Name:

Last Name:

Phone Number:

Company Name:

Country / Region:

State/Province:

Postal Code:

Industry:

Job Role:

I am completing this form in connection with my:

“By leveraging Redshift Spectrum's ability to query data directly into our Amazon S3 data lake, we have been able to easily integrate new data sources in hours, not days or weeks. This has not only reduced our time to insight, but helped us control our infrastructure costs.”

Elliott Cordo

VP of Data Analytics, Equinox Fitness

Resources

Blog

EMR Notebooks: A managed analytics environment based on Jupyter notebooks.

Learn more »
Tutorial

Associate Git repositories with EMR Notebooks

Learn more »
Blog

Install Python libraries on a running cluster with EMR Notebooks.

Learn more »

EMR Notebooks

Write and debug Apache Spark applications in real time

EMR Notebooks allows you to:

Elliott Cordo

VP of Data Analytics, Equinox Fitness

Resources

Blog

Tutorial

Blog