Pages

Courses

ALL COURSES

Resources

ALL RESOURCES

Blogs

ALL BLOGS

Building Modern Data Analytics Solutions on AWS

Dive deep into Amazon Lake Formation, Amazon Glue, Amazon EMR, Amazon Kinesis, and Amazon Redshift and the current thinking in building and operating data analytics pipelines to turn data into insights.

The Building Modern Data Analytics Solutions on AWS collection of one-day, intermediate level instructor-led courses dives deep into Amazon Lake Formation, Amazon Glue, Amazon EMR, Amazon Kinesis, and Amazon Redshift and the current thinking in building and operating data analytics pipelines to turn data into insights.

Wherever you or your customers are in the data modernization journey, our Building Modern Data Analytics Solutions on AWS collection of courses let you select the right training to meet your specific learning needs.

Duration
4 days/28 hours of instruction
Public Classroom Pricing

$2700(USD)

Group Rate: $2600

Private Group Pricing

Have a group of 5 or more students? Request special pricing for private group training today.

Building Data Lakes on AWS

Part 1: Introduction to Data Lakes

  • Describe the value of data lakes
  • Compare data lakes and data warehouses
  • Describe the components of a data lake
  • Recognize common architectures built on data lakes

Part 2: Data ingestion, cataloging, and preparation

  • Describe the relationship between data lake storage and data ingestion
  • Describe AWS Glue crawlers and how they are used to create a data catalog
  • Identify data formatting, partitioning, and compression for efficient storage and query
  • Lab 1: Set up a simple data lake

Part 3: Data Processing and Analytics

  • Recognize how data processing applies to a data lake
  • Use AWS Glue to process data within a data lake
  • Describe how to use Amazon Athena to analyze data in a data lake

Part 4: Building a Data Lake with AWS Lake Formation

  • Describe the features and benefits of AWS Lake Formation
  • Use AWS Lake Formation to create a data lake
  • Understand the AWS Lake Formation security model
  • Lab 2: Build a data lake using AWS Lake Formation

Part 5: Additional Lake Formation Configurations

  • Automate AWS Lake Formation using blueprints and workflows
  • Apply security and access controls to AWS Lake Formation
  • Match records with AWS Lake Formation FindMatches
  • Visualize data with Amazon QuickSight
  • Lab 3: Automate data lake creation using AWS Lake Formation blueprints
  • Lab 4: Data visualization using Amazon QuickSight

Part 6: Architecture and Course Review

  • Post course knowledge check
  • Architecture review
  • Course review

 

Building Batch Data Analytics Solutions on AWS

Part 1: Overview of Data Analytics and the Data Pipeline

  • Data analytics use cases
  • Using the data pipeline for analytics

Part 2: Introduction to Amazon EMR

  • Using Amazon EMR in analytics solutions
  • Amazon EMR cluster architecture
  • Interactive Demo 1: Launching an Amazon EMR cluster
  • Cost management strategies

Part 3: Data Analytics Pipeline Using Amazon EMR: Ingestion and Storage

  • Storage optimization with Amazon EMR
  • Data ingestion techniques

Part 4: High-Performance Batch Data Analytics Using Apache Spark on Amazon EMR

  • Apache Spark on Amazon EMR use cases
  • Why Apache Spark on Amazon EMR
  • Spark concepts
  • Interactive Demo 2: Connect to an EMR cluster and perform Scala commands using the Spark shell
  • Transformation, processing, and analytics
  • Using notebooks with Amazon EMR
  • Practice Lab 1: Low-latency data analytics using Apache Spark on Amazon EMR

Part 5: Processing and Analyzing Batch Data with Amazon EMR and Apache Hive

  • Using Amazon EMR with Hive to process batch data
  • Transformation, processing, and analytics
  • Practice Lab 2: Batch data processing using Amazon EMR with Hive
  • Introduction to Apache HBase on Amazon EMR

Part 6: Serverless Data Processing

  • Serverless data processing, transformation, and analytics
  • Using AWS Glue with Amazon EMR workloads
  • Practice Lab 3: Orchestrate data processing in Spark using AWS Step Functions

Part 7: Security and Monitoring of Amazon EMR Clusters

  • Securing EMR clusters
  • Interactive Demo 3: Client-side encryption with EMRFS
  • Monitoring and troubleshooting Amazon EMR clusters
  • Demo: Reviewing Apache Spark cluster history

Part 8: Designing Batch Data Analytics Solutions

  • Batch data analytics use cases

 

Building Streaming Data Analytics Solutions on AWS

Part 1: Overview of Data Analytics and the Data Pipeline

  • Data analytics use cases
  • Using the data pipeline for analytics

Part 2: Using Streaming Services in the Data Analytics Pipeline

  • The importance of streaming data analytics
  • The streaming data analytics pipeline
  • Streaming concepts

Part 3: Introduction to AWS Streaming Services

  • Streaming data services in AWS
  • Amazon Kinesis in analytics solutions
  • Demonstration: Explore Amazon Kinesis Data Streams
  • Practice Lab: Setting up a streaming delivery pipeline with Amazon Kinesis
  • Using Amazon Kinesis Data Analytics
  • Introduction to Amazon MSK
  • Overview of Spark Streaming

Part 4: Using Amazon Kinesis for Real-time Data Analytics

  • Exploring Amazon Kinesis using a clickstream workload
  • Creating Kinesis data and delivery streams
  • Demonstration: Understanding producers and consumers
  • Building stream producers
  • Building stream consumers
  • Building and deploying Flink applications in Kinesis Data Analytics
  • Demonstration: Explore Zeppelin notebooks for Kinesis Data Analytics
  • Practice Lab: Streaming analytics with Amazon Kinesis Data Analytics and Apache Flink

Part 5: Securing, Monitoring, and Optimizing Amazon Kinesis

  • Optimize Amazon Kinesis to gain actionable business insights
  • Security and monitoring best practices

Part 6: Using Amazon MSK in Streaming Data Analytics Solutions

  • Use cases for Amazon MSK
  • Creating MSK clusters
  • Demonstration: Provisioning an MSK Cluster
  • Ingesting data into Amazon MSK
  • Practice Lab: Introduction to access control with Amazon MSK
  • Transforming and processing in Amazon MSK

Part 7: Securing, Monitoring, and Optimizing Amazon MSK

  • Optimizing Amazon MSK
  • Demonstration: Scaling up Amazon MSK storage
  • Practice Lab: Amazon MSK streaming pipeline and application deployment
  • Security and monitoring
  • Demonstration: Monitoring an MSK cluster

Part 8: Designing Streaming Data Analytics Solutions

  • Use case review
  • Class Exercise: Designing a streaming data analytics workflow

Part 9: Developing Modern Data Architectures on AWS

  • Modern data architectures

 

Building Data Analytics Solutions Using Amazon Redshift

Part 1: Overview of Data Analytics and the Data Pipeline

  • Data analytics use cases
  • Using the data pipeline for analytics

Part 2: Using Amazon Redshift in the Data Analytics Pipeline

  • Why Amazon Redshift for data warehousing?
  • Overview of Amazon Redshift

Part 3: Introduction to Amazon Redshift

  • Amazon Redshift architecture
  • Interactive Demo 1: Touring the Amazon Redshift console
  • Amazon Redshift features
  • Practice Lab 1: Load and query data in an Amazon Redshift cluster

Part 4: Ingestion and Storage

  • Ingestion
  • Interactive Demo 2: Connecting your Amazon Redshift cluster using a Jupyter notebook with Data API
  • Data distribution and storage
  • Interactive Demo 3: Analyzing semi-structured data using the SUPER data type
  • Querying data in Amazon Redshift
  • Practice Lab 2: Data analytics using Amazon Redshift Spectrum

Part 5: Processing and Optimizing Data

  • Data transformation
  • Advanced querying
  • Practice Lab 3: Data transformation and querying in Amazon Redshift
  • Resource management
  • Interactive Demo 4: Applying mixed workload management on Amazon Redshift
  • Automation and optimization
  • Interactive demo 5: Amazon Redshift cluster resizing from the dc2.large to ra3.xlplus cluster

Part 6: Security and Monitoring of Amazon Redshift Clusters

  • Securing the Amazon Redshift cluster
  • Monitoring and troubleshooting Amazon Redshift clusters

Part 7: Designing Data Warehouse Analytics Solutions

  • Data warehouse use case review
  • Activity: Designing a data warehouse analytics workflow

Part 8: Developing Modern Data Architectures on AWS

  • Modern data architectures

Professionals who would benefit from this training include:

  • Data warehouse engineers
  • Data platform engineers
  • Solutions architects

  • How to leverage AWS data Services to store, process, analyze, stream, and query data to make decisions with speed and agility at scale
  • How to modernize data solutions end to end
  • Skills to put your data to work to make better, more informed decisions, respond faster to the unexpected, and uncover new opportunities

A full refund will be issued for class cancellations made at least 15 business days before the course begins. Payment is non‑refundable for cancellations or reschedules made within 15 business days from the course start date and for No‑Shows (students who do not attend class).
For reschedules made within 15 business days from the course start date, students must reschedule immediately for a current, published course, up to a maximum of 90 days from the original date.
A student may reschedule a class or exam up to 2 times. Any additional reschedules will not be allowed.

 

Building Modern Data Analytics Solutions on AWS Schedule

Delivery
Date
Register

Live Online

Nov 14th - 17th, 2023
9:00 AM - 5:00 PM ET
$2700(usd)

Live Online

Dec 12th - 15th, 2023
12:00 PM - 8:00 PM ET
$2700(usd)

Live Online

Jan 10th - 12th, 2024
9:00 AM - 5:00 PM ET
$2700(usd)

Live Online

Jan 31st - Feb 2nd, 2024
9:00 AM - 5:00 PM ET
$2700(usd)
Guaranteed to RunGuaranteed to Run
Register

Live Online

Feb 14th - 14th, 2024
9:00 AM - 5:00 PM ET
$2700(usd)

Request Private Group Training