Data Engineering on AWS is an AWS Training course for data engineers who design and build data pipelines and analytics infrastructure on AWS. Students learn to architect and implement data ingestion workflows for both batch and streaming data, transform data using AWS Glue and Amazon EMR, orchestrate pipelines with AWS Step Functions, and deliver analytics-ready data to Amazon Redshift, S3 data lakes, and other downstream consumers.
What You Will Learn
- Design and implement batch and streaming data ingestion pipelines using AWS Glue, Kinesis, and DMS
- Transform and process large datasets using AWS Glue ETL jobs and Apache Spark on Amazon EMR
- Orchestrate complex multi-step data pipelines using AWS Step Functions and Apache Airflow on Amazon MWAA
- Load processed data into Amazon Redshift and optimize queries with distribution keys and sort keys
- Apply data quality, cataloguing, and governance patterns using AWS Glue Data Catalog and AWS Lake Formation
Who Should Attend
Data engineers, ETL developers, and cloud architects building enterprise-scale data pipelines and analytics infrastructure on AWS.
Prerequisites
SQL proficiency, Python scripting experience, and working knowledge of AWS core services. Prior data engineering experience is expected.



