Building Data Lakes on AWS is an AWS Training course for data engineers and architects who design and implement cloud data lakes. Students learn data lake principles and architecture patterns, how to ingest structured and unstructured data into Amazon S3, use AWS Glue for ETL and cataloguing, apply AWS Lake Formation for centralized governance and fine-grained access control, and query data lake content using Amazon Athena and Amazon Redshift Spectrum.
What You Will Learn
- Design a cloud data lake architecture on AWS using the modern data lake reference architecture
- Ingest batch and streaming data into Amazon S3 using AWS services including Kinesis and Glue
- Use AWS Glue to crawl, catalogue, and run ETL jobs that transform raw data into queryable formats
- Implement centralized data governance and column-level security using AWS Lake Formation
- Query data lake content using Amazon Athena with partitioning and columnar formats for performance
Who Should Attend
Data engineers, cloud architects, and analytics professionals building enterprise data lakes on Amazon Web Services.
Prerequisites
Familiarity with AWS core services (S3, IAM, EC2) and basic data engineering concepts. AWS Cloud Practitioner or Solutions Architect – Associate level knowledge is recommended.




