Skip to content

Latest commit

 

History

History
37 lines (24 loc) · 1.98 KB

README.md

File metadata and controls

37 lines (24 loc) · 1.98 KB

aws_de

aws_de_project repo

This course provides an overview on using various AWS services for data engineering. The course is divided into 6 weeks, each covering a different aspect of data engineering on AWS. Each weeks consists of series of content, labs and homework questions. The course is designed to be completed in 6 weeks, with 8-10 hours of effort per week.

The following prequisites are required to complete the course:

  • An AWS account
  • Basic knowledge of Python
  • Basic knowledge of SQL
  • Terraform basics

Course Outline

Week 1: Data Ingestion

Objective: Understanding the fundamentals of data ingestion and how to use AWS S3 for storing and retrieving data.

Week 2: Data Warehousing with Redshift and AWS Glue

Objective: Understanding the concept of data warehousing and how to use AWS Redshift for data analysis.

Week 3: Workflow Orchestration with Dagster

Objective: To understand the need for orchestration in managing complex workflows and gain hands-on experience with an orchestration tool.

Week 4: Analytics Engineering with dbt and AWS ECS

Objective: Understanding the concept of Analytics Engineering and how to use dbt and AWS ECS for data transformation.

Week 5: Batch Processing with AWS EMR

Objective: Understanding the concept of batch processing and how to use AWS EMR for large scale data processing.

Week 6: Stream Processing with Apache Kafka on AWS

Objective: Understanding the concept of stream processing and how to use Apache Kafka on AWS for real-time data processing.

I've created this course as I've enjoyed the Datatalks.Club Data Engineering Zoomcamp but did not find one that focused primarily on AWS services.

The student will be using the NYC Taxi Trip dataset for the labs and homework questions. The dataset can be downloaded from the link for the DataTalksClub zoomcamp itself or from the link below:

https://github.com/DataTalksClub/nyc-tlc-data/releases/tag/yellow