-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
181 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,181 @@ | ||
Have you found any cool resources about data engineering? Put them here | ||
|
||
## Learning Data Engineering | ||
|
||
### Courses | ||
|
||
* [Data Engineering Zoomcamp](https://github.com/DataTalksClub/data-engineering-zoomcamp) by DataTalks.Club (free) | ||
* [Big Data Platforms, Autumn 2022: Introduction to Big Data Processing Frameworks](https://big-data-platforms-22.mooc.fi/) by the University of Helsinki (free) | ||
* [Awesome Data Engineering Learning Path](https://awesomedataengineering.com/) | ||
|
||
|
||
### Books | ||
|
||
* [Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann](https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321) | ||
* [Big Data: Principles and Best Practices of Scalable Realtime Data Systems by Nathan Marz, James Warren](https://www.amazon.com/Big-Data-Principles-practices-scalable/dp/1617290343) | ||
* [Practical DataOps: Delivering Agile Data Science at Scale by Harvinder Atwal](https://www.amazon.com/Practical-DataOps-Delivering-Agile-Science/dp/1484251032) | ||
* [Data Pipelines Pocket Reference: Moving and Processing Data for Analytics by James Densmore](https://www.amazon.com/Data-Pipelines-Pocket-Reference-Processing/dp/1492087831) | ||
* [Best books for data engineering](https://awesomedataengineering.com/data_engineering_best_books) | ||
* [Fundamentals of Data Engineering: Plan and Build Robust Data Systems by Joe Reis, Matt Housley](https://www.amazon.com/Fundamentals-Data-Engineering-Robust-Systems/dp/1098108302) | ||
|
||
|
||
### Introduction to Data Engineering Terms | ||
|
||
* [https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html](https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html) | ||
|
||
|
||
### Data engineering in practice | ||
|
||
Conference talks from companies, blog posts, etc | ||
|
||
* [Uber Data Archives](https://eng.uber.com/category/articles/uberdata/) (Uber engineering blog) | ||
* [Data Engineering Weekly (DE-focused substack)](https://www.dataengineeringweekly.com/) | ||
* [Seattle Data Guy (DE-focused substack)](https://seattledataguy.substack.com/) | ||
|
||
|
||
## Doing Data Engineering | ||
|
||
### Coding & Python | ||
|
||
* [CS50's Introduction to Computer Science | edX](https://www.edx.org/course/introduction-computer-science-harvardx-cs50x) (course) | ||
* [Python for Everybody SpecializsationSpecialization](https://www.coursera.org/specializations/python) (course) | ||
* [Practical Python programming](https://github.com/dabeaz-course/practical-python/blob/master/Notes/Contents.md) | ||
|
||
|
||
### SQL | ||
|
||
* [Intro to SQL: Querying and managing data | Khan Academy](https://www.khanacademy.org/computing/computer-programming/sql) | ||
* [Mode SQL Tutorial](https://mode.com/sql-tutorial/) | ||
* [Use The Index, Luke](https://use-the-index-luke.com/) (SQL Indexing a nd Tuning e-Book)nfreffx | ||
* [SQL Performance Explained](https://sql-performance-explained.com/) (book) e | ||
|
||
|
||
### Workflow orchestration | ||
|
||
* [What is DAG?](https://youtu.be/1Yh5S-S6wsI) (video) | ||
* [Airflow, Prefect, and Dagster: An Inside Look](https://towardsdatascience.com/airflow-prefect-and-dagster-an-inside-look-6074781c9b77) (blog post) | ||
* [Open-Source Spotlight - Prefect - Kevin Kho](https://www.youtube.com/watch?v=ISLV9JyqF1w) (video) | ||
* [Prefect as a Data Engineering Project Workflow Tool, with Mary Clair Thompson (Duke) - 11/6/2020](https://youtu.be/HuwA4wLQtCM) (video) | ||
|
||
|
||
### ETL and ELT | ||
|
||
* [ETL vs. ELT: What’s the Difference?](https://rivery.io/blog/etl-vs-elt/) (blog post) (print version) | ||
|
||
### Data lakes | ||
|
||
* [An Introduction to Modern Data Lake Storage Layers (Hodi, Iceberg, Delta Lake)](https://dacort.dev/posts/modern-data-lake-storage-layers/) (blog post) | ||
* [Lake House Architecture @ Halodoc: Data Platform 2.0](https://blogs.halodoc.io/lake-house-architecture-halodoc-data-platform-2-0/amp/) (blzog post) | ||
|
||
|
||
### Data warehousing | ||
|
||
|
||
* [Guide to Data Warehousing. Short and comprehensive information… | by Tomas Peluritis](https://towardsdatascience.com/guide-to-data-warehousing-6fdcf30b6fbe) (blog post) | ||
* [Snowflake, Redshift, BigQuery, and Others: Cloud Data Warehouse Tools Compared](https://www.altexsoft.com/blog/snowflake-redshift-bigquery-data-warehouse-tools/) (blog post) | ||
|
||
|
||
### Streaming | ||
|
||
|
||
* Building Streaming Analytics: The Journey and Learnings - Maxim Lukichev | ||
|
||
### DataOps | ||
|
||
* [DataOps 101 with Lars Albertsson – DataTalks.Club](https://datatalks.club/podcast/s02e11-dataops.html) (podcast) | ||
* | ||
|
||
|
||
### Monitoring and observability | ||
|
||
* [Data Observability: The Next Frontier of Data Engineering with Barr Moses](https://datatalks.club/podcast/s03e03-data-observability.html) (podcast) | ||
|
||
|
||
### Analytics engineering | ||
|
||
* [Analytics Engineer: New Role in a Data Team with Victoria Perez Mola](https://datatalks.club/podcast/s03e11-analytics-engineer.html) (podcast) | ||
* [Modern Data Stack for Analytics Engineering - Kyle Shannon](https://www.youtube.com/watch?v=UmIZIkeOfi0) (video) | ||
* [Analytics Engineering vs Data Engineering | RudderStack Blog](https://www.rudderstack.com/blog/analytics-engineering-vs-data-engineering) (blog post) | ||
* [Learn the Fundamentals of Analytics Engineering with dbt](https://courses.getdbt.com/courses/fundamentals) (course) | ||
|
||
|
||
### Data mesh | ||
|
||
* [Data Mesh in Practice - Max Schultze](https://www.youtube.com/watch?v=ekEc8D_D3zY) (video) | ||
|
||
### Cloud | ||
|
||
* [https://acceldataio.medium.com/data-engineering-best-practices-how-netflix-keeps-its-data-infrastructure-cost-effective-dee310bcc910](https://acceldataio.medium.com/data-engineering-best-practices-how-netflix-keeps-its-data-infrastructure-cost-effective-dee310bcc910) | ||
|
||
|
||
### Reverse ETL | ||
|
||
* TODO: What is reverse ETL? | ||
* [https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html](https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html) | ||
* [Open-Source Spotlight - Grouparoo - Brian Leonard](https://www.youtube.com/watch?v=hswlcgQZYuw) (video) | ||
* [Open-Source Spotlight - Castled.io (Reverse ETL) - Arun Thulasidharan](https://www.youtube.com/watch?v=iW0XhltAUJ8) (video) | ||
|
||
## Career in Data Engineering | ||
|
||
* [From Data Science to Data Engineering with Ellen König – DataTalks.Club](https://datatalks.club/podcast/s07e08-from-data-science-to-data-engineering.html) (podcast) | ||
* [Big Data Engineer vs Data Scientist with Roksolana Diachuk – DataTalks.Club](https://datatalks.club/podcast/s04e03-big-data-engineer-vs-data-scientist.html) (podcast) | ||
* [What Skills Do You Need to Become a Data Engineer](https://www.linkedin.com/pulse/what-skills-do-you-need-become-data-engineer-peng-wang/) (blog post) | ||
* [The future history of Data Engineering](https://groupby1.substack.com/p/data-engineering?s=r) (blog post) | ||
* [What Skills Do Data Engineers Need](https://www.theseattledataguy.com/what-skills-do-data-engineers-need/) (blog post) | ||
|
||
### Data Engineering Management | ||
|
||
* [Becoming a Data Engineering Manager with Rahul Jain – DataTalks.Club](https://datatalks.club/podcast/s07e07-becoming-a-data-engineering-manager.html) (podcast) | ||
|
||
## Data engineering projects | ||
|
||
* [How To Start A Data Engineering Project - With Data Engineering Project Ideas](https://www.youtube.com/watch?v=WpN47Jddo7I) (video) | ||
* [Data Engineering Project for Beginners - Batch edition](https://www.startdataengineering.com/post/data-engineering-project-for-beginners-batch-edition/) (blog post) | ||
* [Building a Data Engineering Project in 20 Minutes](https://www.sspaeti.com/blog/data-engineering-project-in-twenty-minutes/) (blog post) | ||
* [Automating Nike Run Club Data Analysis with Python, Airflow and Google Data Studio | by Rich Martin | Medium](https://medium.com/@rich_23525/automating-nike-run-club-data-analysis-with-python-airflow-and-google-data-studio-3c9556478926) (blog post) | ||
|
||
|
||
## Data Engineering Resources | ||
|
||
### Blogs | ||
|
||
* [Start Data Engineering](https://www.startdataengineering.com/) | ||
|
||
### Podcasts | ||
|
||
* [The Data Engineering Podcast](https://www.dataengineeringpodcast.com/) | ||
* [DataTalks.Club Podcast](https://datatalks.club/podcast.html) (only some episodes are about data engineering) | ||
* | ||
|
||
### Communities | ||
|
||
* [DataTalks.Club](https://datatalks.club/) | ||
* [/r/dataengineering](https://www.reddit.com/r/dataengineering) | ||
|
||
|
||
### Meetups | ||
|
||
* [Sydney Data Engineers](https://sydneydataengineers.github.io/) | ||
|
||
### People to follow on Twitter and LinkedIn | ||
|
||
* TODO | ||
|
||
### YouTube channels | ||
|
||
* [Karolina Sowinska - YouTube](https://www.youtube.com/channel/UCAxnMry1lETl47xQWABvH7g) x` | ||
* [Seattle Data Guy - YouTube](https://www.youtube.com/c/SeattleDataGuy) | ||
* [Andreas Kretz - YouTube](https://www.youtube.com/c/andreaskayy) | ||
* [DataTalksClub - YouTube](https://youtube.com/c/datatalksclub) (only some videos are about data engineering) | ||
|
||
### Resource aggregators | ||
|
||
* [Reading List](https://www.scling.com/reading-list/) by Lars Albertsson | ||
* [GitHub - igorbarinov/awesome-data-engineering](https://github.com/igorbarinov/awesome-data-engineering) (focus is more on tools) | ||
|
||
|
||
## License | ||
|
||
This work is licensed under a Creative Commons Attribution 4.0 International License. | ||
|
||
CC BY 4.0 |