Data on companies listed on the stock exchanges NASDAQ, NYSE, and AMEX with information on company name, stock symbol, last market capitalization and price, sector or industry group, and IPO year.
- Practicing Financial Data Management Techniques
- Application of EDA
- Remote(Web) Data Access Techniques
- Dive into PlotlyJs
- Understanding the Implementation of Financial Data & ETL Pipeline Process
- Understanding the Implementation of Docker Containerization
- Understanding the WorkFlow Management Dynamics to programmatically author, schedule and monitor them via the built-in Apache AirFlow UI.
Data on companies listed on the stock exchanges NASDAQ, NYSE, and AMEX with information on company name, stock symbol, last market capitalization and price, sector or industry group, and IPO year.
Table of Contents |
---|
Tech Stack 🔍📜 |
Design 📐 |
Conclusions 📌 |
License 🔖 |
- Python 3.6.3
- Collab Notebook
- Plotly JS
- Apache Airflow
- Docker
- Infact, while I was building an app to visualize stock prices and stock stats with PlotlyJs, I realized that every time I wanted data to be updated, I had to re-run the entire code manually. This process takes a very long time when you need to extract, manipulate and make available data for hundreds of stock tickers. I knew that if I wanted to scale up my project, I needed a tool to orchestrate the repetitive tasks behind the scenes while I was focused on building new interesting features. I explored various options to automate my ETL Pipeline and eventually started to combine Python and Airflow to create an automated ETL pipeline.