- 👋 Hi, I’m Arvind
- 👀 I’m interested in programming and mathematics.
- 🌱 See my commit history to know what I am working on.
- 📫 Reach me on twitter (@ardTechNation)
Focus: Laying the groundwork for Machine Learning projects with a thorough initiation process.
-
Day 1: Introduction to Project Initiation
Overview of starting a Machine Learning project: understanding goals, stakeholders, and success criteria. Learn the basics of pitching a project to stakeholders.- Website: Google’s Machine Learning Crash Course - "ML Problem Framing"
- Research Paper: "A Survey of the Usages of Deep Learning in Natural Language Processing" (Young et al., 2018)
- Blog: Towards Data Science - "How to Start a Machine Learning Project"
- YouTube: "Machine Learning Project Lifecycle" by DataCamp
-
Day 2: Pitching and Selling Your Idea
Dive into crafting compelling pitches and aligning project goals with business needs. Practice structuring a pitch for different audiences.- Website: Stanford Online - "How to Pitch a Project"
- Research Paper: "The Business of Artificial Intelligence" (Brynjolfsson & McAfee, HBR, 2017)
- Blog: Medium - "Selling Your Machine Learning Project to Stakeholders"
- YouTube: "How to Pitch AI Projects" by Siraj Raval
-
Day 3: Problem Framing and Structuring
Learn how to frame complex problems to ensure solvable outcomes. Break down a problem into manageable components and set up a project roadmap.- Website: Kaggle - "Problem Framing in Data Science"
- Research Paper: "Problem Formulation and Data Collection for Machine Learning" (Wagstaff, 2012)
- Blog: Data Science Central - "Framing ML Problems Like a Pro"
- YouTube: "How to Frame a Machine Learning Problem" by DeepLearning.AI
-
Day 4: Running a Discovery Phase
Explore how to conduct a discovery phase: identifying key questions, risks, and unknowns. Discuss stakeholder interviews and requirement gathering.- Website: AWS Machine Learning - "Discovery Phase Best Practices"
- Research Paper: "The Role of Discovery in Machine Learning" (Domingos, 2012)
- Blog: Towards Data Science - "Discovery Phase in ML: A Step-by-Step Guide"
- YouTube: "ML Discovery Phase Explained" by freeCodeCamp
-
Day 5: Data Collection and Initial Setup
Address selection bias, manage data collection strategies, and cover best practices for data labeling. Begin planning an initial prototype.- Website: Google Research - "Data Collection Practices"
- Research Paper: "Data Quality Challenges in Machine Learning" (Krishnan et al., 2016)
- Blog: KDnuggets - "Avoiding Selection Bias in ML Projects"
- YouTube: "Data Collection for Machine Learning" by Sentdex
-
Day 6: Building and Presenting a Prototype
Hands-on session to build a simple prototype. Learn how to present it to stakeholders for feedback and iteration.- Website: Fast.ai - "Prototyping ML Models"
- Research Paper: "Rapid Prototyping for Machine Learning" (Sculley et al., 2015)
- Blog: Towards Data Science - "How to Build Your First ML Prototype"
- YouTube: "Building an ML Prototype in 10 Minutes" by TensorFlow
Focus: Constructing robust and functional Machine Learning models.
-
Day 1: Data Cleaning Basics
Introduction to data cleaning: identifying missing values, outliers, and inconsistencies. Learn foundational techniques for preparing data.- Website: OpenML - "Data Cleaning Guide"
- Research Paper: "Data Cleaning: A Practical Perspective" (Chu et al., 2016)
- Blog: Analytics Vidhya - "Data Cleaning 101 for Machine Learning"
- YouTube: "Data Cleaning Tutorial" by Corey Schafer
-
Day 2: Feature Engineering Deep Dive
Explore feature engineering: creating meaningful features, handling categorical data, and transforming raw inputs for better model performance.- Website: Featuretools - "Automated Feature Engineering"
- Research Paper: "Feature Engineering for Predictive Modeling" (Kuhn & Johnson, 2013)
- Blog: Towards Data Science - "Mastering Feature Engineering"
- YouTube: "Feature Engineering for ML" by StatQuest
-
Day 3: Preprocessing Techniques
Cover vectorization, normalization, and imputation in detail. Practice applying these techniques to sample datasets.- Website: Scikit-Learn - "Preprocessing Documentation"
- Research Paper: "Data Preprocessing Techniques for Machine Learning" (García et al., 2015)
- Blog: Machine Learning Mastery - "How to Normalize and Impute Data"
- YouTube: "Data Preprocessing in Python" by Krish Naik
-
Day 4: Model Selection Strategies
Discuss how to choose the right model for your problem (e.g., regression, classification, clustering). Compare trade-offs of different algorithms.- Website: TensorFlow - "Model Selection Guide"
- Research Paper: "Model Selection: Beyond the Bayesian/Frequentist Divide" (Burnham & Anderson, 2004)
- Blog: KDnuggets - "Choosing the Right ML Model"
- YouTube: "Model Selection in Machine Learning" by Edureka
-
Day 5: Building a Training Pipeline
Learn to construct an end-to-end training pipeline iteratively. Integrate data preprocessing and model training into a cohesive workflow.- Website: MLflow - "Training Pipelines"
- Research Paper: "Machine Learning Pipelines: From Research to Production" (Zaharia et al., 2018)
- Blog: Towards Data Science - "End-to-End ML Pipeline Tutorial"
- YouTube: "Building ML Pipelines" by Data School
-
Day 6: Scaling with Distributed Training
Introduction to distributed training: explore data parallelism and model parallelism to handle large datasets and complex models.- Website: PyTorch - "Distributed Training Docs"
- Research Paper: "Large Scale Distributed Deep Networks" (Dean et al., 2012)
- Blog: AWS Blog - "Scaling ML with Distributed Training"
- YouTube: "Distributed Training Explained" by Hugging Face
Focus: Validating and evaluating models for trustworthiness and real-world applicability.
-
Day 1: Introduction to Model Evaluation
Overview of evaluation strategies: why validation matters and common pitfalls. Introduction to cross-validation.- Website: Scikit-Learn - "Model Evaluation"
- Research Paper: "A Survey of Cross-Validation Procedures" (Arlot & Celisse, 2010)
- Blog: Towards Data Science - "Why Model Evaluation Matters"
- YouTube: "Model Evaluation Basics" by StatQuest
-
Day 2: Advanced Evaluation Techniques
Explore LLM-as-a-judge, LLM juries, backtesting, and behavioral testing. Apply these methods to example scenarios.- Website: Hugging Face - "LLM Evaluation"
- Research Paper: "Evaluating Large Language Models" (Chang et al., 2023)
- Blog: Medium - "LLM-as-a-Judge: A New Paradigm"
- YouTube: "Advanced Model Evaluation" by DeepLearning.AI
-
Day 3: Business-Aligned Metrics
Learn to frame evaluation metrics (e.g., precision, recall, ROI) in the context of business goals. Practice mapping technical outcomes to real-world impact.- Website: Google Cloud - "ML Metrics for Business"
- Research Paper: "Aligning AI with Business Goals" (Ng, 2018)
- Blog: DataCamp - "Metrics That Matter in ML"
- YouTube: "Business Metrics in ML" by Simplilearn
-
Day 4: Preventing Data Leakage
Understand data leakage: causes, consequences, and prevention strategies. Analyze case studies of leakage gone wrong.- Website: Kaggle - "Data Leakage Tutorial"
- Research Paper: "Leakage in Data Mining" (Kaufman et al., 2012)
- Blog: Towards Data Science - "Avoiding Data Leakage in ML"
- YouTube: "What is Data Leakage?" by Data Science Dojo
-
Day 5: Error Analysis and Debugging
Perform error analysis to identify model weaknesses. Learn techniques to handle imbalanced data and improve performance.- Website: TensorFlow - "Debugging ML Models"
- Research Paper: "Error Analysis in Machine Learning" (Koh & Liang, 2017)
- Blog: Machine Learning Mastery - "How to Debug Your ML Model"
- YouTube: "Error Analysis in ML" by Andrew Ng
-
Day 6: Practical Validation Workshop
Hands-on session: apply evaluation techniques to a sample model, interpret results, and propose improvements.- Website: Fast.ai - "Practical ML Validation"
- Research Paper: "Practical Model Evaluation" (Breiman, 2001)
- Blog: KDnuggets - "Hands-On Model Validation"
- YouTube: "ML Validation Workshop" by freeCodeCamp
Focus: Deploying models effectively and optimizing them for production.
-
Day 1: Model Deployment Basics
Introduction to versioning and deploying models. Discuss key operational considerations like latency and scalability.- Website: AWS SageMaker - "Deploying Models"
- Research Paper: "Deploying Machine Learning Models" (Sculley et al., 2015)
- Blog: Towards Data Science - "ML Deployment 101"
- YouTube: "Deploying ML Models" by TensorFlow
-
Day 2: Serving Predictions Strategically
Explore prediction-serving strategies: batch vs. real-time, human-in-the-loop workflows, and cost-sensitive approaches.- Website: Google Cloud - "Prediction Serving"
- Research Paper: "Human-in-the-Loop Machine Learning" (Holzinger, 2016)
- Blog: Medium - "Smart Prediction Serving Strategies"
- YouTube: "Serving ML Predictions" by DataCamp
-
Day 3: Trade-Offs in Deployment
Analyze trade-offs: accuracy vs. speed, cost vs. performance. Practice deploying a simple model.- Website: Microsoft Azure - "ML Deployment Trade-Offs"
- Research Paper: "Trade-Offs in ML Deployment" (Amershi et al., 2019)
- Blog: KDnuggets - "Latency vs. Accuracy in ML"
- YouTube: "ML Deployment Trade-Offs" by Edureka
-
Day 4: Model Optimization Techniques
Learn pruning and quantization to reduce model size and improve efficiency. Introduction to knowledge distillation.- Website: PyTorch - "Model Optimization"
- Research Paper: "Pruning Neural Networks" (LeCun et al., 1989)
- Blog: Towards Data Science - "Quantization and Pruning Guide"
- YouTube: "Optimizing ML Models" by DeepLearning.AI
-
Day 5: Advanced Optimization with LoRA
Dive into Low-Rank Adaptation (LoRA) for fine-tuning and compressing models. Apply it to a sample use case.- Website: Hugging Face - "LoRA Documentation"
- Research Paper: "LoRA: Low-Rank Adaptation of Large Language Models" (Hu et al., 2021)
- Blog: Medium - "LoRA for Model Optimization"
- YouTube: "LoRA Explained" by Yannic Kilcher
-
Day 6: Deployment Simulation
Simulate deploying an optimized model, incorporating versioning, monitoring, and rollback strategies.- Website: MLflow - "Deployment Simulation"
- Research Paper: "Simulating ML Deployments" (Zaharia et al., 2018)
- Blog: Towards Data Science - "Simulating ML in Production"
- YouTube: "ML Deployment Simulation" by freeCodeCamp
Focus: Ensuring models remain reliable in production by monitoring and adapting to changes.
-
Day 1: Understanding Model Drift
Introduction to distribution shifts: covariate shift, label shift, and concept drift. Discuss why drift matters.- Website: Evidently AI - "Model Drift Guide"
- Research Paper: "Concept Drift in Machine Learning" (Gama et al., 2014)
- Blog: Towards Data Science - "What is Model Drift?"
- YouTube: "Model Drift Explained" by Data Science Dojo
-
Day 2: Handling Edge Cases and Outliers
Learn techniques to identify and manage edge cases and outliers in production data.- Website: Scikit-Learn - "Outlier Detection"
- Research Paper: "Outlier Detection in High-Dimensional Data" (Aggarwal, 2015)
- Blog: KDnuggets - "Managing Edge Cases in ML"
- YouTube: "Outlier Handling in ML" by Krish Naik
-
Day 3: Feedback Loops and Their Impact
Explore feedback loops: how they arise and how to mitigate their effects on model performance.- Website: Google Research - "Feedback Loops in ML"
- Research Paper: "Feedback Loops in Machine Learning Systems" (Sculley et al., 2015)
- Blog: Medium - "Breaking Feedback Loops in ML"
- YouTube: "Feedback Loops in ML" by Sentdex
-
Day 4: Monitoring Tools and Techniques
Use adversarial validation and other practical strategies to monitor models. Set up basic monitoring dashboards.- Website: Prometheus - "Monitoring ML Models"
- Research Paper: "Adversarial Validation for Model Monitoring" (Sugiyama et al., 2008)
- Blog: Towards Data Science - "Monitoring ML in Production"
- YouTube: "Model Monitoring Tools" by DataCamp
-
Day 5: Building Resilient Models
Discuss techniques (e.g., robust training, regularization) to make models adaptable to shifts. Test resilience on a dataset.- Website: TensorFlow - "Robust ML Models"
- Research Paper: "Robustness in Machine Learning" (Goodfellow et al., 2014)
- Blog: Machine Learning Mastery - "Resilient ML Models"
- YouTube: "Building Robust ML Models" by DeepLearning.AI
-
Day 6: Case Study and Troubleshooting
Analyze a real-world example of drift, diagnose issues, and propose monitoring and mitigation strategies.- Website: AWS - "ML Monitoring Case Studies"
- Research Paper: "Practical Lessons from ML in Production" (Polyzotis et al., 2017)
- Blog: KDnuggets - "Troubleshooting ML Drift"
- YouTube: "ML Case Study: Drift" by freeCodeCamp
Focus: Creating adaptive, self-improving Machine Learning systems for long-term success.
-
Day 1: Introduction to Continual Learning
Overview of continual learning: why automation and adaptation are critical for production systems.- Website: DeepMind - "Continual Learning Research"
- Research Paper: "Continual Learning in Neural Networks" (Kirkpatrick et al., 2017)
- Blog: Towards Data Science - "What is Continual Learning?"
- YouTube: "Continual Learning Intro" by Yannic Kilcher
-
Day 2: Incremental Training Techniques
Learn incremental training to update models with new data. Discuss avoiding catastrophic forgetting.- Website: PyTorch - "Incremental Learning"
- Research Paper: "Incremental Learning Algorithms" (Losing et al., 2018)
- Blog: Medium - "Incremental Training in ML"
- YouTube: "Incremental Learning Tutorial" by Sentdex
-
Day 3: Retraining Strategies
Explore full retraining vs. fine-tuning vs. transfer learning. Compare their pros and cons with examples.- Website: Google Cloud - "Retraining ML Models"
- Research Paper: "Transfer Learning and Fine-Tuning" (Yosinski et al., 2014)
- Blog: KDnuggets - "Retraining vs. Fine-Tuning"
- YouTube: "Retraining ML Models" by Data School
-
Day 4: Testing in Production
Cover A/B testing, canary releases, and shadow deployments. Learn how to safely test model updates.- Website: AWS - "A/B Testing ML Models"
- Research Paper: "A/B Testing at Scale" (Kohavi et al., 2013)
- Blog: Towards Data Science - "Testing ML in Production"
- YouTube: "A/B Testing for ML" by Simplilearn
-
Day 5: Advanced Testing Methods
Dive into interleaving experiments and multi-armed bandits for production testing. Simulate a testing scenario.- Website: Netflix Tech Blog - "Interleaving Experiments"
- Research Paper: "Multi-Armed Bandits in ML" (Lattimore & Szepesvári, 2020)
- Blog: Medium - "Advanced ML Testing Techniques"
- YouTube: "Interleaving Experiments in ML" by DataCamp
-
Day 6: Capstone Project
Tie it all together: design, build, deploy, and monitor a continual learning system. Present your system and reflect on lessons learned.- Website: Kaggle - "ML Project Ideas"
- Research Paper: "From Research to Production: A Case Study" (Ng, 2018)
- Blog: Towards Data Science - "Building an End-to-End ML System"
- YouTube: "ML Capstone Project Walkthrough" by freeCodeCamp