Skip to content

Commit

Permalink
add cross links and fix observability information
Browse files Browse the repository at this point in the history
  • Loading branch information
fmind committed Jul 28, 2024
1 parent b74d653 commit f6ba1f1
Show file tree
Hide file tree
Showing 6 changed files with 15 additions and 3 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ This course is designed to dive deep into the intersection of software developme

Whether you are a beginner eager to explore or an experienced professional seeking to enhance your skill set, this course offers valuable insights and hands-on experience.

**Related Resources:**
- **[MLOps Python Package (Example)](https://github.com/fmind/mlops-python-package)**: Kickstart your MLOps initiative with a flexible, robust, and productive Python package.
- **[Cookiecutter MLOps Package (Template)](https://github.com/fmind/cookiecutter-mlops-package)**: Build and deploy Python packages and Docker images for MLOps tasks.

## Key Features

- **Hands-on Python Coding**: Learn to code with Python in a way that's directly applicable to real-world AI projects.
Expand All @@ -26,6 +30,7 @@ Whether you are a beginner eager to explore or an experienced professional seeki
4. **Validating**: Focus on code quality with typing, linting, testing, and debugging to ensure your ML projects are robust and maintainable.
5. **Refining**: Dive into advanced MLOps techniques including CI/CD workflows, software containers, and model registries to streamline your operations.
6. **Sharing**: Learn how to effectively organize and document your MLOps projects to ensure they are accessible and collaborative.
7. **Observability**: Gain comprehensive insights into the behavior and performance of your deployed models and infrastructure.

## Installation

Expand Down
1 change: 1 addition & 0 deletions docs/0. Overview/0.0. Course.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ The course is divided into six in-depth chapters, each focusing on different fac
4. **[Validating](../../4. Validating/)**: Adopt practices like typing, linting, testing, and logging to refine code quality.
5. **[Refining](../../5. Refining/)**: Leverage advanced software development techniques and tools to polish your project.
6. **[Sharing](../../6. Sharing/)**: Foster a productive team environment for effective contributions and communication.
7. **[Observability](../../7. Observability/)**: Implement tools and practices for monitoring your data, models, and infrastructure.

## What's beyond the scope of this course?

Expand Down
2 changes: 2 additions & 0 deletions docs/7. Observability/2. Alerting.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ To implement an effective alerting system, you need to choose communication chan
- **[Slack](https://slack.com/) and [Discord](https://discord.com/)**: Suitable for real-time team communication, these messaging platforms allow for instant notifications, discussions, and collaboration among team members.
- **[Datadog](https://www.datadoghq.com/)**: A popular monitoring and observability platform, it provides comprehensive alerting capabilities for various system and application metrics, including those related to AI/ML models.
- **[Statuspal](https://statuspal.io/)**: This platform specializes in status page monitoring and incident communication, making it useful for notifying users about any disruptions or downtime related to AI/ML services.
- **[PagerDuty](https://www.pagerduty.com/)**: A popular incident management platform that can be used for routing AI/ML alerts to the right team members, escalating issues if necessary, and ensuring that incidents are addressed promptly.

## How can you implement Alerting (local demo)?

Expand Down Expand Up @@ -92,5 +93,6 @@ Here's how you can use the alerting service:
- [Alerting in Datadog](https://docs.datadoghq.com/monitors/manage/status/#alerts)
- [Slack API Documentation](https://api.slack.com/)
- [Discord Developer Documentation](https://discord.com/developers/docs/intro)
- [PagerDuty](https://www.pagerduty.com/)
- [Statuspal](https://statuspal.io/)
- [Plyer](https://plyer.readthedocs.io/)
2 changes: 1 addition & 1 deletion docs/7. Observability/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Observability in Machine Learning Operations (MLOps) is crucial for gaining insi

- **[7.0. Reproducibility](./0. Reproducibility.md)**: Explore how to make machine learning experiments and pipelines more reproducible using MLflow Projects, enabling others to verify findings, share knowledge, and build upon existing work.
- **[7.1. Monitoring](./1. Monitoring.md)**: Learn the fundamental principles and tools for monitoring AI/ML models, focusing on tracking key metrics, setting up alerts, and understanding changes in model behavior using MLflow Evaluate API and Evidently.
- **[7.2. Alerting](./2. Alerting.md)**: Understand how to design effective alert systems to promptly notify stakeholders of potential issues with models or infrastructure using tools like Slack, Discord, Datadog, and Statuspal.
- **[7.2. Alerting](./2. Alerting.md)**: Understand how to design effective alert systems to promptly notify stakeholders of potential issues with models or infrastructure using tools like Slack, Discord, Datadog, and PagerDuty.
- **[7.3. Lineage](./3. Lineage.md)**: Delve into data and model lineage, discovering how to track the origin and transformation of data and models throughout the ML lifecycle using MLflow Dataset.
- **[7.4. Costs and KPIs](./4. Costs-KPIs.md)**: Explore techniques for managing costs associated with running AI/ML workloads and for defining and tracking key performance indicators (KPIs) aligned with business goals, using MLflow Tracking for analysis.
- **[7.5. Explainability](./5. Explainability.md)**: Explore the concept of explainable AI, focusing on techniques like SHAP to understand model predictions and build trust in AI systems.
Expand Down
6 changes: 5 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,11 @@ In this chapter, we delve into refining MLOps projects to enhance their efficien

## [Chapter 6: Sharing](./6. Sharing/)

The final chapter focuses on sharing and distributing MLOps projects. We explore tools and practices that enhance collaboration, promote reuse, and facilitate the scaling of machine learning solutions. You will learn how to effectively organize, document, and disseminate your projects to make them more accessible and beneficial to others.
The chapter focuses on sharing and distributing MLOps projects. We explore tools and practices that enhance collaboration, promote reuse, and facilitate the scaling of machine learning solutions. You will learn how to effectively organize, document, and disseminate your projects to make them more accessible and beneficial to others.

## [Chapter 7: Observability](./7. Observability/)

This chapter dives into the essential aspects of observability in MLOps, equipping you with the knowledge and strategies to gain comprehensive insights into the performance, behavior, and health of your deployed models and infrastructure. You'll learn how to ensure reproducibility, implement monitoring and alerting systems, track data and model lineage, manage costs and KPIs, understand model explainability, and monitor infrastructure performance.

## Let's journey together!

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

[tool.poetry]
name = "mlops-coding-course"
version = "3.0.1"
version = "3.1.0"
description = "Learn how to create, develop, and maintain an MLOps code base."
repository = "https://github.com/MLOps-Courses/mlops-coding-course"
documentation = "https://mlops-coding-course.fmind.dev/"
Expand Down

0 comments on commit f6ba1f1

Please sign in to comment.