-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' of github.com:radicalbit/radicalbit-ai-monitoring…
… into features/ROS-496-introduce-text-generation-model-type
- Loading branch information
Showing
33 changed files
with
3,618 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
{ | ||
".": "1.1.0" | ||
".": "1.2.0" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
--- | ||
sidebar_position: 5 | ||
--- | ||
|
||
# All metrics | ||
List of all available Metrics and Charts. | ||
|
||
## CSV summary | ||
|
||
* Number of variables | ||
* Number of observations | ||
* Number of missing values | ||
* Percentage of missing values | ||
* Number of duplicated rows | ||
* Percentage of duplicated rows | ||
* Number of **numerical** variables | ||
* Number of **categorical** variables | ||
* Number of **datetime** variables | ||
|
||
Summary with all variable name and type (float, int, string, datetime). | ||
|
||
## Data quality | ||
|
||
* **Numerical** variables | ||
* Average | ||
* Standard deviation | ||
* Minimum | ||
* Maximum | ||
* Percentile 25% | ||
* Median | ||
* Percentile 75% | ||
* Number of missing values | ||
* Histogram with 10 bins | ||
* **Categorical** variables | ||
* Number of missing values | ||
* Percentage of missing values | ||
* Number of distinct values | ||
* For each distinct value: | ||
* count of observations | ||
* percentage of observations | ||
* **Ground truth** | ||
* if categorical i.e. for a classification model: bar plot *(for both reference and current for an easy comparison)* | ||
* if numerical, i.e. for a regression model: histogram with 10 bins *(for both reference and current for an easy comparison)* | ||
|
||
## Model quality | ||
|
||
* Classification model | ||
* Number of classes | ||
* Accuracy *(for both reference and current for an easy comparison)* | ||
* Line chart of accuracy over time | ||
* Confusion matrix | ||
* Log loss, *only for binary classification at the moment* | ||
* Line chart of log loss over time, *only for binary classification at the moment* | ||
* For each class: | ||
* Precision *(for both reference and current for an easy comparison)* | ||
* Recall *(for both reference and current for an easy comparison)* | ||
* F1 score *(for both reference and current for an easy comparison)* | ||
* True Positive Rate *(for both reference and current for an easy comparison)* | ||
* False Positive Rate *(for both reference and current for an easy comparison)* | ||
* Support *(for both reference and current for an easy comparison)* | ||
* Regression model | ||
* Mean squared error *(for both reference and current for an easy comparison)* | ||
* Root mean squared error *(for both reference and current for an easy comparison)* | ||
* Mean absolute error *(for both reference and current for an easy comparison)* | ||
* Mean absolute percentage error *(for both reference and current for an easy comparison)* | ||
* R-squared *(for both reference and current for an easy comparison)* | ||
* Adjusted R-squared *(for both reference and current for an easy comparison)* | ||
* Variance *(for both reference and current for an easy comparison)* | ||
* Line charts for all of the above over time | ||
* Residual analysis: | ||
* Correlation prediction/ground_truth | ||
* Residuals plot, i.e, scatter plot for standardised residuals and predictions | ||
* Scatter plot for predictions vs ground truth and linear regression line | ||
* Histogram of the residuals | ||
* Kolmogorov-Smirnov test of normality for residuals | ||
|
||
## Data Drift | ||
|
||
Data drift for all features using different algorithms depending on the data type: float, int, categorical. We use the following algorithms (but others will be added in the future): | ||
* [Chi-Square Test](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test) | ||
* [Two-Sample Kolmogorov-Smirnov](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Two-sample_Kolmogorov%E2%80%93Smirnov_test) | ||
* [Population Stability Index](https://scholarworks.wmich.edu/dissertations/3208/) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
sidebar_position: 6 | ||
--- | ||
|
||
# Architecture | ||
|
||
In this section we explore the architecture of the Radicalbit AI platform. | ||
The image below shows all the components of the platform: | ||
|
||
![Alt text](/img/architecture/architecture.png "Architecture") | ||
|
||
## API | ||
|
||
API is the core of the platform, it exposes all the functionalities via REST APIs. | ||
It requires a PostgreSQL database to store data and a Kubernetes cluster to run Spark jobs for metrics evaluations. | ||
To store all dataset files a distributed storage is used. | ||
REST APIs could be used via user interface or using the provided Python SDK. | ||
|
||
## UI | ||
|
||
To use REST APIs with a human friendly interface, a UI is provided. | ||
It covers all the implemented APIs, starting from model creation and ending with all metrics visualization. | ||
|
||
## SDK | ||
|
||
To interact with API programmatically, a [_Python SDK_](python-sdk.md) is provided. | ||
The SDK implements all functionalities exposed via REST API. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
--- | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Introduction | ||
Let's discover the **Radicalbit AI Monitoring Platform** in less than 5 minutes. | ||
|
||
## Welcome! | ||
This platform provides a comprehensive solution for monitoring and observing your Artificial Intelligence (AI) models in production. | ||
|
||
### Why Monitor AI Models? | ||
While models often perform well during development and validation, their effectiveness can degrade over time in production due to various factors like data shifts or concept drift. The Radicalbit AI Monitor platform helps you proactively identify and address potential performance issues. | ||
|
||
### Key Functionalities | ||
The platform provides comprehensive monitoring capabilities to ensure optimal performance of your AI models in production. It analyses both your reference dataset (used for pre-production validation) and the current datasets in use, allowing you to put under control: | ||
* **Data Quality:** evaluate the quality of your data, as high-quality data is crucial for maintaining optimal model performance. The platform analyses both numerical and categorical features in your dataset to provide insights into | ||
* *data distribution* | ||
* *missing values* | ||
* *target variable distribution* (for supervised learning). | ||
|
||
* **Model Quality Monitoring:** the platform provides a comprehensive suite of metrics specifically designed at the moment for classification and regression models. \ | ||
For classification these metrics include: | ||
* *Accuracy, Precision, Recall, and F1:* These metrics provide different perspectives on how well your model is classifying positive and negative cases. | ||
* *False/True Negative/Positive Rates and Confusion Matrix:* These offer a detailed breakdown of your model's classification performance, including the number of correctly and incorrectly classified instances. | ||
* *AUC-ROC and PR AUC:* These are performance curves that help visualize your model's ability to discriminate between positive and negative classes. | ||
|
||
For regression these metrics include: | ||
* *Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, R²:* These metrics provide different perspectives on how well your model is predicting a numerical value. | ||
* *Residual Analysis:* This offers a detailed breakdown of your model's performance, comparing predictions with ground truth and predictions with residuals, i.e. the difference between predictions and ground truth. | ||
* **Model Drift Detection:** analyse model drift, which occurs when the underlying data distribution changes over time and can affect model performance. | ||
|
||
### Current Scope and Future Plans | ||
This version focuses on classification, both binary and multiclass, and regression models. Support for additional model types is planned for future releases. |
8 changes: 8 additions & 0 deletions
8
docs/versioned_docs/version-v1.2.0/model-sections/_category_.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"label": "Model sections", | ||
"position": 4, | ||
"link": { | ||
"type": "generated-index", | ||
"description": "Each created model includes three main sections — Overview, Reference, and Current — as well as a summary section called the Launchpad. This document provides an in-depth explanation of each section." | ||
} | ||
} |
54 changes: 54 additions & 0 deletions
54
docs/versioned_docs/version-v1.2.0/model-sections/current.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
--- | ||
sidebar_position: 4 | ||
--- | ||
|
||
# Current | ||
The Current section stores all the information (statistics, model metrics and charts) related to the current dataset, placed side-by-side to the reference ones. The objective is to streamline and highlight every difference between the data over time. Throughout the platform, all the current information is coloured blue or in different shades. | ||
|
||
> NOTE: in this section, you will always see the last uploaded current dataset. In case you need previous current analysis, you can browse among them in the `Import` section. | ||
|
||
## Data Quality | ||
The **Data Quality** dashboard contains a descriptive analysis of the current variables (blue) placed side-by-side with the reference ones (grey). It adapts itself accordingly to the `Model Type` and shows information such as: | ||
|
||
- Number of observations | ||
- Number of classes (not in regression task) | ||
- Ground Truth Distribution | ||
- Histograms for Numerical Features | ||
- Descriptive Statistics for Numerical Features (average, standard deviation, ranges, percentiles, missing values) | ||
- Bar Charts for Categorical Features | ||
- Descriptive Statistics for Categorical Features(missing values, distinct values, frequencies) | ||
|
||
![Alt text](/img/current/current-data-quality.png "Current Data Quality") | ||
|
||
|
||
## Model Quality | ||
|
||
The **Model Quality** dashboard contains all the metrics used to evaluate the model performance in the current dataset and compare these values to the reference. Many of them are computed through the `prediction`/`probability` compared to the `ground truth`. Naturally, the platform computes the proper metrics according to the chosen `Model Type`. \ | ||
Differently from the reference section, here, the metrics are computed over time thanks to the flagged `timestamp` columns and the `granularity` parameter chosen during the model creation. | ||
|
||
![Alt text](/img/current/current-model-quality.png "Current Model Quality") | ||
|
||
|
||
## Data Drift | ||
|
||
The **Data Drift** section contains the outcome of some drift detector executed for each variable. | ||
According to the field type (categorical or numerical), a specific drift is computed: | ||
|
||
- Categoricals: **Chi-Square Test** | ||
- Numerical: **2-Samples-KS Test** (for `float` variables), **PSI** (for `int` variables) | ||
|
||
If the dot placed at the side of the variable name is red, it means that a drift has been revealed and the relative chart (and statistical description) can be seen in the `Current/Data Quality` section. | ||
|
||
![Alt text](/img/current/current-data-drift.png "Current Data Drift") | ||
|
||
|
||
## Import | ||
|
||
The **Import** section lists the path where your current CSVs are stored. If you have a private AWS, the files will be saved in a dedicated S3 bucket otherwise, they will be saved locally with Minio (which shares the same syntax as S3). | ||
To see your current datasets stored in Minio, visit the address [http://localhost:9091](http://localhost:9091). | ||
|
||
Here, you can browse between all the current datasets you have uploaded over time. | ||
|
||
![Alt text](/img/current/current-import.png "Current Import") | ||
|
25 changes: 25 additions & 0 deletions
25
docs/versioned_docs/version-v1.2.0/model-sections/launchpad.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
--- | ||
sidebar_position: 1 | ||
--- | ||
|
||
# Launchpad | ||
The launchpad provides a dedicated space for summarising existing models. | ||
|
||
![Alt text](/img/launchpad/launchpad.png "Launchpad") | ||
|
||
It offers a quick overview of key aspects: | ||
|
||
- **Data Quality Percentage:** This metric reflects the proportion of columns without anomalies across the current datasets. Anomalies are identified using the Interquartile Range (IQR) method, and the final percentage displayed is the average of each Current’s anomaly-free ratio. | ||
- **Model Quality Percentage:** This metric is calculated using a Bootstrap Test, based on historical metrics (the same grouped by Timestamp) from the Model Quality page for the current dataset. By grouping metrics over time (e.g., Accuracy), we generate multiple instances of the same metric, forming a statistical population. The Bootstrap Test then compares this population with the metric calculated for the Reference dataset, checking if it falls outside the 95% confidence interval. If so, the metric is flagged as “significantly different” between Reference and Current datasets. This process is repeated for each model metric, and the percentage of metrics that pass the test is returned. | ||
- **Drift Detection Percentage:** This percentage represents the ratio of features without drift over the total number of features. | ||
|
||
> NOTE: if a metric cannot be computed, the placeholder `--` will be used. | ||
The general **pie chart** represents the averages of each computed percentage across all models. | ||
|
||
|
||
Additional information appears on the right side: | ||
- **Work in Progress:** This section provides real-time updates on model activities, including ongoing and failed jobs. | ||
- **Alerts:** Here, you’ll find any alerts triggered by the percentages above. When an issue lowers a metric from its ideal 100%, the alert identifies the affected model and component. Clicking the alert takes you to the relevant page for more details. | ||
|
||
|
40 changes: 40 additions & 0 deletions
40
docs/versioned_docs/version-v1.2.0/model-sections/overview.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
--- | ||
sidebar_position: 2 | ||
--- | ||
|
||
# Overview | ||
|
||
The Overview is the section dedicated to the information recap of your reference dataset and your last current dataset, and it helps users to quickly assess the differences and monitor the data shapes. | ||
|
||
|
||
## Summary | ||
|
||
The **Summary** table provides a side-by-side comparison of key metrics between the current and reference datasets: | ||
|
||
- Number of variables | ||
- Number of observations | ||
- Missing Values | ||
- Missing Values (%) | ||
- Duplicated rows | ||
- Duplicated rows (%) | ||
- Number of numerical columns | ||
- Number of categorical columns | ||
- Number of Datetime columns | ||
|
||
![Alt text](/img/overview/overview-summary.png "Overview Summary") | ||
|
||
|
||
## Variables | ||
|
||
The **Variables** table lists all the columns flagged as `feature` or `ground truth`. That's the reason why we have chosen this name. Each field presents with its own type while the `ground truth` is flagged properly. | ||
For the meaning of the column `Field Type` see the *Hands-On Guide*. | ||
|
||
![Alt text](/img/overview/overview-variables.png "Overview Variables") | ||
|
||
|
||
## Output | ||
|
||
The **Output** table lists all the columns flagged as `probability` or `prediction` and it has to include all the fields produced by your model. Each field presents with its own type while the `probability` and the `prediction` are flagged properly. | ||
|
||
![Alt text](/img/overview/overview-output.png "Overview Output") | ||
|
36 changes: 36 additions & 0 deletions
36
docs/versioned_docs/version-v1.2.0/model-sections/reference.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
--- | ||
sidebar_position: 3 | ||
--- | ||
|
||
# Reference | ||
The Reference section stores all the information (statistics, model metrics and charts) related to the reference dataset. Throughout the platform, all the reference information is gray-coloured. | ||
|
||
|
||
## Data Quality | ||
The **Data Quality** dashboard contains a descriptive analysis of the reference variables. It adapts itself accordingly to the model type and shows information such as: | ||
|
||
- Number of observations | ||
- Number of classes (not in regression task) | ||
- Ground Truth Distribution | ||
- Histograms for Numerical Features | ||
- Descriptive Statistics for Numerical Features (average, standard deviation, ranges, percentiles, missing values) | ||
- Bar Charts for Categorical Features | ||
- Descriptive Statistics for Categorical Features(missing values, distinct values, frequencies) | ||
|
||
![Alt text](/img/reference/reference-data-quality.png "Reference Data Quality") | ||
|
||
|
||
## Model Quality | ||
|
||
The **Model Quality** dashboard contains all the metrics used to evaluate the model performance. Many of them are computed through the `prediction`/`probability` compared to the `ground truth`. Naturally, the platform computes the proper metrics according to the chosen `Model Type`. | ||
|
||
![Alt text](/img/reference/reference-model-quality.png "Reference Modela Quality") | ||
|
||
|
||
## Import | ||
|
||
The **Import** section lists the path where your reference CSV is stored. If you have a private AWS, the file will be saved in a dedicated S3 bucket otherwise, it will be saved locally with Minio (which shares the same syntax as S3). | ||
To see your reference dataset stored in Minio, visit the address [http://localhost:9091](http://localhost:9091). | ||
|
||
![Alt text](/img/reference/reference-import.png "Reference Import") | ||
|
Oops, something went wrong.