Skip to content

Commit

Permalink
Merge branch 'main' of github.com:radicalbit/radicalbit-ai-monitoring…
Browse files Browse the repository at this point in the history
… into features/ROS-496-introduce-text-generation-model-type
  • Loading branch information
dtria91 committed Dec 10, 2024
2 parents d7673bc + 87488ee commit 0a1390b
Show file tree
Hide file tree
Showing 33 changed files with 3,618 additions and 29 deletions.
2 changes: 1 addition & 1 deletion .github/release-manifest.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
".": "1.1.0"
".": "1.2.0"
}
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
# Changelog

## [1.2.0](https://github.com/radicalbit/radicalbit-ai-monitoring/compare/v1.1.0...v1.2.0) (2024-12-10)


### Features

* change banner introducing "Book a demo" ([#197](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/197)) ([7842eb1](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/7842eb148459fed7b6af9768dc41a4ac9012e2d1))
* create a custom component and custom hooks to manage dark mode ([#199](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/199)) ([05b93fb](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/05b93fb3587900091cc1dcea37bb432e1e4abace))
* improve layout and accessibility ([#201](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/201)) ([7016c0f](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/7016c0fa9923740ea1f773440dfff14761be30be))
* ugrade design system to 1.4.0 ([#196](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/196)) ([9358296](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/9358296c0c8d78abe7b88229ff9fc2ec1f770254))
* **ui:** add dark mode ([#195](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/195)) ([1c3bc31](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/1c3bc316a9e75598b11ff731b19392e2de5f7ccd))
* **ui:** improve accessibility ([#198](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/198)) ([8ea1e23](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/8ea1e23d82ab5c3c00e4e01768d4dd87f9de6d4c))
* upgrade design-system ([#200](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/200)) ([6f582e6](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/6f582e6680c26e5806f19c64ee827684026ab57c))


### Bug Fixes

* percentage fix ([#206](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/206)) ([523d197](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/523d1974dbd513bb85107f50de2d5d6f3e5e5304))
* **ui:** improve charts header legend ([#203](https://github.com/radicalbit/radicalbit-ai-monitoring/issues/203)) ([5c3ca24](https://github.com/radicalbit/radicalbit-ai-monitoring/commit/5c3ca24fc539ab434a36ec3dc3492aace6ca2d0b))

## [1.1.0](https://github.com/radicalbit/radicalbit-ai-monitoring/compare/v1.0.1...v1.1.0) (2024-10-31)


Expand Down
2 changes: 1 addition & 1 deletion api/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[tool.poetry]
name = "radicalbit-ai-monitoring"
# x-release-please-start-version
version = "1.1.0"
version = "1.2.0"
# x-release-please-end
description = "radicalbit platform"
authors = ["Radicalbit"]
Expand Down
82 changes: 82 additions & 0 deletions docs/versioned_docs/version-v1.2.0/all-metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
sidebar_position: 5
---

# All metrics
List of all available Metrics and Charts.

## CSV summary

* Number of variables
* Number of observations
* Number of missing values
* Percentage of missing values
* Number of duplicated rows
* Percentage of duplicated rows
* Number of **numerical** variables
* Number of **categorical** variables
* Number of **datetime** variables

Summary with all variable name and type (float, int, string, datetime).

## Data quality

* **Numerical** variables
* Average
* Standard deviation
* Minimum
* Maximum
* Percentile 25%
* Median
* Percentile 75%
* Number of missing values
* Histogram with 10 bins
* **Categorical** variables
* Number of missing values
* Percentage of missing values
* Number of distinct values
* For each distinct value:
* count of observations
* percentage of observations
* **Ground truth**
* if categorical i.e. for a classification model: bar plot *(for both reference and current for an easy comparison)*
* if numerical, i.e. for a regression model: histogram with 10 bins *(for both reference and current for an easy comparison)*

## Model quality

* Classification model
* Number of classes
* Accuracy *(for both reference and current for an easy comparison)*
* Line chart of accuracy over time
* Confusion matrix
* Log loss, *only for binary classification at the moment*
* Line chart of log loss over time, *only for binary classification at the moment*
* For each class:
* Precision *(for both reference and current for an easy comparison)*
* Recall *(for both reference and current for an easy comparison)*
* F1 score *(for both reference and current for an easy comparison)*
* True Positive Rate *(for both reference and current for an easy comparison)*
* False Positive Rate *(for both reference and current for an easy comparison)*
* Support *(for both reference and current for an easy comparison)*
* Regression model
* Mean squared error *(for both reference and current for an easy comparison)*
* Root mean squared error *(for both reference and current for an easy comparison)*
* Mean absolute error *(for both reference and current for an easy comparison)*
* Mean absolute percentage error *(for both reference and current for an easy comparison)*
* R-squared *(for both reference and current for an easy comparison)*
* Adjusted R-squared *(for both reference and current for an easy comparison)*
* Variance *(for both reference and current for an easy comparison)*
* Line charts for all of the above over time
* Residual analysis:
* Correlation prediction/ground_truth
* Residuals plot, i.e, scatter plot for standardised residuals and predictions
* Scatter plot for predictions vs ground truth and linear regression line
* Histogram of the residuals
* Kolmogorov-Smirnov test of normality for residuals

## Data Drift

Data drift for all features using different algorithms depending on the data type: float, int, categorical. We use the following algorithms (but others will be added in the future):
* [Chi-Square Test](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)
* [Two-Sample Kolmogorov-Smirnov](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Two-sample_Kolmogorov%E2%80%93Smirnov_test)
* [Population Stability Index](https://scholarworks.wmich.edu/dissertations/3208/)
27 changes: 27 additions & 0 deletions docs/versioned_docs/version-v1.2.0/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
sidebar_position: 6
---

# Architecture

In this section we explore the architecture of the Radicalbit AI platform.
The image below shows all the components of the platform:

![Alt text](/img/architecture/architecture.png "Architecture")

## API

API is the core of the platform, it exposes all the functionalities via REST APIs.
It requires a PostgreSQL database to store data and a Kubernetes cluster to run Spark jobs for metrics evaluations.
To store all dataset files a distributed storage is used.
REST APIs could be used via user interface or using the provided Python SDK.

## UI

To use REST APIs with a human friendly interface, a UI is provided.
It covers all the implemented APIs, starting from model creation and ending with all metrics visualization.

## SDK

To interact with API programmatically, a [_Python SDK_](python-sdk.md) is provided.
The SDK implements all functionalities exposed via REST API.
33 changes: 33 additions & 0 deletions docs/versioned_docs/version-v1.2.0/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
sidebar_position: 1
---

# Introduction
Let's discover the **Radicalbit AI Monitoring Platform** in less than 5 minutes.

## Welcome!
This platform provides a comprehensive solution for monitoring and observing your Artificial Intelligence (AI) models in production.

### Why Monitor AI Models?
While models often perform well during development and validation, their effectiveness can degrade over time in production due to various factors like data shifts or concept drift. The Radicalbit AI Monitor platform helps you proactively identify and address potential performance issues.

### Key Functionalities
The platform provides comprehensive monitoring capabilities to ensure optimal performance of your AI models in production. It analyses both your reference dataset (used for pre-production validation) and the current datasets in use, allowing you to put under control:
* **Data Quality:** evaluate the quality of your data, as high-quality data is crucial for maintaining optimal model performance. The platform analyses both numerical and categorical features in your dataset to provide insights into
* *data distribution*
* *missing values*
* *target variable distribution* (for supervised learning).

* **Model Quality Monitoring:** the platform provides a comprehensive suite of metrics specifically designed at the moment for classification and regression models. \
For classification these metrics include:
* *Accuracy, Precision, Recall, and F1:* These metrics provide different perspectives on how well your model is classifying positive and negative cases.
* *False/True Negative/Positive Rates and Confusion Matrix:* These offer a detailed breakdown of your model's classification performance, including the number of correctly and incorrectly classified instances.
* *AUC-ROC and PR AUC:* These are performance curves that help visualize your model's ability to discriminate between positive and negative classes.

For regression these metrics include:
* *Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, R²:* These metrics provide different perspectives on how well your model is predicting a numerical value.
* *Residual Analysis:* This offers a detailed breakdown of your model's performance, comparing predictions with ground truth and predictions with residuals, i.e. the difference between predictions and ground truth.
* **Model Drift Detection:** analyse model drift, which occurs when the underlying data distribution changes over time and can affect model performance.

### Current Scope and Future Plans
This version focuses on classification, both binary and multiclass, and regression models. Support for additional model types is planned for future releases.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Model sections",
"position": 4,
"link": {
"type": "generated-index",
"description": "Each created model includes three main sections — Overview, Reference, and Current — as well as a summary section called the Launchpad. This document provides an in-depth explanation of each section."
}
}
54 changes: 54 additions & 0 deletions docs/versioned_docs/version-v1.2.0/model-sections/current.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
sidebar_position: 4
---

# Current
The Current section stores all the information (statistics, model metrics and charts) related to the current dataset, placed side-by-side to the reference ones. The objective is to streamline and highlight every difference between the data over time. Throughout the platform, all the current information is coloured blue or in different shades.

> NOTE: in this section, you will always see the last uploaded current dataset. In case you need previous current analysis, you can browse among them in the `Import` section.

## Data Quality
The **Data Quality** dashboard contains a descriptive analysis of the current variables (blue) placed side-by-side with the reference ones (grey). It adapts itself accordingly to the `Model Type` and shows information such as:

- Number of observations
- Number of classes (not in regression task)
- Ground Truth Distribution
- Histograms for Numerical Features
- Descriptive Statistics for Numerical Features (average, standard deviation, ranges, percentiles, missing values)
- Bar Charts for Categorical Features
- Descriptive Statistics for Categorical Features(missing values, distinct values, frequencies)

![Alt text](/img/current/current-data-quality.png "Current Data Quality")


## Model Quality

The **Model Quality** dashboard contains all the metrics used to evaluate the model performance in the current dataset and compare these values to the reference. Many of them are computed through the `prediction`/`probability` compared to the `ground truth`. Naturally, the platform computes the proper metrics according to the chosen `Model Type`. \
Differently from the reference section, here, the metrics are computed over time thanks to the flagged `timestamp` columns and the `granularity` parameter chosen during the model creation.

![Alt text](/img/current/current-model-quality.png "Current Model Quality")


## Data Drift

The **Data Drift** section contains the outcome of some drift detector executed for each variable.
According to the field type (categorical or numerical), a specific drift is computed:

- Categoricals: **Chi-Square Test**
- Numerical: **2-Samples-KS Test** (for `float` variables), **PSI** (for `int` variables)

If the dot placed at the side of the variable name is red, it means that a drift has been revealed and the relative chart (and statistical description) can be seen in the `Current/Data Quality` section.

![Alt text](/img/current/current-data-drift.png "Current Data Drift")


## Import

The **Import** section lists the path where your current CSVs are stored. If you have a private AWS, the files will be saved in a dedicated S3 bucket otherwise, they will be saved locally with Minio (which shares the same syntax as S3).
To see your current datasets stored in Minio, visit the address [http://localhost:9091](http://localhost:9091).

Here, you can browse between all the current datasets you have uploaded over time.

![Alt text](/img/current/current-import.png "Current Import")

25 changes: 25 additions & 0 deletions docs/versioned_docs/version-v1.2.0/model-sections/launchpad.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
sidebar_position: 1
---

# Launchpad
The launchpad provides a dedicated space for summarising existing models.

![Alt text](/img/launchpad/launchpad.png "Launchpad")

It offers a quick overview of key aspects:

- **Data Quality Percentage:** This metric reflects the proportion of columns without anomalies across the current datasets. Anomalies are identified using the Interquartile Range (IQR) method, and the final percentage displayed is the average of each Current’s anomaly-free ratio.
- **Model Quality Percentage:** This metric is calculated using a Bootstrap Test, based on historical metrics (the same grouped by Timestamp) from the Model Quality page for the current dataset. By grouping metrics over time (e.g., Accuracy), we generate multiple instances of the same metric, forming a statistical population. The Bootstrap Test then compares this population with the metric calculated for the Reference dataset, checking if it falls outside the 95% confidence interval. If so, the metric is flagged as “significantly different” between Reference and Current datasets. This process is repeated for each model metric, and the percentage of metrics that pass the test is returned.
- **Drift Detection Percentage:** This percentage represents the ratio of features without drift over the total number of features.

> NOTE: if a metric cannot be computed, the placeholder `--` will be used.
The general **pie chart** represents the averages of each computed percentage across all models.


Additional information appears on the right side:
- **Work in Progress:** This section provides real-time updates on model activities, including ongoing and failed jobs.
- **Alerts:** Here, you’ll find any alerts triggered by the percentages above. When an issue lowers a metric from its ideal 100%, the alert identifies the affected model and component. Clicking the alert takes you to the relevant page for more details.


40 changes: 40 additions & 0 deletions docs/versioned_docs/version-v1.2.0/model-sections/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
sidebar_position: 2
---

# Overview

The Overview is the section dedicated to the information recap of your reference dataset and your last current dataset, and it helps users to quickly assess the differences and monitor the data shapes.


## Summary

The **Summary** table provides a side-by-side comparison of key metrics between the current and reference datasets:

- Number of variables
- Number of observations
- Missing Values
- Missing Values (%)
- Duplicated rows
- Duplicated rows (%)
- Number of numerical columns
- Number of categorical columns
- Number of Datetime columns

![Alt text](/img/overview/overview-summary.png "Overview Summary")


## Variables

The **Variables** table lists all the columns flagged as `feature` or `ground truth`. That's the reason why we have chosen this name. Each field presents with its own type while the `ground truth` is flagged properly.
For the meaning of the column `Field Type` see the *Hands-On Guide*.

![Alt text](/img/overview/overview-variables.png "Overview Variables")


## Output

The **Output** table lists all the columns flagged as `probability` or `prediction` and it has to include all the fields produced by your model. Each field presents with its own type while the `probability` and the `prediction` are flagged properly.

![Alt text](/img/overview/overview-output.png "Overview Output")

36 changes: 36 additions & 0 deletions docs/versioned_docs/version-v1.2.0/model-sections/reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
sidebar_position: 3
---

# Reference
The Reference section stores all the information (statistics, model metrics and charts) related to the reference dataset. Throughout the platform, all the reference information is gray-coloured.


## Data Quality
The **Data Quality** dashboard contains a descriptive analysis of the reference variables. It adapts itself accordingly to the model type and shows information such as:

- Number of observations
- Number of classes (not in regression task)
- Ground Truth Distribution
- Histograms for Numerical Features
- Descriptive Statistics for Numerical Features (average, standard deviation, ranges, percentiles, missing values)
- Bar Charts for Categorical Features
- Descriptive Statistics for Categorical Features(missing values, distinct values, frequencies)

![Alt text](/img/reference/reference-data-quality.png "Reference Data Quality")


## Model Quality

The **Model Quality** dashboard contains all the metrics used to evaluate the model performance. Many of them are computed through the `prediction`/`probability` compared to the `ground truth`. Naturally, the platform computes the proper metrics according to the chosen `Model Type`.

![Alt text](/img/reference/reference-model-quality.png "Reference Modela Quality")


## Import

The **Import** section lists the path where your reference CSV is stored. If you have a private AWS, the file will be saved in a dedicated S3 bucket otherwise, it will be saved locally with Minio (which shares the same syntax as S3).
To see your reference dataset stored in Minio, visit the address [http://localhost:9091](http://localhost:9091).

![Alt text](/img/reference/reference-import.png "Reference Import")

Loading

0 comments on commit 0a1390b

Please sign in to comment.