diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 9c1ae50..6ea2e40 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -15,7 +15,7 @@ jobs: - name: Install Python uses: actions/setup-python@v1 with: - python-version: 3.12.1 + python-version: 3.11.9 - name: Install packages run: | diff --git a/notebooks/05_covid_anomaly_detection.ipynb b/notebooks/05_covid_anomaly_detection.ipynb index f8f1156..a7091e7 100644 --- a/notebooks/05_covid_anomaly_detection.ipynb +++ b/notebooks/05_covid_anomaly_detection.ipynb @@ -1,5 +1,27 @@ { "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 5. Training an Anomaly Detection Model for Covid Anomaly Detection\n", + "\n", + "In this tutorial, we will train an anomaly detection model using a simple [LSTM-AutoEncoder model](https://www.medrxiv.org/content/10.1101/2021.01.08.21249474v1).\n", + "Data can be obtained from [this link](https://iscteiul365-my.sharepoint.com/:u:/g/personal/oonia_iscte-iul_pt/ERZLm1ruUNpMqkSwjpqhE9wB_7loVWAC4yZWuIH2RKGOlQ?e=kD4HlI). This is a processed version of data from original Stanford dataset-Phase 2. The overall pre-processing pipeline used is illustrated in Figure below.\n", + "\n", + "![preprocessing](stanford_data_processing.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Data was aquired from diferent sources (Germin, FitBit, Apple Watch) and pre-processed to have a common format. In this form, data has two columns: heart rate and number of user steps in last minute. \n", + "Then the processing pipeline was applied to the data. The pipeline is composed of the following steps:\n", + "1. Once data was standardized, the resting heart rate was extracted (``Resting Heart Rate Extractor``, in Figure). This process takes as input `min_minutes_rest` that is the number of minutes that the user has to be at rest to consider the heart rate as resting. This variable looks at user steps and, when user steps is 0 for `min_minutes_rest` minutes, the heart rate is considered as resting. At the end of this process, we will have a new dataframe with: the date and the resting heart rate of the last minute.\n", + "2. The second step is adding labels." + ] + }, { "cell_type": "code", "execution_count": 1, diff --git a/notebooks/stanford_data_processing.png b/notebooks/stanford_data_processing.png new file mode 100644 index 0000000..aa37ff9 Binary files /dev/null and b/notebooks/stanford_data_processing.png differ