Jumpstart Library

This repo contains documentation, source code, workshop contents, howtos… for different base techniques, aka "patterns", that can used in Data Science, Data Engineering, AI/ML, Analytics…

The goal is to provide concrete elements that people will be able to learn from, reproduce, adapt and modify to fit their own needs and requirements.

The library currently features 15 different patterns, as well as 2 demos that are full implementation of scenarios where several patterns are combined.

Patterns

In this folder, you will find the following patterns

Pattern 1
Pattern 2
Pattern 3

Demo 1 - XRay analysis automated pipeline

In this demo, we show how to implement an automated data pipeline for chest Xray analysis:

Ingest chest Xrays into an object store based on Ceph.
The Object store sends notifications to a Kafka topic.
A KNative Eventing Listener to the topic triggers a KNative Serving function.
An ML-trained model running in a container makes a risk of Pneumonia assessment for incoming images.
A Grafana dashboard displays the pipeline in real time, along with images incoming, processed and anonymized, as well as full metrics.

See this demo

Demo 2 - Smart City scenario

In this demo, we show how to implement this scenario:

Using a trained ML model, licence plates are recognized at toll location.
Data (plate number, location, timestamp) is send from toll locations (edge) to the core using Kafka mirroring to handle communication issues and recovery.
Incoming data is screened real-time to trigger alerts for wanted vehicles (Amber Alert).
Data is aggregated and stored into object storage.
A central database contains other information coming from licence registry system: car model, color,…
Data analysis leveraging Presto and Superset is done against stored data (database and object storage).

See this demo

Demo 3 - Industrial Condition Based Monitoring

In this demo, we show how to implement industrial condition based monitoring using hybrid cloud ML approach. Machine inference-based anomaly detection on metric time-series sensor data at the edge, with a central data lake and ML model retraining. It also shows how hybrid deployments (cluster at the edge and in the cloud) can be managed, and how the CI/CD pipelines and Model Training/Execution flows can be implemented.

See this demo

Contributing

We recommend to use a virtual python environment to develop for this repository. To create such an environment use

python3 -m venv .venv
source .venv/bin/activate

Then install the dependencies with:

pip -r requirements.txt

Note	This file is generated with `pip3 freeze > requirements.txt`

Then install the pre-commit hook with

pre-commit install

Now you’re ready to contribute code.

Name		Name	Last commit message	Last commit date
Latest commit History 312 Commits
demo1-xray-pipeline		demo1-xray-pipeline
demo2-smart-city		demo2-smart-city
demo3-industrial-condition-monitoring		demo3-industrial-condition-monitoring
patterns		patterns
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.yamllint		.yamllint
LICENSE		LICENSE
README.adoc		README.adoc
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jumpstart Library

Patterns

Demo 1 - XRay analysis automated pipeline

Demo 2 - Smart City scenario

Demo 3 - Industrial Condition Based Monitoring

Contributing

About

Releases

Packages

Languages

License

DanielFroehlich/jumpstart-library

Folders and files

Latest commit

History

Repository files navigation

Jumpstart Library

Patterns

Demo 1 - XRay analysis automated pipeline

Demo 2 - Smart City scenario

Demo 3 - Industrial Condition Based Monitoring

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages