Flink AI Flow is an open source framework that bridges big data and AI. It manages the entire machine learning project lifecycle as a unified workflow, including feature engineering, model training, model evaluation, model service, model inference, monitoring, etc. Throughout the entire workflow, Flink is used as the general purpose computing engine.
In addition to the capability of orchestrating a group of batch jobs, by leveraging an event-based scheduler(enhanced version of Airflow), Flink AI Flow also supports workflows that contain streaming jobs. Such capability is quite useful for complicated real-time machine learning systems as well as other real-time workflows in general.
You can use Flink AI Flow to do the following:
-
Define the machine learning workflow including batch/stream jobs.
-
Manage metadata(generated by the machine learning workflow) of date sets, models, artifacts, metrics, jobs etc.
-
Run the machine learning workflow.
-
Publish and subscribe events
To support online machine learning scenarios, notification service and event-based schedulers are introduced. Flink AI Flow's current components are:
-
SDK: It defines how to build a machine learning workflow and includes the api of the Flink AI Flow.
-
Notification Service: It provides event listening and notification functions.
-
Meta Service: It saves the meta data of the machine learning workflow.
-
Event-Based Scheduler: It is a scheduler that triggered jobs by some events happened.
You can use Flink AI Flow according to the guidelines of QuickStart.
Please see the API Documentation to find the interfaces supported by Flink AI Flow.
Please see the Design Documentation to know the design principle of Flink AI Flow.
You can refer to some simple usage examples of Flink AI Flow. Please see the Simple Examples.
We happily welcome contributions to Flink AI Flow. Please see our contribution guide for details.