From 4ccfccb049683e72cf58aeac48916073b5f8de4c Mon Sep 17 00:00:00 2001 From: "jason.lai" Date: Wed, 3 Jan 2024 01:19:09 +0800 Subject: [PATCH 1/8] docs: add FlytePlugins architecture image - Add an image of the FlytePlugins architecture to the end of the file Signed-off-by: jason.lai --- .../component_architecture/flytepropeller_architecture.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index a04f6dbe4d..3cbee70a4f 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -79,3 +79,8 @@ FlyteAdmin Events ----------------- It should be noted that the WorkflowExecutor, NodeExecutor, and TaskHandlers send events to FlyteAdmin, enabling it to track workflows in near real-time. + +FlytePlugins +----------------- + +.. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png \ No newline at end of file From 3c313e597bc518262b700a59b40ebe058ce9ea9b Mon Sep 17 00:00:00 2001 From: Future Outlier Date: Wed, 3 Jan 2024 14:40:44 +0800 Subject: [PATCH 2/8] future outlier text revision Signed-off-by: Future Outlier --- .../flytepropeller_architecture.rst | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index 3cbee70a4f..5c79395a73 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -19,7 +19,7 @@ A Flyte :ref:`workflow ` is represented as a Directed Acycli - ``DynamicNodes`` add nodes to the DAG. - ``WorkflowNodes`` allow embedding workflows within each other. -FlytePropeller is responsible for scheduling and tracking execution of Flyte workflows. It is implemented using a K8s controller and adheres to the established K8s design principles. In this scheme, resources are periodically evaluated and the goal is to transition from the observed state to a requested state. +FlytePropeller is responsible for scheduling and tracking execution of Flyte workflows. It is implemented using K8s operator and adheres to the established K8s design principles. In this scheme, resources are periodically evaluated and the goal is to transition from the observed state to a requested state. In our case, workflows are the resources and they are iteratively evaluated to transition from the current state to success. During each loop, the current workflow state is established as the phase of workflow nodes and subsequent tasks, and FlytePropeller performs operations to transition this state to success. The operations may include scheduling (or rescheduling) node executions, evaluating dynamic or branch nodes, etc. These design decisions ensure that FlytePropeller can scale to manage a large number of concurrent workflows without performance degradation. @@ -69,7 +69,12 @@ NodeHandlers FlytePropeller includes a robust collection of NodeHandlers to support diverse evaluation of the workflow DAG: -* **TaskHandler (Plugins)**: These are responsible for executing plugin specific tasks. This may include contacting FlyteAdmin to schedule K8s pod to perform work, calling a web API to begin/track evaluation, and much more. The plugin paradigm exposes an extensible interface for adding functionality to Flyte workflows. +* **TaskHandler (Plugins)**: These are responsible for executing tasks in the Flyte cluster. There are mainly 3 kinds of tasks for the task handler: + 1. **Pod Task**: Create a pod in the Kubernetes cluster, execute the task, and then delete the pod. + 2. **K8s Operator Backend Plugin**: Install a specific Kubernetes Operator (e.g., Spark, Ray, and Kubeflow) in the cluster, create pods by the Kubernetes Operator, execute the task, and then delete the pods. + 3. **Web API Task**: Send REST/gRPC requests to a server and return the response. + Note: The Web API Task will not start a pod. +These are responsible for executing plugin specific tasks. This may include contacting FlyteAdmin to schedule K8s pod to perform work, calling a web API to begin/track evaluation, and much more. The plugin paradigm exposes an extensible interface for adding functionality to Flyte workflows. * **DynamicHandler**: Flyte workflow CRs are initialized using a DAG compiled during the registration process. The numerous benefits of this approach are beyond the scope of this document. However, there are situations where the complete DAG is unknown at compile time. For example, when executing a task on each value of an input list. Using Dynamic nodes, a new DAG subgraph may be dynamically compiled during runtime and linked to the existing FlyteWorkflow CR. * **WorkflowHandler**: This handler allows embedding workflows within another workflow definition. The API exposes this functionality using either (1) an inline execution, where the workflow function is invoked directly resulting in a single FlyteWorkflow CR with an appended sub-workflow, or (2) a launch plan, which uses a TODO to create a separate sub-FlyteWorkflow CR whose execution state is linked to the parent FlyteWorkflow CR. * **BranchHandler**: The branch handler allows the DAG to follow a specific control path based on input (or computed) values. From d7a3cb11138f7628ee31162de5cdc67242947a7d Mon Sep 17 00:00:00 2001 From: Future Outlier Date: Wed, 3 Jan 2024 14:48:33 +0800 Subject: [PATCH 3/8] nit Signed-off-by: Future Outlier --- .../component_architecture/flytepropeller_architecture.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index 5c79395a73..7013b990b3 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -74,7 +74,6 @@ FlytePropeller includes a robust collection of NodeHandlers to support diverse e 2. **K8s Operator Backend Plugin**: Install a specific Kubernetes Operator (e.g., Spark, Ray, and Kubeflow) in the cluster, create pods by the Kubernetes Operator, execute the task, and then delete the pods. 3. **Web API Task**: Send REST/gRPC requests to a server and return the response. Note: The Web API Task will not start a pod. -These are responsible for executing plugin specific tasks. This may include contacting FlyteAdmin to schedule K8s pod to perform work, calling a web API to begin/track evaluation, and much more. The plugin paradigm exposes an extensible interface for adding functionality to Flyte workflows. * **DynamicHandler**: Flyte workflow CRs are initialized using a DAG compiled during the registration process. The numerous benefits of this approach are beyond the scope of this document. However, there are situations where the complete DAG is unknown at compile time. For example, when executing a task on each value of an input list. Using Dynamic nodes, a new DAG subgraph may be dynamically compiled during runtime and linked to the existing FlyteWorkflow CR. * **WorkflowHandler**: This handler allows embedding workflows within another workflow definition. The API exposes this functionality using either (1) an inline execution, where the workflow function is invoked directly resulting in a single FlyteWorkflow CR with an appended sub-workflow, or (2) a launch plan, which uses a TODO to create a separate sub-FlyteWorkflow CR whose execution state is linked to the parent FlyteWorkflow CR. * **BranchHandler**: The branch handler allows the DAG to follow a specific control path based on input (or computed) values. From f9abc4001211c9b7730c6ccc26ffbee19b8369fd Mon Sep 17 00:00:00 2001 From: Future Outlier Date: Wed, 3 Jan 2024 15:17:48 +0800 Subject: [PATCH 4/8] add rst Signed-off-by: Future Outlier --- .../flytepropeller_architecture.rst | 2 +- .../flytepropeller_architecture.rst | 13 +++++++++++-- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index 7013b990b3..dacdfdb6cf 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -87,4 +87,4 @@ It should be noted that the WorkflowExecutor, NodeExecutor, and TaskHandlers sen FlytePlugins ----------------- -.. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png \ No newline at end of file +.. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png diff --git a/rsts/concepts/component_architecture/flytepropeller_architecture.rst b/rsts/concepts/component_architecture/flytepropeller_architecture.rst index a04f6dbe4d..dacdfdb6cf 100644 --- a/rsts/concepts/component_architecture/flytepropeller_architecture.rst +++ b/rsts/concepts/component_architecture/flytepropeller_architecture.rst @@ -19,7 +19,7 @@ A Flyte :ref:`workflow ` is represented as a Directed Acycli - ``DynamicNodes`` add nodes to the DAG. - ``WorkflowNodes`` allow embedding workflows within each other. -FlytePropeller is responsible for scheduling and tracking execution of Flyte workflows. It is implemented using a K8s controller and adheres to the established K8s design principles. In this scheme, resources are periodically evaluated and the goal is to transition from the observed state to a requested state. +FlytePropeller is responsible for scheduling and tracking execution of Flyte workflows. It is implemented using K8s operator and adheres to the established K8s design principles. In this scheme, resources are periodically evaluated and the goal is to transition from the observed state to a requested state. In our case, workflows are the resources and they are iteratively evaluated to transition from the current state to success. During each loop, the current workflow state is established as the phase of workflow nodes and subsequent tasks, and FlytePropeller performs operations to transition this state to success. The operations may include scheduling (or rescheduling) node executions, evaluating dynamic or branch nodes, etc. These design decisions ensure that FlytePropeller can scale to manage a large number of concurrent workflows without performance degradation. @@ -69,7 +69,11 @@ NodeHandlers FlytePropeller includes a robust collection of NodeHandlers to support diverse evaluation of the workflow DAG: -* **TaskHandler (Plugins)**: These are responsible for executing plugin specific tasks. This may include contacting FlyteAdmin to schedule K8s pod to perform work, calling a web API to begin/track evaluation, and much more. The plugin paradigm exposes an extensible interface for adding functionality to Flyte workflows. +* **TaskHandler (Plugins)**: These are responsible for executing tasks in the Flyte cluster. There are mainly 3 kinds of tasks for the task handler: + 1. **Pod Task**: Create a pod in the Kubernetes cluster, execute the task, and then delete the pod. + 2. **K8s Operator Backend Plugin**: Install a specific Kubernetes Operator (e.g., Spark, Ray, and Kubeflow) in the cluster, create pods by the Kubernetes Operator, execute the task, and then delete the pods. + 3. **Web API Task**: Send REST/gRPC requests to a server and return the response. + Note: The Web API Task will not start a pod. * **DynamicHandler**: Flyte workflow CRs are initialized using a DAG compiled during the registration process. The numerous benefits of this approach are beyond the scope of this document. However, there are situations where the complete DAG is unknown at compile time. For example, when executing a task on each value of an input list. Using Dynamic nodes, a new DAG subgraph may be dynamically compiled during runtime and linked to the existing FlyteWorkflow CR. * **WorkflowHandler**: This handler allows embedding workflows within another workflow definition. The API exposes this functionality using either (1) an inline execution, where the workflow function is invoked directly resulting in a single FlyteWorkflow CR with an appended sub-workflow, or (2) a launch plan, which uses a TODO to create a separate sub-FlyteWorkflow CR whose execution state is linked to the parent FlyteWorkflow CR. * **BranchHandler**: The branch handler allows the DAG to follow a specific control path based on input (or computed) values. @@ -79,3 +83,8 @@ FlyteAdmin Events ----------------- It should be noted that the WorkflowExecutor, NodeExecutor, and TaskHandlers send events to FlyteAdmin, enabling it to track workflows in near real-time. + +FlytePlugins +----------------- + +.. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png From 6181b548583ade0ed1d5399501df5d548e599b76 Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Sun, 21 Jan 2024 12:04:30 +0800 Subject: [PATCH 5/8] lint Signed-off-by: Future-Outlier --- .../component_architecture/flytepropeller_architecture.rst | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index dacdfdb6cf..35137fec65 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -19,7 +19,7 @@ A Flyte :ref:`workflow ` is represented as a Directed Acycli - ``DynamicNodes`` add nodes to the DAG. - ``WorkflowNodes`` allow embedding workflows within each other. -FlytePropeller is responsible for scheduling and tracking execution of Flyte workflows. It is implemented using K8s operator and adheres to the established K8s design principles. In this scheme, resources are periodically evaluated and the goal is to transition from the observed state to a requested state. +FlytePropeller is responsible for scheduling and tracking execution of Flyte workflows. It is implemented using a K8s controller and adheres to the established K8s design principles. In this scheme, resources are periodically evaluated and the goal is to transition from the observed state to a requested state. In our case, workflows are the resources and they are iteratively evaluated to transition from the current state to success. During each loop, the current workflow state is established as the phase of workflow nodes and subsequent tasks, and FlytePropeller performs operations to transition this state to success. The operations may include scheduling (or rescheduling) node executions, evaluating dynamic or branch nodes, etc. These design decisions ensure that FlytePropeller can scale to manage a large number of concurrent workflows without performance degradation. @@ -70,10 +70,14 @@ NodeHandlers FlytePropeller includes a robust collection of NodeHandlers to support diverse evaluation of the workflow DAG: * **TaskHandler (Plugins)**: These are responsible for executing tasks in the Flyte cluster. There are mainly 3 kinds of tasks for the task handler: + 1. **Pod Task**: Create a pod in the Kubernetes cluster, execute the task, and then delete the pod. + 2. **K8s Operator Backend Plugin**: Install a specific Kubernetes Operator (e.g., Spark, Ray, and Kubeflow) in the cluster, create pods by the Kubernetes Operator, execute the task, and then delete the pods. + 3. **Web API Task**: Send REST/gRPC requests to a server and return the response. Note: The Web API Task will not start a pod. + * **DynamicHandler**: Flyte workflow CRs are initialized using a DAG compiled during the registration process. The numerous benefits of this approach are beyond the scope of this document. However, there are situations where the complete DAG is unknown at compile time. For example, when executing a task on each value of an input list. Using Dynamic nodes, a new DAG subgraph may be dynamically compiled during runtime and linked to the existing FlyteWorkflow CR. * **WorkflowHandler**: This handler allows embedding workflows within another workflow definition. The API exposes this functionality using either (1) an inline execution, where the workflow function is invoked directly resulting in a single FlyteWorkflow CR with an appended sub-workflow, or (2) a launch plan, which uses a TODO to create a separate sub-FlyteWorkflow CR whose execution state is linked to the parent FlyteWorkflow CR. * **BranchHandler**: The branch handler allows the DAG to follow a specific control path based on input (or computed) values. @@ -87,4 +91,5 @@ It should be noted that the WorkflowExecutor, NodeExecutor, and TaskHandlers sen FlytePlugins ----------------- +Here is an overviw architecture of FlytePlugins: .. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png From ae642db68ec6c8f0797caf5363c33b1e0f18fc7e Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Fri, 26 Jan 2024 15:30:35 +0800 Subject: [PATCH 6/8] nit Signed-off-by: Future-Outlier --- .../component_architecture/flytepropeller_architecture.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index 35137fec65..547091ad9b 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -92,4 +92,5 @@ FlytePlugins ----------------- Here is an overviw architecture of FlytePlugins: + .. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png From 2425e220c2fe9d14e7e6e9b4824bdaf21a303b7a Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Fri, 26 Jan 2024 15:48:37 +0800 Subject: [PATCH 7/8] push Signed-off-by: Future-Outlier --- .../component_architecture/flytepropeller_architecture.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index 547091ad9b..59fc8941fa 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -89,7 +89,7 @@ FlyteAdmin Events It should be noted that the WorkflowExecutor, NodeExecutor, and TaskHandlers send events to FlyteAdmin, enabling it to track workflows in near real-time. FlytePlugins ------------------ +------------ Here is an overviw architecture of FlytePlugins: From f91139f60ca152de1c985163da125b58eee2645b Mon Sep 17 00:00:00 2001 From: Future-Outlier Date: Fri, 26 Jan 2024 16:46:48 +0800 Subject: [PATCH 8/8] fix typo Signed-off-by: Future-Outlier --- .../component_architecture/flytepropeller_architecture.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/concepts/component_architecture/flytepropeller_architecture.rst b/docs/concepts/component_architecture/flytepropeller_architecture.rst index 59fc8941fa..c04edbf617 100644 --- a/docs/concepts/component_architecture/flytepropeller_architecture.rst +++ b/docs/concepts/component_architecture/flytepropeller_architecture.rst @@ -33,7 +33,7 @@ Components FlyteWorkflow CRD / K8s Integration ----------------------------------- -Workflows in Flyte are maintained as Custom Resource Definitions (CRDs) in Kubernetes, which are stored in the backing etcd cluster. Each execution of a workflow definition results in the creation of a new FlyteWorkflow CR (Custom Resource) which maintains a state for the entirety of processing. CRDs provide variable definitions to describe both resource specifications (spec) and status' (status). The FlyteWorkflow CRD uses the spec subsection to detail the workflow DAG, embodying node dependencies, etc. The status subsection tracks workflow metadata including overall workflow status, node/task phases, status/phase transition timestamps, etc. +Workflows in Flyte are maintained as Custom Resource Definitions (CRDs) in Kubernetes, which are stored in the backing etcd cluster. Each execution of a workflow definition results in the creation of a new FlyteWorkflow CR (Custom Resource) which maintains a state for the entirety of processing. CRDs provide variable definitions to describe both resource specifications (spec) and status (status). The FlyteWorkflow CRD uses the spec subsection to detail the workflow DAG, embodying node dependencies, etc. The status subsection tracks workflow metadata including overall workflow status, node/task phases, status/phase transition timestamps, etc. K8s exposes a powerful controller/operator API that enables entities to track creation/updates over a specific resource type. FlytePropeller uses this API to track FlyteWorkflows, meaning every time an instance of the FlyteWorkflow CR is created/updated, the FlytePropeller instance is notified. FlyteAdmin is the common entry point, where initialization of FlyteWorkflow CRs may be triggered by user workflow definition executions, automatic relaunches, or periodically scheduled workflow definition executions. However, it is conceivable to manually create FlyteWorkflow CRs, but this will have limited visibility and usability. @@ -91,6 +91,6 @@ It should be noted that the WorkflowExecutor, NodeExecutor, and TaskHandlers sen FlytePlugins ------------ -Here is an overviw architecture of FlytePlugins: +Here is an overview architecture of FlytePlugins: .. image:: https://raw.githubusercontent.com/flyteorg/static-resources/main/flyte/concepts/architecture/flytepropeller_plugins_architecture.png