From 1e4a083dfb529c58f24111247f94115ea84cb7ba Mon Sep 17 00:00:00 2001 From: Italo AGUIAR Date: Wed, 5 Jun 2024 14:30:55 +0200 Subject: [PATCH 1/7] feat: armonik domain definitions aep some changes definition of metadata changes Update AEP/aep-0000y.md Co-authored-by: qdelamea-aneo <134614217+qdelamea-aneo@users.noreply.github.com> Update AEP/aep-0000y.md Co-authored-by: qdelamea-aneo <134614217+qdelamea-aneo@users.noreply.github.com> last corrections --- AEP/aep-0000y.md | 70 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 AEP/aep-0000y.md diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md new file mode 100644 index 0000000..6839cbc --- /dev/null +++ b/AEP/aep-0000y.md @@ -0,0 +1,70 @@ +# AEP : ArmoniK's Domain Definition + +| |ArmoniK Enhancement Proposal| +---: |:--- +**AEP** | +**Title** | ArmoniK's Domain Definition +**Author** | Italo Aguiar <>, Quentin Delamea <> +**Status** | Draft +**Type** | Standard +**Creation Date** | 2024-06-04 + +# Abstract + +This AEP proposes a comprehensive definition of ArmoniK's domain. A domain is an area of interest or control within our system. Our goal is to clearly define the core domain entities to facilitate further development. + +# Motivation + +The purpose of this document is to establish a clear and consistent definition of ArmoniK's domain, enabling other documents and standards to effectively reuse these concepts. + +# Rationale + +To describe more complex features of ArmoniK accurately, it is crucial to have a shared vocabulary and avoid redundancy. While some definitions already exist in the ArmoniK glossary, this AEP seeks to foster community discussion and approval to ensure a unified understanding and usage of these terms. + +# Specification + +## Worker + +User-developed containerized software capable of performing one or several user-implemented tasks. A worker is built on top of a uniformized runtime environment. A worker process input data, outputing more data from its calculation, as while as it can produce new tasks that might be handled by itself or by different workers. A worker embeds an implementation of the communication protocols with the scheduling agent. + +## Scheduling agent + +Containerized software cohabiting with a worker, running a specific algorithm to determine which tasks its associated workers should perform, scheduling tasks on the workers and monitoring their execution. It also manages all interactions between the worker and the databases (retrieving/saving data, creating new tasks, etc.), as well as managing workers errors and retrying/resubmitting failed tasks when necessary. A scheduling agent, like a worker, exists within a single partition. + +## Partition + +Logical segmentation of the cluster's pool of machines to distribute workloads according to usage. This feature is provided and handled by ArmoniK. + +## Task + +An atomic computation that takes one or several inputs and outputs one or several results. A task is launched by a client or by another task and processed by a worker. In ArmoniK, tasks cannot communicate with each other. They can, however, depend on each other via their input/output data, known as data dependency. + +A task is defined by: + +- **TaskOptions**: Set of parameters specifying the execution conditions, for instance: + - *PartitionId*: The Id of the partition where the task must be executed + - *Maximum Duration*: The maximum duration of a task + - *Max Retries*: The maximum number of retries + - *Priority*: The priority level of the task execution (the algorithm performs its best effort on tasks' priorities; however, it is possible that the priority won't always be respected) + +- **Data Dependencies**: Input data for a given task that depends on another unique task. Data dependencies formalize dependencies between tasks. + +- **Expected Outputs**: Data that must be generated as output from a task. If a task submits new tasks then it can transfer responsibility for generating all or part of its outputs to the tasks it submits. + +## Blob + +Abstraction for ArmoniK's task-related data, including data dependencies and expected outputs. + +- **Blob Metadata**: Abstraction for referring to a blob, whether it exists or not. Blob metadata might be used when referring to data that is expected to exist in its full form in the future, such as expected outputs or data dependencies before the blob content is created. + +## Session + +A session is a logical container for tasks and associated data (task status, blobs, errors, etc.). Every task is submitted within a session. An existing session can be resumed to retrieve data or submit new tasks. When a session is canceled, all associated executions still in progress are interrupted. + +## Events + +Events are abstractions that enable users to be notified whenever a change happens to certain entities, for example: sessions, tasks, and blobs, which must be specified by users. + +## Copyright + +This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. \ No newline at end of file From a15b1d137520e773873b69197c92b94caade5b57 Mon Sep 17 00:00:00 2001 From: Italo Aguiar <167756401+italo1aguiar-aneo@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:24:59 +0200 Subject: [PATCH 2/7] Update AEP/aep-0000y.md Co-authored-by: ngruelaneo <100275739+ngruelaneo@users.noreply.github.com> --- AEP/aep-0000y.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md index 6839cbc..b69b731 100644 --- a/AEP/aep-0000y.md +++ b/AEP/aep-0000y.md @@ -11,7 +11,7 @@ # Abstract -This AEP proposes a comprehensive definition of ArmoniK's domain. A domain is an area of interest or control within our system. Our goal is to clearly define the core domain entities to facilitate further development. +This AEP proposes a comprehensive definition of ArmoniK's domain. A domain is an area of interest or control within our system. Our goal is to define precisely the core domain entities to facilitate further development. # Motivation From 6e7956936f21650d0ddd7334c2117f9417755331 Mon Sep 17 00:00:00 2001 From: Italo Aguiar <167756401+italo1aguiar-aneo@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:25:08 +0200 Subject: [PATCH 3/7] Update AEP/aep-0000y.md Co-authored-by: ngruelaneo <100275739+ngruelaneo@users.noreply.github.com> --- AEP/aep-0000y.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md index b69b731..018ac2d 100644 --- a/AEP/aep-0000y.md +++ b/AEP/aep-0000y.md @@ -15,7 +15,7 @@ This AEP proposes a comprehensive definition of ArmoniK's domain. A domain is an # Motivation -The purpose of this document is to establish a clear and consistent definition of ArmoniK's domain, enabling other documents and standards to effectively reuse these concepts. +The purpose of this document is to establish a clear and consistent definition of ArmoniK's concepts, enabling other documents and standards to effectively reuse them. # Rationale From 2cef73dcb65b90e5274109299c1955761adee3a6 Mon Sep 17 00:00:00 2001 From: Italo Aguiar <167756401+italo1aguiar-aneo@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:25:24 +0200 Subject: [PATCH 4/7] Update AEP/aep-0000y.md Co-authored-by: ngruelaneo <100275739+ngruelaneo@users.noreply.github.com> --- AEP/aep-0000y.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md index 018ac2d..943dbc1 100644 --- a/AEP/aep-0000y.md +++ b/AEP/aep-0000y.md @@ -19,7 +19,7 @@ The purpose of this document is to establish a clear and consistent definition o # Rationale -To describe more complex features of ArmoniK accurately, it is crucial to have a shared vocabulary and avoid redundancy. While some definitions already exist in the ArmoniK glossary, this AEP seeks to foster community discussion and approval to ensure a unified understanding and usage of these terms. +To describe more complex features of ArmoniK accurately, it is crucial to have a shared vocabulary and avoid redundancy or discrepancies. While some definitions already exist in the ArmoniK glossary, this AEP seeks to enable community discussion and approval to ensure an unified understanding and usage of these terms. # Specification From 61642723ed56632c1749c01846e06fb92338b90e Mon Sep 17 00:00:00 2001 From: Italo Aguiar <167756401+italo1aguiar-aneo@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:27:47 +0200 Subject: [PATCH 5/7] Update AEP/aep-0000y.md Co-authored-by: ngruelaneo <100275739+ngruelaneo@users.noreply.github.com> --- AEP/aep-0000y.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md index 943dbc1..0672982 100644 --- a/AEP/aep-0000y.md +++ b/AEP/aep-0000y.md @@ -37,7 +37,7 @@ Logical segmentation of the cluster's pool of machines to distribute workloads a ## Task -An atomic computation that takes one or several inputs and outputs one or several results. A task is launched by a client or by another task and processed by a worker. In ArmoniK, tasks cannot communicate with each other. They can, however, depend on each other via their input/output data, known as data dependency. +An atomic computation that takes one or several inputs and outputs one or several results. A task is launched by a client or by another task and processed by a worker. In ArmoniK, a task cannot communicate with another one directly. They can, however, depend on each other via their input/output data, known as data dependency. A task is defined by: From ed8bccc9712f492b874bede364b01b2ad435aa16 Mon Sep 17 00:00:00 2001 From: Italo Aguiar <167756401+italo1aguiar-aneo@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:32:41 +0200 Subject: [PATCH 6/7] Update AEP/aep-0000y.md Co-authored-by: ngruelaneo <100275739+ngruelaneo@users.noreply.github.com> --- AEP/aep-0000y.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md index 0672982..3c6b372 100644 --- a/AEP/aep-0000y.md +++ b/AEP/aep-0000y.md @@ -55,7 +55,7 @@ A task is defined by: Abstraction for ArmoniK's task-related data, including data dependencies and expected outputs. -- **Blob Metadata**: Abstraction for referring to a blob, whether it exists or not. Blob metadata might be used when referring to data that is expected to exist in its full form in the future, such as expected outputs or data dependencies before the blob content is created. +- **Blob Metadata**: Abstraction which refer to a blob, whether it exists or not. Blob metadata might be used when referring to data that is expected to exist in its full form in the futur, such as expected outputs or data dependencies before the blob content is created. ## Session From af12fa4c4825061f9d12ffaf3c8e0631002263b5 Mon Sep 17 00:00:00 2001 From: Italo Aguiar <167756401+italo1aguiar-aneo@users.noreply.github.com> Date: Mon, 10 Jun 2024 14:32:57 +0200 Subject: [PATCH 7/7] Update AEP/aep-0000y.md Co-authored-by: ngruelaneo <100275739+ngruelaneo@users.noreply.github.com> --- AEP/aep-0000y.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/AEP/aep-0000y.md b/AEP/aep-0000y.md index 3c6b372..99ea700 100644 --- a/AEP/aep-0000y.md +++ b/AEP/aep-0000y.md @@ -59,7 +59,7 @@ Abstraction for ArmoniK's task-related data, including data dependencies and exp ## Session -A session is a logical container for tasks and associated data (task status, blobs, errors, etc.). Every task is submitted within a session. An existing session can be resumed to retrieve data or submit new tasks. When a session is canceled, all associated executions still in progress are interrupted. +A session is a logical container for tasks and associated data (task status, blobs, errors, etc.). Every task is submitted within a session. An existing session can be paused and resumed to retrieve data or submit new tasks. When a session is canceled, all associated executions still in progress are interrupted. ## Events