-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: armonik domain definitions aep #47
base: main
Are you sure you want to change the base?
Changes from all commits
1e4a083
a15b1d1
6e79569
2cef73d
6164272
ed8bccc
af12fa4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# AEP : ArmoniK's Domain Definition | ||
|
||
| |ArmoniK Enhancement Proposal| | ||
---: |:--- | ||
**AEP** | | ||
**Title** | ArmoniK's Domain Definition | ||
**Author** | Italo Aguiar <<[email protected]>>, Quentin Delamea <<[email protected]>> | ||
**Status** | Draft | ||
**Type** | Standard | ||
**Creation Date** | 2024-06-04 | ||
|
||
# Abstract | ||
|
||
This AEP proposes a comprehensive definition of ArmoniK's domain. A domain is an area of interest or control within our system. Our goal is to define precisely the core domain entities to facilitate further development. | ||
|
||
# Motivation | ||
|
||
The purpose of this document is to establish a clear and consistent definition of ArmoniK's concepts, enabling other documents and standards to effectively reuse them. | ||
|
||
# Rationale | ||
|
||
To describe more complex features of ArmoniK accurately, it is crucial to have a shared vocabulary and avoid redundancy or discrepancies. While some definitions already exist in the ArmoniK glossary, this AEP seeks to enable community discussion and approval to ensure an unified understanding and usage of these terms. | ||
|
||
# Specification | ||
|
||
## Worker | ||
|
||
User-developed containerized software capable of performing one or several user-implemented tasks. A worker is built on top of a uniformized runtime environment. A worker process input data, outputing more data from its calculation, as while as it can produce new tasks that might be handled by itself or by different workers. A worker embeds an implementation of the communication protocols with the scheduling agent. | ||
|
||
## Scheduling agent | ||
|
||
Containerized software cohabiting with a worker, running a specific algorithm to determine which tasks its associated workers should perform, scheduling tasks on the workers and monitoring their execution. It also manages all interactions between the worker and the databases (retrieving/saving data, creating new tasks, etc.), as well as managing workers errors and retrying/resubmitting failed tasks when necessary. A scheduling agent, like a worker, exists within a single partition. | ||
|
||
## Partition | ||
|
||
Logical segmentation of the cluster's pool of machines to distribute workloads according to usage. This feature is provided and handled by ArmoniK. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we should precise that this the creation is done by Armonik only at deployement stage (for now). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As it may change soon I think it's better not to give this precision. |
||
|
||
## Task | ||
|
||
An atomic computation that takes one or several inputs and outputs one or several results. A task is launched by a client or by another task and processed by a worker. In ArmoniK, a task cannot communicate with another one directly. They can, however, depend on each other via their input/output data, known as data dependency. | ||
|
||
A task is defined by: | ||
|
||
- **TaskOptions**: Set of parameters specifying the execution conditions, for instance: | ||
- *PartitionId*: The Id of the partition where the task must be executed | ||
italo1aguiar-aneo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- *Maximum Duration*: The maximum duration of a task | ||
- *Max Retries*: The maximum number of retries | ||
- *Priority*: The priority level of the task execution (the algorithm performs its best effort on tasks' priorities; however, it is possible that the priority won't always be respected) | ||
|
||
- **Data Dependencies**: Input data for a given task that depends on another unique task. Data dependencies formalize dependencies between tasks. | ||
italo1aguiar-aneo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- **Expected Outputs**: Data that must be generated as output from a task. If a task submits new tasks then it can transfer responsibility for generating all or part of its outputs to the tasks it submits. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Above you are speaking about results, why not use this term here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm avoiding using the word "results" since we aim to use Blobs from now on. Maybe I could take out the other usage of this word to avoid confusion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Following the idea to avoid using the term "result" maybe we should avoid using "data dependencies" but "expected input ids" instead? |
||
|
||
## Blob | ||
|
||
Abstraction for ArmoniK's task-related data, including data dependencies and expected outputs. | ||
|
||
- **Blob Metadata**: Abstraction which refer to a blob, whether it exists or not. Blob metadata might be used when referring to data that is expected to exist in its full form in the futur, such as expected outputs or data dependencies before the blob content is created. | ||
|
||
## Session | ||
|
||
A session is a logical container for tasks and associated data (task status, blobs, errors, etc.). Every task is submitted within a session. An existing session can be paused and resumed to retrieve data or submit new tasks. When a session is canceled, all associated executions still in progress are interrupted. | ||
|
||
## Events | ||
|
||
Events are abstractions that enable users to be notified whenever a change happens to certain entities, for example: sessions, tasks, and blobs, which must be specified by users. | ||
|
||
## Copyright | ||
|
||
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we shouldn't use "pod" once we aim to go bare metal soon and it is mostly associated with k3s