From 24a663c832aee52523d4e47c7a8615aa91943924 Mon Sep 17 00:00:00 2001 From: pryce-turner Date: Mon, 15 Jul 2024 15:19:24 -0700 Subject: [PATCH] Updated map task information to indicate array node is now the default, added optional return value for min success ratio Signed-off-by: pryce-turner --- .../advanced_composition/map_tasks.md | 30 ++++++------------- 1 file changed, 9 insertions(+), 21 deletions(-) diff --git a/docs/user_guide/advanced_composition/map_tasks.md b/docs/user_guide/advanced_composition/map_tasks.md index c57f37b95b..26330a8cd5 100644 --- a/docs/user_guide/advanced_composition/map_tasks.md +++ b/docs/user_guide/advanced_composition/map_tasks.md @@ -67,10 +67,14 @@ If the input size exceeds the concurrency value, multiple batches will run seria ```python @workflow -def map_workflow_with_additional_params(data: list[int] = [10, 12, 11, 10, 13, 12, 100, 11, 12, 10]) -> list[bool]: +def map_workflow_with_additional_params(data: list[int] = [10, 12, 11, 10, 13, 12, 100, 11, 12, 10]) -> list[typing.Optional[bool]]: return map_task(detect_anomalies, concurrency=1, min_success_ratio=0.75)(data_point=data) ``` +:::{note} +Notice the return type of the list has been set to `Optional` when a `min_success_ratio` is added. This is due to the fact we are now tolerating failures, meaning the expected return type from the mapped task may in fact not get returned. +::: + A map task internally uses a compression algorithm (bitsets) to handle every Flyte workflow node’s metadata, which would have otherwise been in the order of 100s of bytes. @@ -162,30 +166,14 @@ pyflyte run --remote \ ## ArrayNode -ArrayNode map tasks serve as a seamless substitution for regular map tasks, differing solely in the submodule -utilized to import the `map_task` function. Specifically, you will need to import `map_task` from the experimental module as illustrated below: - -```python -from flytekit import task, workflow -from flytekit.experimental import map_task - -@task -def t(a: int) -> int: - ... - -@workflow -def array_node_wf(xs: list[int]) -> list[int]: - return map_task(t)(a=xs) -``` - -Flyte introduces map task to enable parallelization of homogeneous operations, +Flyte originally introduced map tasks to enable parallelization of homogeneous operations, offering efficient evaluation and a user-friendly API. Because it’s implemented as a backend plugin, its evaluation is independent of core Flyte logic, which generates subtask executions that lack full Flyte functionality. -ArrayNode tackles this issue by offering robust support for subtask executions. +ArrayNode tackled this issue by offering robust support for subtask executions. It also extends mapping capabilities across all plugins and Flyte node types. -This enhancement will be a part of our move from the experimental phase to general availability. +Starting with `flytekit` version 1.12.0, ArrayNode is the default `map_task` importable via `from flytekit import map_task`. -In contrast to map tasks, an ArrayNode provides the following enhancements: +In contrast to the original map tasks, an ArrayNode provides the following enhancements: - **Wider mapping support**. ArrayNode extends mapping capabilities beyond Kubernetes tasks, encompassing tasks such as Python tasks, container tasks and pod tasks. - **Cache management**. It supports both cache serialization and cache overwriting for subtask executions.