Skip to content

Commit

Permalink
Incorporate suggestions
Browse files Browse the repository at this point in the history
  • Loading branch information
FlorianJacta committed Sep 27, 2023
1 parent d31c583 commit b6bb6ee
Show file tree
Hide file tree
Showing 10 changed files with 59 additions and 35 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,10 @@ $ pip install taipy

# Taipy Studio

There are two ways to configure Taipy, either by Python code or using Taipy Studio. Taipy Studio
is a VS Code extension that provides a graphical editor to describe pipelines. The configuration
You have two options for configuring Taipy: either through Python code or by using Taipy Studio. TTaipy Studio is a Visual Studio Code extension that offers a graphical editor for defining pipelines. The configuration
can be done easily and quickly with drag and drop.

So, without further delay, let's begin to code!
Now, without any more delay, let's get started with the coding!

# Concepts

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ or all the steps <a href="./../src/src.zip" download>here</a>.
Before looking at some code examples, to apprehend what is a *Scenario*, you need to
understand the *Data node* and *Task* concepts.

- [**Data Nodes**](../../../../manuals/core/concepts/data-node.md): are the translation of
variables in Taipy. Data Nodes don't contain the data itself but point to the data and know
- [**Data Nodes**](../../../../manuals/core/concepts/data-node.md): represents a variable in Taipy.
Data Nodes don't contain the data itself but point to the data and know
how to retrieve it. These Data Nodes can point to different types of data sources like CSV
files, Pickle files, databases, etc., and they can represent various types of Python variables
such as integers, strings, data frames, lists, and more. They are fully generic and can be
Expand All @@ -25,14 +25,12 @@ understand the *Data node* and *Task* concepts.
situations (i.e. various data/parameters values set by end-users).

[**Configuration**](../../../../manuals/core/config/index.md) is a structure to define scenarios.
It represents our Direct Acyclic Graph(s); it models the data sources and parameters, as well as
tasks. Once defined, a configuration acts like a superclass; it is used to instantiate different
scenario instances.
It serves as the blueprint for our Directed Acyclic Graph(s) and models the data sources, parameters, and tasks.
After being defined, a configuration functions like a superclass and is employed to generate various instances of scenarios.

# Configuring a Scenario

Let’s consider a simple scenario representing a single function `predict` taking two inputs: a
dataset and a date to forecast and generate a prediction for the chosen date.
Think about the most basic pipeline: one function that needs two things to work – some data and a date. It uses these to generate a prediction for that date.

See the function code below:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,17 @@ Let's discuss some of the essential functions that come along with Taipy.

## Utility of having scenarios

With Taipy, users can make several instances of the same setup. The data can vary among these different scenario instances. It's crucial to grasp the distinctions in data among scenario instances. These variations in scenarios can occur for the following reasons:
Using Taipy, users can create multiple copies of the same setup. These copies can have different data.
Understanding these differences in data between the copies is important.
These variations in scenarios can happen because of the following reasons:

- Changing data from input data nodes,

- Randomness in a task (random algorithm),

- Different values from parameters set by the end-user, etc.

The developer can directly change the data nodes entities with the _write_ function (see below).
The developer has the ability to directly modify the data nodes entities using the _write_ function (as shown below).

![](config_02.svg){ width=700 style="margin:auto;display:block;border: 4px solid rgb(210,210,210);border-radius:7px" }

Expand Down Expand Up @@ -101,7 +103,10 @@ tp.delete(scenario.id)

## Ways of executing the code: Versioning

Taipy provides a [versioning system](../../../../manuals/core/versioning/index.md) to keep track of the changes that a configuration experiences over time: new data sources, new parameters, new versions of your Machine Learning engine, etc. `python main.py -h` opens a helper to understand the versioning options at your disposal.
Taipy provides a [versioning system](../../../../manuals/core/versioning/index.md) to
keep track of the changes that a configuration experiences over time: new data
sources, new parameters, new versions of your Machine Learning engine, etc.
`python main.py -h` opens a helper to understand the versioning options at your disposal.

# Entire code

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ or all the steps <a href="./../src/src.zip" download>here</a>.

- *[Generic](../../../../manuals/core/config/data-node-config.md/#generic)*: Taipy provides a generic Data Node that can read and store any data based on a custom _reading_ and _writing_ function created by the user.

This part utilizes the straightforward DAG/execution setup detailed below. The setup comprises the following components:
This section uses a simple DAG/execution setup explained below. The setup includes these components:

1. Three Data Nodes:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,22 @@ or all the steps <a href="./../src/src.zip" download>here</a>.

[Cycles](../../../../manuals/core/concepts/cycle.md) have been introduced to reflect business situations our customers frequently encounter and can be extremely useful.

For instance, a large fast food chain wants to generate sales forecasts for its stores every week. When creating a given scenario, it needs to be attached to a given week. Typically, only one scenario will be shared among all the scenarios made for a specific week. This specific 'official' scenario is known as the 'Primary' scenario in Taipy.
For example, a big fast-food chain wants to predict its store sales every week. When they create a scenario
for a particular week, it has to be linked to that specific week. Usually, there will be just one scenario
used for all the scenarios created for a particular week.
This special 'official' scenario is called the 'Primary' scenario in Taipy.


Note that Cycles can be ignored entirely if the business problem has no time frequency.


In this step, scenarios are attached to a MONTHLY cycle. Using Cycles, the developer benefits from specific Taipy's functions to navigate through these Cycles. For instance, by providing the Cycle, Taipy can get all the scenarios created in a month. You can also easily get every primary scenario generated for the past X months to monitor KPIs over time.

Let’s slightly change the filter function by passing the month as an argument to get started. You must create a new Data Node representing the month (see the steps below).
In this step, scenarios are connected to a MONTHLY cycle. With the help of Cycles, developers can take advantage
of Taipy's specific functions to work with these Cycles. For instance, by providing the Cycle, Taipy can retrieve
all the scenarios created within a month. You can also easily access every primary scenario generated
in the past X months to track KPIs over time.

Let's make a slight adjustment to the filter function by passing the month as an argument. You'll need to create a new Data Node that represents the month.
Follow the steps below:

```python
def filter_by_month(df, month):
Expand Down Expand Up @@ -69,10 +75,13 @@ Then to introduce Cycles, you need to set the frequency (predefined attribute) o
```


As you can see, a Cycle is activated once you have set the desired frequency of the scenario. In this code snippet, since we have specified `frequency=Frequency.MONTHLY`, the corresponding scenario is automatically attached to the correct period (month) when it is created.
In the code snippet, you can observe that a Cycle becomes active when you've set the desired frequency for the scenario. In this code snippet, since we have specified
`frequency=Frequency.MONTHLY`, the corresponding scenario is automatically attached to the correct period (month) when it is created.


The Cycle which a Scenario belongs to is based on the _creation_date_ (`datetime.datetime.now()` by default) of the scenario. It can be "attached" to a specific cycle by manually setting its _creation_date_, as we are doing in the following example.
The Cycle which a Scenario belongs to is based on the _creation_date_
(`datetime.datetime.now()` by default) of the scenario. It can be "attached"
to a specific cycle by manually setting its _creation_date_, as we are doing in the following example.


```python
Expand All @@ -86,7 +95,10 @@ scenario_2 = tp.create_scenario(scenario_cfg,
name="Scenario 2022/10/5")
```

Scenario 1 and Scenario 2 are two scenario entities/instances created from the same scenario configuration. They belong to the same Cycle but don't share the same Data Nodes. By default, each scenario instance has its own data node instances. They are not shared with any other scenario. The Scope concept can modify this behavior, which is covered in the next step.
Scenario 1 and Scenario 2 are two separate scenario entities created using the same scenario configuration.
They are part of the same Cycle but have different Data Nodes. By default, each scenario instance
has its own data node instances, and they are not shared with any other scenario.
The Scope concept can change this behavior, and we'll discuss it in the next step.


```python
Expand Down Expand Up @@ -116,11 +128,14 @@ Month Data Node of Scenario 2 10

## Primary scenarios

In each Cycle, there is a primary scenario. A primary scenario is interesting because it represents the important scenario of the Cycle, the reference. By default, the first scenario created for a cycle is primary.
In each Cycle, there is a primary scenario. A primary scenario is interesting because
it represents the important scenario of the Cycle, the reference. By default, the
first scenario created for a cycle is primary.

### GUI-Core visual elements

GUI-Core elements may be used to manipulate Cycles. Cycles are visible in *scenario_selector* or *data_node_selector*. It is also possible to make a scenario primary directly through the `scenario` visual element.
You can utilize GUI-Core elements to control Cycles. Cycles can be seen in either the *scenario_selector* or *data_node_selector*.
Additionally, it's possible to designate a scenario as primary directly through the `scenario` visual element.

```python
data_node = None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ or all the steps <a href="./../src/src.zip" download>here</a>.

- `Scope.CYCLE`: Extend the scope by sharing data nodes across all scenarios of a given cycle.

- `Scope.GLOBAL`: Finally, extend the scope globally (across all scenarios of all cycles). For example, the initial/historical dataset is usually shared by all the scenarios/pipelines/cycles. It is unique in the entire application.
- `Scope.GLOBAL`: Finally, you can expand the scope globally, applying it across all scenarios in all cycles. For instance, the initial or historical dataset is typically shared by all scenarios, and cycles. It's a unique dataset throughout the entire application.

![](config_05.svg){ width=700 style="margin:auto;display:block;border: 4px solid rgb(210,210,210);border-radius:7px" }

Expand Down Expand Up @@ -53,7 +53,9 @@ or all the steps <a href="./../src/src.zip" download>here</a>.
```


Cycles are created based on the _creation_date_ of scenarios. In the example below, we force the creation_date to a given date (in real life, the actual creation date of the scenario gets used automatically).
Cycles are generated according to the _creation_date_ of scenarios. In the example
below, we set the creation_date to a specific date (typically, in real situations,
the actual creation date of the scenario is automatically used).

```python
tp.Core().run()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ or all the steps <a href="./../src/src.zip" download>here</a>.
Skipping tasks is a crucial capability of Taipy. If you run a function twice with the same inputs,
you'll get the same output. Executing this type of function is inefficient and consumes unnecessary time and resources.

Taipy offers the option to set the _skippable_ attribute for each task. If this attribute is set to
True, Taipy Cores scheduler automatically detects if changes have occurred on any of the input Data Nodes of a task.
If there are no changes, it automatically skips the execution of that task. By default, _skippable_ is set to False.
In Taipy, you have the choice to configure the _skippable_ attribute for each task. When this attribute is set to True,
Taipy Core's scheduler can automatically detect whether any changes have happened to the input Data Nodes of a task.
If there are no changes, it will skip the execution of that task. By default, _skippable_ is set to False.


![](config_06.svg){ width=700 style="margin:auto;display:block;border: 4px solid rgb(210,210,210);border-radius:7px" }
Expand Down Expand Up @@ -84,7 +84,9 @@ Scenario 1: submit
Value 849
```

When you submit scenario 2, the scheduler skips the initial task in this second scenario. This is because both scenarios use the same Data Nodes for this task, and there have been no changes in the input Data Nodes since the last time the task ran when we submitted scenario 1.
When you submit scenario 2, the scheduler skips the initial task in this second scenario.
This happens because both scenarios utilize the same Data Nodes for this task, and there haven't been any changes
in the input Data Nodes since the last time the task ran when we submitted scenario 1.

```python
# the first task has already been executed by scenario 1
Expand Down Expand Up @@ -130,7 +132,7 @@ Results:
[2022-12-22 16:20:10,257][Taipy][INFO] job JOB_count_values_... is completed.
```

Here, we change the input Data Node of the pipeline so Taipy will re-run the correct tasks to ensure that everything is up-to-date.
In this case, we modify the input Data Node so that Taipy will execute the necessary tasks to ensure that everything is current and up-to-date.


```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ def add(nb):

![](config_07.svg){ width=700 style="margin:auto;display:block;border: 4px solid rgb(210,210,210);border-radius:7px" }

This line of code changes the execution mode. _standalone_ makes Taipy Core asynchronous. Here, a maximum of two tasks are able to run concurrently.
This line of code alters the execution mode. Setting it to _standalone_ makes Taipy Core work asynchronously.
In this configuration, a maximum of two tasks can run simultaneously.

```python
Config.configure_job_executions(mode="standalone", max_nb_of_workers=2)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,17 @@ or all the steps <a href="./../src/src.zip" download>here</a>.

![](config_08.svg){ width=700 style="margin:auto;display:block;border: 4px solid rgb(210,210,210);border-radius:7px" }

Taipy provides a mechanism to compare data nodes of scenarios by providing a function directly into the scenario's configuration.
Taipy offers a way to compare data nodes across scenarios by including a function directly in the configuration of the scenario.

## Step 1: The first step consists in declaring on which data nodes to apply the comparison functions:

In this example, we want a comparison applied to the '_output_' Data Node. It is indicated in the comparators parameter of the `configure_scenario()`.
In this example, we want to apply a comparison to the '_output_' Data Node. It is indicated in the comparators parameter of the `configure_scenario()`.

```python
scenario_cfg = Config.configure_scenario(id="multiply_scenario",
name="my_scenario",
tasks_configs=[pipeline_cfg],
tasks_configs=[first_task_cfg,
second_task_cfg],
comparators={output_data_node_cfg.id: compare_function},)
```
## Step 2: Implement the comparison function (`compare_function()`) used above.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ or all the steps <a href="./../src/src.zip" download>here</a>.

*Estimated Time for Completion: 15 minutes; Difficulty Level: Advanced*

To have an action after changing a job status, we can [subscribe a function](../../../../manuals/core/entities/orchestrating-and-job-execution.md/#subscribe-to-job-execution) to a scenario. A status change calls this function. This feature allows the creation of logs or particular events for Taipy GUI.
To perform an action after a job status change, you can [subscribe a function](../../../../manuals/core/entities/orchestrating-and-job-execution.md/#subscribe-to-job-execution) to a scenario.
When there is a status change, this function is triggered. This feature enables the creation of logs or specific events for the Taipy GUI.

```python
def callback_scenario_state(scenario, job):
Expand Down

0 comments on commit b6bb6ee

Please sign in to comment.