Skip to content

Commit

Permalink
Merge branch 'master' into terryhung/add-the-new-type-of-admin-seed-p…
Browse files Browse the repository at this point in the history
…orjects-in-config

Signed-off-by: terry.hung <[email protected]>
  • Loading branch information
Terryhung committed Oct 25, 2024
2 parents ea2e834 + 7f92953 commit fedad63
Show file tree
Hide file tree
Showing 13 changed files with 366 additions and 77 deletions.
18 changes: 18 additions & 0 deletions .github/workflows/dependency-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: 'Dependency Review'
on: [pull_request]
permissions:
contents: read
pull-requests: write
jobs:
dependency-review:
runs-on: ubuntu-latest
steps:
- name: 'Checkout Repository'
uses: actions/checkout@v4
- name: Dependency Review
uses: actions/dependency-review-action@v4
with:
comment-summary-in-pr: on-failure
# Licenses need to come from https://spdx.org/licenses/
deny-licenses: GPL-1.0-only, GPL-1.0-or-later, GPL-2.0-only, GPL-2.0-or-later, GPL-3.0-only, GPL-3.0-or-later

12 changes: 12 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
# Contributing to Flyte

For information related to contributing to Flyte, please check out the [Contributing to Flyte](https://docs.flyte.org/en/latest/community/contribute/index.html) section of the documentation.

## Recommendation Order (For Beginners)
* Setup dev environment
* Read the following and run at least 5 examples. Pay close attention to the generated outputs, the Graph view, task
logs, etc. Repeat with as many examples as you need to have an initial understanding of what an execution looks like:
* https://docs.flyte.org/en/latest/user_guide/introduction.html
* https://docs.flyte.org/en/latest/flytesnacks/userguide.html
* Finish reading the [Concepts](https://docs.flyte.org/en/latest/user_guide/concepts/main_concepts/index.html)
* Finish reading the [Control Plane](https://docs.flyte.org/en/latest/user_guide/concepts/control_plane/index.html)
* Finish reading the [Component Architecture](https://docs.flyte.org/en/latest/user_guide/concepts/component_architecture/index.html)
* Choose 2 good first issues from the following and start solving them with the knowledge you have read.
* Familiar with using [ImageSpec to push images to localhost for development](https://docs.flyte.org/en/latest/user_guide/customizing_dependencies/imagespec.html#image-spec-example)
63 changes: 63 additions & 0 deletions docs/community/contribute/contribute_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,66 @@ Deployment and API docs mostly use reStructured Text. For more information, see

You can cross-reference multiple Python modules, functions, classes, methods, and global data in documentations. For more information, see the [Sphinx documentation](https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#cross-referencing-python-objects).

### Quickstart

Flyte Documentation is primarily maintained in two locations: [flyte](https://github.com/flyteorg/flyte) and [flytesnacks](https://github.com/flyteorg/flytesnacks).

#### Tips
The following are some tips to include various content:
* **Images**
Flyte maintain all static resources in [static-resources-repo](https://github.com/flyteorg/static-resources).
You should upload your images to this repo and open the PR, and then refer to the image in the documentation.
Notice that the image URL should be in the format `https://raw.githubusercontent.com/flyteorg/static-resources/<git sha of your commit in PR>/<your image path>`.
* **Source code references (Link format)** <br>
`.rst` example:
```{code-block}
.. raw:: html

a href="https://github.com/flyteorg/<source repo name>/blob/<git sha>/<target file path>#L<from line>-L<to line>">View source code on GitHub</a>
```

`.md` example:
```{code-block}
[View source code on GitHub]("https://github.com/flyteorg/<source repo name>/blob/<git sha>/<target file path>#L<from line>-L<to line>")
```
* **Source code references (Embedded format)** <br>
`.rst` example:
```{code-block}
.. rli:: https://raw.githubusercontent.com/flyteorg/<source repo name>/<git sha>/<target file path>
:lines: <from line>-<to line>
```

`.md` example:
````{code-block}
```{rli} https://raw.githubusercontent.com/flyteorg/<source repo name>/<git sha>/<target file path>
lines: <from line>-<to line>
```
````

This way, the nested code block is properly displayed without breaking the Markdown structure.

#### Open a pull request
[This is an example PR](https://github.com/flyteorg/flyte/pull/5844)

Each time you update your PR, it triggers the CI build, so there’s no need to build the docs locally. Flyte uses the CI process `"docs/readthedocs.org:flyte"`, which builds the documentation after each PR.
Be sure to include the following CI-build preview link in your PR description so reviewers can easily preview the changes:
```{code-block}
https://flyte--<PR number>.org.readthedocs.build/en/<PR number>/<relative path>.html
```
The relative path is based on the `docs` directory.
For example, if the full path is `flyte/docs/user_guide/advanced_composition/chaining_flyte_entities.md`, then the relative path would be `user_guide/advanced_composition/chaining_flyte_entities` + `.html`.

#### Important note
In the `flytesnacks` repository, most Python comments using `# xxxx` are not imported into the documentation.
You may notice some overlap between `flytesnacks` and `flyte` docs, but what is displayed primarily comes from the`flyte` repository.

Otherwise, take care of the following points:
````{important}
* Make sure `:lines:` are aligned correctly.
* Use gitsha to specify the example code instead of using master branch or relative path, as this ensures 100% accuracy.
* Build the documentation by submitting a PR instead of building it locally.
* For `flytesnacks`, run `make fmt` before submitting the PR.
* Before uploading commits, use `git commit -s` to sign off. This step is often forgotten during the first submission.
* Run `codespell` on the modified files to check for any spelling mistakes before pushing.
* When using reference code or images, use gitsha along with GitHub raw content links.
````
3 changes: 3 additions & 0 deletions docs/deployment/configuration/general.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,9 @@ as the base container configuration for all primary containers. If both containe
names exist in the default PodTemplate, Flyte first applies the default
configuration, followed by the primary configuration.

Note: Init containers can be configured with similar granularity using "default-init"
and "primary-init" init container names.

The ``containers`` field is required in each k8s PodSpec. If no default
configuration is desired, specifying a container with a name other than "default"
or "primary" (for example, "noop") is considered best practice. Since Flyte only
Expand Down
102 changes: 65 additions & 37 deletions docs/deployment/configuration/monitoring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Monitoring

.. tags:: Infrastructure, Advanced

.. tip:: The Flyte core team publishes and maintains Grafana dashboards built using Prometheus data sources, which can be found `here <https://grafana.com/grafana/dashboards?search=flyte>`__.
.. tip:: The Flyte core team publishes and maintains Grafana dashboards built using Prometheus data sources. You can import them to your Grafana instance from the `Grafana marketplace <https://grafana.com/orgs/flyteorg/dashboards>`__.

Metrics for Executions
======================
Expand Down Expand Up @@ -87,53 +87,81 @@ Flyte Backend is written in Golang and exposes stats using Prometheus. The stats

Both ``flyteadmin`` and ``flytepropeller`` are instrumented to expose metrics. To visualize these metrics, Flyte provides three Grafana dashboards, each with a different focus:

- **User-facing dashboards**: Dashboards that can be used to triage/investigate/observe performance and characteristics of workflows and tasks.
The user-facing dashboard is published under ID `13980 <https://grafana.com/grafana/dashboards/13980>`__ in the Grafana marketplace.
- **User-facing dashboard**: it can be used to investigate performance and characteristics of workflow and task executions. It's published under ID `22146 <https://grafana.com/grafana/dashboards/22146-flyte-user-dashboard-via-prometheus/>`__ in the Grafana marketplace.

- **System Dashboards**: Dashboards that are useful for the system maintainer to investigate the status and performance of their Flyte deployments. These are further divided into:
- `DataPlane/FlytePropeller <https://grafana.com/grafana/dashboards/13979>`__: execution engine status and performance.
- `ControlPlane/Flyteadmin <https://grafana.com/grafana/dashboards/13981>`__: API-level monitoring.
- Data plane (``flytepropeller``): `21719 <https://grafana.com/grafana/dashboards/21719-flyte-propeller-dashboard-via-prometheus/>`__: execution engine status and performance.
- Control plane (``flyteadmin``): `21720 <https://grafana.com/grafana/dashboards/21720-flyteadmin-dashboard-via-prometheus/>`__: API-level monitoring.

The corresponding JSON files for each dashboard are also located at ``deployment/stats/prometheus``.
The corresponding JSON files for each dashboard are also located in the ``flyte`` repository at `deployment/stats/prometheus <https://github.com/flyteorg/flyte/tree/master/deployment/stats/prometheus>`__.

.. note::

The dashboards are basic dashboards and do not include all the metrics exposed by Flyte.
Feel free to use the scripts provided `here <https://github.com/flyteorg/flyte/tree/master/stats>`__ to improve and -hopefully- contribute the improved dashboards.

How to use the dashboards
~~~~~~~~~~~~~~~~~~~~~~~~~

1. We recommend installing and configuring the Prometheus operator as described in `their docs <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/getting-started.md>`__.
This is especially true if you plan to use the Service Monitors provided by the `flyte-core <https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/templates/propeller/service-monitor.yaml>`__ Helm chart.

2. Enable the Prometheus instance to use Service Monitors in the namespace where Flyte is running, configuring the following keys in the ``prometheus`` resource:

.. code-block:: yaml
spec:
serviceMonitorSelector: {}
serviceMonitorNamespaceSelector: {}
.. note::

The above example configuration lets Prometheus use any ``ServiceMonitor`` in any namespace in the cluster. Adjust the configuration to reduce the scope if needed.

3. Once you have installed and configured the Prometheus operator, enable the Service Monitors in the Helm chart by configuring the following keys in your ``values`` file:

.. code-block:: yaml
flyteadmin:
serviceMonitor:
enabled: true
flytepropeller:
serviceMonitor:
enabled: true
Setup instructions
~~~~~~~~~~~~~~~~~~

The dashboards rely on a working Prometheus deployment with access to your Kubernetes cluster and Flyte pods.
Additionally, the user dashboard uses metrics that come from ``kube-state-metrics``. Both of these requirements can be fulfilled by installing the `kube-prometheus-stack <https://github.com/kubernetes/kube-state-metrics>`__.

Once the prerequisites are in place, follow the instructions in this section to configure metrics scraping for the corresponding Helm chart:

.. tabs::

.. group-tab:: flyte-core

Save the following in a ``flyte-monitoring-overrides.yaml`` file and run a ``helm upgrade`` operation pointing to that ``--values`` file:

.. code-block:: yaml
flyteadmin:
serviceMonitor:
enabled: true
labels:
release: kube-prometheus-stack #This is particular to the kube-prometheus-stacl
selectorLabels:
- app.kubernetes.io/name: flyteadmin
flytepropeller:
serviceMonitor:
enabled: true
labels:
release: kube-prometheus-stack
selectorLabels:
- app.kubernetes.io/name: flytepropeller
service:
enabled: true
The above configuration enables the ``serviceMonitor`` that Prometheus can then use to automatically discover services and scrape metrics from them.
.. group-tab:: flyte-binary

Save the following in a ``flyte-monitoring-overrides.yaml`` file and run a ``helm upgrade`` operation pointing to that ``--values`` file:

.. code-block:: yaml
configuration:
inline:
propeller:
prof-port: 10254
metrics-prefix: "flyte:"
scheduler:
profilerPort: 10254
metricsScope: "flyte:"
flyteadmin:
profilerPort: 10254
service:
extraPorts:
- name: http-metrics
protocol: TCP
port: 10254
The above configuration enables the ``serviceMonitor`` that Prometheus can then use to automatically discover services and scrape metrics from them.
.. note::

By default, the ``ServiceMonitor`` is configured with a ``scrapeTimeout`` of 30s and ``interval`` of 60s. You can customize these values if needed.

With the above configuration in place you should be able to import the dashboards in your Grafana instance.
With the above configuration completed, you should be able to import the dashboards in your Grafana instance.

16 changes: 12 additions & 4 deletions docs/user_guide/development_lifecycle/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,23 +19,31 @@ Let's watch a brief explanation of caching and a demo in this video, followed by
```

### Input Caching

In Flyte, input caching allows tasks to automatically cache the input data required for execution. This feature is particularly useful in scenarios where tasks may need to be re-executed, such as during retries due to failures or when manually triggered by users. By caching input data, Flyte optimizes workflow performance and resource usage, preventing unnecessary recomputation of task inputs.

### Output Caching

Output caching in Flyte allows users to cache the results of tasks to avoid redundant computations. This feature is especially valuable for tasks that perform expensive or time-consuming operations where the results are unlikely to change frequently.

There are four parameters and one command-line flag related to caching.

## Parameters

* `cache`(`bool`): Enables or disables caching of the workflow, task, or launch plan.
By default, caching is disabled to avoid unintended consequences when caching executions with side effects.
To enable caching set `cache=True`.
To enable caching, set `cache=True`.
* `cache_version` (`str`): Part of the cache key.
A change to this parameter will invalidate the cache.
Changing this version number tells Flyte to ignore previous cached results and run the task again if the task's function has changed.
This allows you to explicitly indicate when a change has been made to the task that should invalidate any existing cached results.
Note that this is not the only change that will invalidate the cache (see below).
Also, note that you can manually trigger cache invalidation per execution using the [`overwrite-cache` flag](#overwrite-cache-flag).
* `cache_serialize` (`bool`): Enables or disables [cache serialization](./cache_serializing).
When enabled, Flyte ensures that a single instance of the task is run before any other instances that would otherwise run concurrently.
This allows the initial instance to cache its result and lets the later instances reuse the resulting cached outputs.
Cache serialization is disabled by default.
* `cache_ignore_input_vars` (`Tuple[str, ...]`): Input variables that should not be included when calculating hash for cache. By default, no input variables are ignored. This parameter only applies to task serialization.
* `cache_ignore_input_vars` (`Tuple[str, ...]`): Input variables that Flyte should ignore when deciding if a task’s result can be reused (hash calculation). By default, no input variables are ignored. This parameter only applies to task serialization.

Task caching parameters can be specified at task definition time within `@task` decorator or at task invocation time using `with_overrides` method.

Expand Down Expand Up @@ -127,7 +135,7 @@ Task executions can be cached across different versions of the task because a ch

### How does local caching work?

The flytekit package uses the [diskcache](https://github.com/grantjenks/python-diskcache) package, specifically [diskcache.Cache](http://www.grantjenks.com/docs/diskcache/tutorial.html#cache), to aid in the memoization of task executions. The results of local task executions are stored under `~/.flyte/local-cache/` and cache keys are composed of **Cache Version**, **Task Signature**, and **Task Input Values**.
Flyte uses a tool called [diskcache](https://github.com/grantjenks/python-diskcache), specifically [diskcache.Cache](http://www.grantjenks.com/docs/diskcache/tutorial.html#cache), to save task results so they don’t need to be recomputed if the same task is executed again, a technique known as ``memoization``. The results of local task executions are stored under `~/.flyte/local-cache/` and cache keys are composed of **Cache Version**, **Task Signature**, and **Task Input Values**.

Similar to the remote case, a local cache entry for a task will be invalidated if either the `cache_version` or the task signature is modified. In addition, the local cache can also be emptied by running the following command: `pyflyte local-cache clear`, which essentially obliterates the contents of the `~/.flyte/local-cache/` directory.
To disable the local cache, you can set the `local.cache_enabled` config option (e.g. by setting the environment variable `FLYTE_LOCAL_CACHE_ENABLED=False`).
Expand Down
Loading

0 comments on commit fedad63

Please sign in to comment.