Skip to content

Commit

Permalink
Refresh sasquatch docs
Browse files Browse the repository at this point in the history
  • Loading branch information
afausti committed Feb 8, 2024
1 parent c977c87 commit 995cf7d
Show file tree
Hide file tree
Showing 13 changed files with 308 additions and 112 deletions.
42 changes: 16 additions & 26 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,43 +14,34 @@ name: Docs
- "renovate/**"
- "tickets/**"
- "u/**"
tags:
- "*"
workflow_dispatch: {}
release:
types: [published]

jobs:

docs:

runs-on: ubuntu-latest
timeout-minutes: 10

steps:
- name: Checkout
uses: actions/checkout@v3

- name: Print GitHub event name
run: echo "${{ github.event_name }}"

- name: Filter paths
uses: dorny/paths-filter@v2
id: filter
- uses: actions/checkout@v4
with:
filters: |
docs:
- ".github/workflows/docs.yaml"
- "docs/**"
fetch-depth: 0 # full history for setuptools_scm

- name: Install graphviz
if: steps.filter.outputs.docs == 'true'
- name: Install Graphviz
run: sudo apt-get install graphviz

- name: Build docs
if: steps.filter.outputs.docs == 'true'
- name: Run tox
uses: lsst-sqre/run-tox@v1
with:
python-version: "3.11"
tox-envs: docs
python-version: "3.10"
tox-envs: "docs"
# Add docs-linkcheck when the docs and PyPI package are published
# tox-envs: "docs,docs-linkcheck"

# Only attempt documentation uploads for tagged releases and pull
# requests from ticket branches in the same repository. This avoids
# requests from ticket branches in the same repository. This avoids
# version clutter in the docs and failures when a PR doesn't have access
# to secrets.
- name: Upload to LSST the Docs
Expand All @@ -60,8 +51,7 @@ jobs:
dir: "docs/_build/html"
username: ${{ secrets.LTD_USERNAME }}
password: ${{ secrets.LTD_PASSWORD }}
if: >-
steps.filter.outputs.docs == 'true'
&& github.event_name != 'merge_group'
if: >
github.event_name != 'merge_group'
&& (github.event_name != 'pull_request'
|| startsWith(github.head_ref, 'tickets/'))
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ venv
src/sasquatch.egg-info
.tox
.DS_Store
docs/user-guide/phalanx.code-workspace
src/sasquatch/__pycache__
Binary file added docs/_static/sasquatch_architecture_single.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion docs/_static/sasquatch_architecture_single.svg

This file was deleted.

6 changes: 0 additions & 6 deletions docs/apis.rst

This file was deleted.

59 changes: 59 additions & 0 deletions docs/developer-guide/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
###############
Developer guide
###############

This part of Sasquatch documentation contains information primarily of interest to developers of Sasquatch itself.


Architecture overview
=====================

.. toctree::
:caption: Architecture overview


.. figure:: /_static/sasquatch_architecture_single.png
:name: Sasquatch architecture overviewpng

Kafka
-----

In Sasquatch, `Kafka`_ is used as a message queue to InfluxDB and for data replication between Sasquatch :ref:`environments`.

Kafka is managed by `Strimzi`_.
In addition to the Strimzi components, Sasquatch uses the Confluent Schema Registry and the Confluent Kafka REST proxy to connect HTTP-based clients with Kafka.

.. _Kafka: https://kafka.apache.org
.. _Strimzi: https://strimzi.io

Kafka Connect
-------------

In Sasquatch, Kafka connectors are managed by the `kafka-connect-manager`_ tool.

The InfluxDB Sink connector consumes Kafka topics, converts the records to the InfluxDB line protocol, and writes them to an InfluxDB database.
Sasquatch :ref:`namespaces` map to InfluxDB databases.

The MirrorMaker 2 source connector is used for data replication.


InfluxDB Enterprise
-------------------

InfluxDB is a `time series database`_ optimized for efficient storage and analysis of time series data.

InfluxDB organizes the data in measurements, fields, and tags.
In Sasquatch, Kafka topics (telemetry topics and metrics) map to InfluxDB measurements.

InfluxDB provides an SQL-like query language called `InfluxQL`_ and a more powerful data scripting language called `Flux`_.
Both languages can be used in Chronograf for data exploration and visualization.

Read more about the Sasquatch architecture in `SQR-068`_.

.. _kafka-connect-manager: https://kafka-connect-manager.lsst.io/
.. _time series database: https://www.influxdata.com/time-series-database/
.. _InfluxQL: https://docs.influxdata.com/influxdb/v1.8/query_language/
.. _Flux: https://docs.influxdata.com/influxdb/v1.8/flux/
.. _SQR-068: https://sqr-068.lsst.io


7 changes: 4 additions & 3 deletions docs/documenteer.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
title = "sasquatch"
copyright = "2023 Association of Universities for Research in Astronomy, Inc. (AURA)"
copyright = "2024 Association of Universities for Research in Astronomy, Inc. (AURA)"

[project.python]
package = "sasquatch"
Expand All @@ -11,8 +11,9 @@ disable_primary_sidebars = [
"index",
"changelog",
]
extensions = ["sphinxcontrib.youtube"]

extensions = [
"sphinxcontrib.youtube",
]

[sphinx.intersphinx.projects]
python = "https://docs.python.org/3/"
59 changes: 6 additions & 53 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,64 +5,17 @@
Overview
########

Sasquatch is the Rubin Observatory service for recording, displaying, and alerting on telemetry data and metrics.
Sasquatch is the Rubin Observatory's service for metrics and telemetry data.

Sasquatch is currently deployed at the Summit, USDF and test stands through `Phalanx`_.

Sasquatch architecture
======================


.. figure:: /_static/sasquatch_architecture_single.svg
:name: Sasquatch architecture overview.

Kafka
-----

In Sasquatch, `Kafka`_ is used as a message queue to InfluxDB and for data replication between Sasquatch :ref:`environments`.

Kafka is managed by `Strimzi`_.
In addition to the Strimzi components, Sasquatch uses the Confluent Schema Registry and the Confluent Kafka REST proxy to connect HTTP-based clients with Kafka.

.. _Kafka: https://kafka.apache.org
.. _Strimzi: https://strimzi.io

Kafka Connect
-------------

In Sasquatch, Kafka connectors are managed by the `kafka-connect-manager`_ tool.

The InfluxDB Sink connector consumes Kafka topics, converts the records to the InfluxDB line protocol, and writes them to an InfluxDB database.
Sasquatch :ref:`namespaces` map to InfluxDB databases.

The MirrorMaker 2 source connector is used for data replication.

Sasquatch connectors are configured in `Phalanx`_.

InfluxDB
--------

InfluxDB is an open-source `time series database`_ optimized for efficient storage and analysis of time series data.

InfluxDB organizes the data in measurements, fields, and tags.
In Sasquatch, Kafka topics (telemetry topics and metrics) map to InfluxDB measurements.

InfluxDB provides an SQL-like query language called `InfluxQL`_ and a more powerful data scripting language called `Flux`_.
Both languages can be used in Chronograf for data exploration and visualization.

Read more about Sasquatch architecture in `SQR-068`_.

.. _Phalanx: https://phalanx.lsst.io
.. _kafka-connect-manager: https://kafka-connect-manager.lsst.io/
.. _time series database: https://www.influxdata.com/time-series-database/
.. _InfluxQL: https://docs.influxdata.com/influxdb/v1.8/query_language/
.. _Flux: https://docs.influxdata.com/influxdb/v1.8/flux/
.. _SQR-068: https://sqr-068.lsst.io
Built on Kafka and InfluxDB, Sasquatch offers a comprehensive solution for collecting, storing, and querying time-series data.

Sasquatch is currently deployed at the Summit, USDF and test stands through `Phalanx`_.

.. toctree::
:hidden:

User guide <user-guide/index>
APIs <apis>
Environments <environments>
Developer guide <developer-guide/index>

.. _Phalanx: https://phalanx.lsst.io
37 changes: 37 additions & 0 deletions docs/user-guide/analysistools.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
.. _analysis-tools:

########
Overview
########

The `Analysis Tools`_ package is used to create QA metrics from the `LSST Pipelines`_ outputs.

Currently, the Analysis Tools metrics are dispatched to the ``usdfdev_efd`` Sasquatch environment under the ``lsst.dm`` namespace.

The EFD Python client can be used to query these metrics.

For example, to get the list of analysis tools in the ``lsst.dm`` namespace, you can use:

.. code:: python
from lsst_efd_client import EfdClient
client = EfdClient("usdfdev_efd", db_name="lsst.dm")
await client.get_topics()
Example notebooks
=================

.. grid:: 3

.. grid-item-card:: Analysis Tools metrics
:link: https://github.com/lsst-sqre/sasquatch/blob/main/docs/user-guide/notebooks/AnalysisTools.ipynb
:link-type: url

Learn how to query Analysis Tools metrics using the EFD Python client and InfluxQL.


.. _LSST Pipelines: https://pipelines.lsst.io
.. _Analysis Tools: https://pipelines.lsst.io/v/daily/modules/lsst.analysis.tools/index.html
9 changes: 3 additions & 6 deletions docs/user-guide/efdclient.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
The EFD Python client
#####################

The EFD Python client is based on `aioinflux`_ an asyncio client to access EFD data from a notebook in the RSP.
The EFD Python client provides convenience methods for accessing EFD data.

For example, at the USDF environment you can instantiate the EFD client using:
For example, at USDF you can instantiate the EFD client using:

.. code::
Expand All @@ -17,14 +17,11 @@ For example, at the USDF environment you can instantiate the EFD client using:
await client.get_topics()
where ``usdf_efd`` is an alias to the :ref:`environment <environments>`.
It helps to discover the InfluxDB API URL and the credentials to connect to the EFD database.
It helps to discover the InfluxDB API URL and the credentials to connect to Sasquatch.

The EFD Python client provides convenience methods for accessing EFD data.
Read more about the methods available in the `EFD client documentation`_.

.. _EFD client documentation: https://efd-client.lsst.io
.. _aioinflux: https://aioinflux.readthedocs.io/en/stable/


InfluxQL
--------
Expand Down
5 changes: 5 additions & 0 deletions docs/user-guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ User guide
The EFD Python client <efdclient>
Working with timestamps <timestamps>

.. toctree::
:caption: Analysis Tools metrics

Overview <analysistools>

.. toctree::
:caption: Data exploration and visualization with Chronograf

Expand Down
Loading

0 comments on commit 995cf7d

Please sign in to comment.