Skip to content

Commit

Permalink
DM-47199: Docs updates
Browse files Browse the repository at this point in the history
* Steps to provision Sasquatch in a new environment
* IDF environment descriptions
* Clarify that strimzi-access-operator phalanx app is needed for kafka
direct connection guide.
  • Loading branch information
fajpunk committed Oct 31, 2024
1 parent be3a7ab commit 509b4e6
Show file tree
Hide file tree
Showing 9 changed files with 209 additions and 1 deletion.
Binary file added docs/_static/bootstrap_forwarding_rule.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/forwarding_rule_details.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/gcp_ip_addresses.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/promote_ip_address.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/developer-guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ A Sasquatch developer is responsible for maintaining the Sasquatch components an
kafka-shutdown
broker-migration
connectors
new-environment

.. toctree::
:caption: Troubleshooting
Expand Down
141 changes: 141 additions & 0 deletions docs/developer-guide/new-environment.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
################################
Deploying into a new environment
################################

Deploying Sasquatch into a new environment requires multiple ArgoCD syncs with some manual information gathering and updating in between.


Enable Sasquatch in Phalanx
===========================

#. Cut a `Phalanx`_ development branch.
#. Ensure the ``strimzi`` and ``strimzi-access-operator`` Phalanx applications are enabled and synced in the new environment by adding them to the :samp:`environments/values-{environment}.yaml` file, and adding a blank :samp:`values-{environment}.yaml` file to their ``applications/`` directories.
`These docs <https://phalanx.lsst.io/developers/switch-environment-to-branch.html>`_ can help you enable them from your development branch.
#. Enable the ``sasquatch`` app in the environment.
For the :samp:`applications/sasquatch/values-{environment}.yaml` file, copy one from an existing environment that has the same enabled services that you want in the new environment.
Change all of the environment references to the new environment, and change or add anything else you need for the new environment.
#. Comment out any ``loadBalancerIP`` entries in the :samp:`applications/sasquatch/values-{environment}.yaml` file.
We'll fill these in later.
#. In the new environment's ArgoCD, point the ``sasquatch`` app at your Phalanx development branch, and sync it.

This first sync will not be successful.
The `cert-manager`_ ``Certificate`` resource will be stuck in a progressing state until we update some values and provision some DNS.

.. _Phalanx: https://phalanx.lsst.io
.. _cert-manager: https://cert-manager.io/

Gather IP addresses and update Phalanx config
=============================================

.. note::

The public IP address gathering and modification described here only applies to environments deployed on `GCP`_.
This process will be different for other types of environments.

#. Get the broker ids, which are the node ids of the the kafka brokers.
In this example, the broker ids are ``0``, ``1``, and ``2``:

.. code::
❯ kubectl get kafkanodepool -n sasquatch
NAME DESIRED REPLICAS ROLES NODEIDS
controller 3 ["controller"] [3,4,5]
kafka 3 ["broker"] [0,1,2]
#. A GCP public IP address will be provisioned for each of these broker nodes.
Another IP address will be provisioned for the external `kafka bootstrap servers`_ endpoint.
You can see all of the provisioned ip addresses in your GCP project here: :samp:`https://console.cloud.google.com/networking/addresses/list?authuser=1&hl=en&project={project name}`:

.. figure:: /_static/gcp_ip_addresses.png
:name: GCP IP addresses

#. One by one, click on the ``Forwarding rule`` links in each row until you find the ones annotated with :samp:`\{"kubernetes.io/service-name":"sasquatch/sasquatch-kafka-{broker node id}"\}` for each broker node.
Note the ip address and node number.

.. figure:: /_static/forwarding_rule_details.png
:name: Forwarding rule details

#. Find and note the IP address that is annotated with ``{"kubernetes.io/service-name":"sasquatch/sasquatch-kafka-external-bootstrap"}``:

.. figure:: /_static/bootstrap_forwarding_rule.png
:name: Bootstrap forwarding rule

#. Promote all of these IP addresses to GCP Static IP Addresses by choosing the option in the three-vertical-dots menu for each IP address (you may have to scroll horrizontally).
This makes sure that we won't lose these IP addresses and have to update DNS later:

.. figure:: /_static/promote_ip_address.png
:name: Promote IP address

#. Update the :samp:`applications/sasquatch/values-{environment}.yaml` ``strimzi-kafka.kafka`` config with ``loadBalancerIP`` and ``host`` entries that correspond with the node ids that you found.
Here is an example from ``idfint``.
Note that the broker node ids are in the ``broker`` entries, and that the ``host`` entries have numbers in them that match the those ids.

.. code:: yaml
strimzi-kafka:
kafka:
externalListener:
tls:
enabled: true
bootstrap:
loadBalancerIP: "35.188.187.82"
host: sasquatch-int-kafka-bootstrap.lsst.cloud
brokers:
- broker: 0
loadBalancerIP: "34.171.69.125"
host: sasquatch-int-kafka-0.lsst.cloud
- broker: 1
loadBalancerIP: "34.72.50.204"
host: sasquatch-int-kafka-1.lsst.cloud
- broker: 2
loadBalancerIP: "34.173.225.150"
host: sasquatch-int-kafka-2.lsst.cloud
#. Push these changes to your Phalanx branch and sync ``sasquatch`` in ArgoCD.

.. _GCP: https://cloud.google.com
.. _kafka bootstrap servers: https://kafka.apache.org/documentation/#producerconfigs_bootstrap.servers

Provision DNS for TLS certificate
=================================

#. Provision ``CNAME`` records (probably in AWS Route53) for `LetsEncrypt`_ verification for each of the ``host`` entries in the ``strimzi-kafka.kafka`` values.
Continuing with the ``idfint`` example:

.. code:: text
_acme-challenge.sasquatch-int-kafka-0.lsst.cloud (_acme-challenge.tls.lsst.cloud)
_acme-challenge.sasquatch-int-kafka-1.lsst.cloud (_acme-challenge.tls.lsst.cloud)
_acme-challenge.sasquatch-int-kafka-2.lsst.cloud (_acme-challenge.tls.lsst.cloud)
_acme-challenge.sasquatch-int-kafka-bootstrap.lsst.cloud (_acme-challenge.tls.lsst.cloud)
#. Provision ``A`` records for each of the ``host`` entries with their matching IP address values:

.. code:: text
sasquatch-int-kafka-0.lsst.cloud (34.171.69.125)
sasquatch-int-kafka-1.lsst.cloud (34.72.50.204)
sasquatch-int-kafka-2.lsst.cloud (34.173.225.150)
sasquatch-int-kafka-bootstrap.lsst.cloud (35.188.187.82)
#. Wait for the ``Certificate`` Kubernetes resource to provision in ArgoCD! This might take several minutes

.. _LetsEncrypt: https://letsencrypt.org

Configure Gafaelfawr OIDC authentication
========================================

Sasquatch assumes that Chronograf will use OIDC authentication.
Follow `these instructions <https://gafaelfawr.lsst.io/user-guide/openid-connect.html#chronograf>`_ to set it up.

.. warning::

This requires a Gafaelfawr restart.
It could also affect all of the apps in an environment if done incorrectly.
If your new environment is a production environment, you should probably wait for a maintenance window to do this step!

Merge your Phalanx branch!
==========================

If all is well, of course.
58 changes: 58 additions & 0 deletions docs/environments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,12 @@ The table below summarizes the Sasquatch environments and their main entry point
+---------------------------+---------------------------------------------------+-----------------------------------+----------------+
| :ref:`USDF dev<usdfdev>` | https://usdf-rsp-dev.slac.stanford.edu/chronograf | ``usdfdev_efd`` | Not required |
+---------------------------+---------------------------------------------------+-----------------------------------+----------------+
| :ref:`IDF<idf>` | https://data.lsst.cloud/chronograf | (not available) | Not required |
+---------------------------+---------------------------------------------------+-----------------------------------+----------------+
| :ref:`IDF int<idfint>` | https://data-int.lsst.cloud/chronograf | (not available) | Not required |
+---------------------------+---------------------------------------------------+-----------------------------------+----------------+
| :ref:`IDF dev<idfdev>` | https://data-dev.lsst.cloud/chronograf | ``idfdev_efd`` | Not required |
+---------------------------+---------------------------------------------------+-----------------------------------+----------------+
| :ref:`TTS<tts>` | https://tucson-teststand.lsst.codes/chronograf | ``tucson_teststand_efd`` | NOIRLab VPN |
+---------------------------+---------------------------------------------------+-----------------------------------+----------------+
| :ref:`BTS<bts>` | https://base-lsp.lsst.codes/chronograf | ``base_efd`` | Chile VPN |
Expand Down Expand Up @@ -75,6 +81,58 @@ Intended audience: Project staff.
- Schema Registry: ``http://sasquatch-schema-registry.sasquatch:8081`` (cluster internal)
- Kafka REST proxy API: ``https://usdf-rsp-dev.slac.stanford.edu/sasquatch-rest-proxy``

.. _idf:

IDF
---

Sasquatch production environment for the community science platform in Google Cloud.
This instance is mainly used for :ref:`application metrics<appmetrics>`.

Intended audience: Project staff.

- Chronograf: ``https://data.lsst.cloud/chronograf``
- InfluxDB HTTP API: ``https://data.lsst.cloud/influxdb``
- Kafdrop UI: ``https://data.lsst.cloud/kafdrop``
- Kafka boostrap server: ``sasquatch-kafka-bootstrap.lsst.cloud:9094``
- Schema Registry: ``http://sasquatch-schema-registry.sasquatch:8081`` (cluster internal)
- Kafka REST proxy API: (not available)

.. _idfint:

IDF int
-------

Sasquatch integration environment for the community science platform in Google Cloud.
This instance is used for testing.
There is no direct EFD integration.

Intended audience: Project staff.

- Chronograf: ``https://data-int.lsst.cloud/chronograf``
- InfluxDB HTTP API: ``https://data-int.lsst.cloud/influxdb``
- Kafdrop UI: ``https://data-int.lsst.cloud/kafdrop``
- Kafka boostrap server: ``sasquatch-int-kafka-bootstrap.lsst.cloud:9094``
- Schema Registry: ``http://sasquatch-schema-registry.sasquatch:8081`` (cluster internal)
- Kafka REST proxy API: ``https://data-int.lsst.cloud/sasquatch-rest-proxy``

.. _idfdev:

IDF dev
-------

Sasquatch dev environment for the community science platform in Google Cloud.
This instance is used for testing.

Intended audience: Project staff.

- Chronograf: ``https://data-dev.lsst.cloud/chronograf``
- InfluxDB HTTP API: ``https://data-dev.lsst.cloud/influxdb``
- Kafdrop UI: ``https://data-dev.lsst.cloud/kafdrop``
- Kafka boostrap server: ``sasquatch-dev-kafka-bootstrap.lsst.cloud:9094``
- Schema Registry: ``http://sasquatch-schema-registry.sasquatch:8081`` (cluster internal)
- Kafka REST proxy API: ``https://data-dev.lsst.cloud/sasquatch-rest-proxy``

.. _tts:

Tucson Test Stand (TTS)
Expand Down
4 changes: 3 additions & 1 deletion docs/user-guide/app-metrics.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
.. _appmetrics:

===================
Application metrics
===================

Applications can use Sasquatch infrastructure to publish metrics events to `InfluxDB`_ via `Kafka`_.
Setting certain Sasquatch values in Phalanx will create Kafka user and topic, and configure a Telegraf consumer to put messages from that topic into the ``telegraf-kafka-app-metrics-consumer`` database in the Sasquatch InfluxDB instance.
Setting certain Sasquatch values in Phalanx will create Kafka user and topic, and configure a Telegraf consumer to put messages from that topic into the ``lsst.square.metrics`` database in the Sasquatch InfluxDB instance.

The messages are expected to be in :ref:`Avro <avro>` format, and schemas are expected to be in the `Schema Registry`_ for any messages that are encoded with a schema ID.

Expand Down
6 changes: 6 additions & 0 deletions docs/user-guide/directconnection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,17 @@ This guide describes the the most secure and straightforward option, assuming th
Generating Kafka credentials
============================

.. note::

The ``strimzi-access-operator`` `Phalanx`_ app must be enabled.
It provides the ``KafkaAccess`` CRD that is used in this guide.

You can generate Kafka credentials by creating a couple of `Strimzi`_ resources:

* A `KafkaUser`_ resource, in the ``sasquatch`` namespace, to configure a user in the Kafka cluster and provision a Kubernetes Secret with that user's credentials
* A `KafkaAccess`_ resource, in your app's namespace, to make those credentials and other Kafka connection information available to your app

.. _Phalanx: https://phalanx.lsst.io
.. _Strimzi: https://strimzi.io
.. _KafkaUser: https://strimzi.io/docs/operators/latest/configuring.html#type-KafkaUser-reference
.. _KafkaAccess: https://github.com/strimzi/kafka-access-operator
Expand Down

0 comments on commit 509b4e6

Please sign in to comment.