Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https://issues.redhat.com/browse/ACM-15420 #7277

Open
wants to merge 6 commits into
base: 2.12_stage
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 47 additions & 42 deletions observability/observability_enable.adoc
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
[#enabling-observability-service]
= Enabling the observability service
= Enabling the Observability service

When you enable the observability service on your hub cluster, the `multicluster-observability-operator` watches for new managed clusters and automatically deploys metric and alert collection services to the managed clusters. You can use metrics and configure Grafana dashboards to make cluster resource information visible, help you save cost, and prevent service disruptions.
When you enable the Observability service on your hub cluster, the `multicluster-observability-operator` watches for new managed clusters and automatically deploys metric and alert collection services to the managed clusters. You can use metrics and configure Grafana dashboards to make cluster resource information visible, help you save cost, and prevent service disruptions.

Monitor the status of your managed clusters with the observability component, also known as the `multicluster-observability-operator` pod.
Monitor the status of your managed clusters with the Observability component, also known as the `multicluster-observability-operator` pod.

*Required access:* Cluster administrator, the `open-cluster-management:cluster-manager-admin` role, or S3 administrator.

* <<prerequisites-observability,Prerequisites>>
* <<enabling-observability,Enabling observability from the command line interface>>
* <<enabling-observability,Enabling Observability from the command line interface>>
* <<creating-mco-cr,Creating the MultiClusterObservability custom resource>>
* <<enabling-observability-ocp,Enabling observability from the {ocp} console>>
* <<disabling-observability,Disabling observability>>
* <<removing-observability-resource,Removing observability>>
* <<enabling-observability-ocp,Enabling Observability from the {ocp} console>>
* <<disabling-observability,Disabling Observability>>
* <<removing-observability-resource,Removing Observability>>

[#prerequisites-observability]
== Prerequisites
Expand All @@ -21,7 +21,7 @@ Monitor the status of your managed clusters with the observability component, al
- You must define a storage class in the `MultiClusterObservability` custom resource, if there is no default storage class specified.
- Direct network access to the hub cluster is required. Network access to load balancers and proxies are not supported. For more information, see link:../networking/networking_intro.adoc#networking[Networking].
- You must configure an object store to create a storage solution.
** *Important:* When you configure your object store, ensure that you meet the encryption requirements that are necessary when sensitive data is persisted. The observability service uses Thanos supported, stable object stores. You might not be able to share an object store bucket by multiple {acm-short} observability installations. Therefore, for each installation, provide a separate object store bucket.
** *Important:* When you configure your object store, ensure that you meet the encryption requirements that are necessary when sensitive data is persisted. The Observability service uses Thanos supported, stable object stores. You might not be able to share an object store bucket by multiple {acm-short} Observability installations. Therefore, for each installation, provide a separate object store bucket.
** {acm-short} supports the following cloud providers with stable object stores:

* Amazon Web Services S3 (AWS S3)
Expand All @@ -33,19 +33,19 @@ Monitor the status of your managed clusters with the observability component, al


[#enabling-observability]
== Enabling observability from the command line interface
== Enabling Observability from the command line interface

Enable the observability service by creating a `MultiClusterObservability` custom resource instance. Before you enable observability, see xref:../observability/observe_environments.adoc#observability-pod-capacity-requests[Observability pod capacity requests] for more information.
Enable the Observability service by creating a `MultiClusterObservability` custom resource instance. Before you enable Observability, see xref:../observability/observe_environments.adoc#observability-pod-capacity-requests[Observability pod capacity requests] for more information.

*Note:*
*Notes:*

- When observability is enabled or disabled on {ocp-short} managed clusters that are managed by {acm-short}, the observability endpoint operator updates the `cluster-monitoring-config` config map by adding additional `alertmanager` configuration that automatically restarts the local Prometheus.
- The observability endpoint operator updates the `cluster-monitoring-config` config map by adding additional `alertmanager` configurations that automatically restart the local Prometheus. When you insert the `alertmanager` configuration in the {ocp-short} managed cluster, the configuration removes the settings that relate to the retention field of the Prometheus metrics.
- When Observability is enabled or disabled on {ocp-short} managed clusters that are managed by {acm-short}, the observability endpoint operator updates the `cluster-monitoring-config` config map by adding additional `alertmanager` configuration that automatically restarts the local Prometheus.
- The Observability endpoint operator updates the `cluster-monitoring-config` config map by adding additional `alertmanager` configurations that automatically restart the local Prometheus. When you insert the `alertmanager` configuration in the {ocp-short} managed cluster, the configuration removes the settings that relate to the retention field of the Prometheus metrics.

Complete the following steps to enable the observability service:
Complete the following steps to enable the Observability service:

. Log in to your {acm-short} hub cluster.
. Create a namespace for the observability service with the following command:
. Create a namespace for the Observability service with the following command:

+
[source,bash]
Expand Down Expand Up @@ -81,7 +81,7 @@ oc create secret generic multiclusterhub-operator-pull-secret \
----

+
*Important:* If you modify the global pull secret for your cluster by using the {ocp-short} documentation, be sure to also update the global pull secret in the observability namespace. See link:https://docs.redhat.com/documentation/en-us/openshift_container_platform/4.15/html/images/managing-images#images-update-global-pull-secret_using-image-pull-secrets[Updating the global pull secret] for more details.
*Important:* If you modify the global pull secret for your cluster by using the {ocp-short} documentation, be sure to also update the global pull secret in the Observability namespace. See link:https://docs.redhat.com/documentation/en-us/openshift_container_platform/4.15/html/images/managing-images#images-update-global-pull-secret_using-image-pull-secrets[Updating the global pull secret] for more details.

. Create a secret for your object storage for your cloud provider. Your secret must contain the credentials to your storage solution. For example, run the following command:

Expand Down Expand Up @@ -225,7 +225,7 @@ Generating access keys using AWS Security Service require the following addition

. Create an IAM policy that limits access to an S3 bucket.
. Create an IAM role with a trust policy to generate JWT tokens for {ocp-short} service accounts.
. Specify annotations for the observability service accounts that requires access to the S3 bucket. You can find an example of how observability on Red Hat OpenShift Service on AWS (ROSA) cluster can be configured to work with AWS STS tokens in the _Set environment_ step. See link:https://www.rosaworkshop.io/[Red Hat OpenShift Service on AWS (ROSA)] for more details, along with link:https://www.rosaworkshop.io/rosa/15-sts_explained/[ROSA with STS explained] for an in-depth description of the requirements and setup to use STS tokens.
. Specify annotations for the Observability service accounts that requires access to the S3 bucket. You can find an example of how Observability on Red Hat {rosa} (ROSA) cluster can be configured to work with AWS STS tokens in the _Set environment_ step. See link:https://www.rosaworkshop.io/[Red Hat {rosa} (ROSA)] for more details, along with link:https://www.rosaworkshop.io/rosa/15-sts_explained/[ROSA with STS explained] for an in-depth description of the requirements and setup to use STS tokens.

[#generate-access-keys]
=== Generating access keys using the AWS Security Service
Expand Down Expand Up @@ -408,7 +408,7 @@ YOUR_CLOUD_PROVIDER_SECRET_KEY=$(oc -n open-cluster-management-observability get
echo $YOUR_CLOUD_PROVIDER_SECRET_KEY
----

. Verify that observability is enabled by checking the pods for the following deployments and stateful sets. You might receive the following information:
. Verify that Observability is enabled by checking the pods for the following deployments and stateful sets. You might receive the following information:

+
----
Expand Down Expand Up @@ -482,7 +482,7 @@ See link:../apis/observability.json.adoc#observability-api[Observability API] fo
+
For more information, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.15/html-single/machine_management/index#creating-infrastructure-machinesets[Creating infrastructure machine sets].

. Apply the observability YAML to your cluster by running the following command:
. Apply the Observability YAML to your cluster by running the following command:

+
[source,bash]
Expand All @@ -492,7 +492,7 @@ oc apply -f multiclusterobservability_cr.yaml
+
All the pods in `open-cluster-management-observability` namespace for Thanos, Grafana and Alertmanager are created. All the managed clusters connected to the {acm-short} hub cluster are enabled to send metrics back to the {acm-short} Observability service.

. Validate that the observability service is enabled and the data is populated by launching the Grafana dashboards.
. Validate that the Observability service is enabled and the data is populated by launching the Grafana dashboards.

. Click the *Grafana link* that is near the console header, from either the console _Overview_ page or the _Clusters_ page.

Expand Down Expand Up @@ -522,34 +522,34 @@ multicluster-observability-operator 1/1 1 1 35m ins
installer.namespace: open-cluster-management
----

. _Optional:_ If you want to exclude specific managed clusters from collecting the observability data, add the following cluster label to your clusters: `observability: disabled`.
. _Optional:_ If you want to exclude specific managed clusters from collecting the Observability data, add the following cluster label to your clusters: `observability: disabled`.

The observability service is enabled. After you enable the observability service, the following functions are initiated:
The Observability service is enabled. After you enable the Observability service, the following functions are initiated:

- All the alert managers from the managed clusters are forwarded to the {acm-short} hub cluster.
- All the managed clusters that are connected to the {acm-short} hub cluster are enabled to send alerts back to the {acm-short} observability service. You can configure the {acm-short} Alertmanager to take care of deduplicating, grouping, and routing the alerts to the correct receiver integration such as email, PagerDuty, or OpsGenie. You can also handle silencing and inhibition of the alerts.
- All the managed clusters that are connected to the {acm-short} hub cluster are enabled to send alerts back to the {acm-short} Observability service. You can configure the {acm-short} Alertmanager to take care of deduplicating, grouping, and routing the alerts to the correct receiver integration such as email, PagerDuty, or OpsGenie. You can also handle silencing and inhibition of the alerts.
+
*Note:* Alert forwarding to the {acm-short} hub cluster feature is only supported by managed clusters on a supported {ocp-short} version. After you install {acm-short} with observability enabled, alerts are automatically forwarded to the hub cluster. See xref:../observability/customize_observability.adoc#forward-alerts[Forwarding alerts] to learn more.
*Note:* Alert forwarding to the {acm-short} hub cluster feature is only supported by managed clusters on a supported {ocp-short} version. After you install {acm-short} with Observability enabled, alerts are automatically forwarded to the hub cluster. See xref:../observability/customize_observability.adoc#forward-alerts[Forwarding alerts] to learn more.

[#enabling-observability-ocp]
== Enabling observability from the {ocp} console
== Enabling Observability from the {ocp} console

Optionally, you can enable observability from the {ocp} console, create a project named `open-cluster-management-observability`. Complete the following steps:
Optionally, you can enable Observability from the {ocp} console, create a project named `open-cluster-management-observability`. Complete the following steps:

. Create an image pull-secret named, `multiclusterhub-operator-pull-secret` in the `open-cluster-management-observability` project.

. Create your object storage secret named, `thanos-object-storage` in the `open-cluster-management-observability` project.

. Enter the object storage secret details, then click *Create*. See step four of the _Enabling observability_ section to view an example of a secret.
. Enter the object storage secret details, then click *Create*. See step four of the _Enabling Observability_ section to view an example of a secret.

. Create the `MultiClusterObservability` custom resource instance. When you receive the following message, the observability service is enabled successfully from {ocp-short}: `Observability components are deployed and running`.
. Create the `MultiClusterObservability` custom resource instance. When you receive the following message, the Observability service is enabled successfully from {ocp-short}: `Observability components are deployed and running`.

[#verifying-thanos-version]
=== Verifying the Thanos version

After Thanos is deployed on your cluster, verify the Thanos version from the command line interface (CLI).

After you log in to your hub cluster, run the following command in the observability pods to receive the Thanos version:
After you log in to your hub cluster, run the following command in the Observability pods to receive the Thanos version:

[source,bash]
----
Expand All @@ -559,40 +559,45 @@ thanos --version
The Thanos version is displayed.

[#disabling-observability]
== Disabling observability
== Disabling Observability

You can disable observability, which stops data collection on the {acm-short} hub cluster.
You can disable Observability, which stops data collection on the {acm-short} hub cluster.

[#disabling-observability-on-all-clusters]
=== Disabling observability on all clusters
=== Disabling Observability on all clusters

Disable Observability by removing Observability components on all managed clusters.

Disable observability by removing observability components on all managed clusters.
Update the `multicluster-observability-operator` resource by setting `enableMetrics` to `false`. Your updated resource might resemble the following change:

[source,yaml]
----
spec:
imagePullPolicy: Always
imagePullSecret: multiclusterhub-operator-pull-secret
observabilityAddonSpec: # The ObservabilityAddonSpec defines the global settings for all managed clusters which have observability add-on enabled
enableMetrics: false #indicates the observability addon push metrics to hub server
observabilityAddonSpec: <1>
enableMetrics: false <2>
workers: <3>
----
<1> Use the `observabilityAddonSpec` parameter to define the global settings for all managed clusters that have the Observability add-on enabled.
<2> Use the `enableMetrics` parameter to indicate that the Observability add-on is enabled to push metrics to hub cluster server.
<3> Use the `workers` parameter to list worker nodes into the metric collector process to shard federate requests made to Prometheus on your hub cluster. Then the metric collector sends sperate remote-write requests to Thanos on your hub cluster.
Comment on lines +578 to +584
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saswatamcode when you have a chance, will you review this part of the PR please?


[#disabling-observability-on-a-single-cluster]
=== Disabling observability on a single cluster
=== Disabling Observability on a single cluster

Disable observability by removing observability components on specific managed clusters. Complete the following steps:
Disable Observability by removing Observability components on specific managed clusters. Complete the following steps:

. Add the `observability: disabled` label to the `managedclusters.cluster.open-cluster-management.io` custom resource.

. From the {acm-short} console _Clusters_ page, add the `observability=disabled` label to the specified cluster.
+
*Note:* When a managed cluster with the observability component is detached, the `metrics-collector` deployments are removed.
*Note:* When a managed cluster with the Observability component is detached, the `metrics-collector` deployments are removed.

[#removing-observability-resource]
== Removing observability
== Removing Observability

When you remove the `MultiClusterObservability` custom resource, you are disabling and uninstalling the observability service. From the {ocp-short} console navigation, select *Operators* > *Installed Operators* > *Advanced Cluster Manager for Kubernetes*. Remove the `MultiClusterObservability` custom resource.
When you remove the `MultiClusterObservability` custom resource, you are disabling and uninstalling the Observability service. From the {ocp-short} console navigation, select *Operators* > *Installed Operators* > *Advanced Cluster Manager for Kubernetes*. Remove the `MultiClusterObservability` custom resource.

[#additional-resources-enable-obs]
== Additional resources
Expand All @@ -606,9 +611,9 @@ When you remove the `MultiClusterObservability` custom resource, you are disabli
* link:https://www.redhat.com/en/technologies/cloud-computing/openshift-data-foundation[Red Hat OpenShift Data Foundation (formerly known as Red Hat OpenShift Container Storage)]
* link:https://www.ibm.com/docs/en/baw/20.x?topic=storage-preparing-cloud-public-roks[Red Hat OpenShift on IBM (ROKS)]

- See xref:../observability/using_observability.adoc#using-observability[Using observability].
- See xref:../observability/using_observability.adoc#using-observability[Using Observability].

- To learn more about customizing the observability service, see xref:../observability/customize_observability.adoc#customizing-observability[Customizing observability].
- To learn more about customizing the Observability service, see xref:../observability/customize_observability.adoc#customizing-observability[Customizing Observability].

- For more related topics, return to the xref:../observability/observe_environments_intro.adoc#observing-environments-intro[Observability service].

Expand Down
2 changes: 2 additions & 0 deletions release_notes/acm_whats_new.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,8 @@ For other Application topics, see link:../applications/app_management_overview.a

* You can now use the _Advanced search_ option from the console by selecting the *Advanced search* drop-down button. Specify your query and receive results that match the exact strings that you enter and range-based search parameters. See link:../console/search_console.adoc#search-customization[Search customization and configurations].

* Use the new `workers` parameter in the `ObservabilityAddOn` custom resource definition to add more worker nodes into the metric collector procress to shard federate requests made to your hub cluster. See link:../observability/observability_enable.adoc#enabling-observability-service[Enabling the observability service].

See link:../observability/observe_environments_intro.adoc#observing-environments-intro[Observability service introduction].

[#governance-whats-new]
Expand Down