-
Notifications
You must be signed in to change notification settings - Fork 116
https://issues.redhat.com/browse/ACM-22475 Backup and restore Observability #8222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+137
−16
Merged
Changes from all commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
f4af7cf
https://issues.redhat.com/browse/ACM-22475
dockerymick 7fdc3f9
More changes
dockerymick d33ca43
more updates, modular writing
dockerymick 28c98f2
Removed hidden comments
dockerymick 3de94fb
Apply suggestions from code review
dockerymick b8060c5
Reduced table syntax
dockerymick 34c4cf1
Update backup_restore_obs.adoc
dockerymick ae867a8
Updates
dockerymick c4a6b18
Update backup_restore_obs.adoc
dockerymick 28cf2aa
More updates after reviewing local-cluster details
dockerymick a3dceeb
Removing hidden comment
dockerymick 6a69196
Update business_continuity/backup_restore/backup_restore_config_obs.adoc
dockerymick 8016c5f
Update business_continuity/backup_restore/backup_restore_config_obs.adoc
dockerymick d0d76eb
Update business_continuity/backup_restore/backup_restore_obs.adoc
dockerymick b0a3688
Few more updates after initial peer review
dockerymick fefe642
Update backup_restore_config_obs.adoc
dockerymick 745f95b
Updates after review from peer
dockerymick 4a506ff
Update business_continuity/backup_restore/backup_restore_obs.adoc
dockerymick 5b59223
Updates after dev lead review
dockerymick 768295a
Update backup_restore_obs.adoc
dockerymick 1a8b5fe
Updates after second review from peer
dockerymick File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
26 changes: 26 additions & 0 deletions
26
business_continuity/backup_restore/backup_restore_config_obs.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| [#backup-restore-obs-config] | ||
| = Backup and restore configuration for Observability | ||
|
|
||
| The Observability service uses an S3-compatible object store to keep all time-series data collected from managed clusters. Because Observability is a stateful service, it is sensitive to active and passive backup patterns. You must configure Oservability to ensure that your data stays safe and keeps its continuity during the hub cluster migration or backup. | ||
|
|
||
| *Notes:* | ||
|
|
||
| - When a managed cluster is detached from the primary hub cluster and reattached to the backup hub cluster, metrics are not collected. To help connect the metrics, you can script the cluster migration for large fleets. | ||
|
|
||
| - For product backup and restore, the Observability service automatically labels its resources with the `cluster.open-cluster-management.io/backup` label. | ||
|
|
||
| .Resources that are automatically backuped up and restored for Observability | ||
| |==== | ||
| | Resource type | Resource name | ||
|
|
||
| | ConfigMaps | ||
| | `observability-metrics-custom-allowlist`, `thanos-ruler-custom-rules`, `alertmanager-config`, `policy-acs-central-status`, Any ConfigMap labeled with `grafana-custom-dashboard` | ||
|
|
||
| | Secrets | ||
| | `thanos-object-storage`, `observability-server-ca-certs`, `observability-client-ca-certs`, `observability-server-certs`, `observability-grafana-certs`, `alertmanager-byo-ca`, `alertmanager-byo-cert`, `proxy-byo-ca`, `proxy-byo-cert` | ||
| |==== | ||
|
|
||
| == Additional resources | ||
|
|
||
| - For the steps to complete the backup and restore for Observability, see xref:../backup_restore/backup_restore_obs.adoc#backup-restore-obs[Backing up and restoring Observability service]. | ||
|
|
||
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
|
||
98 changes: 98 additions & 0 deletions
98
business_continuity/backup_restore/backup_restore_obs.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| [#backup-restore-obs] | ||
| = Backing up and restoring Observability service | ||
|
|
||
| Backup and restore the Observability service to keep data safe and to support continuity during the hub cluster migration or backup. To help with disruption in metric data collection, use the same S3-compatible object store for both the primary and backup hub clusters. | ||
|
|
||
| .Prerequsites | ||
|
|
||
| - Ensure that you can run a restore operation for backup types by completing the xref:../business_continuity/backup_restore/backup_restore.adoc#restoring-backup-restore-operation[Using the restore operation for backup types] process. | ||
|
|
||
| .Procedure | ||
|
|
||
| Complete the following steps to backup and restore the Observability service: | ||
|
|
||
| . To ensure the Observability service recognizes the hub cluster as the `local-cluster`, a managed hub cluster, change the `spec.disableHubSelfManagement` parameter in the `MultiClusterHub` custom resource to `false`. | ||
|
|
||
| + | ||
| *Note:* If you change the default name of your `local-cluster` to another value, the results appear within the changed local cluster name. | ||
|
|
||
| . To preserve the tenant ID of the `observatorium` resource as you manually back up and restore the `observatorium` resource, run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc get observatorium -n open-cluster-management-observability -o yaml > observatorium-backup.yaml | ||
| ---- | ||
|
|
||
| . To backup the `observability` deployment, run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc get mco observability -o yaml > mco-cr-backup.yaml | ||
| ---- | ||
|
|
||
| . Shut down the Thanos compactor on your primary hub cluster by running the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc scale statefulset observability-thanos-compact -n open-cluster-management-observability --replicas=0 | ||
| ---- | ||
|
|
||
| .. Verify the compactor is not active by running the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc get pods observability-thanos-compact-0 -n open-cluster-management-observability | ||
| ---- | ||
|
|
||
| . Restore the `backup` resources such as the automatically backed-up ConfigMaps and Secrets listed in the backup and restore configuration for Observability. | ||
|
|
||
| . To preserve the tenant ID for maintaing continuity in the metrics ingestion and querying, restore the `observatorium` resource to the backup hub cluster. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc apply -f observatorium-backup.yaml | ||
| ---- | ||
|
|
||
| . Apply the backed up `MultiClusterObservability` custom resource to start the Observability service on the new restored hub cluster. Run the following command: | ||
|
|
||
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc apply -f mco-cr-backup.yaml | ||
| ---- | ||
| + | ||
| The operator starts the Observability service and detects the existing `observatorium` resource, reusing the preserved tenant ID instead of creating a new one. | ||
|
|
||
| . Verify that the Observability service runs on your new hub cluster. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc get pods -n open-cluster-management-observability | ||
| ---- | ||
|
|
||
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
dockerymick marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| . Verify that the `observability-controller` `managedclusteraddon` does not have a status in the `DEGRADED` column, and that the `PROGRESSING` status is not set to `False`. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc get managedclusteraddons -A | awk 'NR==1 || /observability-controller/ | ||
| ---- | ||
|
|
||
| . Verify metrics collection from your managed clusters by accesing Grafana. | ||
|
|
||
| . Verify that your managed clusters are connected to your new hub cluster by checking for the `Available` status for each managed cluster. | ||
|
|
||
| . Shut down the Observability service on your previous hub cluster by removing the resources. Run the following command: | ||
|
|
||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc delete mco observability | ||
| ---- | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.