Skip to content

[CP 1397] [CP 1391] anr docs update and minor fix during resource cleanup#532

Open
ci-penbot-01 wants to merge 1 commit intoROCm:mainfrom
ci-penbot-01:CP.O2O.pensando.gpu-operator.1397.rocm.gpu-operator.main
Open

[CP 1397] [CP 1391] anr docs update and minor fix during resource cleanup#532
ci-penbot-01 wants to merge 1 commit intoROCm:mainfrom
ci-penbot-01:CP.O2O.pensando.gpu-operator.1397.rocm.gpu-operator.main

Conversation

@ci-penbot-01
Copy link
Copy Markdown
Contributor

cp of pensando/gpu-operator#1397


Source PR Description (pensando/gpu-operator#1397):

This PR fixes some documentation related issues and couple of ANR issues. Fixes
GPUOP-645 - Update documentation for enabling ANR on Openshift environment
GPUOP-647 - Handle cleanup of configmap during uninstall of operator
GPUOP-648 - Update documentation for maxAllowedRunsPerWindow attribute
GPUOP-652 - modify sed command in configMapImage utility to overcome permission issues in Openshift deployment

Cherrypick triggered by: ACP-Automation

* anr docs update and minor fixes

* update sed command to overcome Openshift permission issue

(cherry picked from commit 90b20f956109f7203dcd2eee481b032f814868dc)

Co-authored-by: Uday Bhaskar <udayb@amd.com>
(cherry picked from commit 74a29c11f8c7421fd21e731712c408e2b50e87ed)
@ci-penbot-01
Copy link
Copy Markdown
Contributor Author

AI-Assisted Cherry-Pick

Source PR: #1397
Target Branch: main

The cherry-pick operation encountered merge conflicts which were resolved automatically using AI assistance.

Files with conflicts (resolved by AI):

  • docs/autoremediation/auto-remediation.md:91-95
  • internal/controllers/remediation_handler.go:1964-2076
Original conflict in docs/autoremediation/auto-remediation.md
<<<<<<< HEAD
  1. **If using OpenShift AI Operator with CRD `DataScienceCluster`:** Argo Workflows are possibly already deployed by the OpenShift AI Operator, if the Custom Resource Definition (CRD) like workflows.argoproj.io is already existing, no additional installation is needed.
=======
1. **If using OpenShift AI Operator with CRD `DataScienceCluster`:** Argo Workflows are possibly already deployed by the OpenShift AI Operator, if the CustomResourceDefinition like workflows.argoproj.io is already existing, no additional installation is needed.
>>>>>>> 74a29c11... anr docs update and minor fix during resource cleanup (#1391) (#1397)
Original conflict in internal/controllers/remediation_handler.go
<<<<<<< HEAD
=======

func (h *remediationMgrHelper) createConfigMapFromImage(ctx context.Context, devConfig *amdv1alpha1.DeviceConfig) (ctrl.Result, error) {
	logger := log.FromContext(ctx)
	image := devConfig.Spec.RemediationWorkflow.ConfigMapImage
	if image == "" {
		return ctrl.Result{}, nil
	}

	jobName := devConfig.Name + "-" + ConfigMapImageJobSuffix
	namespace := devConfig.Namespace
	configMapName := devConfig.Name + "-" + DefaultConfigMapSuffix

	// Check if the ConfigMap already exists.
	existingCM, cmErr := h.getConfigMap(ctx, configMapName, namespace)
	if cmErr == nil {
		// ConfigMap exists - check if it was created by the same image.
		if existingCM.Annotations != nil && existingCM.Annotations[ConfigMapImageAnnotationKey] == image {
			return ctrl.Result{}, nil
		}
		// Image changed - delete the stale ConfigMap so the new image can recreate it.
		if err := h.deleteConfigMap(ctx, configMapName, namespace); err != nil {
			return ctrl.Result{}, fmt.Errorf("failed to delete stale configMap for image change: %w", err)
		}
		logger.Info("Deleted stale ConfigMap due to image change", "configMap", configMapName, "newImage", image)
	}

	// [additional code...]
}
>>>>>>> 74a29c11... anr docs update and minor fix during resource cleanup (#1391) (#1397)

Cherry-pick triggered by: ACP-Automation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant