Skip to content

Conversation

@ayushsatyam146
Copy link
Member

@ayushsatyam146 ayushsatyam146 commented Sep 16, 2024

Changes

hack/storage-version-migration.sh script is mentioned in the install steps. If that script is executed it runs infinitely without reporting any success or failure. This PR adds changes to put checks in hack/storage-version-migration.sh script to give success or failure response and terminate the execution in finite time.

Fixes #1680

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Set a kind label on this PR
  • Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

NONE

@pull-request-size pull-request-size bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 16, 2024
@openshift-ci openshift-ci bot added the release-note-none Label for when a PR does not need a release note label Sep 16, 2024
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 16, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign adambkaplan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR needs a bit of re-work. The script as is will always complete and return false positives.

Comment on lines +75 to +77
if [[ "${jobStatus}" == *"Complete"* ]]; then
echo "Job ${JOB_NAME} has completed successfully!"
exit 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sufficient. condition.type tells you if the status condition is reporting the Complete or Failed . The Complete and Failed condition types should always be populated in the Conditions array.

You need to inspect the condition.status field, which can be True, False, or Unknown.

Comment on lines -70 to -73
while [ "$(kubectl -n shipwright-build get job "${JOB_NAME}" -o json | jq -r '.status.completionTime // ""')" == "" ]; do
echo "[INFO] Storage version migraton job is still running"
sleep 10
done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably where the bug is. Per the Job API documentation:

completionTime(Time):

Represents time when the job was completed. It is not guaranteed to be set in happens-before order across separate operations. It is represented in RFC3339 form and is in UTC. The completion time is set when the job finishes successfully, and only then. The value cannot be updated or removed. The value indicates the same or later point in time as the startTime field.

I suspect that in your case, the job is failing. This script doesn't see a completion time because the Job never succeeds, hence it just runs forever.

See https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/job-v1/#JobStatus

@adambkaplan adambkaplan added this to the Backlog milestone Aug 26, 2025
@adambkaplan
Copy link
Member

/hold

This PR needs to be rebased, and has outstanding comments that have not been addressed.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. release-note-none Label for when a PR does not need a release note size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[BUG] storage version migration script runs infinitely

2 participants