Skip to content

Conversation

@Prucek
Copy link
Member

@Prucek Prucek commented Oct 30, 2025

Fixes: #474

This PR adds a fallback to merging an individual PR, when a batch fails to merge.
Before, when a Batch failed to merge, it didn't take into account that a failure could occur, the action in takeAction was always MergeBatch regardless of the error. This sometimes led to /hold-ing one of the PRs from the batch, and then it would merge.
Now, it will take one of the PRs and it will try to merge it.

🤖 Assisted by Claude.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Prucek
Once this PR has been reviewed and has the lgtm label, please assign cjwagner for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the area/tide Issues or PRs related to prow's tide component label Oct 30, 2025
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 30, 2025
@netlify
Copy link

netlify bot commented Oct 30, 2025

Deploy Preview for k8s-prow ready!

Name Link
🔨 Latest commit ed28fed
🔍 Latest deploy log https://app.netlify.com/projects/k8s-prow/deploys/690493d2ac9cf80008d664e2
😎 Deploy Preview https://deploy-preview-538--k8s-prow.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Oct 30, 2025
@petr-muller
Copy link
Contributor

/cc

Comment on lines 291 to 292
failed = append(failed, pr.Number)
errs = append(errs, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks very suspicious to me. This condition handles an actual failure to merge something that should be mergeable in the first place, otherwise we would not be here (btw, this is the place where Tide hits the annoying errors where it attempts to merge something blocked by incompatible branch protection settings, like #269). That's why the (now removed) "these are user errors" comment was here. Maybe we need to treat this case differently when we are trying to merge a batch, but for non-batch merges I don't see why this behavior would need to change.

I'll need to doublecheck.
/hold

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the conditions to only if len(merged) > 0

enableScheduling: true,
},
{
name: "batch merge fails, falls back to individual PR merge (issue #474)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this what the "batch merge errors but continues if a PR is unmergeable" test on L1899 is supposed to test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a situation where something was merged (merged: 2), but the bug I am trying to solve is that nothing got merged

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 30, 2025
Comment on lines +1442 to +1443
// If the entire batch failed to merge (e.g., due to conflicts), fall through
// to attempt merging individual PRs to break the tie.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#474 is really weird to me and I'm really not sure if the change proposed in this PR actually helps with anything.

Have you managed to reproduce or at least really understand what was actually happening in #474 (Tide logs would really help with that, it's a pity we haven't captured them when the problem was happening).

My thinking is: if we are here, then we have at least one successful batch job of a presubmit that is required for merge. That implies that all pull requests involved must be git-mergeable (not necessarilty GH-mergeable) without conflicts, otherwise the batch job would fail right at the beginning, when clonerefs merges all involved pulls into the base ref. If that happened, we would not have a successful batch job and we would not get here to try merge PRs based on that fact.

Additionally, if at least one PR from the batch is GH-mergeable (and therefore, if the fallback single merge on L1448 has any chance of succeeding) , it would merge already. The PRs are attempted to be merged sequentially and the code tracks keepTrying for merge failures that should only affect that single PR. It only returns "stop trying" on "we do not have permissions to merge" errors and "merge commits are forbidden" errors. Otherwise it continues and tries to merge other PRs. If none of the PRs merged it means that all of them failed to merge individually (and fallback individual merge has no chance of merging) or Tide hit one of the hard errors (and fallback merge has no chance of merging).

This is all really strange to me, because I remember that we actually saw #474 when it was happening. I just do not understand what is the actual failure case from which the user can recover through individual merges.

I think the idea that if we once have a passing batch job that fails to merge all its PRs then Tide gets stuck on this and does not let any other PR to merge (because it will always prefer to act on the passing Batch job it sees) is sound, I just do not understand the conditions that lead to that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tide Issues or PRs related to prow's tide component cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 pr's are not compatible in merge pool

3 participants