Skip to content

maven_install.json should be mergeable by Git when two non-overlapping dependency subgraphs are updated #758

@thirtyseven

Description

@thirtyseven

Currently, maven_install.json is not able to be automatically merged by Git when two separate changes have updated it, since it contains top level hashes of both its inputs and its contents, which will be invalid after applying any automated merge strategy. In repos with high enough velocity, this is undesirable. Consider a PR that updates maven_install.json and also has a long CI run time, by the time CI has completed another PR may have already been merged that updated the file. If the two PRs touched non-overlapping parts of maven_install.json, ideally git should have been able to merge them both.

Hashing the contents is a good safeguard against users manually editing the file, and hashing the inputs is useful for detecting when the file is out of date, but a single top-level hash is brittle w/r/t merges. Addressing these two use cases, we propose the following changes to the structure of maven_install.json:

  1. Hashing contents: Each artifact in maven_install.json would contain a hash of itself and its direct dependencies (ala a Merkle tree. In theory, if two maven_install.json files are merged where the changes did not touch the same tree, there should be no conflicts. If they did touch the same tree (e.g. because a common dependency was updated in a conflicting way), a merge conflict will still be introduced and a repin will be required.
  2. Hashing inputs: Less sure about this, but my idea is we could add a boolean to each artifact marking whether it was directly requested by maven_install() or not. If the set of requested inputs from maven_install() matches the requested artifacts from maven_install.json, then we should be confident that a repin isn't required.

There is probably more that needs to be fleshed out here, I will try to put together a PoC and see how it works in our repo with about ~400 dependencies pinned by our maven_install.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions