-
-
Notifications
You must be signed in to change notification settings - Fork 277
Description
Currently, maven_install.json is not able to be automatically merged by Git when two separate changes have updated it, since it contains top level hashes of both its inputs and its contents, which will be invalid after applying any automated merge strategy. In repos with high enough velocity, this is undesirable. Consider a PR that updates maven_install.json and also has a long CI run time, by the time CI has completed another PR may have already been merged that updated the file. If the two PRs touched non-overlapping parts of maven_install.json, ideally git should have been able to merge them both.
Hashing the contents is a good safeguard against users manually editing the file, and hashing the inputs is useful for detecting when the file is out of date, but a single top-level hash is brittle w/r/t merges. Addressing these two use cases, we propose the following changes to the structure of maven_install.json:
- Hashing contents: Each artifact in maven_install.json would contain a hash of itself and its direct dependencies (ala a Merkle tree. In theory, if two maven_install.json files are merged where the changes did not touch the same tree, there should be no conflicts. If they did touch the same tree (e.g. because a common dependency was updated in a conflicting way), a merge conflict will still be introduced and a repin will be required.
- Hashing inputs: Less sure about this, but my idea is we could add a boolean to each artifact marking whether it was directly requested by
maven_install()or not. If the set of requested inputs frommaven_install()matches the requested artifacts from maven_install.json, then we should be confident that a repin isn't required.
There is probably more that needs to be fleshed out here, I will try to put together a PoC and see how it works in our repo with about ~400 dependencies pinned by our maven_install.