|
| 1 | +--- |
| 2 | +feature: lockfile-generation |
| 3 | +start-date: 2025-10-09 |
| 4 | +author: dish (@pyrox0) |
| 5 | +co-authors: (find a buddy later to help out with the RFC) |
| 6 | +shepherd-team: (names, to be nominated and accepted by RFC steering committee) |
| 7 | +shepherd-leader: (name to be appointed by RFC steering committee) |
| 8 | +related-issues: (will contain links to implementation PRs) |
| 9 | +--- |
| 10 | + |
| 11 | +# Summary |
| 12 | + |
| 13 | +[summary]: #summary |
| 14 | + |
| 15 | +When adding packages to nixpkgs, or migrating them within nixpkgs, some packages do not contain lockfiles |
| 16 | +that would allow us to lock their dependencies correctly. This is an issue with packages based on the |
| 17 | +NPM, Yarn, PNPM, and Cargo package managers, but is not limited to them. Therefore, we should support |
| 18 | +some mechanism for creating these sorts of lockfiles, **without** bloating the nixpkgs tarball with |
| 19 | +megabytes of lockfiles that are only useful for a handful of packages. |
| 20 | + |
| 21 | +# Motivation |
| 22 | + |
| 23 | +[motivation]: #motivation |
| 24 | + |
| 25 | +I have been migrating many packages from the deprecated `nodePackages` package set, into the main nixpkgs |
| 26 | +package set(under `pkgs/by-name`). Many packages, such as [alex](https://github.com/get-alex/alex), |
| 27 | +most(or all) packages by [`sindresorhus`](https://github.com/sindresorhus), and even projects using |
| 28 | +Cargo and Rust such as [`proxmox-backup-client`](https://git.proxmox.com/?p=proxmox-backup.git;a=tree) do not have lockfiles. |
| 29 | +This becomes an issue. Currently, the solution is to generate these lockfiles, and then vendor them within nixpkgs, |
| 30 | +then using the `postPatch` phase(mostly) to copy this lockfile to the source tree, allowing for package |
| 31 | +reproducibility. |
| 32 | + |
| 33 | +The issue, of course, is that this massively increases the size of the nixpkgs tarball. nixpkgs issues such as |
| 34 | +[#327064](https://github.com/NixOS/nixpkgs/issues/327064) and [#327063](https://github.com/NixOS/nixpkgs/issues/327063) have found |
| 35 | +the effects of this are staggering, with nearly 6MiB of size(at the time) added to the nixpkgs tarball, |
| 36 | +and this also contributed a lot of eval time of nixpkgs(See [this comment](https://github.com/NixOS/nixpkgs/issues/320528#issuecomment-2227185095)). |
| 37 | + |
| 38 | +However, I believe there are solutions to this problem. The |
| 39 | +[`rustPlatform.fetchCargoVendor`](https://nixos.org/manual/nixpkgs/stable/#vendoring-of-dependencies) |
| 40 | +function allows for using upstream Cargo.lock files without parsing them within |
| 41 | +nixpkgs, thus solving the previous problem of parsing these lockfiles within |
| 42 | +nixpkgs itself, removing much of the eval time penalty. This does not solve the |
| 43 | +issue for packages that do not ship their own lockfiles, however, which still |
| 44 | +requires vendoring the lockfile within nixpkgs, which contributes to the above |
| 45 | +evaluation time and tarball size issues. |
| 46 | + |
| 47 | +Note also that the following design DOES NOT rely on [`import-from-derivation`](https://nix.dev/manual/nix/latest/language/import-from-derivation.html), as far as I know. This is a critical point for anything in |
| 48 | +nixpkgs, as we should not require IFD to evaluate nixpkgs or build any packages. |
| 49 | + |
| 50 | +# Detailed design |
| 51 | + |
| 52 | +[design]: #detailed-design |
| 53 | + |
| 54 | +I propose that a new repo, named `NixOS/lockfile-storage`(or something similar) |
| 55 | +be created in the NixOS repo, along with a GitHub App for this purpose. This app |
| 56 | +would help streamline the process of generating these lockfiles. |
| 57 | + |
| 58 | +Note, for the rest of this document, I will refer to the above repository as "LFR", for "Lockfile Repository", |
| 59 | +and the GitHub app/bot that manage it as the "LFR Bot". |
| 60 | + |
| 61 | +## Workflow |
| 62 | +The following diagram illustrates an idealized workflow using this new process. |
| 63 | + |
| 64 | +Note also that I've designed this to be as simple from an end-user(PR Author)'s perspective. The |
| 65 | +only additional friction they should have from this process is potentially waiting for someone to |
| 66 | +approve an LFR PR, so that the lockfile will be accessible. |
| 67 | + |
| 68 | +Caveats: We assume that the PR Author is also a committer, and we also skip over a potential approval |
| 69 | +step for PRs in the lockfile repo in order to keep the repo safe over time. |
| 70 | + |
| 71 | +Notes: `lfr` is Lockfile Repo, and `lfr_bot` is the mentioned GitHub App that would do this |
| 72 | +automatically. |
| 73 | + |
| 74 | +```mermaid |
| 75 | +sequenceDiagram |
| 76 | + participant auth as PR Author |
| 77 | + participant nixpkgs@{ "type": "database" } |
| 78 | + auth->>nixpkgs: Submits PR |
| 79 | + auth->>nixpkgs: Comment mentioning @LFR_Bot |
| 80 | + create participant LFR_Bot@{"type": "queue"} |
| 81 | + nixpkgs-->>LFR_Bot: processes ping |
| 82 | + lfr_bot->>LFR_Bot:Creates lockfile with tooling |
| 83 | + create participant LFR@{ "type": "database" } |
| 84 | + lfr_bot->>LFR: Creates PR associated with nixpkgs PR |
| 85 | + note left of LFR: Processes and<br>merges PR |
| 86 | + destroy LFR |
| 87 | + LFR-->>LFR: |
| 88 | + LFR_Bot->>nixpkgs: Adds "LFR:processed" label to PR |
| 89 | + LFR_Bot->>nixpkgs: Comments link to lockfile for fetching |
| 90 | + auth->>nixpkgs: Updates PR with lockfile |
| 91 | + auth->>nixpkgs: Merges PR |
| 92 | +``` |
| 93 | + |
| 94 | +## Lockfile Repo Bot |
| 95 | + |
| 96 | +For ease of contribution and extensibility, this bot should be written in Python, a la [nixpkgs-merge-bot](https://github.com/NixOS/nixpkgs-merge-bot). |
| 97 | + |
| 98 | +When a user submits a PR, they should be able to immediately ping the bot to start the process of creating the lockfile. |
| 99 | +This may involve an approval step, if users are not committers or maintainers, but otherwise should execute automatically. |
| 100 | + |
| 101 | +The bot would be a simple Python script that recieves two things: A source URL, and a package manager type. |
| 102 | +The source URL would be the `git clone`-able URL for a git repo, which is the only currently envisioned |
| 103 | +source type, but again this could be designed to be easily extensible. The package manager type would be |
| 104 | +one of a set of strings, such as "npm", "yarn", "yarn_berry", "cargo", or other similar phrase. This |
| 105 | +would act as the two inputs to the script, where it would first clone the repository, then execute the |
| 106 | +package manager's command to create a lockfile without actually installing dependencies, such as: |
| 107 | + |
| 108 | +* npm: `npm install --package-lock-only` |
| 109 | +* yarn berry: `yarn install --mode=update-lockfile` |
| 110 | +* cargo: `cargo generate-lockfile` |
| 111 | +* etc... |
| 112 | + |
| 113 | +This allows us to follow a similar pattern to tools such as [Renovate](https://www.mend.io/renovate/) or [Dependabot](https://github.com/dependabot), |
| 114 | +which can update lockfiles but do not need to install all dependencies to do so, which allows them to scale better. |
| 115 | +Further, not installing code allows us to avoid many security issues that come from downloading and running untrusted code. |
| 116 | + |
| 117 | +## Lockfile Repo |
| 118 | + |
| 119 | +This would be a very simple repo. A package for lockfiles from each ecosystem, and then a folder per-package that uses that package manager. That way, there are not errors at the top level with "too many files", and so that if you know where a package is, you can find it easily. |
| 120 | + |
| 121 | +Further, updates to a package should automatically update the lockfile, and replace it in the |
| 122 | +lockfile repo, NOT adding an additional file to the repo. This is to prevent unbounded growth of the |
| 123 | +repo, and to try and keep it reasonable to clone if needed(though again, the workflow is designed to |
| 124 | +hopefully not need that). Finally, when dropping a package, this should be able to be automatically |
| 125 | +detected and a PR removing it from the LFR should be automatically generated. Old lockfiles, |
| 126 | +however, would still be accessible since they are stored in the data of previous commits. |
| 127 | + |
| 128 | + |
| 129 | +# Examples and Interactions |
| 130 | + |
| 131 | +[examples-and-interactions]: #examples-and-interactions |
| 132 | + |
| 133 | +From a user perspective, I want this to be as smooth as possible. So, after submitting a PR, you would add a comment with the following structure: |
| 134 | + |
| 135 | +``` |
| 136 | +@NixOS/LFR_Bot [repo] [type] |
| 137 | +``` |
| 138 | + |
| 139 | +For example: |
| 140 | + |
| 141 | +``` |
| 142 | +@NixOS/LFR_Bot https://github.com/NixOS/nixpkgs npm |
| 143 | +``` |
| 144 | + |
| 145 | +which would `git clone` the nixpkgs repo and attempt to create an npm `package-lock.json` file. |
| 146 | + |
| 147 | +These commands should be sanitized so that only the original author of the PR or committers can run |
| 148 | +this process, avoiding some potential issues. Further, non-maintainers must wait for an approval |
| 149 | +step on the workflow, possibly with a committer having to comment `@NixOS/LFR_Bot approve` or some |
| 150 | +similar message. |
| 151 | + |
| 152 | +# Drawbacks |
| 153 | + |
| 154 | +[drawbacks]: #drawbacks |
| 155 | + |
| 156 | +If the lockfile storage repository is deleted or purged in any way, then anything relying on this |
| 157 | +repository is broken, unless the lockfile has already been fetched locally and stored as a FOD. We |
| 158 | +could approach this problem by having some special fetcher that can use alternative sources, such as |
| 159 | +a CDN or other server(for users in China who may have issues accessing GitHub, for instance). |
| 160 | + |
| 161 | +Further, this is another potential way for someone to sneak untrusted code into nixpkgs, but: |
| 162 | + |
| 163 | +* The lockfiles are generated on nixpkgs-related infrastructure(either self-hosted or GHA), so |
| 164 | +there is not a risk of attackers patching lockfiles for malicious purposes. |
| 165 | +* Non-maintainers should go through an initial approval before having this run for their PRs. |
| 166 | +* The lockfiles should also only be generated for supported ecosystems, and if lockfiles already exist, |
| 167 | +they should be used instead of relying on this tool. |
| 168 | + |
| 169 | +# Alternatives |
| 170 | + |
| 171 | +[alternatives]: #alternatives |
| 172 | + |
| 173 | +* Not packaging packages with missing lockfiles |
| 174 | + * A very valid idea! However, there are enough of these sorts of packages, and lockfiles are already such an issue |
| 175 | + for nixpkgs, that we should solve this so that we can move more lockfiles out of tree. |
| 176 | +* Storing lockfiles in-tree |
| 177 | + * See above issues r.e massively increased tarball size. I aim to prevent that with this, so doing that |
| 178 | + would be antithetical to this. |
| 179 | + |
| 180 | +# Prior art |
| 181 | + |
| 182 | +[prior-art]: #prior-art |
| 183 | + |
| 184 | +Currently, we store lockfiles in-tree. This is bad, for all the reasons mentioned above, and so that is the only prior art I've considered while writing this. |
| 185 | + |
| 186 | +However, you could consider projects like `node2nix`, `cargo2nix`, and others, prior art for this, |
| 187 | +but that involves translating the lockfile and storing that representation in the nixpkgs repo, |
| 188 | +which still contributes to the size issue mentioned. Further, this also has a larger impact on |
| 189 | +evaluation time, as some of these fetchers have each dependency as their own separate derivations, |
| 190 | +which means a lot of derivations could be instantiated and fetched, which does pose an |
| 191 | +outsized(though relatively minor) disk size impact due to the ratio of metadata to derivation size. |
| 192 | + |
| 193 | +# Unresolved questions |
| 194 | + |
| 195 | +[unresolved]: #unresolved-questions |
| 196 | + |
| 197 | +Lots of it! I don't know any particular questions that remain unanswered at the moment, but I would be glad to answer those if they come up. |
| 198 | + |
| 199 | +# Future work |
| 200 | + |
| 201 | +[future]: #future-work |
| 202 | + |
| 203 | +This may create additional maintenance burdens in the future, with an additional repo being part of the nixpkgs process in some cases, but I have tried to design this with an eye for maintainability over time. |
0 commit comments