Skip to content

Commit 4212116

Browse files
committed
[RFC 0191] Lockfile Generation
1 parent c655bda commit 4212116

File tree

1 file changed

+203
-0
lines changed

1 file changed

+203
-0
lines changed

rfcs/0191-lockfile-generation.md

Lines changed: 203 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,203 @@
1+
---
2+
feature: lockfile-generation
3+
start-date: 2025-10-09
4+
author: dish (@pyrox0)
5+
co-authors: (find a buddy later to help out with the RFC)
6+
shepherd-team: (names, to be nominated and accepted by RFC steering committee)
7+
shepherd-leader: (name to be appointed by RFC steering committee)
8+
related-issues: (will contain links to implementation PRs)
9+
---
10+
11+
# Summary
12+
13+
[summary]: #summary
14+
15+
When adding packages to nixpkgs, or migrating them within nixpkgs, some packages do not contain lockfiles
16+
that would allow us to lock their dependencies correctly. This is an issue with packages based on the
17+
NPM, Yarn, PNPM, and Cargo package managers, but is not limited to them. Therefore, we should support
18+
some mechanism for creating these sorts of lockfiles, **without** bloating the nixpkgs tarball with
19+
megabytes of lockfiles that are only useful for a handful of packages.
20+
21+
# Motivation
22+
23+
[motivation]: #motivation
24+
25+
I have been migrating many packages from the deprecated `nodePackages` package set, into the main nixpkgs
26+
package set(under `pkgs/by-name`). Many packages, such as [alex](https://github.com/get-alex/alex),
27+
most(or all) packages by [`sindresorhus`](https://github.com/sindresorhus), and even projects using
28+
Cargo and Rust such as [`proxmox-backup-client`](https://git.proxmox.com/?p=proxmox-backup.git;a=tree) do not have lockfiles.
29+
This becomes an issue. Currently, the solution is to generate these lockfiles, and then vendor them within nixpkgs,
30+
then using the `postPatch` phase(mostly) to copy this lockfile to the source tree, allowing for package
31+
reproducibility.
32+
33+
The issue, of course, is that this massively increases the size of the nixpkgs tarball. nixpkgs issues such as
34+
[#327064](https://github.com/NixOS/nixpkgs/issues/327064) and [#327063](https://github.com/NixOS/nixpkgs/issues/327063) have found
35+
the effects of this are staggering, with nearly 6MiB of size(at the time) added to the nixpkgs tarball,
36+
and this also contributed a lot of eval time of nixpkgs(See [this comment](https://github.com/NixOS/nixpkgs/issues/320528#issuecomment-2227185095)).
37+
38+
However, I believe there are solutions to this problem. The
39+
[`rustPlatform.fetchCargoVendor`](https://nixos.org/manual/nixpkgs/stable/#vendoring-of-dependencies)
40+
function allows for using upstream Cargo.lock files without parsing them within
41+
nixpkgs, thus solving the previous problem of parsing these lockfiles within
42+
nixpkgs itself, removing much of the eval time penalty. This does not solve the
43+
issue for packages that do not ship their own lockfiles, however, which still
44+
requires vendoring the lockfile within nixpkgs, which contributes to the above
45+
evaluation time and tarball size issues.
46+
47+
Note also that the following design DOES NOT rely on [`import-from-derivation`](https://nix.dev/manual/nix/latest/language/import-from-derivation.html), as far as I know. This is a critical point for anything in
48+
nixpkgs, as we should not require IFD to evaluate nixpkgs or build any packages.
49+
50+
# Detailed design
51+
52+
[design]: #detailed-design
53+
54+
I propose that a new repo, named `NixOS/lockfile-storage`(or something similar)
55+
be created in the NixOS repo, along with a GitHub App for this purpose. This app
56+
would help streamline the process of generating these lockfiles.
57+
58+
Note, for the rest of this document, I will refer to the above repository as "LFR", for "Lockfile Repository",
59+
and the GitHub app/bot that manage it as the "LFR Bot".
60+
61+
## Workflow
62+
The following diagram illustrates an idealized workflow using this new process.
63+
64+
Note also that I've designed this to be as simple from an end-user(PR Author)'s perspective. The
65+
only additional friction they should have from this process is potentially waiting for someone to
66+
approve an LFR PR, so that the lockfile will be accessible.
67+
68+
Caveats: We assume that the PR Author is also a committer, and we also skip over a potential approval
69+
step for PRs in the lockfile repo in order to keep the repo safe over time.
70+
71+
Notes: `lfr` is Lockfile Repo, and `lfr_bot` is the mentioned GitHub App that would do this
72+
automatically.
73+
74+
```mermaid
75+
sequenceDiagram
76+
participant auth as PR Author
77+
participant nixpkgs@{ "type": "database" }
78+
auth->>nixpkgs: Submits PR
79+
auth->>nixpkgs: Comment mentioning @LFR_Bot
80+
create participant LFR_Bot@{"type": "queue"}
81+
nixpkgs-->>LFR_Bot: processes ping
82+
lfr_bot->>LFR_Bot:Creates lockfile with tooling
83+
create participant LFR@{ "type": "database" }
84+
lfr_bot->>LFR: Creates PR associated with nixpkgs PR
85+
note left of LFR: Processes and<br>merges PR
86+
destroy LFR
87+
LFR-->>LFR:
88+
LFR_Bot->>nixpkgs: Adds "LFR:processed" label to PR
89+
LFR_Bot->>nixpkgs: Comments link to lockfile for fetching
90+
auth->>nixpkgs: Updates PR with lockfile
91+
auth->>nixpkgs: Merges PR
92+
```
93+
94+
## Lockfile Repo Bot
95+
96+
For ease of contribution and extensibility, this bot should be written in Python, a la [nixpkgs-merge-bot](https://github.com/NixOS/nixpkgs-merge-bot).
97+
98+
When a user submits a PR, they should be able to immediately ping the bot to start the process of creating the lockfile.
99+
This may involve an approval step, if users are not committers or maintainers, but otherwise should execute automatically.
100+
101+
The bot would be a simple Python script that recieves two things: A source URL, and a package manager type.
102+
The source URL would be the `git clone`-able URL for a git repo, which is the only currently envisioned
103+
source type, but again this could be designed to be easily extensible. The package manager type would be
104+
one of a set of strings, such as "npm", "yarn", "yarn_berry", "cargo", or other similar phrase. This
105+
would act as the two inputs to the script, where it would first clone the repository, then execute the
106+
package manager's command to create a lockfile without actually installing dependencies, such as:
107+
108+
* npm: `npm install --package-lock-only`
109+
* yarn berry: `yarn install --mode=update-lockfile`
110+
* cargo: `cargo generate-lockfile`
111+
* etc...
112+
113+
This allows us to follow a similar pattern to tools such as [Renovate](https://www.mend.io/renovate/) or [Dependabot](https://github.com/dependabot),
114+
which can update lockfiles but do not need to install all dependencies to do so, which allows them to scale better.
115+
Further, not installing code allows us to avoid many security issues that come from downloading and running untrusted code.
116+
117+
## Lockfile Repo
118+
119+
This would be a very simple repo. A package for lockfiles from each ecosystem, and then a folder per-package that uses that package manager. That way, there are not errors at the top level with "too many files", and so that if you know where a package is, you can find it easily.
120+
121+
Further, updates to a package should automatically update the lockfile, and replace it in the
122+
lockfile repo, NOT adding an additional file to the repo. This is to prevent unbounded growth of the
123+
repo, and to try and keep it reasonable to clone if needed(though again, the workflow is designed to
124+
hopefully not need that). Finally, when dropping a package, this should be able to be automatically
125+
detected and a PR removing it from the LFR should be automatically generated. Old lockfiles,
126+
however, would still be accessible since they are stored in the data of previous commits.
127+
128+
129+
# Examples and Interactions
130+
131+
[examples-and-interactions]: #examples-and-interactions
132+
133+
From a user perspective, I want this to be as smooth as possible. So, after submitting a PR, you would add a comment with the following structure:
134+
135+
```
136+
@NixOS/LFR_Bot [repo] [type]
137+
```
138+
139+
For example:
140+
141+
```
142+
@NixOS/LFR_Bot https://github.com/NixOS/nixpkgs npm
143+
```
144+
145+
which would `git clone` the nixpkgs repo and attempt to create an npm `package-lock.json` file.
146+
147+
These commands should be sanitized so that only the original author of the PR or committers can run
148+
this process, avoiding some potential issues. Further, non-maintainers must wait for an approval
149+
step on the workflow, possibly with a committer having to comment `@NixOS/LFR_Bot approve` or some
150+
similar message.
151+
152+
# Drawbacks
153+
154+
[drawbacks]: #drawbacks
155+
156+
If the lockfile storage repository is deleted or purged in any way, then anything relying on this
157+
repository is broken, unless the lockfile has already been fetched locally and stored as a FOD. We
158+
could approach this problem by having some special fetcher that can use alternative sources, such as
159+
a CDN or other server(for users in China who may have issues accessing GitHub, for instance).
160+
161+
Further, this is another potential way for someone to sneak untrusted code into nixpkgs, but:
162+
163+
* The lockfiles are generated on nixpkgs-related infrastructure(either self-hosted or GHA), so
164+
there is not a risk of attackers patching lockfiles for malicious purposes.
165+
* Non-maintainers should go through an initial approval before having this run for their PRs.
166+
* The lockfiles should also only be generated for supported ecosystems, and if lockfiles already exist,
167+
they should be used instead of relying on this tool.
168+
169+
# Alternatives
170+
171+
[alternatives]: #alternatives
172+
173+
* Not packaging packages with missing lockfiles
174+
* A very valid idea! However, there are enough of these sorts of packages, and lockfiles are already such an issue
175+
for nixpkgs, that we should solve this so that we can move more lockfiles out of tree.
176+
* Storing lockfiles in-tree
177+
* See above issues r.e massively increased tarball size. I aim to prevent that with this, so doing that
178+
would be antithetical to this.
179+
180+
# Prior art
181+
182+
[prior-art]: #prior-art
183+
184+
Currently, we store lockfiles in-tree. This is bad, for all the reasons mentioned above, and so that is the only prior art I've considered while writing this.
185+
186+
However, you could consider projects like `node2nix`, `cargo2nix`, and others, prior art for this,
187+
but that involves translating the lockfile and storing that representation in the nixpkgs repo,
188+
which still contributes to the size issue mentioned. Further, this also has a larger impact on
189+
evaluation time, as some of these fetchers have each dependency as their own separate derivations,
190+
which means a lot of derivations could be instantiated and fetched, which does pose an
191+
outsized(though relatively minor) disk size impact due to the ratio of metadata to derivation size.
192+
193+
# Unresolved questions
194+
195+
[unresolved]: #unresolved-questions
196+
197+
Lots of it! I don't know any particular questions that remain unanswered at the moment, but I would be glad to answer those if they come up.
198+
199+
# Future work
200+
201+
[future]: #future-work
202+
203+
This may create additional maintenance burdens in the future, with an additional repo being part of the nixpkgs process in some cases, but I have tried to design this with an eye for maintainability over time.

0 commit comments

Comments
 (0)