Skip to content

Conversation

@brianhelba
Copy link
Contributor

The node_modules installation should be as fast and deterministic as possible, as it's taking place internally within a Python package build. Routine developer-facing information should be omitted, as this is expected to be handled as part of the direct Javascript development process.

See:
https://docs.npmjs.com/cli/v7/commands/npm-install#configuration https://yarnpkg.com/cli/install#options


npm_cmd = normalize_cmd(npm)

install_cmd = ["install", "--immutable"] if is_yarn else ["install", "--no-audit", "--no-fund"]
Copy link
Contributor Author

@brianhelba brianhelba Sep 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be reasonable to use npm ci instead of npm install. npm ci is intended for automated environments where determinism is crucial, and it's roughly equivalent to yarn install --immutable.

However, npm ci does remove and reinstall node_modules. With package download caches, this doesn't use additional bandwidth, but it's definitely slower than npm install.

Do we care more about absolute correctness or speed when building internal JS assets for a Python package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wary of changing any of this default behavior, as I'm not a JS developer anymore and I'm not sure what affect this would have on existing consumers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, are you comfortable with this change as-is?

  • --no-fund only changes what's written to stdout, to make it cleaner in logs
  • --no-audit speeds up the command by preventing a background HTTP request and prevents additional output to stdout
  • For Yarn, --immetable is already the behavior when this is run in CI (determined via environment variables), so explicitly setting it makes the hook more consistent across all environments. Substantively, it's an important check, as it ensures that the yarn.lock will not be modified by the hook, which is very important to deterministic package builds.

if isinstance(npm, str):
npm = [npm]

is_yarn = (abs_path / "yarn.lock").exists()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's always been possible that this heuristic is incorrect, but the risk is slightly magnified now that we rely on it even if npm (the executable location) is explicitly provided.

I don't see a better alternative though, without attempting to look within npm to do pattern-matching for "yarn", which could still be incorrect (e.g. npm = "/home/john.yarn/bin/npm").

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically this change is what I am uncertain about.

The node_modules installation should be as fast and deterministic as possible,
as it's taking place internally within a Python package build. Routine
developer-facing information should be omitted, as this is expected to be
handled as part of the direct Javascript development process.

See:
https://docs.npmjs.com/cli/v7/commands/npm-install#configuration
https://yarnpkg.com/cli/install#options
@brianhelba
Copy link
Contributor Author

@blink1073 Is this something that you'll be interested in reviewing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants