Skip to content

Conversation

@VeckoTheGecko
Copy link
Contributor

@VeckoTheGecko VeckoTheGecko commented Oct 31, 2025

(posting now for visibility and to request feedback/pixi debugging help)

Overview

Fixes #10732

This PR migrates the dev workflow and CI for Xarray across to Pixi, providing the following benefits:

  • Composable environments via dependency groups (in pixi called "features")
  • Support for multiple environments
  • Task running
  • lock file support

See the original issue for more info.

Changes so far in this PR:

  • Add pixi badge to readme
  • Migrated most environment files to Pixi config in pixi.toml split apart into features that I thought were sensible . I left out environment-benchmarks.yml, binder/environment as that has interactions with asv, and Binder - this PR is already big enough, and I think those should be explored another time.
    • I made the environments in Pixi have similar names as the original conda environments to ease migration
  • Introduced a cache-pixi-lock.yml workflow (see below section "Considerations")
  • Updated ci.yaml
    • Fixed now! 98% there - for some reason the CI of Pixi is finding which pytest to be .pixi/envs/default/Scripts/pytest while local pixi run -e test-all-deps-py313 which pytest is finding .pixi/envs/test-all-deps-py313/bin/pytest (see test-pixi-dust branch, example action run) . Any ideas why @lucascolley ?
  • Update CI additional
  • Update RTD build
  • Migrate minimal environment to Pixi as well
  • Update contributing guidelines (see "Feedback wanted" section below)
  • Migrate hypothesis tests
  • Migrate nightly dev testing

I've tried to make the commits tidy to help with reviewing commit by commit, which might be easier. I also was quite diligent when migrating from the old env files to make sure versions were the same.

Testing instructions

Resources: Pixi Scipy 2025 talk | Docs: Manifest Reference

  • pixi info -> show info about the pixi environments
  • Build documentation: pixi run doc
  • Run tests: pixi run test then choose the environment you want to run the tests in (or pixi run -e environment_name test)
    • Most often you'll want the test-all-deps-py313 environment (corresponding to the old environment ci/requirements/environment.yml)
  • Run pre-commit: pixi run pre-commit
  • Run mypy typing: pixi run typing

Enter an environment (equivalent to conda activate): pixi shell -e env_name
Exit an environment (equivalent to conda activate): exit or Ctrl+D

See all tasks: pixi run

Considerations

Lock files o' lock files

There was some interesting conversation in #10732 (comment) about lock files. To summarise:

We have two choices to handle the lock files, either (a) generate them in CI, or (b) commit them to the repo and periodically update them.

(a) generating in CI (done in this PR):

  • add pixi.lock to .gitignore
  • have a workflow which generates the lock file. Cache this under a key that is date + hash(pixi.toml)
  • have all workflows restore this pixi.lock file for environment creation

Pros:

  • lock file is only generated once a day and shared across workflows - saving 40s per run
  • close to what was previously done (with daily caches)
  • minimal changes to workflows (only need to add a few lines - cache-pixi-lock.yml is re-usable across different projects).

Cons:

  • Mismatch of devs pixi.lock and what's in CI. Local developers need to periodically delete pixi.lock and regenerate it.
  • Missed benefit of perfectly reproducible dev environments cross developers and with CI

(b) commit the lock files

(I think this is the gist of it)

  • commit the lock file (now local devs and CI can use this lockfile) - around 40k lines
  • add GitHub PR automation to automatically update the lockfile every 3 weeks
  • Most of CI works from committed lockfile, but there can be a job bleeding-edge which runs every few days by taking the current lockfile, running an update, and then running tests. Any failures can be automatically reported in an issue
    • Then, to "resolve" that issue you can add a pin in the pixi.toml manifest and talk with upstream to see whats up

Pros:

  • no need to generate lock files in CI
  • perfectly reproducible dev environments cross developers and CI

Cons:

  • there is a bunch of added complexity/maintainer burden to setting this up (automated workflows etc)

@lucascolley knows the full extent as he's been exploring this setup at Scipy

Conclusion

Approach (a) has minimal setup/maintenance with little downside. I think that it's a good solution for smaller projects in particular (we've adopted it at Parcels - cc @maxrjones might be interesting based on your comment )

Approach (b) is more robust if having the same environment between all devs is highly valued (@shoyer mentioned during a dev meeting that this would be good for xarray), but requires more setup.

I recommend we go for (a) as is done in this PR, and consider (b) separately .

@lucascolley would it be beneficial to do a write-up of all this on prefix.dev sometime to help guide others dealing with this? I'm happy to write or collab on a blog post.

Feedback wanted: To what extent do we promote Conda dev workflows

Yeah - I don't know. In the projects I'm working on I've gone full Pixi, but those are smaller projects.

I've deleted the old environment files to avoid duplication, but can re-add them to the extent which you want to support conda dev workflows.

I've held off on updating the contributing instructions for this reason.

EDIT: Joined the dev meeting - @keewis doesn't think its a bad idea to fully migrate dev instructions from conda to Pixi. Later (if people really want conda instructions) we can show how to use pixi to export a conda compatible env file - no need for us to maintain two separate env files.


I think that's about it! I don't think I've forgotten anything, but it is late on a Friday so maybe - will update if that's the case :)

Let me know if you want me to drop by the dev meeting on 5 Nov - but I'm happy to keep this async otherwise.


(🎉 for my first significant contribution to Xarray!!!)

- Using the bare-minimum.yml requirements file to act as a starting point to build the composable environments
- Add pixi.lock to gitignore (no need to commit lock files in library repos)
- Update .gitattributes (automatically done by pixi)
- Configure xarray as source dependency with dynamic versioning
Already migrated to pixi
Update requirements files to remove deps handled by Pixi
@dcherian
Copy link
Contributor

Claude solved this by adding

[tool.uv]
override-dependencies = [
    "dask @ git+https://github.com/dask/dask@main",
]

This must be a uv thing? The tight pin on dask has been around for years.

@VeckoTheGecko
Copy link
Contributor Author

Thanks @dcherian ! Fixed in 8abc993

@dcherian
Copy link
Contributor

Hmm.. does that apply globally so every env tests against dask main when dask is requested?

@VeckoTheGecko
Copy link
Contributor Author

Hmm.. does that apply globally so every env tests against dask main when dask is requested?

No, only to the environments which have the nightly feature

@Illviljan Illviljan added run-benchmark Run the ASV benchmark workflow run-upstream Run upstream CI run-pyright Run pyright type checker labels Nov 12, 2025
@dcherian dcherian added the run-slow-hypothesis Run slow hypothesis tests label Nov 12, 2025
@VeckoTheGecko

This comment was marked as outdated.

@keewis
Copy link
Collaborator

keewis commented Nov 13, 2025

that doesn't sound right, we have been monkeypatching sys.meta_path for a few months in test_default_engine_h5netcdf.

Instead, I believe the reason is that this particular environment does not contain netcdf4, so removing h5netcdf means that we don't have any other available engine (see also the two erroring tests).

@VeckoTheGecko
Copy link
Contributor Author

VeckoTheGecko commented Nov 13, 2025

see also the two erroring tests

Thanks, didn't see those.

I just noticed a bunch of tests in nightly are skipped, though I'm not sure if that's a regression since I don't have prior logs to compare with. The env now has got less deps than before (before - the base environment file was taken, packages were deleted and added using a bash script). Are there any more dependencies that would be good for that env? Grepping through the test suite for @requires_ (filtering out irrelevant items) gives

@requires_bottleneck
@requires_cartopy
@requires_cupy
@requires_fsspec
@requires_iris
@requires_matplotlib
@requires_numbagg
@requires_pint
@requires_pyarrow
@requires_pydap
@requires_seaborn
@requires_zarr

which might be of interest

Copy link
Collaborator

@keewis keewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the nightly env used to have a few more dependencies, and I think we should try to keep it that way: what we're testing is how certain nightly versions interact with the entire library.


# other
cftime = "*"
pint = "*"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to the array section?

]

[feature.nightly.dependencies]
python = "*"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be worth pinning this to 3.13?


[feature.nightly.dependencies]
python = "*"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add iris, cartopy, and seaborn, as well, here?

The other envs dont have it - can be added later
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Automation Github bots, testing workflows, release automation CI Continuous Integration tools dependencies Pull requests that update a dependency file run-benchmark Run the ASV benchmark workflow run-pyright Run pyright type checker run-slow-hypothesis Run slow hypothesis tests run-upstream Run upstream CI topic-arrays related to flexible array support topic-documentation topic-plotting topic-rolling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using Pixi for environment management

7 participants