Skip to content

feat: scope stable abi to backend#682

Merged
drbh merged 11 commits into
mainfrom
backend-scoped-stable-abi
Jul 3, 2026
Merged

feat: scope stable abi to backend#682
drbh merged 11 commits into
mainfrom
backend-scoped-stable-abi

Conversation

@drbh

@drbh drbh commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

this is a work in progress PR that scopes torches stable abi to individual back ends rather than having it be a global attribute.

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Coverage report — kernels/

Measured on: Python 3.10 / Torch 2.12.0.
Other CI configurations are not included in this number.
Hardware-gated code paths (ROCm/XPU/NPU/Darwin/Windows) are excluded or unreachable on the Linux+CUDA runner.

Total coverage: 84.6% — threshold: 80% — ✅

Per-file breakdown
Name Stmts Miss Cover Missing
src/kernels/__init__.py 10 0 100%
src/kernels/_system.py 6 1 83% 10
src/kernels/_versions.py 63 7 89% 46, 49, 52-53, 56-57, 100
src/kernels/backends.py 194 55 72% 40, 44, 48-51, 68, 90, 108, 117, 121, 125-127, 148, 170, 181, 188-191, 201, 205-225, 233, 256-276
src/kernels/compat.py 8 1 88% 5
src/kernels/deps.py 54 4 93% 58-59, 95, 98
src/kernels/layer/__init__.py 6 0 100%
src/kernels/layer/_interval_tree.py 103 4 96% 23, 52, 147, 150
src/kernels/layer/device.py 48 14 71% 42, 47-49, 91, 96-98, 101, 149, 152, 155-157
src/kernels/layer/func.py 81 7 91% 81, 111, 183, 301, 307, 320, 338
src/kernels/layer/globals.py 5 0 100%
src/kernels/layer/kernelize.py 74 8 89% 255, 281, 289-290, 296, 300, 316-318
src/kernels/layer/layer.py 210 16 92% 167, 210, 216, 229, 337, 417-418, 430, 439, 447, 458, 487, 491, 504, 557, 587
src/kernels/layer/mode.py 14 0 100%
src/kernels/layer/repos.py 130 34 74% 27, 33, 36-41, 61-62, 68, 71-74, 88, 92, 101-102, 108, 111-114, 121-122, 128, 131-134, 141-142, 148, 151-154, 235
src/kernels/lockfile.py 71 46 35% 37-104, 108-131
src/kernels/status.py 49 2 96% 23, 81
src/kernels/utils.py 301 55 82% 65, 77-81, 87-88, 218, 222, 225, 287, 295, 334-335, 373, 404, 409, 444, 673, 676, 678, 684, 697-698, 719-731, 735-742, 750, 754-764, 768-775, 813, 817, 836, 838
src/kernels/variants.py 262 19 93% 56, 87, 108, 138, 247-248, 289, 291, 371-378, 384-390, 421-427, 439-445, 534-536
src/kernels/verify.py 88 1 99% 32
TOTAL 1777 274 85%

Updated by the Test kernels workflow on commit c4a0e0d0688190f0e515b34e01e8936cecd7f861.

Comment thread kernel-builder/src/util.rs Outdated
/// Validate that a per-backend `[torch.stable-abi]` mapping only references
/// backends that are declared in `[general].backends`.
fn validate_stable_abi(build: &CurrentConfig) -> Result<()> {
let v4::Framework::Torch(torch) = &build.framework else {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is still in draft, but vN should not spread across the code base. This validation should probably be in the v4 file itself in the deserialization of Build (since that holds both General and Framework).

Though I'm not 100% sure if this is worth validating. It's nice to be able to remove a backend from the list for quick testing without removing every other occurrence of that backend in build.toml.

Comment thread nix-builder/lib/kernel-config.nix Outdated
@drbh drbh force-pushed the backend-scoped-stable-abi branch from 0633fd1 to defa006 Compare July 1, 2026 16:11
@drbh drbh marked this pull request as ready for review July 1, 2026 20:46
Comment on lines +211 to +214
pub enum TorchAbi {
All(Version),
Mapping(BTreeMap<Backend, Version>),
}

@danieldk danieldk Jul 2, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pasted from Slack to have it in the review:

I thought a bit more about the extended Stable ABI mapping. Let's do this:

• Make v5 of the Build format. In v4 it stays String, in v5 we make it the mapping.
• Conversion from v4 to v5 is simple: when stable-abi is set, enumerate over the backends and set the stable ABI for the available backends..
• If we make v5, we should add edition to general and make it mandatory to set edition = 5. In this way we can start to unambiguously parse different versions in the future..
• If we don't detect edition, ask the user to run kernel-builder update-build (or maybe auto-update from v4 to v5, not on disk, but just in the builder, so that it works without running update-build).

By doing this, we avoid a lot of conditionals that check whether stable-abi is a string or mapping/attrset.

What do you think?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are great recommendations

  1. added v5 in latest changes with edition field and expects a mapping for stable abi
  2. added logic to upgrade v4 to v5 in memory so existing stable-abi="string" should work without issues
  3. updated update-builder to target updating to v5

these changes also include config changes to all of the examples so there are many more files changed in latest but most are one line additions to configs

@drbh

drbh commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator Author

this kernel community PR uses this branch to validate the new version/scoped stable abi builds

@danieldk danieldk left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

One question about a CMake change, plus a suggestion to hardcode the version in the v5 General.

add_compile_definitions(CUDA_KERNEL)
elseif(GPU_LANG STREQUAL "HIP")
if(NOT HIP_FOUND)
if(NOT HIP_FOUND AND NOT PYTORCH_FOUND_HIP)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this additional condition needed? Seems to make it less strict?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into a build issue with rocm which seems related to torch 2.13. It seems like they no longer expose HIP_FOUND and now expose PYTORCH_FOUND_HIP only

build issue:
https://github.com/huggingface/kernels-community/actions/runs/28620863071/job/84876068475

pr with changes to HIP_FOUND
pytorch/pytorch@d921fd0

HIP_FOUND is not longer found in the codebase when searching on main

I opt'ed to adding PYTORCH_FOUND_HIP instead of replacing in order to avoid any regressions with older torch versions, this way we will only throw if both HIP_FOUND and PYTORCH_FOUND_HIP are false

Comment on lines +36 to +46
let edition = EditionProbe::deserialize(value.clone())
.map_err(|err| de::Error::custom(format!("invalid `general.edition`: {err}")))?
.general
.and_then(|general| general.edition);

match edition {
// Configs with editions
Some(5) => v5::Build::deserialize(value)
.map(BuildCompat::V5)
.map_err(de::Error::custom),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment thread kernels-data/src/config/v5.rs Outdated
pub version: usize,

/// Build format edition. Must be `5` for this schema.
pub edition: usize,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to hard-set this to 5. The serde author has a crate for this:

https://crates.io/crates/monostate

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nice improvement, I've added in the latest commits.

@drbh drbh merged commit 8c4761c into main Jul 3, 2026
51 of 55 checks passed
@danieldk danieldk deleted the backend-scoped-stable-abi branch July 3, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants