Skip to content

feat(capi): add xet_capi C ABI crate for hf-xet#881

Draft
assafvayner wants to merge 30 commits into
mainfrom
assaf/c-api
Draft

feat(capi): add xet_capi C ABI crate for hf-xet#881
assafvayner wants to merge 30 commits into
mainfrom
assaf/c-api

Conversation

@assafvayner

Copy link
Copy Markdown
Contributor

Summary

Adds xet_capi, a new workspace crate exposing the full xet::xet_session::XetSession surface over a C ABI, so hf-xet can be consumed from C / C++ / CGo. It follows the same thin-wrapper pattern as the existing Python (hf_xet), wasm (hf_xet_wasm), and Node (xet_pkg_napi) bindings.

Design: opaque handles freed explicitly; async transfers via handle-based polling (XetOppoll + typed take_*), backed by std threads running the existing _blocking APIs; no callbacks cross the ABI in either direction. Errors are XetStatus codes plus an opaque XetError out-param. Reports are opaque with accessor functions; scalar progress/dedup are flat #[repr(C)] structs. The header include/hf_xet.h is cbindgen-generated and committed.

Surface

  • Session + auth: xet_session_new/free, xet_init_logging, and three XetAuthConfig-driven builders (upload commit, file-download group, stream group).
  • Uploads: from-path / bytes / streaming, finalize, commit, progress, abort.
  • Downloads: XetFileInfo, download-to-path, finish, progress, abort, task_id; ordered/unordered streaming with pull-based next, progress, cancel, task_id.
  • Polling: XetOp poll + typed take_* (file-metadata / commit-report / download-report / bytes / chunk / void / error).
  • Results: opaque commit/download report handles with accessors; flat XetProgress / XetDedupMetrics; XetBytes; XetError / XetStatus.

Build & safety

  • crate-type = ["cdylib", "staticlib", "rlib"]; release build produces libxet_capi.{a,dylib}.
  • Every fallible extern "C" fn is wrapped in a panic guard (ffi_guard / catch_unwind) so panics never unwind across the ABI; all pointer args null-checked.
  • cbindgen runs in build.rs; the committed header is guarded by an up-to-date test and a symbol-presence test.

Test Plan

  • cargo test -p xet_capi — 15 tests pass (default features)
  • cargo test -p xet_capi --features simulation — 16 pass, including e2e_upload_then_download_via_ffi (real upload → download round-trip through the C ABI over a local:// CAS)
  • cargo clippy -p xet_capi --all-targets -- -D warnings and --features simulation — clean
  • cargo +nightly fmt -p xet_capi --check — clean
  • cargo build -p xet_capi --release — produces staticlib + cdylib
  • c_smoke_compilestests/smoke.c compiles against the generated header
  • Reviewer: sanity-check include/hf_xet.h against intended C consumer usage

Previously xet_op_take_* freed the op internally on success/error but not on
the wrong-variant path, an inconsistent contract that made double-frees easy
(freeing an already-consumed op joins an already-joined worker thread and
panics with ESRCH). Switch to a single predictable rule: take_* never frees;
the caller always frees every op exactly once with xet_op_free. This is the
RAII/defer-friendly model the language bindings want.
Each performs the same upload -> commit -> download-by-hash round-trip against
a real HF Xet repo via the C API, using a Hub token-refresh URL for auth. All
four were tested end-to-end against assafvayner/xet-c-api-test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant