Commit Graph

29 Commits

Author SHA1 Message Date
Assaf Vayner
5868f64ab9 fixing some issues identified in cargo audit (#802)
CI for hf-hub is running cargo audit and found many issues through
hf-xet transitive deps. this PR attempts to solve some of them (not
necessarily all of them).

Main changes:
- dropped derivative and reqwest-retry
- replaced bincode with postcard, only used in testing
- upgrade xet-core rand usage
- added audit CI step and ignoring some issues that we can't easily fix.





<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Medium risk because it removes `reqwest-retry`/`derivative` and
replaces part of the retry classification logic with an in-house
equivalent, which could subtly change HTTP retry behavior; the remaining
changes are dependency/version bumps and test-only serialization swaps.
> 
> **Overview**
> Adds a new CI `cargo audit` job and introduces `.cargo/audit.toml` to
ignore a small set of **dev-only** RustSec advisories with documented
rationale.
> 
> Reduces audit surface by dropping `derivative` (manual `Debug` impl
for `AuthConfig`) and removing `reqwest-retry`, replacing its
status-code classification with a local `Retryable` enum +
`default_on_request_success` helper in `RetryWrapper`.
> 
> Updates workspace deps (notably `rand` to `0.10` and `rand_distr` to
`0.6`) and adjusts call sites to the newer `rand` APIs (`RngExt`
imports, minor test/bench tweaks). Test-only binary serialization
switches from `bincode` to `postcard` (and updates affected tests), with
corresponding lockfile updates across crates.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
26377f4a1c. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
2026-04-20 14:49:48 -07:00
Di Xiao
efc8359323 Crates release workflow (#785)
## Summary

Adds two workflows to automate the crates.io release process, and
refactors the CI WASM job into a reusable composite action.

**Release process** (two separate manual steps):

1. **`bump-crates-version.yml`** (triggered via `workflow_dispatch` with
a `version` input): updates version fields in `Cargo.toml` files, runs
`cargo build` + `cargo test` to validate, builds the `hf-xet` Python
wheel and WASM targets to update related `Cargo.lock` files, then opens
a PR (e.g. `crates-release/1.6.0`). The workflow terminates after PR
creation.
2. **`crates-release.yml`** (triggered manually via `workflow_dispatch`
after the version-bump PR is merged): checks out `main`, authenticates
to crates.io via OIDC Trusted Publishing, and publishes crates in
dependency order with index-propagation delays: `xet-runtime` →
`xet-core-structures` → `xet-client` → `xet-data` → `hf-xet`. Requires
manual approval via the `crates-release` GitHub environment.

**Design notes:**
- Split into two workflows to avoid holding a runner while waiting for
the PR to be reviewed and merged
- Version bump is committed to a PR so the repo always reflects the
published version
- Uses OIDC Trusted Publishing (`rust-lang/crates-io-auth-action`) — no
long-lived secrets required. See
https://crates.io/docs/trusted-publishing

**CI refactor:**
- Extracts the nightly Rust/WASM toolchain setup and `hf_xet*_wasm`
builds into a reusable composite action (`.github/actions/build-wasm`)
- The composite action saves and restores the caller's default toolchain
around the nightly build, so callers are not affected
- Adds post-build porcelain checks in CI to fail if either WASM
`Cargo.lock` has uncommitted changes after building

## One-time manual setup required

Before this workflow can run successfully, complete the following:

### GitHub

- [x] Create a GitHub Environment named **`crates-release`**: repo
Settings → Environments → New environment
- [x] Add **required reviewers** to the `crates-release` environment —
this is the manual approval gate before the `publish` job runs

### crates.io — Trusted Publishing

Each crate must have been published manually at least once before
Trusted Publishing can be configured. For each crate, go to its Settings
page on crates.io → **Trusted Publishing** → **Add**, and fill in:

| Field | Value |
|---|---|
| Owner | `huggingface` |
| Repository | `xet-core` |
| Workflow name | `crates-release.yml` |
| Environment | `crates-release` |

- [x] Configure Trusted Publishing for **`xet-runtime`**
- [x] Configure Trusted Publishing for **`xet-core-structures`**
- [x] Configure Trusted Publishing for **`xet-client`**
- [x] Configure Trusted Publishing for **`xet-data`**
- [x] Configure Trusted Publishing for **`hf-xet`**
2026-04-10 03:14:14 -07:00
Pauline Bailly-Masson
2659c69892 🔒 Pin GitHub Actions to commit SHAs (#772)
## 🔒 Pin GitHub Actions to commit SHAs

This PR pins all GitHub Actions to their exact commit SHA instead of
mutable tags or branch names.

**Why?**
Pinning to a SHA prevents supply chain attacks where a tag (e.g. `v4`)
could be moved to point to malicious code.

### Changes

| Workflow | Action | Avant | Après | SHA |
|---|---|---|---|---|
| `hf-xet-tests.yml` | `actions/checkout` | `v6` | `v6.0.2` |
`de0fac2e4500…` |
| `hf-xet-tests.yml` | `actions/checkout` | `v6` | `v6.0.2` |
`de0fac2e4500…` |
| `hf-xet-tests.yml` | `actions/setup-python` | `v6` | `v6` |
`a309ff8b426b…` |
| `hf-xet-tests.yml` | `PyO3/maturin-action` | `v1` | `v1` |
`04ac600d27cd…` |
| `release.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…`
|
| `release.yml` | `actions/setup-python` | `v6` | `v6` | `a309ff8b426b…`
|
| `release.yml` | `PyO3/maturin-action` | `v1` | `v1` | `04ac600d27cd…`
|
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…`
|
| `release.yml` | `actions/setup-python` | `v6` | `v6` | `a309ff8b426b…`
|
| `release.yml` | `PyO3/maturin-action` | `v1` | `v1` | `04ac600d27cd…`
|
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…`
|
| `release.yml` | `actions/setup-python` | `v6` | `v6` | `a309ff8b426b…`
|
| `release.yml` | `PyO3/maturin-action` | `v1` | `v1` | `04ac600d27cd…`
|
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…`
|
| `release.yml` | `actions/setup-python` | `v6` | `v6` | `a309ff8b426b…`
|
| `release.yml` | `PyO3/maturin-action` | `v1` | `v1` | `04ac600d27cd…`
|
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…`
|
| `release.yml` | `PyO3/maturin-action` | `v1` | `v1` | `04ac600d27cd…`
|
| `release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `release.yml` | `actions/download-artifact` | `v7` | `v7` |
`37930b1c2aba…` |
| `release.yml` | `actions/attest-build-provenance` | `v3` | `v3` |
`977bb373ede9…` |
| `release.yml` | `PyO3/maturin-action` | `v1` | `v1` | `04ac600d27cd…`
|
| `release.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…`
|
| `release.yml` | `actions/download-artifact` | `v7` | `v7` |
`37930b1c2aba…` |
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `dtolnay/rust-toolchain` | `stable` | `nightly` |
`3c5f7ea28cd6…` |
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `bnjbvr/cargo-machete` | `main` | `main` | `b81ce1560c5f…`
|
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1` |
`3c5f7ea28cd6…` |
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1` |
`3c5f7ea28cd6…` |
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1` |
`3c5f7ea28cd6…` |
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1` |
`3c5f7ea28cd6…` |
| `ci.yml` | `actions/checkout` | `v6` | `v6.0.2` | `de0fac2e4500…` |
| `ci.yml` | `dtolnay/rust-toolchain` | `nightly` | `nightly` |
`3c5f7ea28cd6…` |
| `git-xet-release.yml` | `actions/checkout` | `v6` | `v6.0.2` |
`de0fac2e4500…` |
| `git-xet-release.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1`
| `3c5f7ea28cd6…` |
| `git-xet-release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `git-xet-release.yml` | `actions/checkout` | `v6` | `v6.0.2` |
`de0fac2e4500…` |
| `git-xet-release.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1`
| `3c5f7ea28cd6…` |
| `git-xet-release.yml` | `lando/code-sign-action` | `v3` | `v3` |
`a5703d3b5486…` |
| `git-xet-release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `git-xet-release.yml` | `actions/checkout` | `v6` | `v6.0.2` |
`de0fac2e4500…` |
| `git-xet-release.yml` | `dtolnay/rust-toolchain` | `1.89.0` | `1.94.1`
| `3c5f7ea28cd6…` |
| `git-xet-release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `git-xet-release.yml` | `actions/upload-artifact` | `v6` | `v6` |
`b7c566a772e6…` |
| `git-xet-release.yml` | `actions/checkout` | `v6` | `v6.0.2` |
`de0fac2e4500…` |
| `git-xet-release.yml` | `actions/download-artifact` | `v7` | `v7` |
`37930b1c2aba…` |

> 🤖 Generated by `/github-actions-audit` — [security/pin-actions-to-sha]


Closes huggingface/tracking-issues#291


Co-authored-by: di <di@huggingface.co>
2026-04-02 11:23:49 -07:00
Assaf Vayner
9c0cb6e4c8 Reduce workspace dependencies (batches 1-3) (#746)
## Summary

- **Remove unused dependencies**: warp (zero imports), paste (zero
invocations), tower-service (zero imports), and heed misplacement in
xet_core_structures
- **Move mockall to dev-dependencies** in xet_client by gating
`#[automock]` with `#[cfg_attr(test, automock)]`
- **Feature-gate simulation module** behind `simulation` cargo feature
in xet_client, making axum, heed, humantime, futures-util,
human-bandwidth, and tower-http optional
- **Replace duration-str with humantime** (~2 deps vs ~78 transitive
deps) across xet_runtime, xet_client simulation, and simulation crate

## Impact

| Metric | Before | After | Change |
|---|---|---|---|
| hf-xet production deps | 371 | 321 | **-50** |
| Workspace total | 575 | 569 | -6 |

## Test plan

- [x] `cargo check --workspace` passes
- [x] `cargo check -p hf-xet` passes (without simulation feature — key
validation)
- [x] `cargo test --workspace` — all tests pass (4 pre-existing auth
test failures in git_xet unrelated to this PR)
- [x] `cargo tree -p hf-xet -e normal --prefix none | sort -u | wc -l`
confirms 321 deps

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Medium risk because it changes dependency graph and Cargo feature
gating (notably `xet-client` simulation modules and CI test features),
which can affect build/test behavior across targets despite minimal
runtime logic changes.
> 
> **Overview**
> Reduces workspace dependency surface by removing `duration-str`
(replaced with `humantime`) and trimming other transitive-heavy crates;
updates lockfiles accordingly across the workspace, `hf_xet`, and WASM
builds.
> 
> Introduces/propagates a `simulation` Cargo feature: `xet-client`’s
simulation server-related deps become optional and are only
compiled/exported when `feature = "simulation"` is enabled; `git_xet`
adds a `simulation` feature that forwards to dependent crates, and CI
now runs tests with `strict simulation git-xet-for-integration-test`.
> 
> Minor repo hygiene updates include ignoring `.claude/` in `.gitignore`
and wiring the `simulation` crate to depend on `xet-client` with
`features = ["simulation"]` (plus swapping its duration parsing helper
to `humantime`).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
6abc194398. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 09:54:36 -07:00
Di Xiao
b3edd92a10 Split out cargo bench compile check (#753)
Acknowledged that running "cargo bench --no-run" on every test platform
is slow. This PR
- extracts benchmark compilation verification from the Linux and macOS
build_and_test jobs into a dedicated `check-bench-compiles` job so it
runs in parallel with the cargo test jobs;
- also skips compiling "git_xet" in release mode which itself doesn't
contain benchmarks and takes the longest to compile due to optimized
linking;
- also removes unused clippy component installs from Windows and macOS
toolchain setup.

See below that the `check-bench-compiles` job finishes faster than
`build_and_test-linux` and `build_and_test-win`, so it's not introducing
extra wait time.
2026-03-25 22:25:20 -07:00
Di Xiao
101837f691 Remove cargo bench from Windows CI (#752)
As discussed, removing this step from Windows CI because it's just too
slow on Windows.
2026-03-23 17:23:42 -07:00
Di Xiao
4d24627180 Fix bench code compilation after repo restructuring (#728)
The last repo restructuring didn't update several bench code that are
not compiled by default as part of "cargo build". This PR fixes those
compilation errors and warning, and adds "cargo bench --no-run" to CI
which checks compilation but doesn't actually run benchmarks.
2026-03-19 09:28:57 -07:00
Hoyt Koepke
45d38a13a9 Code reorganization towards release of xet cargo package (#693)
This PR is a massive rearrangement of the code base into 5 packages
intended for release on cargo. The directories and corresponding
packages are:

1. xet_runtime/ — compiles into the xet-runtime package. Contains the
runtime, config, and logging management.
2. xet_core_structures/ — compiles into the xet-core-structures package.
Contains core data structures for hashing, shards, and xorbs as well as
internal data structures that depend on these.
3. xet_client/ — compiles into the xet-client package, contains client
code for remotely connecting to the Hugging Face servers.
4. xet_data/ — compiles into the xet-data package, contains the data
processing pipeline: chunking/deduplication, file reconstruction,
clean/smudge operations, and progress tracking.
5. xet_pkg/ — compiles into the hf-xet package, provides the top-level
session-based API for file upload and download with user-facing error
categorization. This is the primary package downstream dependencies
would use. This also contains a single summary error type, XetError,
that translates cleanly into python error types.

In addition, the other tools are: 

- git_xet/ — the git_xet CLI binary crate (location preserved). 
- hf_xet/ -- the hf_xet python package (location preserved).
- simulation/ — the simulation crate for upload scenario benchmarking.
- wasm/ -- the wasm objects. 

The full description — and information for an AI agent to use to update
downstream dependencies — is at
api_changes/update_260309_package_restructure.md.

Summary of moves:

- xet_runtime: became xet_runtime::core inside xet_runtime/.
- utils: became xet_runtime::utils inside xet_runtime/.
- xet_config: became xet_runtime::config inside xet_runtime/.
- xet_logging: became xet_runtime::logging inside xet_runtime/.
- error_printer: became xet_runtime::error_printer inside xet_runtime/.
- file_utils: became xet_runtime::file_utils inside xet_runtime/.
- merklehash: became xet_core_structures::merklehash inside
xet_core_structures/.
- mdb_shard: became xet_core_structures::metadata_shard inside
xet_core_structures/.
- xorb_object: became xet_core_structures::xorb_object inside
xet_core_structures/.
- cas_client: became xet_client::cas_client inside xet_client/.
- hub_client: became xet_client::hub_client inside xet_client/.
- cas_types: became xet_client::cas_types inside xet_client/.
- chunk_cache: became xet_client::chunk_cache inside xet_client/.
- data: became xet_data::processing inside xet_data/.
- deduplication: became xet_data::deduplication inside xet_data/.
- file_reconstruction: became xet_data::file_reconstruction inside
xet_data/.
- progress_tracking: became xet_data::progress_tracking inside
xet_data/.
- xet_session: became xet::xet_session inside xet_pkg/.

- Wasm packages (hf_xet_wasm, hf_xet_thin_wasm): moved from top-level
into wasm/; internal imports updated, public APIs unchanged.
2026-03-11 12:02:38 -07:00
Brian Ronan
17e900a70e Feat: optional request_headers on hf_xet API calls (#661)
Adding support for setting an optional `request_header` map on the
hf_xet upload and download API calls. This map is augmented with the
hf_xet user agent string and is passed along with the requests to
xetcas.

This PR also adds some unit tests for testing the map merging behavior
to `hf_xet/lib.rs` and adds support for running these with cargo test
and in github actions CI step.
2026-02-23 14:43:58 -08:00
Salman Chishti
adbd4fa433 Upgrade GitHub Actions to latest versions (#615)
## Summary
This PR upgrades GitHub Actions to their latest versions for Node.js 24
compatibility and security updates.

## Changes

| Action | Old Version(s) | New Version | Files |
|--------|---------------|-------------|-------|
| actions/attest-build-provenance | v1 | v3 | release.yml |


## Why these changes?
- Keeps actions up to date with latest stable releases
- Updated actions include security fixes and new features

## Testing
These changes only update action versions and don't modify workflow
logic.

---------

Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
2026-01-26 10:22:45 -10:00
Salman Chishti
8ae8501cea Upgrade GitHub Actions for Node 24 compatibility (#600)
Upgrade GitHub Actions to their latest versions to ensure compatibility
with Node 24, as Node 20 will reach end-of-life in April 2026, per [GitHub's
announcement](https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/).

---------

Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
Co-authored-by: di <di@huggingface.co>
2026-01-06 10:11:06 -08:00
Di Xiao
d15295eff3 Clean up dependencies (#595)
- Remove dependencies from Cargo.toml files that are not used.
- Move dependencies directly referencing crates.io from crate level
Cargo.toml to the workspace Cargo.toml.
- Fix using RemoteClient in WASM: AdaptiveConcurrencyController uses
`tokio::time::Instant` which wraps `std::time::Instant` and is not
available in WASM.
- Add [cargo-machete](https://github.com/bnjbvr/cargo-machete) to CI to
check unused dependencies.

No functionality change.
2025-12-15 15:26:02 -08:00
Di Xiao
5f77ffc46a Integration test for ssh access on Windows (#566)
This PR builds on top of
https://github.com/huggingface/xet-core/pull/565 and builds an
integration test to test access to "ssh" and "sh" on Windows through the
"git" (-> "git-lfs") -> "git-xet" call chain.

Out of all the ssh variants, access to programs like "plink", "putty",
"tortoiseplink" or "simple" should be given by the env var
`$GIT_SSH_COMMAND` or `$GIT_SSH`, or by git config entry
`core.sshCommand`. Direct access to the mostly used utility "ssh" and
in-direct access to "ssh" via "sh -c" on Windows is provided by the
"git" (-> "git-lfs") -> "git-xet" call chain, see
git_xet/tests/test_ssh.rs for details.
2025-11-20 03:22:19 -08:00
Assaf Vayner
f0895142cb move spec to docs (#515)
publish to hub docs out of xet-core for xet-spec. Need to merge this
first before iterating to get the github workflows working right.
2025-09-29 12:37:21 -07:00
Assaf Vayner
0958579c40 spec draft (#422)
fix XET-681

XET protocol specification initial draft

- documentation of core procedures required for file uploads and
downloads
- format specifications for shards and xorbs
2025-09-29 10:25:25 -07:00
Di Xiao
8ee0a5c958 Cache rust build in actions (#513)
In response to [A Joint Statement on Sustainable
Stewardship](https://openssf.org/blog/2025/09/23/open-infrastructure-is-not-free-a-joint-statement-on-sustainable-stewardship/)
and [Rust Foundation Signs Joint Statement on Open Source Infrastructure
Stewardship](https://rustfoundation.org/media/rust-foundation-signs-joint-statement-on-open-source-infrastructure-stewardship/),
implements caching of dependency and build artifact, and reduces some CI
runtime. Cache entry keys are formed by `os_type`-`arch_type`-`hash of
Cargo.lock`, cache configuration adapts from
https://docs.github.com/en/actions/tutorials/build-and-test-code/rust#caching-dependencies.
2025-09-26 11:16:04 -07:00
Di Xiao
fa030edcd5 upgrade rust edition to 2024; upgrade rustc to 1.89 (#494)
- Upgrade Rust edition and rustc version to bring in some nice features,
e.g. let chains instead of nested if block.
- Fix clippy and format due to the upgrade.
- Fix a bug identified by the new rustc:
6cb0a7fb4e/xet_runtime/src/runtime.rs (L195)
```
#[cfg(not(target_family = "wasm"))]
{
    // A new multithreaded runtime with a capped number of threads
    TokioRuntimeBuilder::new_multi_thread().worker_threads(get_num_tokio_worker_threads())
}
```
here the end curly bracket drops the temporary builder while a `&mut
Self` to the dropped value is returned. (this may be due to a difference
between compilers regarding how they treat the scope of "{...}" of
`#[cfg(...))] {...}`?)
2025-09-17 10:28:50 -07:00
Di Xiao
0e1f9f4cf0 Git-Xet: LFS custom transfer agent with Xet protocol (#425)
This PR builds a Git integration called `git-xet` that enables users to
upload files using the Xet protocol as part of a standard git push.

This integration builds on the Git LFS custom transfer adapter protocol,
the same mechanism we now use to handle Git LFS uploads for files larger
than 5 GB through multipart PUT.
To enable uploads to Xet, users run `git-xet install`, which writes the
following configuration to the Git config file at a selected scope
[`--system`, `--global` (default), or `--local`]:
```
[lfs "customtransfer.xet"]
	path = git-xet
	args = transfer
	concurrent = true
```
This setup registers a new transfer adapter named xet, allowing Git to
delegate LFS file transfers to the git-xet binary when applicable.

On the server side, support is rolled out in two stages:

Stage 1 (Upload): The Git LFS batch API for the "upload" operation is
updated.

- If a repo is Xet enabled but users didn't run git-xet install,
moon-landing rejects the request when users initiated git push and
returns an instruction to install git-xet.

- If a repo is Xet enabled and users have git-xet configured correctly,
moon-landing accepts the request and replies with CAS server URL and
access token, which git-xet will use to upload files to Xet.

- If a repo is NOT Xet enabled, upload goes through the LFS path.
2025-09-08 16:08:50 -07:00
Assaf Vayner
6203653ecf update api paths to use plural nouns (#482)
Updates paths used by the clients to use latest CAS paths as defined in
the spec.

All paths now use plural nouns and shard upload no longer uses the hash,
removes the prefix and hash from the client trait upload_shard function.
2025-09-08 13:02:49 -07:00
Di Xiao
740887a453 CI test on macos (#473)
We test on Ubuntu and Windows, so it seems reasonable to test on macOS
too. This also gets CI prepared for git-xet tests.
2025-08-26 10:46:34 -07:00
Assaf Vayner
6beab3b197 enforce linting on hf_xet (#462)
This PR adds an explicit lint command on the hf_xet directory. This is
necessary because it is excluded from the workspace. Other excluded
directories aren't touched very often and are less important for now.
2025-08-18 16:55:05 -07:00
Assaf Vayner
b2fc01d479 thin wasm (#411)
Re-adding a thin wasm crate for JS client development.

checks build in ci job build_and_test-wasm

only includes a wrapper over a chunker and function to compute xorb hash
at the moment.
2025-07-15 10:11:38 -07:00
Di Xiao
9fbd234328 wasm poc (#272)
This implements uploading through Xet protocol in WASM environment, and
makes necessary changes to make dependent crates WASM compatible.
1. Uploading through Xet protocol is done in hf_xet_wasm crate;
2. Separate Cas Client trait definitions into upload and download
functionality groups and disable download for WASM;
3. Disable Cas Client request retry in WASM environment, which isn't
critical for a POC (until we have a retry strategy that doesn't depends
on time);
4. Disable async CasObject deserialization;
5. Enable in-memory global dedup;

---------

Co-authored-by: Assaf Vayner <assaf@huggingface.co>
2025-06-25 12:08:48 -07:00
Assaf Vayner
80c0a7ffc9 add ci steps to check cargo.lock is up to date (#377)
We keep having out of date hf_xet/Cargo.lock, likely people are not
building hf_xet 100% of the time they are pushing to the repo. This PR
enforces that hf_xet/Cargo.lock and the root Cargo.lock must be up to
date, a CI job will fail if this is not true.
2025-06-11 10:25:28 -07:00
Joseph Godlewski
b10690d8c1 Adding windows testing for CI (#226)
* Adds a job to run tests on windows in addition to linux.
* Fixes linting for windows builds
* Fixes the LocalClient that tries to set files as read-only, which,
during tests, will break on windows due to file deletion behaving
differently than on Linux.
* Identified an issue with the disk cache deleting items if there are
simultaneous `put`s to the cache for the same key, range. This is fine
on unix, but on windows, causes errors (again, due to file deletion
behavior differences). This change mitigates the issue, allowing
huggingface_hub tests to pass on windows, but opens up another issue of
us needing to vet our filesystem deletes (e.g. cache eviction) for
correctness on windows.
2025-04-04 21:50:06 +02:00
Assaf Vayner
744ae76b90 bump xet-core to rust 1.86 (#230)
fix #225 . Fixes 225 by skipping ahead to a version where the issue is
fixed.
2025-04-04 16:21:28 +02:00
Rajat Arya
bb75c0b20f CI & Release GH Action updates (#135)
- To support sha2 builds on windows needed to change dependency for sha2
  crate to not use asm feature on Windows.
* Trim platforms & vendored ssl
2025-01-10 11:58:41 -08:00
Assaf Vayner
da07266034 run cargo fmt on everything (#59)
* run cargo fmt on everything

* standard rustfmt.toml

* format with nightly toolchain

* format in CI

* fix issue

* fix hf_xet
2024-10-23 17:57:45 -07:00
Di Xiao
1f006e2c12 Add github actions (#4) 2024-09-12 15:12:48 -07:00