Commit Graph

15 Commits

Author SHA1 Message Date
Assaf Vayner
0958579c40 spec draft (#422)
fix XET-681

XET protocol specification initial draft

- documentation of core procedures required for file uploads and
downloads
- format specifications for shards and xorbs
2025-09-29 10:25:25 -07:00
Di Xiao
8ee0a5c958 Cache rust build in actions (#513)
In response to [A Joint Statement on Sustainable
Stewardship](https://openssf.org/blog/2025/09/23/open-infrastructure-is-not-free-a-joint-statement-on-sustainable-stewardship/)
and [Rust Foundation Signs Joint Statement on Open Source Infrastructure
Stewardship](https://rustfoundation.org/media/rust-foundation-signs-joint-statement-on-open-source-infrastructure-stewardship/),
implements caching of dependency and build artifact, and reduces some CI
runtime. Cache entry keys are formed by `os_type`-`arch_type`-`hash of
Cargo.lock`, cache configuration adapts from
https://docs.github.com/en/actions/tutorials/build-and-test-code/rust#caching-dependencies.
2025-09-26 11:16:04 -07:00
Di Xiao
fa030edcd5 upgrade rust edition to 2024; upgrade rustc to 1.89 (#494)
- Upgrade Rust edition and rustc version to bring in some nice features,
e.g. let chains instead of nested if block.
- Fix clippy and format due to the upgrade.
- Fix a bug identified by the new rustc:
6cb0a7fb4e/xet_runtime/src/runtime.rs (L195)
```
#[cfg(not(target_family = "wasm"))]
{
    // A new multithreaded runtime with a capped number of threads
    TokioRuntimeBuilder::new_multi_thread().worker_threads(get_num_tokio_worker_threads())
}
```
here the end curly bracket drops the temporary builder while a `&mut
Self` to the dropped value is returned. (this may be due to a difference
between compilers regarding how they treat the scope of "{...}" of
`#[cfg(...))] {...}`?)
2025-09-17 10:28:50 -07:00
Di Xiao
0e1f9f4cf0 Git-Xet: LFS custom transfer agent with Xet protocol (#425)
This PR builds a Git integration called `git-xet` that enables users to
upload files using the Xet protocol as part of a standard git push.

This integration builds on the Git LFS custom transfer adapter protocol,
the same mechanism we now use to handle Git LFS uploads for files larger
than 5 GB through multipart PUT.
To enable uploads to Xet, users run `git-xet install`, which writes the
following configuration to the Git config file at a selected scope
[`--system`, `--global` (default), or `--local`]:
```
[lfs "customtransfer.xet"]
	path = git-xet
	args = transfer
	concurrent = true
```
This setup registers a new transfer adapter named xet, allowing Git to
delegate LFS file transfers to the git-xet binary when applicable.

On the server side, support is rolled out in two stages:

Stage 1 (Upload): The Git LFS batch API for the "upload" operation is
updated.

- If a repo is Xet enabled but users didn't run git-xet install,
moon-landing rejects the request when users initiated git push and
returns an instruction to install git-xet.

- If a repo is Xet enabled and users have git-xet configured correctly,
moon-landing accepts the request and replies with CAS server URL and
access token, which git-xet will use to upload files to Xet.

- If a repo is NOT Xet enabled, upload goes through the LFS path.
2025-09-08 16:08:50 -07:00
Assaf Vayner
6203653ecf update api paths to use plural nouns (#482)
Updates paths used by the clients to use latest CAS paths as defined in
the spec.

All paths now use plural nouns and shard upload no longer uses the hash,
removes the prefix and hash from the client trait upload_shard function.
2025-09-08 13:02:49 -07:00
Di Xiao
740887a453 CI test on macos (#473)
We test on Ubuntu and Windows, so it seems reasonable to test on macOS
too. This also gets CI prepared for git-xet tests.
2025-08-26 10:46:34 -07:00
Assaf Vayner
6beab3b197 enforce linting on hf_xet (#462)
This PR adds an explicit lint command on the hf_xet directory. This is
necessary because it is excluded from the workspace. Other excluded
directories aren't touched very often and are less important for now.
2025-08-18 16:55:05 -07:00
Assaf Vayner
b2fc01d479 thin wasm (#411)
Re-adding a thin wasm crate for JS client development.

checks build in ci job build_and_test-wasm

only includes a wrapper over a chunker and function to compute xorb hash
at the moment.
2025-07-15 10:11:38 -07:00
Di Xiao
9fbd234328 wasm poc (#272)
This implements uploading through Xet protocol in WASM environment, and
makes necessary changes to make dependent crates WASM compatible.
1. Uploading through Xet protocol is done in hf_xet_wasm crate;
2. Separate Cas Client trait definitions into upload and download
functionality groups and disable download for WASM;
3. Disable Cas Client request retry in WASM environment, which isn't
critical for a POC (until we have a retry strategy that doesn't depends
on time);
4. Disable async CasObject deserialization;
5. Enable in-memory global dedup;

---------

Co-authored-by: Assaf Vayner <assaf@huggingface.co>
2025-06-25 12:08:48 -07:00
Assaf Vayner
80c0a7ffc9 add ci steps to check cargo.lock is up to date (#377)
We keep having out of date hf_xet/Cargo.lock, likely people are not
building hf_xet 100% of the time they are pushing to the repo. This PR
enforces that hf_xet/Cargo.lock and the root Cargo.lock must be up to
date, a CI job will fail if this is not true.
2025-06-11 10:25:28 -07:00
Joseph Godlewski
b10690d8c1 Adding windows testing for CI (#226)
* Adds a job to run tests on windows in addition to linux.
* Fixes linting for windows builds
* Fixes the LocalClient that tries to set files as read-only, which,
during tests, will break on windows due to file deletion behaving
differently than on Linux.
* Identified an issue with the disk cache deleting items if there are
simultaneous `put`s to the cache for the same key, range. This is fine
on unix, but on windows, causes errors (again, due to file deletion
behavior differences). This change mitigates the issue, allowing
huggingface_hub tests to pass on windows, but opens up another issue of
us needing to vet our filesystem deletes (e.g. cache eviction) for
correctness on windows.
2025-04-04 21:50:06 +02:00
Assaf Vayner
744ae76b90 bump xet-core to rust 1.86 (#230)
fix #225 . Fixes 225 by skipping ahead to a version where the issue is
fixed.
2025-04-04 16:21:28 +02:00
Rajat Arya
bb75c0b20f CI & Release GH Action updates (#135)
- To support sha2 builds on windows needed to change dependency for sha2
  crate to not use asm feature on Windows.
* Trim platforms & vendored ssl
2025-01-10 11:58:41 -08:00
Assaf Vayner
da07266034 run cargo fmt on everything (#59)
* run cargo fmt on everything

* standard rustfmt.toml

* format with nightly toolchain

* format in CI

* fix issue

* fix hf_xet
2024-10-23 17:57:45 -07:00
Di Xiao
1f006e2c12 Add github actions (#4) 2024-09-12 15:12:48 -07:00