Files
xet-core/xet_data
Di Xiao 23ec2940bb Expose XetSession APIs to Python (#792)
Replaces the old `upload_files` / `download_files` / `hash_files` Python
functions with a new object-oriented API that exposes `XetSession` and
its child objects directly as PyO3 classes. This gives Python callers
full control over session lifecycle, connection pooling, and progress
reporting.

The previous module-level functions are kept under `hf_xet/src/legacy/`
and remain importable as `from hf_xet import upload_files` etc., but now
emit `DeprecationWarning`.
2026-05-01 03:05:51 -07:00
..
2026-04-20 15:06:14 -07:00

xet-data

crates.io docs.rs License

Data processing pipeline for chunking, deduplication, and file reconstruction. Intended to be used through the API in the hf-xet package.

Overview

  • Content-defined chunking — Gear-hash based chunking for deduplication
  • Deduplication — Probe and register chunks against metadata shards
  • File reconstruction — Reassemble files from deduplicated chunk references
  • Progress tracking — Hooks for upload/download progress reporting

This crate is part of xet-core.

License

Apache-2.0