Files
xet-core/xet_data/README.md
Hoyt Koepke 0d9f78aaf4 Add README.md files and Cargo.toml updates needed for publishing hf-xet (#773)
This PR adds crates.io-facing metadata (homepage, readme, keywords,
categories) for the publishable crates, along with crate README files
and concise crate-level docs so crates.io and docs.rs pages have better
context.
2026-04-03 12:34:47 -07:00

863 B

xet-data

crates.io docs.rs License

Data processing pipeline for chunking, deduplication, and file reconstruction. Intended to be used through the API in the hf-xet package.

Overview

  • Content-defined chunking — Gear-hash based chunking for deduplication
  • Deduplication — Probe and register chunks against metadata shards
  • File reconstruction — Reassemble files from deduplicated chunk references
  • Progress tracking — Hooks for upload/download progress reporting

This crate is part of xet-core.

License

Apache-2.0