mirror of
https://github.com/huggingface/xet-core.git
synced 2026-06-04 13:30:29 +08:00
## Summary
- Run codespell across tracked files in the repo and fix unambiguous
spelling typos
- All edits are in comments, doc strings, an issue template, and one log
message — no logic changes
- 22 typos fixed across 19 files (e.g. retreived→retrieved,
elegible→eligible, occurances→occurrences, gauranteed→guaranteed,
endianess→endianness, archetectures→architectures, etc.)
## Cases left for follow-up (not in this PR)
A few hits were ambiguous and need human judgment:
- \`xet_core_structures/src/metadata_shard/shard_file_manager.rs:1400\`
— comment "but delet" appears truncated
- \`xet_core_structures/src/metadata_shard/shard_format.rs:1577\` —
"invalid somes" likely meant "invalid ones"
- \`xet_data/src/deduplication/chunking.rs:564\` — comment trails off
("on other po")
False positives left untouched: \`serde::ser::*\` module paths,
"process-global statics" (Rust \`static\` items), "implementor(s)"
(valid alternate of "implementer"), "re-used", "unparseable".
## Test plan
- [x] \`cargo check --workspace --lib --all-features\` passes
- [ ] CI green on the draft PR
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Low Risk**
> Low risk: changes are limited to spelling fixes in comments/docs, an
issue template string, and a single log message, with no functional code
modifications.
>
> **Overview**
> Fixes a set of unambiguous spelling typos across the repo (primarily
Rust comments/docstrings plus `.github/ISSUE_TEMPLATE/bug-report.yml`
and `api_changes/README.md`).
>
> Also corrects one user-facing log line in `hf_xet` ("cofigured" ->
"configured"); otherwise behavior is unchanged.
>
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
e615df87a8. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->