mirror of
https://github.com/huggingface/xet-core.git
synced 2026-06-04 13:30:29 +08:00
There's no publicly documented Xet CAS endpoint. To interact with Xet
CAS, all public clients need to obtain a CAS endpoint from the same
route to obtain a CAS token.
Currently users need to
1. first construct a CAS token URL with respect to a certain operation
("read" or "write", targeted repo type, targeted repo, targeted
revision),
2. send a request to this URL to get a CAS token and CAS endpoint,
3. use the CAS endpoint to build a `XetSession`,
4. use the `XetSession` instance and the CAS token and CAS token URL to
build an upload or download group.
This is a rather completed setup. This PR address this blocker by
eagerly "refresh"-ing the CAS token if no CAS endpoint is provided, thus
users can
1. build a `XetSession`,
2. construct a CAS token URL with respect to a certain operation ("read"
or "write", targeted repo type, targeted repo, targeted revision),
3. use the `XetSession` instance and the CAS token URL to build an
upload or download group.
So effectively, there will be two common patterns:
Pattern A: endpoint known ahead of time — no eager refresh, token_info
is used as-is
```
let session = XetSessionBuilder::new().build()?;
let commit = session
.new_upload_commit()?
.with_endpoint(cas_url)
.with_token_info(token, expiry)
.with_token_refresh_url(refresh_url, /*Auth headers*/)
.build_blocking()?;
```
Pattern B: endpoint unknown — build call fetches it; token_info seeded
from response
```
let session = XetSessionBuilder::new().build()?;
let commit = session
.new_upload_commit()?
.with_token_refresh_url(token_refresh_url, /*Auth headers*/)
.build_blocking()?;
```
Other changes:
1. `with_endpoint()` and `with_custom_headers()` configuration is moved
from the `XetSession` level down to the operation level, because we can
actually have multiple operations with different CAS endpoints co-exist
in the same session instance.
2. Builder for different operations `XetUploadCommit`,
`XetFileDownloadGroup`, `XetDownloadStreamGroup` are refactored to share
common code under `struct AuthGroupBuilder<G>`.
API changes
This folder contains a record of API changes in main. It's indended for AI agents to read in order to correctly apply merges or update dependencies and PRs.
The updates are listed by date in the form: update_<yymmdd>_<description>.md
When applying a merge, rebase, or downstream update, all AI agents should first scan this folder to understand what relevant information may need to be applied.
When creating a PR that involves an API change potentially requiring downstream updates, an AI agent should create such a file. This file should be humanly readable but contain enough information to correctly apply the needed changes without scanning the code.