WeKnora/internal at 29820e4cac6c5f226ea20a95de2dd03db12339ae - WeKnora - Gitea: Git with a cup of tea

pub_soft/WeKnora

mirror of https://github.com/Tencent/WeKnora.git synced 2026-06-04 13:30:32 +08:00

Files

History

young1lin 29820e4cac docs(chat): clarify cached-token semantics for explicit-cache providers

`cached_tokens` is reported by every OpenAI-compatible provider that
supports prompt caching, but how it becomes non-zero differs by mode:

- Implicit caching (OpenAI, Azure OpenAI, DeepSeek, …) populates the
  field automatically whenever a prompt prefix matches a previous
  request within the provider's cache TTL. No client-side opt-in.

- Explicit caching (Qwen on Aliyun, Anthropic Claude, …) only
  populates the field after the caller attaches `cache_control:
  {"type": "ephemeral"}` to the relevant message / content block.
  Until that opt-in is applied upstream of the request, the field
  stays zero even when the prefix is otherwise byte-stable.

Without this distinction documented, the previous commit reads as if
`TokenUsage.CachedTokens` will show non-zero values for Qwen / Claude
once this PR lands — which is not the case. The plumbing here is a
prerequisite (stable prefix via sorted tools) and a meter (visibility
of the field), but the explicit-cache opt-in itself is out of scope
and lives elsewhere.

Document this on `TokenUsage.CachedTokens` and the `cachedTokens`
helper so callers do not mistake observability for activation.

2026-05-25 16:47:14 +08:00

..

fix(agent/tools): sort function definitions for deterministic ordering

2026-05-25 16:47:14 +08:00

feat(retriever): add OpenSearch type prep ahead of Phase 3 driver

2026-05-22 20:43:50 +08:00

feat(assets): add ASR test audio file and embed it in the application

2026-04-02 21:27:27 +08:00

feat: Implement deadlock retry mechanism for chunk creation

2026-04-22 21:17:21 +08:00

docs: add Chinese RBAC guide and link with shared space docs

2026-05-21 12:28:31 +08:00

feat(obs): 支持华为云obs存储

2026-05-18 19:38:23 +08:00

feat(system-info): surface DB migration errors with troubleshooting links

2026-05-14 16:34:50 +08:00

fix(feishu): tolerate partial wiki node listing failures

2026-05-12 17:40:59 +08:00

feat(tenant): implement tenant creation limit and error handling

2026-05-18 17:28:58 +08:00

feat(agent): human-in-the-loop approval for MCP tool calls (#1173 )

2026-05-10 22:57:12 +08:00

feat: expose KB ↔ vector store binding in list, editor, and detail UI

2026-05-22 17:40:10 +08:00

fix(agent): exclude wiki-only KBs from quick-answer (RAG) mode

2026-05-12 16:27:28 +08:00

fix(docparser): address review feedback on PR #1404

2026-05-21 11:42:56 +08:00

refactor(logger): support LOG_FORMAT template and harden level coloring

2026-05-22 20:31:54 +08:00

feat(mcp): implement reconnection logic for MCP tool calls and tool listing

2026-03-31 11:57:15 +08:00

chore(rbac): update default behavior for tenant RBAC configuration

2026-05-18 21:24:28 +08:00

docs(chat): clarify cached-token semantics for explicit-cache providers

2026-05-25 16:47:14 +08:00

feat(knowledge-base): implement per-user pinning for knowledge bases

2026-05-18 17:28:58 +08:00

chore(runtime): silence gin per-route logs and emit env config banner at startup

2026-05-17 15:27:52 +08:00

feat: optimize security and deployment of agent skills

2026-02-04 20:08:49 +08:00

fix(summary): preserve image caption/OCR text in document summaries

2026-05-22 17:25:39 +08:00

feat(redis): add REDIS_USERNAME support for Redis ACL

2026-02-04 19:38:40 +08:00

feat(observability): extend Langfuse tracing across asynq pipeline

2026-04-24 13:16:47 +08:00

docs(chat): clarify cached-token semantics for explicit-cache providers

2026-05-25 16:47:14 +08:00

fix(initialization): surface upstream and SSRF errors verbatim in test-connection responses

2026-05-17 15:27:52 +08:00