Phase 3 (#1440) gate flip. PR 1 (#1445) + PR 2a (#1481) + PR 2b (#1482)
laid the type prep + driver skeleton + read/write paths as gated dead
code; this PR wires every activation surface so opensearch becomes a
registerable VectorStore engine.
Activation wiring
- internal/types: validEngineTypes / GetVectorStoreTypes (with HNSW
bounds + knn_engine enum + Immutable hints) / retrieverEngineMapping /
buildEnvStoreForDriver — every gated surface now recognises
"opensearch". IndexConfig grows four omitempty HNSW fields (HNSWM /
HNSWEFConstruction / HNSWEFSearch / KNNEngine), keeping other engines'
serialised config byte-identical.
- internal/container: createOpenSearchEngine + the switch case in
createEngineServiceFromStore; the RETRIEVE_DRIVER=opensearch env path
in initRetrieveEngineRegistry; NewEngineFactory now closes over the
AuditLogService (the EngineFactory type itself is unchanged).
- internal/application/service/vectorstore_healthcheck.go: a
testOpenSearchConnection case so CreateStore's connectivity probe
accepts opensearch instead of returning 400.
- internal/application/repository/retriever/opensearch/transport.go:
NewOpenSearchClient is exported so the factory and env path can build
the TLS-hardened client; healthcheck.go reuses the unexported
probeVersion / probeKNNPlugin for the service-layer probe.
Service-layer validation
- validateOpenSearchIndexConfig validates the HNSW caps (m 2-100,
ef_construction 2-4096, ef_search 1-10000, knn_engine ∈ lucene|faiss).
Shards/replicas continue to be enforced by the flat ValidateIndexConfig.
Create-only: UpdateStore mutates the name only.
- validateConnectionConfig requires addr for opensearch.
Sync implementations (stubs.go shrinks)
- CopyIndices (copy.go) mirrors the Elasticsearch / Qdrant pattern —
search → BatchSave with the source_id remap for generated questions —
so dim/keyword routing and the source_id contract come from BatchSave
for free. embeddingMap is keyed by the *target* SourceID because
OpenSearch's BatchSave looks up embeddings by SourceID
(lookupEmbedding), not by chunk_id (the ES driver's convention).
Pagination is from/size; copies larger than max_result_window
(default 10000) need the scroll-based async path that lands later.
- BatchUpdateChunkEnabledStatus / BatchUpdateChunkTagID (bulk_update.go)
group the input by target value and issue one _update_by_query per
group over the cross-dim <base>_* pattern. Caller values flow through
bound script params only — never string-interpolated into the Painless
source — closing the script-injection surface.
- inspectByQueryResponse (byquery.go) mirrors inspectBulkResponse: the
full failure reason goes to the debug log only; the returned error
carries the bounded id + type.
- UpdateByQueryParams.Refresh is *bool in opensearch-go v4.6.0 (the same
shape as DeleteByQuery's quirk), so refresh=wait_for is not
expressible; we use refresh=true.
Driver-owned audit (DIP)
- A new opensearch.AuditSink interface (with nopSink + WithAuditSink
functional option) lets the driver emit opensearch.index_created and
opensearch.reindex_executed events without importing any service
package — the service layer implements the interface. NewRepository
takes opts, so existing 4-arg test call sites keep compiling unchanged.
- internal/container/audit_sink.go bridges AuditSink to AuditLogService.
When the context carries no tenant (the env-path registration ctx
during boot, for example) the adapter skips the emit with a warning
rather than silently writing tenant_id=0, which would collide with the
system-scope sentinel.
Frontend + polish
- FieldSchema (frontend/src/api/vector-store.ts) gains min/max/enum/
immutable. VectorStoreSettings.vue is now schema-driven: a closed
`enum` renders a t-select; number inputs use the schema's `:min`/`:max`
and fall back to the legacy replica-vs-shard heuristic only when the
schema does not pin them; a danger-coloured warning fires when
insecure_skip_verify is toggled on (the switch and warning are wrapped
in a vertical stack so the warning sits on its own row below the switch).
- i18n: labels for hnsw_m / hnsw_ef_construction / hnsw_ef_search /
knn_engine / insecure_skip_verify plus the warning copy in en-US,
ko-KR, zh-CN, ru-RU.
- docker-compose.dev.yml: an opensearch profile (single-node 3.3.2 with
security plugin disabled for dev only). OpenSearch Dashboards lives in a
separate, opt-in opensearch-ui profile so the heavy UI container is not
forced up alongside the cluster (the driver e2e is fully curl-verifiable
against :9200). The new docs/dev/opensearch-integration-test.md covers the
end-to-end exercise and the single-node guidance (set replicas=0 to keep
the cluster Green).
Gating-guard tests flipped
- The "OpenSearch is NOT in validEngineTypes / mapping / types list /
env builder / stubs" guard tests from PR 1 / PR 2 are replaced by
their positive counterparts in this PR. The test suite was the
activation checklist; the activation flip is its diff.
Backward compatibility
- Additive everywhere. IndexConfig's new HNSW fields are omitempty so
other engines' serialised config is byte-identical. Existing
Elasticsearch / Qdrant / Milvus / Weaviate / Doris / TencentVectorDB
stores are untouched. No migrations.
Test plan
- go build ./... clean
- go vet ./... clean
- gofmt -l clean on touched files
- go test ./... — only TestOssEnsureBucket_CreateFails (Aliyun OSS
endpoint), the docreader gRPC tests, and the doris SQL-shape tests
fail; all three are pre-existing on upstream/main and untouched by
this PR.
- New tests across internal/types, opensearch, service and container —
including a full end-to-end env-path test that exercises
initRetrieveEngineRegistry with RETRIEVE_DRIVER=opensearch against an
httptest cluster.
This commit refines the language used in the knowledge parsing documentation and user interface. Key changes include:
- Updated the description of the `finalizing` state to clarify that it refers to ongoing optimization tasks rather than just completion.
- Modified the confirmation message for canceling parsing to replace "enhancement" with "optimization" for consistency across multiple languages.
- Enhanced the UI to better reflect the current parsing status, including a new function to display appropriate status messages during in-flight parsing.
These changes aim to improve user understanding and experience when interacting with the knowledge parsing features.
Lets users stop an in-flight document parse to free up LLM / worker
resources without losing the chunks and index already written. The
core insight is that the previous parse_status=completed flipped as
soon as primary chunks landed, while the most expensive subtasks
(graph extract = N LLM calls per chunk, plus summary, question
generation) were still running in the background — so "completed"
wasn't actually terminal from a resource standpoint.
State machine
pending -> processing -> finalizing -> completed
|
+-> cancelled (any of the three
in-flight states)
+-> failed
+-> deleting
`finalizing` is the new post-process fan-out window. parse_status
only promotes to `completed` once pending_subtasks_count (a new
column tracking summary + question + per-chunk graph extract)
drains to zero via atomic FinalizeSubtask. Wiki ingest is
intentionally excluded from the counter — it's a KB-scoped
debounced batch and would otherwise pin parse_status in
`finalizing` for the wiki batch window.
Backend
- New ParseStatusFinalizing + pending_subtasks_count column with
migration 000056.
- knowledgeRepository.SetFinalizing transitions processing -> finalizing
conditionally so a racing cancel cannot be clobbered.
- knowledgeRepository.FinalizeSubtask atomically decrements the
counter and self-promotes the row to completed when it hits zero.
- KnowledgePostProcess restructured to compute expected subtask
count up front, flip to finalizing (or completed when no
enrichment is enabled), and only then fan out subtasks. Subtask
handlers (summary, question, graph extract) defer-decrement on
terminal exit using the existing isFinalAsynqAttempt convention.
- New POST /api/v1/knowledge/{id}/cancel-parse handler accepting
pending / processing / finalizing. Marks the row cancelled,
zeroes the counter, best-effort dequeues asynq tasks via a new
TaskInspector abstraction (asynq-mode walks pending/scheduled/
retry queues; Lite-mode noop), and scrubs wiki ingest pending op.
- SpanTracker.AbortAttempt flat-sweeps every still-running span
for the attempt via a new repo.CancelAllOpenSpans helper so the
trace viewer's striped bars all flip to cancelled, even leaf
generations whose parent stage already EndSpan'd (multimodal
fan-out pattern). knowledge_post_process closes its postSpan
via SkipSpan on the cancel/deleting entry guard so a worker
that opens a span AFTER the cancel sweep doesn't leak it.
- Housekeeping and resetPendingTasks sweep finalizing rows
identically to processing so a crash/restart can't strand them.
- DeleteKnowledge/DeleteKnowledgeList proactively dequeue
downstream tasks via the same TaskInspector path.
- ChunkExtractService gets a cancel entry guard so the most
expensive enrichment (graph extract) bails immediately when the
parent knowledge is aborted.
Frontend
- New cancelKnowledgeParse API client + "Stop parsing" entry in
both list view and card view more menus, gated on
pending/processing/finalizing.
- Polling predicate refactored to a shared isParseInFlight helper
that recognises `finalizing` (previously the doc list silently
stopped polling once parse_status flipped from processing).
- Knowledge processing timeline: isPolling includes finalizing,
new isHardTerminal short-circuits LIVE for cancelled/failed/
completed so stranded child spans cannot pin LIVE on.
- DocumentListView.computeStatus distinguishes finalizing
("增强中") from completed and shows the previous "生成摘要中"
copy when summary_status is still pending under finalizing.
Added cancelled badge as well.
- i18n: statusFinalizing / statusCancelled / cancelParse* keys
across zh-CN, en-US, ko-KR, ru-RU.
Docs / SDK
- docs/api/knowledge.md: documents the new finalizing state,
cancel-parse semantics, and which statuses accept cancel.
- client (Go SDK): CancelKnowledgeParse with docstring listing
the cancellable statuses.
Extend the builtin_models.yaml loader so the YAML file becomes a complete
source of truth for the rows it owns. Builds on the previous commit's
managed_by column.
Lifecycle contract:
- Every UPSERTed entry is tagged managed_by="yaml".
- The DoUpdates list now includes deleted_at, so an entry that was
soft-deleted (e.g. via UI/API) is automatically resurrected when it
reappears in the file. Closes the "ghost row that exists but is
invisible" failure mode.
- After all UPSERTs, the loader soft-deletes rows where
managed_by='yaml' AND id NOT IN (current YAML id set). Removing an
entry from YAML is now the supported way to retire a built-in model —
no manual SQL needed.
- Rows tagged managed_by='' (UI/API/SQL-seeded built-ins) are invisible
to the reconcile path and never touched.
- When a YAML entry sets is_default=true, the loader first clears
is_default on any other rows in the same (tenant_id, type) bucket,
mirroring the invariant enforced by the API path
(repository.UnsetDefaultModel).
Failure handling stays defensive:
- File missing / not a regular file / parse error: warn and skip; the
drift sweep is NOT executed so a malformed file cannot wipe rows.
- Per-entry UPSERT error: warn, drop the id from the keep-set so the
sweep also leaves the existing row alone ("leave alone on failure").
Tests cover: file-missing, parse-error, basic upsert + defaults,
idempotency, ${ENV} interpolation (set vs unset), drift sweep removing
YAML rows, drift sweep ignoring manual rows, soft-delete resurrection,
is_default cleanup across tenant+type, explicit empty list sweeping all
yaml-managed rows, and a regression guard ensuring BeforeCreate does not
overwrite YAML-supplied stable ids.
Docs are rewritten so operators see "delete from YAML and restart" as
the supported removal path; SQL is retained only for the legacy
managed_by='' slice.
Allow built-in models to be declared in config/builtin_models.yaml
instead of inserting rows via SQL. On every startup the file is read
and each entry is UPSERT-ed into the models table (is_builtin=true)
by stable id.
Any string field may reference an environment variable with ${NAME}.
Unset variables are left as the literal placeholder so
misconfiguration surfaces clearly in provider calls rather than
failing silently with an empty token.
The file is optional: missing file, parse errors, and per-entry
upsert failures all log a warning without aborting startup.
docker-compose.yml adds env_file (.env, required:false) so
deployment-specific variables are passed through automatically.
Reuse enqueueKnowledgeListDelete inside DeleteKnowledge so that single-item
delete shares the same hardening as BatchDeleteKnowledge / ClearKnowledgeBase:
asynq retries, business-aware queue routing, and marking-as-deleting inside
the worker.
The endpoint now returns 200 once the delete task has been enqueued; the
response body carries the asynq task_id and the message is updated to
"Delete task submitted". Swagger annotations, generated docs and the Go
client SDK comment are updated to reflect the new asynchronous semantics.
Note: this is a behavior change. Callers that previously assumed the
knowledge was already gone on a 200 response should poll the task status
or accept eventual consistency, matching the existing BatchDeleteKnowledge
contract.
Tenant RBAC headline release: 4-tier role matrix (Owner/Admin/
Contributor/Viewer), per-KB resource ownership, per-tenant audit
log, tenant member management, self-service workspaces.
Also: CLI v0.3/v0.4 GA, KB retrieval fan-out across vector stores,
AES-256-GCM credential at-rest, docreader gRPC TLS+Token, Zhipu
embedding, Huawei OBS, vLLM URL for MinerU, Apache Doris compat
modes, server-side user preferences, Go 1.26.0.
See CHANGELOG.md for the full list.
docs(rbac): wire RBAC screenshots into READMEs and RBAC guide
- README.md / README_CN.md / README_JA.md / README_KO.md: replace the
single member-management thumbnail under the v0.6.0 RBAC highlight
with a 2×2 showcase (member management, workspace switcher,
self-service workspace creation, pending invitations).
- docs/RBAC说明.md: add the member-management screenshot to the
existing 前端实际界面 showcase so the guide is self-contained
and no longer cross-references README for it.
feat(rbac-ui): link tenant member page to RBAC guide
Add an inline doc-link in the Tenant Members settings page that
opens docs/RBAC说明.md on GitHub in a new tab, complementing the
existing in-app role-matrix popover. New i18n key
tenantMember.learnRbacGuide covered for zh-CN / en-US / ko-KR /
ru-RU.
Replace the English docs/rbac.md with a comprehensive Chinese
docs/RBAC说明.md and a wiki-style summary under docs/wiki/安全认证/.
Explain how tenant RBAC relates to the shared space feature (they
are orthogonal: tenant RBAC is the vertical defense, shared space
is the horizontal collaboration channel) and cross-link the two
docs in both the flat and wiki trees. Update inbound references in
.env.example, docker-compose.yml, and the auth legacy env test to
point at the new file name.
Refactor the tenant RBAC configuration to change the default value from false to true, enabling role enforcement by default. This change allows operators to opt into a logging-only rollout window by explicitly setting the configuration to false.
Updates include:
- Modifications to .env.example and docker-compose.yml to reflect the new default.
- Adjustments in rbac.md documentation to clarify the new default behavior and the opt-in process.
- Code changes across various files to utilize the new pointer-based configuration for EnableRBAC, ensuring nil safety and clearer intent.
No functional changes were introduced; the adjustments primarily enhance clarity and maintainability of the RBAC feature.
DISABLE_REGISTRATION=true used to block /auth/register at the handler
layer only, leaving /auth/config still reporting self_serve. The
frontend therefore kept showing the Register entry even when the env
var was set — clicking it just hit the 403. Two gates, out of sync.
Wire DISABLE_REGISTRATION=true through applyAuthAndTenantDefaults so
it coerces auth.registration_mode to invite_only (env wins over YAML,
matching docs/rbac.md). The handler-side os.Getenv check is now
redundant with IsInviteOnly() and is removed, leaving a single
enforcement path.
Add config tests pinning down the env/YAML matrix, including the
explicit-self_serve override case that would otherwise be the easy
regression to ship.
DISABLE_REGISTRATION=true (handler layer) and
WEKNORA_AUTH_REGISTRATION_MODE=invite_only (config layer) were two
env-level ways of saying the same thing: block /auth/register.
Keeping both invites confusion about which wins and risks operators
setting one while expecting the other.
Remove the WEKNORA_AUTH_REGISTRATION_MODE env override so the env
layer exposes a single knob (DISABLE_REGISTRATION). The
auth.registration_mode config field stays — operators who want the
richer behaviour (frontend hides the registration entry via
/auth/config in addition to the server-side 403) flip it in
config.yaml.
No behaviour change for deployments that did not set the env var.
Update docs/rbac.md to document the deliberate split.
The audit_logs table grows monotonically until something purges it.
PR 6 explicitly listed retention as out of scope for v1; this picks
that up.
Adds an `audit.retention_days` config (default 90, 0 disables) and a
small background goroutine `AuditLogRetentionRunner` that fires once
~10 minutes after boot and then every 24h, calling
`AuditLogService.Purge` which DELETEs rows older than the cutoff in a
single statement. The dependency surface is intentionally minimal —
no robfig/cron, no asynq, no migration — because retention has no
wall-clock alignment requirement.
Wiring is the same shape as the existing data source scheduler:
container `Provide(NewAuditLogRetentionRunner)`, an `Invoke` that
calls `Start`, and a `ResourceCleaner.RegisterWithName` for graceful
shutdown.
Defaults preserve operator intent:
- `audit:` section omitted from YAML -> retention_days=90.
- explicit `audit.retention_days: 0` in YAML -> purge disabled.
- `WEKNORA_AUDIT_RETENTION_DAYS=N` env override (incl. N=0).
- `retention_days < 0` is rejected by ValidateConfig.
Tests cover the service Purge contract (no-op when disabled, cutoff
math, error propagation) and the runner lifecycle (Start no-op when
disabled, idempotent Start/Stop, ticker cadence, runOnce swallows
errors). 10 new tests, all green.
docs/rbac.md updated with the new YAML / env / behaviour.
Refs: #1303
- Added comprehensive documentation for the role-based access control (RBAC) system within tenants, detailing the purpose, role matrix, enforcement mechanisms, and configuration options.
- Explained the implications of enabling RBAC, including the default settings and how to audit role assignments before activation.
- Included schema references for the `tenant_members` and `audit_logs` tables to assist developers in understanding the underlying data structure.
Wires KnowledgeBase.VectorStoreID and the ownership-aware retrieve factory
into the user-facing knowledge-base lifecycle:
- POST /knowledge-bases validates the requested vector_store_id against
the caller's tenant scope and the engine registry. New error codes
ErrVectorStoreBindingInvalid (2200) and ErrVectorStoreUnavailable (2201)
distinguish the typed branches without echoing UUIDs to the client.
- GET / POST / PUT / PUT-pin responses embed the bound store's display
metadata (name, source, engine_type, status) without exposing any
connection credentials. Cross-tenant shared KBs receive a suppressed
payload (vector_store_id stripped, source="shared") so operator-chosen
store names cannot be enumerated across tenants.
- POST /knowledge-bases/copy synchronously rejects clones whose target
has a different embedding model or vector store, before the async
clone task is enqueued. The async clone worker re-applies the same
checks for defense in depth.
- DELETE /vector-stores/:id refuses to remove a store with bound KBs,
inside a transaction that row-locks the store on PostgreSQL and
serializes via WAL on SQLite. unregister-from-registry is wrapped in
defer/recover so a panic surfaces as a structured warning instead of
silently leaking a stale engine.
- vector_store_id is immutable after creation. The GORM <-:create tag
blocks every ORM update path; the service-layer DTO omits the field
entirely; a reflection-based regression test catches any future
maintainer who adds it back to either layer.
- Empty-string vector_store_id is normalized to nil at both the create
path and inside SharesStoreWith, so rows persisted by callers that
did not run Normalize first cannot trip false same-store comparisons.
Part of #993. Depends on #994 and #1310.
When a startup database migration fails (e.g. issue #1319: pg_trgm not
available so 000041 cannot build its trigram index), the application
intentionally keeps booting so operators can reach the UI to diagnose.
However, before this change the system info page silently dropped the
"DB Version" row because the value was empty:
- migration.go only cached the version after a successful m.Up(); the
error path returned early and left migrationVersionSet=false.
- system.go used CachedMigrationVersion's ok=false to skip emitting
db_version, and the JSON tag was already omitempty.
- SystemInfo.vue gated the entire row on v-if="systemInfo?.db_version".
The end result was the most useful diagnostic surface vanishing in the
exact failure mode that needs it most — Wiki ingest and KG features
would silently produce nothing with no UI hint.
Changes:
- migration.go: replace sync.Once-based setter with an RWMutex-guarded
state struct holding {version, dirty, err}. Every failure path now
calls captureMigrationFailure(m, err), which best-effort reads
m.Version() so the cached value still reflects the partial state.
- system.go (handler): always emit db_version (falling back to
"unknown" when no version could be read), append " (failed)" when an
error is recorded, and add db_migration_error to the response.
- swagger / client SDK: keep the API contract in sync with the new
response field.
- SystemInfo.vue: render the DB version row whenever either field is
present, show a "Migration failed" danger tag, and add a full-width
alert below the row carrying the error message plus two links:
1. View troubleshooting guide -> new docs/migration-troubleshooting.md
2. Report an issue -> github.com/Tencent/WeKnora/issues/new,
prefilled with the captured error and environment metadata.
- docs/migration-troubleshooting.md: new self-service guide covering
the common failure modes (missing extension, dirty state, privileges,
out of disk, schema drift) with concrete psql / make commands.
- i18n: add the new keys to zh-CN, en-US, ko-KR, ru-RU.
Refs #1319.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add three new optional filters to the document list under a knowledge base
detail page — parse status, source/channel, and updated time range — and
rework multi-select to no longer cause the card title to jitter on hover.
Backend
- Introduce types.KnowledgeListFilter to aggregate optional filter dimensions
(tag, keyword, file_type, parse_status, source, updated_from/to) and switch
ListPagedKnowledgeByKnowledgeBaseID (repository/service/interface) to accept
it instead of a growing positional parameter list.
- The ListKnowledge HTTP handler accepts new parse_status, source, start_time
and end_time query params; time params accept RFC3339, "YYYY-MM-DD HH:MM:SS"
and "YYYY-MM-DD". The repository routes source="manual"/"url" onto the type
column to stay consistent with file_type semantics; other source values match
the channel column.
- Update the four other callers (agent_service, initialization) to pass an
empty filter struct, preserving prior behavior.
Frontend
- Add three controls in the doc-filter-bar (status select, source select,
date-range picker with future-date disabled) wired through getKnowled /
listKnowledgeFiles into the new backend params.
- Replace the hover-triggered card checkbox with an explicit "批量管理" mode
(mirrors the session list UX): in card view the checkbox only renders while
batch mode is on, entered via the per-card "..." menu; the list view keeps
its leading checkbox column. Switching from list to grid auto-enables batch
mode when something is already selected, so the selection stays visible.
- DocumentBatchBar now stays open whenever batch mode or selection > 0, and
its "取消选择" button both clears the selection and exits batch mode.
API surface sync
- Regenerate Swagger artifacts (docs/docs.go / swagger.json / swagger.yaml).
- Update docs/api/knowledge.md with the new query parameters.
- Add backward-compatible ListKnowledgeWithFilter + KnowledgeListFilter to the
Go SDK; the existing ListKnowledge keeps its signature.
i18n
- New filter labels in zh-CN / en-US / ko-KR / ru-RU; reuse existing
menu.batchManage / batchManage.cancel for the multi-select strings.
Add scripts and docs for packaging WeKnora into cloud images (AMI,
custom images, snapshots) so users can distribute one-click deployable
templates on any cloud provider.
- scripts/cloud-image/: cloud-agnostic prepare/cleanup/firstboot scripts
plus systemd units. Downloads only the 4 runtime files needed by the
compose stack (~100KB) instead of cloning the full repo, and pins to
any git ref via WEKNORA_REF for reproducible builds.
- firstboot.sh randomizes DB/Redis/JWT/AES secrets on first boot,
writes credentials to /root/weknora-credentials.txt and self-removes.
- docs/cloud-image/: per-platform packaging guides. Includes a guide
for Tencent Cloud Lighthouse / CVM covering image creation, sharing,
and marketplace listing.
Default-on services match the unprofiled compose stack (frontend, app,
docreader, postgres, redis); optional services (qdrant, milvus,
neo4j, langfuse, etc.) remain opt-in via compose profiles to keep the
image size small.
Add an opt-in human approval gate so Agent runs pause before executing
MCP tools that operators flag as dangerous, surface an approval card in
the chat UI, and only resume after the user approves (optionally with
edited args) or rejects.
Backend
- New mcp_tool_approvals table + repo/service to mark per-tool approval
required (PG migration 000042 + sqlite init).
- approval.Gate coordinates RequestAndWait / Resolve with sync.Once
delivery, configurable timeout, and Redis Pub/Sub fan-out so multi-
replica deployments work without sticky sessions.
- MCPTool.Execute integrates the gate; uses a round-level ApprovalCtx
(without the per-tool 60s timeout) for the wait, and re-derives a
fresh 60s exec ctx after approval so CallTool keeps a full window.
- New SSE response types (tool_approval_required / _resolved) and
EventBus events plumb approval state to AgentStreamDisplay.
- REST: list/set per-tool approval flag, resolve pending approval.
- Configurable via agent.tool_approval_timeout_seconds (yaml) or
WEKNORA_AGENT_TOOL_APPROVAL_TIMEOUT env (accepts seconds or Go
duration).
Frontend
- MCP settings: per-tool "require approval" switch on the test panel.
- Chat: ToolApprovalCard renders the pause point with editable JSON
args, validation feedback, mm:ss countdown that turns warning/danger
near deadline, and a resolved state that retains context.
- i18n strings added for zh-CN / en-US / ko-KR / ru-RU.
Docs
- docs/zh/mcp-approval.md covering behavior, config, API, deployment
considerations (Redis cross-instance, restart limitations).
Three small follow-ups from the QA review of 13f57ca:
1. KBChunkingDebug error surfacing
Previously a 200 OK response with { success: false, error: "..." }
would have been swallowed under a generic "unexpected response shape"
message. The strict-shape check now distinguishes empty response,
success=false (surfaces resp.error directly), and missing data — so
any future backend-side validation message reaches the user.
2. Token approximation in docs/CHUNKING.md
Child Chunk Size default 384 ≈ 95 EN tokens (was rounded to 80).
384 / 4 chars-per-token = 96; "~95" is the honest figure.
3. API surface in docs/CHUNKING.md
The example only documented PUT /initialization/config/:kbId
(camelCase, documentSplitting envelope). Added explicit notes that
POST/PUT /knowledge-bases use snake_case under chunking_config, and
that POST /chunker/preview also uses the snake_case form plus a text
field. Readers picking the wrong endpoint won't be surprised by the
case mismatch anymore.
https://claude.ai/code/session_01XADhx6mtu2ZYW3DE9Lun6k
Three documentation passes around the adaptive chunking work:
UI
- Frontend ChunkOverlap default consolidated to 80 (was 100), matching
chunker.DefaultChunkOverlap on the backend. Both DEFAULT_CHUNKING_PRESET
and initFormData updated. The KB-load fallback also uses 80 when a
loaded KB has no chunk_overlap stored.
- All four locales (en-US, zh-CN, ko-KR, ru-RU) get rewritten chunking
setting descriptions: each now states the validated range, the default,
and the situations where you'd deviate (FAQ vs narrative, embedder
token limits, language-specific corpora).
Source code
- splitter.go: DefaultChunkSize / DefaultChunkOverlap constants get a
longer block-comment explaining the per-language token math and the
use-case sweet spots, plus the migration note on what the old
inconsistent defaults were.
- KBChunkingSettings.vue: new comment block above ChunkingConfig
documents the slider min/max for each setting, why those bounds
exist, and the recommended TokenLimit values per embedding model.
Repo docs
- New docs/CHUNKING.md: end-to-end guide covering why chunking matters,
the adaptive 3-tier architecture, per-setting reference with ranges
and sweet spots, parent-child explanation, the token-limit table per
embedder (OpenAI / Voyage / Cohere / BGE / MiniLM / Jina), 7 use-case
presets, the debug panel workflow, the API surface, and known
trade-offs (recursive strategy hidden from UI, no auto-reindex on
strategy switch, OCR limitations).
- CHANGELOG.md gets a new [Unreleased] section consolidating all the
adaptive-chunking work shipped on this branch: 5 features, 8
improvements, 6 fixes, 1 docs entry. The entry references
docs/CHUNKING.md for deeper explanation.
https://claude.ai/code/session_01XADhx6mtu2ZYW3DE9Lun6k
Bump version to v0.5.0 across VERSION, frontend/package.json,
frontend/package-lock.json and helm/Chart.yaml.
Highlights:
- Wiki Mode: agent-driven Wiki knowledge system that distills raw
documents into interlinked markdown pages, with a dedicated
WikiBrowser and an interactive knowledge graph visualizing page
references and relationships.
- Observability: Langfuse tracing across the agent ReAct loop, LLM
token usage, tool calls and the asynq async pipeline.
- Customizable indexing strategy: per-knowledge-base toggles for
vector / keyword / Wiki / knowledge-graph indexing.
- Vector Store UI & per-KB binding.
- Yuque connector with full / incremental sync.
- Agent enhancements: json_repair tool, OpenMAIC Classroom skill,
multi-sheet DuckDB Excel analysis.
- Docs: refreshed READMEs (EN/CN/JA/KO), CHANGELOG, QA, regenerated
Swagger and updated architecture diagram with new Wiki/Langfuse
components.
The existing Langfuse integration covered Chat / Embedding / Rerank / VLM /
ASR generations plus the HTTP + asynq spans, but the agent's own execution
tree was invisible: tool calls never appeared, multi-round ReAct iterations
were flat under the HTTP trace, and there was no single node representing
"one agent run".
This change adds three levels of agent-side spans:
- agent.execute — wraps AgentEngine.Execute, records query preview,
knowledge bases, allowed tools, final-answer length
and totals on finish.
- agent.round.<N> — wraps each ReAct iteration; records finish_reason,
tool-call count, token usage and duration.
- agent.tool.<name> — wraps each tool invocation; records arguments,
success, duration, output preview (rune-safe, 4KB
cap), error, data keys and image count.
To keep the loop's many exit paths (natural stop, stuck loop, empty-content
retry, final_answer, context cancellation) span-safe, the iteration body was
extracted into runReActIteration with a single defer span.Finish() and an
iterOutcome sentinel driving the outer loop. database_query arguments are
redacted (keys only) to avoid leaking raw SQL into the observability
backend, mirroring the existing UI hint policy.
Adds unit tests for the new helpers (truncateForLangfuse, argKeys, dataKeys,
finishToolSpan nil-safety, iterOutcome.String).
Previously the Langfuse integration only traced in-process HTTP requests
(chat / search / eval), so file uploads and every downstream asynq task
(document parse, chunk embedding, OCR/VLM, summary / question gen, wiki
ingest, datasource sync, etc.) produced either disconnected shallow
traces or no observation at all.
This change threads one trace end-to-end:
- tracer: add SPAN observation type and StartSpan; add ResumeTrace so a
worker can attach to an upstream trace without emitting a duplicate
trace-create; StartGeneration now auto-picks parentObservationId from
ctx so nested trace -> span -> generation trees render correctly.
- types.TracingContext + LangfuseTracingCarrier: embed on all 17 asynq
payloads so trace_id / parent_obs_id / user_id / session_id serialise
into every job.
- langfuse.InjectTracing: injected at 28 enqueue sites before json.Marshal
so the HTTP-layer trace survives the Redis hop.
- langfuse.AsynqMiddleware: mux.Use hook that peeks the payload, either
resumes the upstream trace or opens a standalone asynq.<type> trace
for scheduled jobs, and wraps the handler in a SPAN with task metadata
(id / queue / retry / payload_bytes) plus ERROR level on failure.
- GinMiddleware.shouldTrace: whitelist ingestion / knowledge-mutation /
FAQ / wiki / datasource endpoints so the root trace actually starts.
- Tests: tracer_test.go covers span nesting, error status, and
ResumeTrace no-trace-create guarantee; asynq_test.go covers
InjectTracing round-trip, middleware resume path, and standalone
trace fallback.
- Docs: docs/Langfuse\u96c6\u6210.md now lists the covered task types
and documents the cross-process propagation model.
No behavioural change when Langfuse is disabled (all new code paths are
no-ops and carriers serialise to empty strings with omitempty).
Closes#620#497. Add opt-in Langfuse observability covering all five
model types (chat, embedding, rerank, VLM, ASR) with HTTP-request-scoped
traces and Docker Compose support (both cloud and self-hosted).
Core package internal/tracing/langfuse:
- HTTP client with batched async ingestion (non-blocking in request path)
- Sampling, environment / release tagging, and graceful fallback when
LANGFUSE_* env vars are absent (wrappers become no-ops)
- Gin middleware opens one trace per traced request and finishes it after
the handler chain returns, attaching method / path / user / session
- Trace context is stored under a typed key exported from internal/types
so logger.CloneContext can preserve it across handler / goroutine
boundaries (otherwise each LLM call auto-created an orphan trace,
fragmenting one request into many)
Per-model generation wrappers (opt-in via NewChat/NewEmbedder/...):
- chat: captures prompt, streaming output, token usage + TTFT
- embedding: approximates tokens when the provider omits usage
- rerank: previews query/docs, summarizes results to keep payload small
- vlm: records image count and total bytes, never uploads raw pixels
- asr: records file size and audio duration, never uploads audio bytes
Async title generation (GenerateTitleAsync) now forwards the trace key
into the goroutine so title calls appear under the parent chat trace.
Docker Compose:
- LANGFUSE_* env passthrough on the `app` service for cloud deployments
- Optional `langfuse` profile spins up a self-hosted Langfuse stack that
reuses WeKnora's existing PostgreSQL (separate database via an idempotent
init container that fixes ICU collation drift) and Redis (separate DB
number), adding only ClickHouse, MinIO, web and worker containers
- web/worker entrypoints URL-encode DB_PASSWORD / REDIS_PASSWORD at start
to avoid Prisma P1013 when passwords contain @ / # / etc.
Docs: docs/Langfuse集成.md covers cloud vs self-hosted, per-model usage
strategy, code map, and resource footprint.
- Create structured wiki from docs/ directory
- Add 17 markdown pages organized in 7 categories
- Include standard Markdown relative path links for navigation
- Add Mermaid knowledge graph visualization in Home.md
Wire VectorStoreService to HTTP with 8 endpoints: types metadata, CRUD
(create/list/get/update/delete), and connection testing (raw + by ID).
Register routes, DI container bindings, and add API documentation.
- Updated relevant files to include provider registration, implementation, and metadata.
- Enhanced frontend components to support provider management and configuration.
- Added localization for new provider settings and messages.
- Implemented backend repository methods for CRUD operations on web search providers.