Introduce opendataloader and PaddleOCR-VL parser engines with tenant-level
settings UI, replace liteparse, and harden Excel/PPT/Markdown parsing.
Optional odl-hybrid sidecar stays local-build only and is excluded from
default dev-start and full profiles.
Phase 3 (#1440) gate flip. PR 1 (#1445) + PR 2a (#1481) + PR 2b (#1482)
laid the type prep + driver skeleton + read/write paths as gated dead
code; this PR wires every activation surface so opensearch becomes a
registerable VectorStore engine.
Activation wiring
- internal/types: validEngineTypes / GetVectorStoreTypes (with HNSW
bounds + knn_engine enum + Immutable hints) / retrieverEngineMapping /
buildEnvStoreForDriver — every gated surface now recognises
"opensearch". IndexConfig grows four omitempty HNSW fields (HNSWM /
HNSWEFConstruction / HNSWEFSearch / KNNEngine), keeping other engines'
serialised config byte-identical.
- internal/container: createOpenSearchEngine + the switch case in
createEngineServiceFromStore; the RETRIEVE_DRIVER=opensearch env path
in initRetrieveEngineRegistry; NewEngineFactory now closes over the
AuditLogService (the EngineFactory type itself is unchanged).
- internal/application/service/vectorstore_healthcheck.go: a
testOpenSearchConnection case so CreateStore's connectivity probe
accepts opensearch instead of returning 400.
- internal/application/repository/retriever/opensearch/transport.go:
NewOpenSearchClient is exported so the factory and env path can build
the TLS-hardened client; healthcheck.go reuses the unexported
probeVersion / probeKNNPlugin for the service-layer probe.
Service-layer validation
- validateOpenSearchIndexConfig validates the HNSW caps (m 2-100,
ef_construction 2-4096, ef_search 1-10000, knn_engine ∈ lucene|faiss).
Shards/replicas continue to be enforced by the flat ValidateIndexConfig.
Create-only: UpdateStore mutates the name only.
- validateConnectionConfig requires addr for opensearch.
Sync implementations (stubs.go shrinks)
- CopyIndices (copy.go) mirrors the Elasticsearch / Qdrant pattern —
search → BatchSave with the source_id remap for generated questions —
so dim/keyword routing and the source_id contract come from BatchSave
for free. embeddingMap is keyed by the *target* SourceID because
OpenSearch's BatchSave looks up embeddings by SourceID
(lookupEmbedding), not by chunk_id (the ES driver's convention).
Pagination is from/size; copies larger than max_result_window
(default 10000) need the scroll-based async path that lands later.
- BatchUpdateChunkEnabledStatus / BatchUpdateChunkTagID (bulk_update.go)
group the input by target value and issue one _update_by_query per
group over the cross-dim <base>_* pattern. Caller values flow through
bound script params only — never string-interpolated into the Painless
source — closing the script-injection surface.
- inspectByQueryResponse (byquery.go) mirrors inspectBulkResponse: the
full failure reason goes to the debug log only; the returned error
carries the bounded id + type.
- UpdateByQueryParams.Refresh is *bool in opensearch-go v4.6.0 (the same
shape as DeleteByQuery's quirk), so refresh=wait_for is not
expressible; we use refresh=true.
Driver-owned audit (DIP)
- A new opensearch.AuditSink interface (with nopSink + WithAuditSink
functional option) lets the driver emit opensearch.index_created and
opensearch.reindex_executed events without importing any service
package — the service layer implements the interface. NewRepository
takes opts, so existing 4-arg test call sites keep compiling unchanged.
- internal/container/audit_sink.go bridges AuditSink to AuditLogService.
When the context carries no tenant (the env-path registration ctx
during boot, for example) the adapter skips the emit with a warning
rather than silently writing tenant_id=0, which would collide with the
system-scope sentinel.
Frontend + polish
- FieldSchema (frontend/src/api/vector-store.ts) gains min/max/enum/
immutable. VectorStoreSettings.vue is now schema-driven: a closed
`enum` renders a t-select; number inputs use the schema's `:min`/`:max`
and fall back to the legacy replica-vs-shard heuristic only when the
schema does not pin them; a danger-coloured warning fires when
insecure_skip_verify is toggled on (the switch and warning are wrapped
in a vertical stack so the warning sits on its own row below the switch).
- i18n: labels for hnsw_m / hnsw_ef_construction / hnsw_ef_search /
knn_engine / insecure_skip_verify plus the warning copy in en-US,
ko-KR, zh-CN, ru-RU.
- docker-compose.dev.yml: an opensearch profile (single-node 3.3.2 with
security plugin disabled for dev only). OpenSearch Dashboards lives in a
separate, opt-in opensearch-ui profile so the heavy UI container is not
forced up alongside the cluster (the driver e2e is fully curl-verifiable
against :9200). The new docs/dev/opensearch-integration-test.md covers the
end-to-end exercise and the single-node guidance (set replicas=0 to keep
the cluster Green).
Gating-guard tests flipped
- The "OpenSearch is NOT in validEngineTypes / mapping / types list /
env builder / stubs" guard tests from PR 1 / PR 2 are replaced by
their positive counterparts in this PR. The test suite was the
activation checklist; the activation flip is its diff.
Backward compatibility
- Additive everywhere. IndexConfig's new HNSW fields are omitempty so
other engines' serialised config is byte-identical. Existing
Elasticsearch / Qdrant / Milvus / Weaviate / Doris / TencentVectorDB
stores are untouched. No migrations.
Test plan
- go build ./... clean
- go vet ./... clean
- gofmt -l clean on touched files
- go test ./... — only TestOssEnsureBucket_CreateFails (Aliyun OSS
endpoint), the docreader gRPC tests, and the doris SQL-shape tests
fail; all three are pre-existing on upstream/main and untouched by
this PR.
- New tests across internal/types, opensearch, service and container —
including a full end-to-end env-path test that exercises
initRetrieveEngineRegistry with RETRIEVE_DRIVER=opensearch against an
httptest cluster.
Follow-up to #1359. Addresses a set of correctness and security gaps in
the initial docreader auth implementation.
- docker-compose: inject GRPC_TLS_*/GRPC_TLS_SERVER_NAME/GRPC_AUTH_TOKEN
into the WeKnora-app service. Without this the Go client never saw the
knobs, so enabling token auth on the server broke every RPC.
- client: bind tokenAuth.RequireTransportSecurity() to TLSEnabled so a
bearer token cannot be sent over an insecure channel once TLS is on.
- server: load_tls_credentials now raises TLSConfigError on misconfig
(cert/key missing, file unreadable, mTLS without CA); main.py exits 1
instead of silently downgrading to insecure.
- server: replace endswith("/Check"|"/Watch") health bypass with exact
match against /grpc.health.v1.Health/{Check,Watch}.
- server: compare tokens with hmac.compare_digest, warn on tokens < 16B.
- server: AuthInterceptor now returns an abort handler matching the
original RPC kind (unary/stream) and uses context.abort, so streaming
RPCs surface UNAUTHENTICATED instead of INTERNAL.
- internal/infrastructure/docparser/grpc_parser.go: drop the duplicated
TLS/tokenAuth block and reuse docreader/client.LoadAuthConfigFromEnv +
BuildDialOptions. Single source of truth for client-side auth.
- Add GRPC_TLS_SERVER_NAME (client SNI override) and
GRPC_MTLS_REQUIRE_CLIENT_CERT (server explicit mTLS toggle); document
the differing CA semantics between client and server in .env*.example.
- Reject half-configured client mTLS (cert XOR key) loudly.
- Fix missing trailing newline in .env.lite.example.
Verified locally: go build ./... and go vet ./... clean; auth.py
fail-fast / token paths smoke-tested.
`${SEARXNG_SECRET:?...}` made the variable mandatory at compose parse time,
which forced *any* compose command (default profile included) to fail when
SEARXNG_SECRET was unset, with a message confusingly claiming the searxng
profile was being started.
Switch to `${SEARXNG_SECRET:-weknora-default-searxng-secret-...}` so the
searxng profile starts zero-config. Default deployments bind searxng to
127.0.0.1 only, so a shared default secret is acceptable; .env.example
now explicitly warns to rotate it before flipping SEARXNG_BIND=0.0.0.0,
since secret_key signs image-proxy URLs.
- Updated .env.example to clarify SEARXNG_SECRET generation and added SSRF_WHITELIST_EXTRA for improved security.
- Modified docker-compose files to bind SearXNG to localhost by default and introduced a one-time initialization service to set up settings.yml correctly.
- Enhanced SearxngProvider with stricter URL validation, ensuring no query or fragment is present in the base URL.
- Added unit tests for SearXNG validation and date parsing to ensure robustness.
- Updated frontend WebSearchSettings to reflect changes in SearXNG instance URL handling.
This commit improves the security and usability of the SearXNG integration, addressing potential misconfigurations and enhancing the developer experience.
Closes#620#497. Add opt-in Langfuse observability covering all five
model types (chat, embedding, rerank, VLM, ASR) with HTTP-request-scoped
traces and Docker Compose support (both cloud and self-hosted).
Core package internal/tracing/langfuse:
- HTTP client with batched async ingestion (non-blocking in request path)
- Sampling, environment / release tagging, and graceful fallback when
LANGFUSE_* env vars are absent (wrappers become no-ops)
- Gin middleware opens one trace per traced request and finishes it after
the handler chain returns, attaching method / path / user / session
- Trace context is stored under a typed key exported from internal/types
so logger.CloneContext can preserve it across handler / goroutine
boundaries (otherwise each LLM call auto-created an orphan trace,
fragmenting one request into many)
Per-model generation wrappers (opt-in via NewChat/NewEmbedder/...):
- chat: captures prompt, streaming output, token usage + TTFT
- embedding: approximates tokens when the provider omits usage
- rerank: previews query/docs, summarizes results to keep payload small
- vlm: records image count and total bytes, never uploads raw pixels
- asr: records file size and audio duration, never uploads audio bytes
Async title generation (GenerateTitleAsync) now forwards the trace key
into the goroutine so title calls appear under the parent chat trace.
Docker Compose:
- LANGFUSE_* env passthrough on the `app` service for cloud deployments
- Optional `langfuse` profile spins up a self-hosted Langfuse stack that
reuses WeKnora's existing PostgreSQL (separate database via an idempotent
init container that fixes ICU collation drift) and Redis (separate DB
number), adding only ClickHouse, MinIO, web and worker containers
- web/worker entrypoints URL-encode DB_PASSWORD / REDIS_PASSWORD at start
to avoid Prisma P1013 when passwords contain @ / # / etc.
Docs: docs/Langfuse集成.md covers cloud vs self-hosted, per-model usage
strategy, code map, and resource footprint.
- Added a new `.env.lite.example` file for the Lite version, providing a minimal configuration template.
- Updated `.env.example` to remove deprecated variables and include new Docreader settings.
- Enhanced Docker configurations to support the Lite version, including a new Dockerfile for the Docreader service.
- Introduced a Makefile target for building and running the Lite version, along with packaging capabilities.
- Created GitHub workflows for building and releasing Lite binaries, including Homebrew formula support.
- Implemented a new service file for managing the Lite version as a system service.
This update enables a streamlined, single-binary deployment of WeKnora, reducing external dependencies and simplifying setup.
- Added container name for the sandbox service in both docker-compose.dev.yml and docker-compose.yml.
- Changed the profile from 'sandbox' to 'full' for the sandbox service, enhancing its integration within the application.
- Added logging for skill availability and sandbox mode in the skill handler, improving debugging capabilities.
- Updated .env.example to set default sandbox mode to 'docker' and added timeout and docker image variables.
- Modified docker-compose files to include a sandbox service for building and pulling the sandbox image.
- Adjusted frontend API to reflect sandbox availability for skills, ensuring UI elements are conditionally displayed based on sandbox status.
- Implemented backend logic to disable skills when the sandbox is not enabled, improving error handling and user experience.
- Deleted several SQL migration files related to agent configuration, cleanup of unreferenced models, MCP services, and web search configuration.
- Updated docker-compose files to remove references to obsolete SQL migration scripts, streamlining the development environment setup.
- Improved code maintainability by eliminating unnecessary files and configurations.
- Updated agent configuration to support separate system prompts for web search enabled and disabled states.
- Removed deprecated agent configuration parameters to streamline settings management.
- Enhanced UI components in AgentSettings.vue to allow configuration of custom prompts based on web search status.
- Improved localization in English, Russian, and Chinese for new prompt settings and UI elements.
- Refactored related API and service logic to accommodate changes in agent configuration structure.
- Introduced new MCP services functionality, including management routes and integration with the agent service.
- Updated Docker Compose configuration to include a new SQL migration for MCP services.
- Enhanced Makefile with new migration commands for versioning and creating migrations.
- Improved chat component to handle selected MCP services in message processing.
- Updated migration documentation to reflect new strategies and added MCP services migration files.
- Added 'logs/' and '*.pid' to .gitignore to exclude log files and process ID files from version control.
- Expanded Makefile with new development commands for easier local environment management, including 'dev-start', 'dev-stop', 'dev-restart', 'dev-logs', 'dev-status', 'dev-app', and 'dev-frontend'.
- Updated README_CN.md to include instructions for the new development mode and commands for improved developer experience.