Follow-up to #1359. Addresses a set of correctness and security gaps in
the initial docreader auth implementation.
- docker-compose: inject GRPC_TLS_*/GRPC_TLS_SERVER_NAME/GRPC_AUTH_TOKEN
into the WeKnora-app service. Without this the Go client never saw the
knobs, so enabling token auth on the server broke every RPC.
- client: bind tokenAuth.RequireTransportSecurity() to TLSEnabled so a
bearer token cannot be sent over an insecure channel once TLS is on.
- server: load_tls_credentials now raises TLSConfigError on misconfig
(cert/key missing, file unreadable, mTLS without CA); main.py exits 1
instead of silently downgrading to insecure.
- server: replace endswith("/Check"|"/Watch") health bypass with exact
match against /grpc.health.v1.Health/{Check,Watch}.
- server: compare tokens with hmac.compare_digest, warn on tokens < 16B.
- server: AuthInterceptor now returns an abort handler matching the
original RPC kind (unary/stream) and uses context.abort, so streaming
RPCs surface UNAUTHENTICATED instead of INTERNAL.
- internal/infrastructure/docparser/grpc_parser.go: drop the duplicated
TLS/tokenAuth block and reuse docreader/client.LoadAuthConfigFromEnv +
BuildDialOptions. Single source of truth for client-side auth.
- Add GRPC_TLS_SERVER_NAME (client SNI override) and
GRPC_MTLS_REQUIRE_CLIENT_CERT (server explicit mTLS toggle); document
the differing CA semantics between client and server in .env*.example.
- Reject half-configured client mTLS (cert XOR key) loudly.
- Fix missing trailing newline in .env.lite.example.
Verified locally: go build ./... and go vet ./... clean; auth.py
fail-fast / token paths smoke-tested.
Closes#620#497. Add opt-in Langfuse observability covering all five
model types (chat, embedding, rerank, VLM, ASR) with HTTP-request-scoped
traces and Docker Compose support (both cloud and self-hosted).
Core package internal/tracing/langfuse:
- HTTP client with batched async ingestion (non-blocking in request path)
- Sampling, environment / release tagging, and graceful fallback when
LANGFUSE_* env vars are absent (wrappers become no-ops)
- Gin middleware opens one trace per traced request and finishes it after
the handler chain returns, attaching method / path / user / session
- Trace context is stored under a typed key exported from internal/types
so logger.CloneContext can preserve it across handler / goroutine
boundaries (otherwise each LLM call auto-created an orphan trace,
fragmenting one request into many)
Per-model generation wrappers (opt-in via NewChat/NewEmbedder/...):
- chat: captures prompt, streaming output, token usage + TTFT
- embedding: approximates tokens when the provider omits usage
- rerank: previews query/docs, summarizes results to keep payload small
- vlm: records image count and total bytes, never uploads raw pixels
- asr: records file size and audio duration, never uploads audio bytes
Async title generation (GenerateTitleAsync) now forwards the trace key
into the goroutine so title calls appear under the parent chat trace.
Docker Compose:
- LANGFUSE_* env passthrough on the `app` service for cloud deployments
- Optional `langfuse` profile spins up a self-hosted Langfuse stack that
reuses WeKnora's existing PostgreSQL (separate database via an idempotent
init container that fixes ICU collation drift) and Redis (separate DB
number), adding only ClickHouse, MinIO, web and worker containers
- web/worker entrypoints URL-encode DB_PASSWORD / REDIS_PASSWORD at start
to avoid Prisma P1013 when passwords contain @ / # / etc.
Docs: docs/Langfuse集成.md covers cloud vs self-hosted, per-model usage
strategy, code map, and resource footprint.
- Added a new `.env.lite.example` file for the Lite version, providing a minimal configuration template.
- Updated `.env.example` to remove deprecated variables and include new Docreader settings.
- Enhanced Docker configurations to support the Lite version, including a new Dockerfile for the Docreader service.
- Introduced a Makefile target for building and running the Lite version, along with packaging capabilities.
- Created GitHub workflows for building and releasing Lite binaries, including Homebrew formula support.
- Implemented a new service file for managing the Lite version as a system service.
This update enables a streamlined, single-binary deployment of WeKnora, reducing external dependencies and simplifying setup.