Files
WeKnora/docs/dev/opensearch-integration-test.md
ochan.kwon 40b74e2efa feat(retriever): activate OpenSearch k-NN driver (PR 3 of 3)
Phase 3 (#1440) gate flip. PR 1 (#1445) + PR 2a (#1481) + PR 2b (#1482)
laid the type prep + driver skeleton + read/write paths as gated dead
code; this PR wires every activation surface so opensearch becomes a
registerable VectorStore engine.

Activation wiring
- internal/types: validEngineTypes / GetVectorStoreTypes (with HNSW
  bounds + knn_engine enum + Immutable hints) / retrieverEngineMapping /
  buildEnvStoreForDriver — every gated surface now recognises
  "opensearch". IndexConfig grows four omitempty HNSW fields (HNSWM /
  HNSWEFConstruction / HNSWEFSearch / KNNEngine), keeping other engines'
  serialised config byte-identical.
- internal/container: createOpenSearchEngine + the switch case in
  createEngineServiceFromStore; the RETRIEVE_DRIVER=opensearch env path
  in initRetrieveEngineRegistry; NewEngineFactory now closes over the
  AuditLogService (the EngineFactory type itself is unchanged).
- internal/application/service/vectorstore_healthcheck.go: a
  testOpenSearchConnection case so CreateStore's connectivity probe
  accepts opensearch instead of returning 400.
- internal/application/repository/retriever/opensearch/transport.go:
  NewOpenSearchClient is exported so the factory and env path can build
  the TLS-hardened client; healthcheck.go reuses the unexported
  probeVersion / probeKNNPlugin for the service-layer probe.

Service-layer validation
- validateOpenSearchIndexConfig validates the HNSW caps (m 2-100,
  ef_construction 2-4096, ef_search 1-10000, knn_engine ∈ lucene|faiss).
  Shards/replicas continue to be enforced by the flat ValidateIndexConfig.
  Create-only: UpdateStore mutates the name only.
- validateConnectionConfig requires addr for opensearch.

Sync implementations (stubs.go shrinks)
- CopyIndices (copy.go) mirrors the Elasticsearch / Qdrant pattern —
  search → BatchSave with the source_id remap for generated questions —
  so dim/keyword routing and the source_id contract come from BatchSave
  for free. embeddingMap is keyed by the *target* SourceID because
  OpenSearch's BatchSave looks up embeddings by SourceID
  (lookupEmbedding), not by chunk_id (the ES driver's convention).
  Pagination is from/size; copies larger than max_result_window
  (default 10000) need the scroll-based async path that lands later.
- BatchUpdateChunkEnabledStatus / BatchUpdateChunkTagID (bulk_update.go)
  group the input by target value and issue one _update_by_query per
  group over the cross-dim <base>_* pattern. Caller values flow through
  bound script params only — never string-interpolated into the Painless
  source — closing the script-injection surface.
- inspectByQueryResponse (byquery.go) mirrors inspectBulkResponse: the
  full failure reason goes to the debug log only; the returned error
  carries the bounded id + type.
- UpdateByQueryParams.Refresh is *bool in opensearch-go v4.6.0 (the same
  shape as DeleteByQuery's quirk), so refresh=wait_for is not
  expressible; we use refresh=true.

Driver-owned audit (DIP)
- A new opensearch.AuditSink interface (with nopSink + WithAuditSink
  functional option) lets the driver emit opensearch.index_created and
  opensearch.reindex_executed events without importing any service
  package — the service layer implements the interface. NewRepository
  takes opts, so existing 4-arg test call sites keep compiling unchanged.
- internal/container/audit_sink.go bridges AuditSink to AuditLogService.
  When the context carries no tenant (the env-path registration ctx
  during boot, for example) the adapter skips the emit with a warning
  rather than silently writing tenant_id=0, which would collide with the
  system-scope sentinel.

Frontend + polish
- FieldSchema (frontend/src/api/vector-store.ts) gains min/max/enum/
  immutable. VectorStoreSettings.vue is now schema-driven: a closed
  `enum` renders a t-select; number inputs use the schema's `:min`/`:max`
  and fall back to the legacy replica-vs-shard heuristic only when the
  schema does not pin them; a danger-coloured warning fires when
  insecure_skip_verify is toggled on (the switch and warning are wrapped
  in a vertical stack so the warning sits on its own row below the switch).
- i18n: labels for hnsw_m / hnsw_ef_construction / hnsw_ef_search /
  knn_engine / insecure_skip_verify plus the warning copy in en-US,
  ko-KR, zh-CN, ru-RU.
- docker-compose.dev.yml: an opensearch profile (single-node 3.3.2 with
  security plugin disabled for dev only). OpenSearch Dashboards lives in a
  separate, opt-in opensearch-ui profile so the heavy UI container is not
  forced up alongside the cluster (the driver e2e is fully curl-verifiable
  against :9200). The new docs/dev/opensearch-integration-test.md covers the
  end-to-end exercise and the single-node guidance (set replicas=0 to keep
  the cluster Green).

Gating-guard tests flipped
- The "OpenSearch is NOT in validEngineTypes / mapping / types list /
  env builder / stubs" guard tests from PR 1 / PR 2 are replaced by
  their positive counterparts in this PR. The test suite was the
  activation checklist; the activation flip is its diff.

Backward compatibility
- Additive everywhere. IndexConfig's new HNSW fields are omitempty so
  other engines' serialised config is byte-identical. Existing
  Elasticsearch / Qdrant / Milvus / Weaviate / Doris / TencentVectorDB
  stores are untouched. No migrations.

Test plan
- go build ./... clean
- go vet ./... clean
- gofmt -l clean on touched files
- go test ./... — only TestOssEnsureBucket_CreateFails (Aliyun OSS
  endpoint), the docreader gRPC tests, and the doris SQL-shape tests
  fail; all three are pre-existing on upstream/main and untouched by
  this PR.
- New tests across internal/types, opensearch, service and container —
  including a full end-to-end env-path test that exercises
  initRetrieveEngineRegistry with RETRIEVE_DRIVER=opensearch against an
  httptest cluster.
2026-05-29 16:32:27 +08:00

3.6 KiB

OpenSearch k-NN driver — local integration test

This guide brings up a single-node OpenSearch cluster and exercises the OpenSearch retrieve engine end to end. The driver lives in internal/application/repository/retriever/opensearch/.

1. Start a dev cluster

docker compose -f docker-compose.dev.yml --profile opensearch up -d

This starts:

  • opensearch on http://localhost:9200 — single-node, security plugin disabled (plain HTTP, no auth/TLS). The image bundles the opensearch-knn plugin.

OpenSearch Dashboards is optional and lives in a separate opensearch-ui profile, so it is not started by --profile opensearch. The whole integration test below is curl-verifiable against :9200. If you want the web UI (Dev Tools console / visual index inspection), start it on demand:

docker compose -f docker-compose.dev.yml --profile opensearch-ui up -d
# opensearch-dashboards on http://localhost:5601 (depends_on pulls the cluster in)

Verify:

curl -s localhost:9200 | jq '.version.distribution, .version.number'
# "opensearch" "3.3.2"
curl -s 'localhost:9200/_cat/plugins?format=json' | jq -r '.[].component' | grep opensearch-knn

Production clusters must enable the security plugin (TLS + auth). The dev profile disables it only to keep local setup trivial. When connecting to a secured cluster, set username / password and — for self-signed certs in dev only — insecure_skip_verify=true.

2. Register the store

Option A — DB store (UI / API)

POST /api/v1/vector-stores:

{
  "name": "opensearch-local",
  "engine_type": "opensearch",
  "connection_config": { "addr": "http://localhost:9200" },
  "index_config": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "hnsw_m": 16,
    "hnsw_ef_construction": 100,
    "knn_engine": "lucene"
  }
}

CreateStore runs the connection probe (version + k-NN plugin) before persisting; a bad address / unsupported version / missing plugin is rejected with 400.

Option B — env store

export RETRIEVE_DRIVER=opensearch
export OPENSEARCH_ADDR=http://localhost:9200
# export OPENSEARCH_USERNAME / OPENSEARCH_PASSWORD for a secured cluster
# export OPENSEARCH_INSECURE_SKIP_VERIFY=true   # self-signed dev TLS only

3. Single-node note (important)

On a single-node cluster, any index created with number_of_replicas >= 1 leaves its replica shard unassigned, so the index health goes Yellow. Yellow does not block reads or writes — it is safe for local testing — but to keep the cluster Green set number_of_replicas: 0 at store registration (as in the Option A example above). The driver default is 1 (it assumes a ≥2-node cluster).

4. Exercise the flow

  1. Bind a knowledge base to the store and ingest a few documents.
  2. Confirm the per-dimension index appears: curl -s 'localhost:9200/_cat/indices?v' | grep weknora (e.g. weknora_<storeprefix>_768 + alias, plus weknora_<storeprefix>_keywords).
  3. Run a retrieval query against the bound KB and confirm hits come back.
  4. Copy the KB to another KB and confirm the docs are reindexed (opensearch.reindex_executed audit event).
  5. Toggle chunk enabled-status / tag and confirm _update_by_query applies it.

5. Tear down

docker compose -f docker-compose.dev.yml --profile opensearch down -v

Scope notes

  • Large-batch async reindex / delete (task polling) is a follow-up; the sync paths handle typical KB sizes (pagination is bounded by max_result_window, default 10000).
  • Native hybrid query + search pipeline is out of scope — fusion stays at the service layer (RRF).