mirror of
https://github.com/Tencent/WeKnora.git
synced 2026-06-04 13:30:32 +08:00
Phase 3 (#1440) gate flip. PR 1 (#1445) + PR 2a (#1481) + PR 2b (#1482) laid the type prep + driver skeleton + read/write paths as gated dead code; this PR wires every activation surface so opensearch becomes a registerable VectorStore engine. Activation wiring - internal/types: validEngineTypes / GetVectorStoreTypes (with HNSW bounds + knn_engine enum + Immutable hints) / retrieverEngineMapping / buildEnvStoreForDriver — every gated surface now recognises "opensearch". IndexConfig grows four omitempty HNSW fields (HNSWM / HNSWEFConstruction / HNSWEFSearch / KNNEngine), keeping other engines' serialised config byte-identical. - internal/container: createOpenSearchEngine + the switch case in createEngineServiceFromStore; the RETRIEVE_DRIVER=opensearch env path in initRetrieveEngineRegistry; NewEngineFactory now closes over the AuditLogService (the EngineFactory type itself is unchanged). - internal/application/service/vectorstore_healthcheck.go: a testOpenSearchConnection case so CreateStore's connectivity probe accepts opensearch instead of returning 400. - internal/application/repository/retriever/opensearch/transport.go: NewOpenSearchClient is exported so the factory and env path can build the TLS-hardened client; healthcheck.go reuses the unexported probeVersion / probeKNNPlugin for the service-layer probe. Service-layer validation - validateOpenSearchIndexConfig validates the HNSW caps (m 2-100, ef_construction 2-4096, ef_search 1-10000, knn_engine ∈ lucene|faiss). Shards/replicas continue to be enforced by the flat ValidateIndexConfig. Create-only: UpdateStore mutates the name only. - validateConnectionConfig requires addr for opensearch. Sync implementations (stubs.go shrinks) - CopyIndices (copy.go) mirrors the Elasticsearch / Qdrant pattern — search → BatchSave with the source_id remap for generated questions — so dim/keyword routing and the source_id contract come from BatchSave for free. embeddingMap is keyed by the *target* SourceID because OpenSearch's BatchSave looks up embeddings by SourceID (lookupEmbedding), not by chunk_id (the ES driver's convention). Pagination is from/size; copies larger than max_result_window (default 10000) need the scroll-based async path that lands later. - BatchUpdateChunkEnabledStatus / BatchUpdateChunkTagID (bulk_update.go) group the input by target value and issue one _update_by_query per group over the cross-dim <base>_* pattern. Caller values flow through bound script params only — never string-interpolated into the Painless source — closing the script-injection surface. - inspectByQueryResponse (byquery.go) mirrors inspectBulkResponse: the full failure reason goes to the debug log only; the returned error carries the bounded id + type. - UpdateByQueryParams.Refresh is *bool in opensearch-go v4.6.0 (the same shape as DeleteByQuery's quirk), so refresh=wait_for is not expressible; we use refresh=true. Driver-owned audit (DIP) - A new opensearch.AuditSink interface (with nopSink + WithAuditSink functional option) lets the driver emit opensearch.index_created and opensearch.reindex_executed events without importing any service package — the service layer implements the interface. NewRepository takes opts, so existing 4-arg test call sites keep compiling unchanged. - internal/container/audit_sink.go bridges AuditSink to AuditLogService. When the context carries no tenant (the env-path registration ctx during boot, for example) the adapter skips the emit with a warning rather than silently writing tenant_id=0, which would collide with the system-scope sentinel. Frontend + polish - FieldSchema (frontend/src/api/vector-store.ts) gains min/max/enum/ immutable. VectorStoreSettings.vue is now schema-driven: a closed `enum` renders a t-select; number inputs use the schema's `:min`/`:max` and fall back to the legacy replica-vs-shard heuristic only when the schema does not pin them; a danger-coloured warning fires when insecure_skip_verify is toggled on (the switch and warning are wrapped in a vertical stack so the warning sits on its own row below the switch). - i18n: labels for hnsw_m / hnsw_ef_construction / hnsw_ef_search / knn_engine / insecure_skip_verify plus the warning copy in en-US, ko-KR, zh-CN, ru-RU. - docker-compose.dev.yml: an opensearch profile (single-node 3.3.2 with security plugin disabled for dev only). OpenSearch Dashboards lives in a separate, opt-in opensearch-ui profile so the heavy UI container is not forced up alongside the cluster (the driver e2e is fully curl-verifiable against :9200). The new docs/dev/opensearch-integration-test.md covers the end-to-end exercise and the single-node guidance (set replicas=0 to keep the cluster Green). Gating-guard tests flipped - The "OpenSearch is NOT in validEngineTypes / mapping / types list / env builder / stubs" guard tests from PR 1 / PR 2 are replaced by their positive counterparts in this PR. The test suite was the activation checklist; the activation flip is its diff. Backward compatibility - Additive everywhere. IndexConfig's new HNSW fields are omitempty so other engines' serialised config is byte-identical. Existing Elasticsearch / Qdrant / Milvus / Weaviate / Doris / TencentVectorDB stores are untouched. No migrations. Test plan - go build ./... clean - go vet ./... clean - gofmt -l clean on touched files - go test ./... — only TestOssEnsureBucket_CreateFails (Aliyun OSS endpoint), the docreader gRPC tests, and the doris SQL-shape tests fail; all three are pre-existing on upstream/main and untouched by this PR. - New tests across internal/types, opensearch, service and container — including a full end-to-end env-path test that exercises initRetrieveEngineRegistry with RETRIEVE_DRIVER=opensearch against an httptest cluster.
331 lines
12 KiB
Go
331 lines
12 KiB
Go
package service
|
||
|
||
import (
|
||
"context"
|
||
"database/sql"
|
||
"encoding/json"
|
||
"fmt"
|
||
"io"
|
||
"net"
|
||
"net/http"
|
||
"strings"
|
||
"time"
|
||
|
||
openSearchRepo "github.com/Tencent/WeKnora/internal/application/repository/retriever/opensearch"
|
||
"github.com/Tencent/WeKnora/internal/errors"
|
||
"github.com/Tencent/WeKnora/internal/logger"
|
||
"github.com/Tencent/WeKnora/internal/types"
|
||
"github.com/go-sql-driver/mysql" // MySQL driver for database/sql, used by Doris connection test
|
||
_ "github.com/jackc/pgx/v5/stdlib" // pgx driver for database/sql
|
||
"github.com/qdrant/go-client/qdrant"
|
||
"github.com/tencent/vectordatabase-sdk-go/tcvectordb"
|
||
"github.com/weaviate/weaviate-go-client/v5/weaviate"
|
||
"github.com/weaviate/weaviate-go-client/v5/weaviate/auth"
|
||
wgrpc "github.com/weaviate/weaviate-go-client/v5/weaviate/grpc"
|
||
)
|
||
|
||
const connectionTestTimeout = 10 * time.Second
|
||
|
||
// TestConnection tests connectivity to a vector database.
|
||
// Returns the detected server version on success (e.g., "7.10.1"), empty string if unknown.
|
||
func (s *vectorStoreService) TestConnection(
|
||
ctx context.Context,
|
||
engineType types.RetrieverEngineType,
|
||
config types.ConnectionConfig,
|
||
) (string, error) {
|
||
switch engineType {
|
||
case types.ElasticsearchRetrieverEngineType:
|
||
return testElasticsearchConnection(ctx, config)
|
||
case types.PostgresRetrieverEngineType:
|
||
return testPostgresConnection(ctx, config)
|
||
case types.QdrantRetrieverEngineType:
|
||
return testQdrantConnection(ctx, config)
|
||
case types.MilvusRetrieverEngineType:
|
||
return testMilvusConnection(ctx, config)
|
||
case types.TencentVectorDBRetrieverEngineType:
|
||
return testTencentVectorDBConnection(ctx, config)
|
||
case types.WeaviateRetrieverEngineType:
|
||
return testWeaviateConnection(ctx, config)
|
||
case types.DorisRetrieverEngineType:
|
||
return testDorisConnection(ctx, config)
|
||
case types.OpenSearchRetrieverEngineType:
|
||
return testOpenSearchConnection(ctx, config)
|
||
case types.SQLiteRetrieverEngineType:
|
||
// SQLite is file-based, no remote connection to test
|
||
return "", nil
|
||
default:
|
||
return "", errors.NewBadRequestError(
|
||
fmt.Sprintf("connection test not supported for engine type: %s", engineType))
|
||
}
|
||
}
|
||
|
||
func testElasticsearchConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
// Use plain HTTP GET to the root endpoint instead of the go-elasticsearch SDK.
|
||
// The v8 SDK's TypedClient performs a product check that rejects ES7 servers,
|
||
// so we use a raw HTTP request to support both v7 and v8.
|
||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, config.Addr, nil)
|
||
if err != nil {
|
||
return "", errors.NewBadRequestError("failed to create elasticsearch request: invalid address")
|
||
}
|
||
if config.Username != "" {
|
||
req.SetBasicAuth(config.Username, config.Password)
|
||
}
|
||
|
||
client := &http.Client{Timeout: connectionTestTimeout}
|
||
resp, err := client.Do(req)
|
||
if err != nil {
|
||
logger.Warnf(ctx, "Elasticsearch connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to elasticsearch: connection refused or authentication failed")
|
||
}
|
||
defer resp.Body.Close()
|
||
|
||
if resp.StatusCode != http.StatusOK {
|
||
logger.Warnf(ctx, "Elasticsearch connection test returned status %d", resp.StatusCode)
|
||
return "", errors.NewBadRequestError("failed to connect to elasticsearch: authentication failed or server error")
|
||
}
|
||
|
||
// Parse version from response: {"version": {"number": "7.10.1"}, ...}
|
||
body, err := io.ReadAll(io.LimitReader(resp.Body, 4096))
|
||
if err != nil {
|
||
return "", nil // connected but version unknown
|
||
}
|
||
|
||
var esInfo struct {
|
||
Version struct {
|
||
Number string `json:"number"`
|
||
} `json:"version"`
|
||
}
|
||
if err := json.Unmarshal(body, &esInfo); err != nil {
|
||
return "", nil // connected but version unparseable
|
||
}
|
||
|
||
return esInfo.Version.Number, nil
|
||
}
|
||
|
||
func testPostgresConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
|
||
if config.UseDefaultConnection {
|
||
// Using the default app DB connection — always reachable if the app is running.
|
||
// Cannot query version without a DB handle; return empty.
|
||
return "", nil
|
||
}
|
||
|
||
db, err := sql.Open("pgx", config.Addr)
|
||
if err != nil {
|
||
return "", errors.NewBadRequestError("failed to create postgres connection: invalid configuration")
|
||
}
|
||
defer db.Close()
|
||
|
||
if err := db.PingContext(testCtx); err != nil {
|
||
logger.Warnf(ctx, "Postgres connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to postgres: connection refused or authentication failed")
|
||
}
|
||
|
||
// Detect version
|
||
var version string
|
||
if err := db.QueryRowContext(testCtx, "SHOW server_version").Scan(&version); err != nil {
|
||
logger.Warnf(ctx, "Postgres version detection failed: %v", err)
|
||
return "", nil // connected but version unknown
|
||
}
|
||
|
||
return version, nil
|
||
}
|
||
|
||
func testQdrantConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
|
||
port := config.Port
|
||
if port == 0 {
|
||
port = 6334
|
||
}
|
||
|
||
client, err := qdrant.NewClient(&qdrant.Config{
|
||
Host: config.Host,
|
||
Port: port,
|
||
APIKey: config.APIKey,
|
||
UseTLS: config.UseTLS,
|
||
})
|
||
if err != nil {
|
||
return "", errors.NewBadRequestError("failed to create qdrant client: invalid configuration")
|
||
}
|
||
defer client.Close()
|
||
|
||
result, err := client.HealthCheck(testCtx)
|
||
if err != nil {
|
||
logger.Warnf(ctx, "Qdrant connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to qdrant: connection refused or authentication failed")
|
||
}
|
||
|
||
return result.GetVersion(), nil
|
||
}
|
||
|
||
func testMilvusConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
// Use TCP dial instead of the Milvus SDK to avoid protobuf namespace conflict
|
||
// between milvus-proto and qdrant-client (both register "common.proto").
|
||
// A TCP dial is sufficient for connectivity verification; the Milvus SDK client
|
||
// creation in container.go (PR 3) will validate full gRPC connectivity.
|
||
// Version detection is not possible with TCP dial alone.
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
|
||
addr := config.Addr
|
||
if addr == "" {
|
||
addr = "localhost:19530"
|
||
}
|
||
|
||
conn, err := (&net.Dialer{}).DialContext(testCtx, "tcp", addr)
|
||
if err != nil {
|
||
logger.Warnf(ctx, "Milvus connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to milvus: connection refused or server unreachable")
|
||
}
|
||
defer conn.Close()
|
||
|
||
return "", nil
|
||
}
|
||
|
||
func testTencentVectorDBConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
|
||
client, err := tcvectordb.NewRpcClient(config.Addr, config.Username, config.APIKey, &tcvectordb.ClientOption{
|
||
ReadConsistency: tcvectordb.EventualConsistency,
|
||
Timeout: connectionTestTimeout,
|
||
})
|
||
if err != nil {
|
||
logger.Warnf(ctx, "Tencent VectorDB connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to tencent vectordb: connection refused or authentication failed")
|
||
}
|
||
defer client.Close()
|
||
|
||
if _, err := client.ListDatabase(testCtx); err != nil {
|
||
logger.Warnf(ctx, "Tencent VectorDB list database failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to tencent vectordb: authentication failed or server error")
|
||
}
|
||
return "", nil
|
||
}
|
||
|
||
func testWeaviateConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
|
||
host := config.Host
|
||
if host == "" {
|
||
host = "weaviate:8080"
|
||
}
|
||
grpcAddress := config.GrpcAddress
|
||
if grpcAddress == "" {
|
||
grpcAddress = "weaviate:50051"
|
||
}
|
||
scheme := config.Scheme
|
||
if scheme == "" {
|
||
scheme = "http"
|
||
}
|
||
|
||
weaviateCfg := weaviate.Config{
|
||
Host: host,
|
||
GrpcConfig: &wgrpc.Config{
|
||
Host: grpcAddress,
|
||
},
|
||
Scheme: scheme,
|
||
}
|
||
if config.APIKey != "" {
|
||
weaviateCfg.AuthConfig = auth.ApiKey{Value: config.APIKey}
|
||
}
|
||
|
||
// Weaviate Go client v5 does not expose a Close() method;
|
||
// it uses HTTP + gRPC transports that are managed internally.
|
||
client, err := weaviate.NewClient(weaviateCfg)
|
||
if err != nil {
|
||
logger.Warnf(ctx, "Weaviate connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to create weaviate client: invalid configuration")
|
||
}
|
||
|
||
isReady, err := client.Misc().ReadyChecker().Do(testCtx)
|
||
if err != nil || !isReady {
|
||
logger.Warnf(ctx, "Weaviate connection test failed: ready=%v, err=%v", isReady, err)
|
||
return "", errors.NewBadRequestError("failed to connect to weaviate: server not ready or authentication failed")
|
||
}
|
||
|
||
// Detect version via /v1/meta
|
||
meta, err := client.Misc().MetaGetter().Do(testCtx)
|
||
if err != nil || meta == nil {
|
||
return "", nil // connected but version unknown
|
||
}
|
||
|
||
return meta.Version, nil
|
||
}
|
||
|
||
// testDorisConnection 通过 MySQL 协议(database/sql + go-sql-driver)
|
||
// Ping Doris FE 并查询 @@version。
|
||
//
|
||
// Doris 的 @@version 形如 "5.7.99 Doris-4.1.0"——前半段是 MySQL 协议
|
||
// 兼容性表达式,"Doris-" 之后才是真实版本号。统一只返回 "4.1.0" 这类
|
||
// 裸版本号,与 Postgres/ES 路径的格式保持一致。
|
||
func testDorisConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
|
||
if config.Addr == "" {
|
||
return "", errors.NewBadRequestError("failed to create doris connection: addr is required")
|
||
}
|
||
|
||
// Database 不强制要求;Ping 时无明确库则用 information_schema(任何 MySQL 兼容服务都有)。
|
||
database := config.Database
|
||
if database == "" {
|
||
database = "information_schema"
|
||
}
|
||
|
||
// 用 mysql.Config.FormatDSN() 构造 DSN,避免用户名/密码中 `@` `:` `/`
|
||
// 等特殊字符破坏字面量拼接(fmt.Sprintf 会跑偏,参考 issue #1234 类问题)。
|
||
cfg := mysql.NewConfig()
|
||
cfg.User = config.Username
|
||
cfg.Passwd = config.Password
|
||
cfg.Net = "tcp"
|
||
cfg.Addr = config.Addr
|
||
cfg.DBName = database
|
||
cfg.Timeout = 5 * time.Second
|
||
db, err := sql.Open("mysql", cfg.FormatDSN())
|
||
if err != nil {
|
||
return "", errors.NewBadRequestError("failed to create doris connection: invalid configuration")
|
||
}
|
||
defer db.Close()
|
||
|
||
if err := db.PingContext(testCtx); err != nil {
|
||
logger.Warnf(ctx, "Doris connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError("failed to connect to doris: connection refused or authentication failed")
|
||
}
|
||
|
||
var version string
|
||
if err := db.QueryRowContext(testCtx, "SELECT @@version").Scan(&version); err != nil {
|
||
logger.Warnf(ctx, "Doris version detection failed: %v", err)
|
||
return "", nil
|
||
}
|
||
if i := strings.Index(version, "Doris-"); i >= 0 {
|
||
return strings.TrimSpace(version[i+len("Doris-"):]), nil
|
||
}
|
||
return version, nil
|
||
}
|
||
|
||
// testOpenSearchConnection verifies the cluster is reachable, runs a
|
||
// supported OpenSearch version, and has the k-NN plugin installed. The driver
|
||
// owns the probe logic; a generic message is returned on failure so cluster
|
||
// internals are not surfaced to the API caller.
|
||
func testOpenSearchConnection(ctx context.Context, config types.ConnectionConfig) (string, error) {
|
||
if config.Addr == "" {
|
||
return "", errors.NewBadRequestError("failed to create opensearch connection: addr is required")
|
||
}
|
||
testCtx, cancel := context.WithTimeout(ctx, connectionTestTimeout)
|
||
defer cancel()
|
||
if err := openSearchRepo.TestConnection(testCtx, &config); err != nil {
|
||
logger.Warnf(ctx, "OpenSearch connection test failed: %v", err)
|
||
return "", errors.NewBadRequestError(
|
||
"failed to connect to opensearch: check address, credentials, version (>= 2.4), and that the k-NN plugin is installed")
|
||
}
|
||
// Version is detected during the probe but not surfaced here; lazy index
|
||
// creation re-validates on first use.
|
||
return "", nil
|
||
}
|