mirror of
https://github.com/Tencent/WeKnora.git
synced 2026-06-04 13:30:32 +08:00
Post-review polish on the v0.7 wire / surface contract. Bundles five
follow-ups that landed after the main BREAKING feat commit:
1. Complete context→profile cascade (internal API + YAML schema)
The prior commit renamed only the user-visible surface (commands /
flags / env / project link / envelope field). The internal Go API
and on-disk config schema were still half-renamed — an L-25
self-consistency violation flagged by post-merge review. Closed here:
Internal Go API:
- config.Context → config.Profile
- config.Config.CurrentContext → CurrentProfile
- config.Config.Contexts → Profiles
- LoginOptions.Context → LoginOptions.Profile
- clearContextSecrets() → clearProfileSecrets()
- saveContextRef() → saveProfileRef()
- secrets.Store: param name `context` → `profile` (interface +
FileStore + KeyringStore + MemStore)
- cmdutil.LoadSecret(store, context, key) → LoadSecret(store, profile, key)
- cmdutil.RefreshAndPersist's ctxName → profileName
- Local var `ctx := &config.Profile{...}` → `prof := &config.Profile{...}`
in auth/login.go to eliminate the visual collision with Go stdlib
context.Context that motivated the whole rename in the first place.
On-disk config.yaml schema:
- current_context: → current_profile:
- contexts: → profiles:
- Pre-1.0 break, no compat alias. Users on v0.6 dogfooded configs
must delete ~/.config/weknora/config.yaml or hand-rename the two
keys (CHANGELOG migration note added).
Tests / fixtures / golden files:
- factory_test.go YAML fixture + assertion updated.
- acceptance/e2e/e2e_test.go writeContextYAML → writeProfileYAML,
fixture YAML keys updated.
- acceptance/testdata/wire/doctor.error_network.json golden updated
("active context" → "active profile" in hint string).
User-visible prose sweep:
- cmd/mcp/serve.go --help Long: "active context (or --context)" →
"active profile (or --profile)" — most-visible miss.
- cmd/{kb/list, search/kb, session/list, api/api} Short/Long help.
- cmd/auth/login.go stdout: `(context=%s)` → `(profile=%s)`.
- cmd/auth/logout.go error: `"no current context"` → `"no current profile"`.
- cmd/doctor/doctor.go hint string (also the wire golden above).
- cmd/auth/refresh.go error: `"refresh token missing for context"` →
`"refresh token missing for profile"`.
- README.md: `## Multi-context` H2 → `## Multi-profile`; code-block
comment `# current context` → `# current profile`.
Code-comment / docstring sweep across cli/cmd/auth/ and
cli/internal/cmdutil/. Comments referencing Go stdlib context.Context,
the RAG / LLM "context window" concept, and historical CHANGELOG
entries for v0.4 / v0.5 were left alone.
CHANGELOG v0.7 BREAKING entry gains the on-disk-schema bullet under
the existing "context → profile" item.
2. Profile name validation (shell-injection guard)
`envelope.error.retry_command` is a single shell-string field. An
AI agent that exec()s it via `sh -c <retry_command>` was injectable
through a maliciously-named profile:
weknora auth logout --name 'x; rm -rf ~'
# would produce: retry_command = "weknora auth logout --name x; rm -rf ~ -y"
`cmd/profile/add.go` already enforced an alphanumeric + `-_.`
allowlist via `validateName`. The `auth login` and `auth logout`
paths bypassed it.
- Moved validation from `cmd/profile/add.go` to
`cli/internal/cmdutil/profilename.go` as exported
`ValidateProfileName` (cmdutil is the import-cycle-safe home;
internal/config can't depend on cmdutil).
- `auth login` runs the validator before any persist call.
- `auth logout` runs the validator on `opts.Name` before
constructing `retry_command`.
- Unit tests (`profilename_test.go`) cover the allowlist, empty
rejection, path-traversal, shell metacharacters (`;`, `&`, `|`,
`$()`, backticks, quotes, whitespace, glob, redirects), and the
user-facing hint text. The shell-metachar test exists as a
regression guard.
Wire shape (`retry_command` string → `retry_command_argv []string`)
remains a v0.8 additive change per ROADMAP — this fix removes the
practical exploit path without touching the wire contract.
3. AI-agent terminology disambiguation
"agent" has three referents in this codebase: (a) WeKnora's
server-side Custom Agent resource, (b) the removed `agent invoke`
verb, (c) external LLM/automation consumers. Per project memory
feedback_no_meta_disambiguation_in_docs, the fix is full-term
naming, not "X has N meanings" prose. Surgical changes at section
headers + ambiguous prose:
- AGENTS.md: "Agent decision shortcuts" → "AI agent decision
shortcuts"; "agent-callable surface" → "AI-agent-callable
surface".
- README.md: "Designed to be agent-first" → "AI-agent-first";
"Other agent ergonomics" → "Other AI-agent ergonomics"; "in
agent contexts" → "in AI-agent contexts"; "for CI / agents" →
"for CI / AI agents".
Anaphoric "agents" inside paragraphs that already established
"AI agents" was left alone — full substitution everywhere would
have been prose noise without clarity gain.
4. Wire-contract review follow-ups
Real findings from a second-pass review of the v0.7 envelope /
streaming / surface design. Per project memory
feedback_check_in_domain_anchor_first, candidate findings were
first verified against the in-domain peer CLI explicitly cited as
the envelope anchor; two earlier-flagged issues turned out to be
in-pattern and were withdrawn.
Surviving fixes:
- AGENTS.md success-envelope example rewritten. The prior example
showed `has_more: false` / `_notice: {}` as if they were always
present, but both fields are `omitempty` and never serialize
when zero / nil. Replaced with three realistic shapes (list /
single resource / mutation with no payload) and added a note
that optional fields are omitted when empty.
- cmd/chat/chat.go Args: MinimumNArgs(1) → ExactArgs(1).
v0.6 silently joined `weknora chat hello world` into
`"hello world"`. v0.7 now rejects multi-arg with exit 2,
matching `weknora session ask`. BREAKING; CHANGELOG entry
added under v0.7 BREAKING.
- internal/output/envelope.go extracts NewEnvelope(data, meta,
profile) constructor. The jq-filter path in
cmdutil.FormatOptions.Emit was manually rebuilding the
envelope literal alongside the canonical WriteEnvelope path —
drift risk when fields are added. Single construction point now.
- internal/cmdutil/factory.go adds AddKBFlag(cmd) helper.
Five files (chat, doc/list, doc/upload, doc/create, doc/fetch)
had verbatim-identical `cmd.Flags().String("kb", ...)`
declarations. Centralised so flag name + help text stay
in sync with Factory.ResolveKB. Docstring reordering + gofmt
fixup landed in the same edit to keep ResolveKB's own godoc
attached to its function.
5. OSS-readiness comment / doc sweep
Pre-publication scrub of code, comments, and shipped Markdown to
remove references that only make sense in the development repo:
- AGENTS.md "Deliberate deviations + mainstream alignments"
section: removed peer-project name-drops from the comparison
table; rewrote as five flagged design decisions with rationale
but no specific competitor named. The four rows that previously
contrasted against a named peer CLI now state WeKnora's choice
+ rationale directly. Section header renamed to "Design
decisions worth flagging" since it is no longer a
deviation/alignment matrix.
- CHANGELOG v0.7 BREAKING rationales: three references to a
named peer CLI removed; the context→profile rationale now
cites only mainstream multi-credential CLIs by category (AWS /
Stripe / OpenAI / Anthropic), and the `api -d/--data` removal
rationale cites only `gh api` / `curl`. `chat` BREAKING entry
rationale similarly simplified.
- 35 cross-references to design-spec section numbers (§4.1 /
§4.5 / §5.3 etc.) removed from Go doc comments and test
comments across 13 files. The referenced spec lives outside
the shipped tree; readers of the public repo cannot resolve
them. Each reference replaced with a self-contained semantic
description (e.g. "the batch envelope" / "AGENTS.md section
on the success path").
- Mixed-language strings translated to English:
- Four Go comments: internal/cmdutil/exit.go:213,215,
internal/cmdutil/errors.go:156,
internal/output/batch_test.go:90,
internal/output/envelope_test.go:27.
- One CHANGELOG section title:
`v0.7 — Agent-first wire contract + 命令面集中清理` →
`... + command-surface cleanup`.
- CJK test fixtures (internal/text/truncate_test.go CJK
truncation cases, cmd/session/list_test.go Chinese session
title, acceptance/e2e/e2e_test.go Chinese RAG corpus)
retained — they are intentional test inputs, not stray prose.
- Makefile help comment: `golangci-lint added in PR-9` →
`golangci-lint planned`. Internal PR numbering should not
surface in shipped Makefile prose.
Build green, 28/28 packages, +5 new ValidateProfileName tests.
go vet / gofmt / go mod verify / go mod tidy all clean.
Rationale for the cascade: pre-1.0 is the cheapest moment to close
L-25 self-consistency (L-26). The half-finished internal rename
would have perpetuated the very `context` vs `context.Context`
ambiguity that motivated v0.7's user-visible rename in the first
place.
271 lines
9.2 KiB
Go
271 lines
9.2 KiB
Go
package doc
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"errors"
|
|
"os"
|
|
"path/filepath"
|
|
"sort"
|
|
"strings"
|
|
"testing"
|
|
|
|
"github.com/stretchr/testify/assert"
|
|
"github.com/stretchr/testify/require"
|
|
|
|
"github.com/Tencent/WeKnora/cli/internal/cmdutil"
|
|
"github.com/Tencent/WeKnora/cli/internal/iostreams"
|
|
sdk "github.com/Tencent/WeKnora/client"
|
|
)
|
|
|
|
// scriptedUploadSvc records every CreateKnowledgeFromFile call and returns
|
|
// per-path scripted results.
|
|
type scriptedUploadSvc struct {
|
|
results map[string]struct {
|
|
k *sdk.Knowledge
|
|
err error
|
|
}
|
|
called []string
|
|
|
|
// Captures from the most-recent call (every recursive iteration writes
|
|
// these; tests that want all-rows can extend to slices later).
|
|
lastMetadata map[string]string
|
|
lastEnableMultimodel *bool
|
|
lastChannel string
|
|
}
|
|
|
|
func (s *scriptedUploadSvc) CreateKnowledgeFromFile(
|
|
_ context.Context,
|
|
_, filePath string,
|
|
metadata map[string]string,
|
|
enableMultimodel *bool,
|
|
_, channel string,
|
|
) (*sdk.Knowledge, error) {
|
|
s.called = append(s.called, filepath.Base(filePath))
|
|
s.lastMetadata = metadata
|
|
s.lastEnableMultimodel = enableMultimodel
|
|
s.lastChannel = channel
|
|
r, ok := s.results[filepath.Base(filePath)]
|
|
if !ok {
|
|
return &sdk.Knowledge{ID: "doc_" + filepath.Base(filePath), FileName: filepath.Base(filePath)}, nil
|
|
}
|
|
return r.k, r.err
|
|
}
|
|
|
|
func mkTree(t *testing.T, base string, names ...string) {
|
|
t.Helper()
|
|
for _, n := range names {
|
|
full := filepath.Join(base, n)
|
|
require.NoError(t, os.MkdirAll(filepath.Dir(full), 0o755))
|
|
require.NoError(t, os.WriteFile(full, []byte("x"), 0o644))
|
|
}
|
|
}
|
|
|
|
func TestUploadRecursive_WalksAllFiles(t *testing.T) {
|
|
out, _ := iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "a.pdf", "b.pdf", "sub/c.pdf")
|
|
|
|
svc := &scriptedUploadSvc{}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*"}
|
|
require.NoError(t, runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir))
|
|
|
|
sort.Strings(svc.called)
|
|
assert.Equal(t, []string{"a.pdf", "b.pdf", "c.pdf"}, svc.called)
|
|
got := out.String()
|
|
for _, w := range []string{"a.pdf", "b.pdf", "c.pdf", "Uploaded 3"} {
|
|
assert.Contains(t, got, w)
|
|
}
|
|
}
|
|
|
|
func TestUploadRecursive_GlobFilter(t *testing.T) {
|
|
_, _ = iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "doc.pdf", "ignore.txt", "sub/keep.pdf", "sub/also-ignore.md")
|
|
|
|
svc := &scriptedUploadSvc{}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*.pdf"}
|
|
require.NoError(t, runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir))
|
|
|
|
sort.Strings(svc.called)
|
|
assert.Equal(t, []string{"doc.pdf", "keep.pdf"}, svc.called)
|
|
}
|
|
|
|
func TestUploadRecursive_PartialFailure_Exits1(t *testing.T) {
|
|
out, _ := iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "ok.pdf", "bad.pdf")
|
|
|
|
svc := &scriptedUploadSvc{results: map[string]struct {
|
|
k *sdk.Knowledge
|
|
err error
|
|
}{
|
|
"bad.pdf": {err: errors.New("HTTP error 500: internal")},
|
|
}}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*"}
|
|
err := runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir)
|
|
require.Error(t, err)
|
|
|
|
var typed *cmdutil.Error
|
|
require.ErrorAs(t, err, &typed)
|
|
// CodeServerError preserves the 500 classification of the underlying
|
|
// SDK error - the recursive wrapper just aggregates.
|
|
assert.Equal(t, cmdutil.CodeServerError, typed.Code)
|
|
|
|
got := out.String()
|
|
assert.Contains(t, got, "OK") // ok.pdf still succeeded
|
|
assert.Contains(t, got, "FAIL")
|
|
assert.Contains(t, got, "Uploaded 1")
|
|
assert.Contains(t, got, "Failed 1")
|
|
}
|
|
|
|
func TestUploadRecursive_NoMatches(t *testing.T) {
|
|
out, _ := iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "only.txt")
|
|
|
|
svc := &scriptedUploadSvc{}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*.pdf"}
|
|
require.NoError(t, runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir))
|
|
assert.Len(t, svc.called, 0)
|
|
assert.Contains(t, strings.ToLower(out.String()), "no files matched")
|
|
}
|
|
|
|
func TestUploadRecursive_NotADirectory(t *testing.T) {
|
|
_, _ = iostreams.SetForTest(t)
|
|
path := writeTempFile(t, "single.pdf")
|
|
svc := &scriptedUploadSvc{}
|
|
err := runUploadRecursive(context.Background(), &UploadOptions{Recursive: true, Glob: "*"}, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", path)
|
|
require.Error(t, err)
|
|
var typed *cmdutil.Error
|
|
require.ErrorAs(t, err, &typed)
|
|
assert.Equal(t, cmdutil.CodeInputInvalidArgument, typed.Code)
|
|
assert.Contains(t, typed.Message, "directory")
|
|
}
|
|
|
|
func TestUploadRecursive_RejectsNameFlag(t *testing.T) {
|
|
_, _ = iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "a.pdf")
|
|
svc := &scriptedUploadSvc{}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*", Name: "single-name.pdf"}
|
|
err := runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir)
|
|
require.Error(t, err)
|
|
var typed *cmdutil.Error
|
|
require.ErrorAs(t, err, &typed)
|
|
assert.Equal(t, cmdutil.CodeInputInvalidArgument, typed.Code)
|
|
assert.Contains(t, typed.Message, "--name")
|
|
}
|
|
|
|
func TestUploadRecursive_PropagatesMultimodelAndMetadata(t *testing.T) {
|
|
_, _ = iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "a.pdf")
|
|
|
|
svc := &scriptedUploadSvc{}
|
|
mm := true
|
|
opts := &UploadOptions{
|
|
Recursive: true,
|
|
Glob: "*",
|
|
EnableMultimodel: &mm,
|
|
Metadata: []string{"team=alpha"},
|
|
Channel: "browser_extension",
|
|
}
|
|
require.NoError(t, runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir))
|
|
|
|
require.NotNil(t, svc.lastEnableMultimodel)
|
|
assert.True(t, *svc.lastEnableMultimodel)
|
|
assert.Equal(t, map[string]string{"team": "alpha"}, svc.lastMetadata)
|
|
assert.Equal(t, "browser_extension", svc.lastChannel)
|
|
}
|
|
|
|
func TestUploadRecursive_MetadataInvalid_NoCalls(t *testing.T) {
|
|
_, _ = iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "a.pdf")
|
|
|
|
svc := &scriptedUploadSvc{}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*", Metadata: []string{"badformat"}}
|
|
err := runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatText}, svc, "kb_xxx", dir)
|
|
require.Error(t, err)
|
|
var typed *cmdutil.Error
|
|
require.ErrorAs(t, err, &typed)
|
|
assert.Equal(t, cmdutil.CodeInputInvalidArgument, typed.Code)
|
|
assert.Empty(t, svc.called, "must fail before any per-file call")
|
|
}
|
|
|
|
// TestUploadRecursive_JSON_BatchEnvelope verifies that --format json emits the
|
|
// batch envelope shape: {ok, data:[{id,ok,result?|error?}...], meta:{count,successes,failures}}.
|
|
// The per-item id is the file path; result carries {id, name} from the server.
|
|
func TestUploadRecursive_JSON_BatchEnvelope(t *testing.T) {
|
|
out, _ := iostreams.SetForTest(t)
|
|
dir := t.TempDir()
|
|
mkTree(t, dir, "ok.pdf", "bad.pdf")
|
|
|
|
svc := &scriptedUploadSvc{results: map[string]struct {
|
|
k *sdk.Knowledge
|
|
err error
|
|
}{
|
|
"bad.pdf": {err: errors.New("HTTP error 500: internal")},
|
|
}}
|
|
opts := &UploadOptions{Recursive: true, Glob: "*"}
|
|
err := runUploadRecursive(context.Background(), opts, &cmdutil.FormatOptions{Mode: cmdutil.FormatJSON}, svc, "kb_xxx", dir)
|
|
require.Error(t, err) // partial failure → typed error
|
|
|
|
var env struct {
|
|
OK bool `json:"ok"`
|
|
Data []struct {
|
|
ID string `json:"id"`
|
|
OK bool `json:"ok"`
|
|
Result *struct {
|
|
ID string `json:"id"`
|
|
Name string `json:"name"`
|
|
} `json:"result,omitempty"`
|
|
Error *struct {
|
|
Type string `json:"type"`
|
|
Message string `json:"message"`
|
|
} `json:"error,omitempty"`
|
|
} `json:"data"`
|
|
Meta struct {
|
|
Count int `json:"count"`
|
|
Successes int `json:"successes"`
|
|
Failures int `json:"failures"`
|
|
} `json:"meta"`
|
|
}
|
|
require.NoError(t, json.Unmarshal(out.Bytes(), &env), "must be valid JSON: %s", out.String())
|
|
|
|
assert.False(t, env.OK, "top-level ok must be false when any item failed")
|
|
assert.Equal(t, 2, env.Meta.Count)
|
|
assert.Equal(t, 1, env.Meta.Successes)
|
|
assert.Equal(t, 1, env.Meta.Failures)
|
|
require.Len(t, env.Data, 2)
|
|
|
|
// File paths are used as batch item ids; verify both files appear.
|
|
ids := []string{env.Data[0].ID, env.Data[1].ID}
|
|
assert.True(t, strings.Contains(ids[0], "ok.pdf") || strings.Contains(ids[1], "ok.pdf"), "ok.pdf must appear in batch data")
|
|
assert.True(t, strings.Contains(ids[0], "bad.pdf") || strings.Contains(ids[1], "bad.pdf"), "bad.pdf must appear in batch data")
|
|
|
|
// The success item must have a result with server id/name.
|
|
for _, item := range env.Data {
|
|
if strings.Contains(item.ID, "ok.pdf") {
|
|
assert.True(t, item.OK)
|
|
assert.NotNil(t, item.Result)
|
|
} else {
|
|
assert.False(t, item.OK)
|
|
assert.NotNil(t, item.Error)
|
|
}
|
|
}
|
|
|
|
// --format json must emit exactly ONE JSON document. Per-file "FAIL"/"OK"
|
|
// progress lines belong on the human path; the typed error is Silent so
|
|
// the root handler doesn't write anything additional to stdout.
|
|
body := out.String()
|
|
assert.NotContains(t, body, "FAIL ", "per-file plain lines must not appear under --format json")
|
|
assert.NotContains(t, body, "OK ", "per-file plain lines must not appear under --format json")
|
|
|
|
var typed *cmdutil.Error
|
|
require.ErrorAs(t, err, &typed)
|
|
assert.True(t, typed.Silent, "JSON-path partial failure must be Silent")
|
|
assert.Equal(t, cmdutil.CodeServerError, typed.Code)
|
|
}
|