mirror of
https://github.com/Tencent/WeKnora.git
synced 2026-06-04 13:30:32 +08:00
Post-review polish on the v0.7 wire / surface contract. Bundles five
follow-ups that landed after the main BREAKING feat commit:
1. Complete context→profile cascade (internal API + YAML schema)
The prior commit renamed only the user-visible surface (commands /
flags / env / project link / envelope field). The internal Go API
and on-disk config schema were still half-renamed — an L-25
self-consistency violation flagged by post-merge review. Closed here:
Internal Go API:
- config.Context → config.Profile
- config.Config.CurrentContext → CurrentProfile
- config.Config.Contexts → Profiles
- LoginOptions.Context → LoginOptions.Profile
- clearContextSecrets() → clearProfileSecrets()
- saveContextRef() → saveProfileRef()
- secrets.Store: param name `context` → `profile` (interface +
FileStore + KeyringStore + MemStore)
- cmdutil.LoadSecret(store, context, key) → LoadSecret(store, profile, key)
- cmdutil.RefreshAndPersist's ctxName → profileName
- Local var `ctx := &config.Profile{...}` → `prof := &config.Profile{...}`
in auth/login.go to eliminate the visual collision with Go stdlib
context.Context that motivated the whole rename in the first place.
On-disk config.yaml schema:
- current_context: → current_profile:
- contexts: → profiles:
- Pre-1.0 break, no compat alias. Users on v0.6 dogfooded configs
must delete ~/.config/weknora/config.yaml or hand-rename the two
keys (CHANGELOG migration note added).
Tests / fixtures / golden files:
- factory_test.go YAML fixture + assertion updated.
- acceptance/e2e/e2e_test.go writeContextYAML → writeProfileYAML,
fixture YAML keys updated.
- acceptance/testdata/wire/doctor.error_network.json golden updated
("active context" → "active profile" in hint string).
User-visible prose sweep:
- cmd/mcp/serve.go --help Long: "active context (or --context)" →
"active profile (or --profile)" — most-visible miss.
- cmd/{kb/list, search/kb, session/list, api/api} Short/Long help.
- cmd/auth/login.go stdout: `(context=%s)` → `(profile=%s)`.
- cmd/auth/logout.go error: `"no current context"` → `"no current profile"`.
- cmd/doctor/doctor.go hint string (also the wire golden above).
- cmd/auth/refresh.go error: `"refresh token missing for context"` →
`"refresh token missing for profile"`.
- README.md: `## Multi-context` H2 → `## Multi-profile`; code-block
comment `# current context` → `# current profile`.
Code-comment / docstring sweep across cli/cmd/auth/ and
cli/internal/cmdutil/. Comments referencing Go stdlib context.Context,
the RAG / LLM "context window" concept, and historical CHANGELOG
entries for v0.4 / v0.5 were left alone.
CHANGELOG v0.7 BREAKING entry gains the on-disk-schema bullet under
the existing "context → profile" item.
2. Profile name validation (shell-injection guard)
`envelope.error.retry_command` is a single shell-string field. An
AI agent that exec()s it via `sh -c <retry_command>` was injectable
through a maliciously-named profile:
weknora auth logout --name 'x; rm -rf ~'
# would produce: retry_command = "weknora auth logout --name x; rm -rf ~ -y"
`cmd/profile/add.go` already enforced an alphanumeric + `-_.`
allowlist via `validateName`. The `auth login` and `auth logout`
paths bypassed it.
- Moved validation from `cmd/profile/add.go` to
`cli/internal/cmdutil/profilename.go` as exported
`ValidateProfileName` (cmdutil is the import-cycle-safe home;
internal/config can't depend on cmdutil).
- `auth login` runs the validator before any persist call.
- `auth logout` runs the validator on `opts.Name` before
constructing `retry_command`.
- Unit tests (`profilename_test.go`) cover the allowlist, empty
rejection, path-traversal, shell metacharacters (`;`, `&`, `|`,
`$()`, backticks, quotes, whitespace, glob, redirects), and the
user-facing hint text. The shell-metachar test exists as a
regression guard.
Wire shape (`retry_command` string → `retry_command_argv []string`)
remains a v0.8 additive change per ROADMAP — this fix removes the
practical exploit path without touching the wire contract.
3. AI-agent terminology disambiguation
"agent" has three referents in this codebase: (a) WeKnora's
server-side Custom Agent resource, (b) the removed `agent invoke`
verb, (c) external LLM/automation consumers. Per project memory
feedback_no_meta_disambiguation_in_docs, the fix is full-term
naming, not "X has N meanings" prose. Surgical changes at section
headers + ambiguous prose:
- AGENTS.md: "Agent decision shortcuts" → "AI agent decision
shortcuts"; "agent-callable surface" → "AI-agent-callable
surface".
- README.md: "Designed to be agent-first" → "AI-agent-first";
"Other agent ergonomics" → "Other AI-agent ergonomics"; "in
agent contexts" → "in AI-agent contexts"; "for CI / agents" →
"for CI / AI agents".
Anaphoric "agents" inside paragraphs that already established
"AI agents" was left alone — full substitution everywhere would
have been prose noise without clarity gain.
4. Wire-contract review follow-ups
Real findings from a second-pass review of the v0.7 envelope /
streaming / surface design. Per project memory
feedback_check_in_domain_anchor_first, candidate findings were
first verified against the in-domain peer CLI explicitly cited as
the envelope anchor; two earlier-flagged issues turned out to be
in-pattern and were withdrawn.
Surviving fixes:
- AGENTS.md success-envelope example rewritten. The prior example
showed `has_more: false` / `_notice: {}` as if they were always
present, but both fields are `omitempty` and never serialize
when zero / nil. Replaced with three realistic shapes (list /
single resource / mutation with no payload) and added a note
that optional fields are omitted when empty.
- cmd/chat/chat.go Args: MinimumNArgs(1) → ExactArgs(1).
v0.6 silently joined `weknora chat hello world` into
`"hello world"`. v0.7 now rejects multi-arg with exit 2,
matching `weknora session ask`. BREAKING; CHANGELOG entry
added under v0.7 BREAKING.
- internal/output/envelope.go extracts NewEnvelope(data, meta,
profile) constructor. The jq-filter path in
cmdutil.FormatOptions.Emit was manually rebuilding the
envelope literal alongside the canonical WriteEnvelope path —
drift risk when fields are added. Single construction point now.
- internal/cmdutil/factory.go adds AddKBFlag(cmd) helper.
Five files (chat, doc/list, doc/upload, doc/create, doc/fetch)
had verbatim-identical `cmd.Flags().String("kb", ...)`
declarations. Centralised so flag name + help text stay
in sync with Factory.ResolveKB. Docstring reordering + gofmt
fixup landed in the same edit to keep ResolveKB's own godoc
attached to its function.
5. OSS-readiness comment / doc sweep
Pre-publication scrub of code, comments, and shipped Markdown to
remove references that only make sense in the development repo:
- AGENTS.md "Deliberate deviations + mainstream alignments"
section: removed peer-project name-drops from the comparison
table; rewrote as five flagged design decisions with rationale
but no specific competitor named. The four rows that previously
contrasted against a named peer CLI now state WeKnora's choice
+ rationale directly. Section header renamed to "Design
decisions worth flagging" since it is no longer a
deviation/alignment matrix.
- CHANGELOG v0.7 BREAKING rationales: three references to a
named peer CLI removed; the context→profile rationale now
cites only mainstream multi-credential CLIs by category (AWS /
Stripe / OpenAI / Anthropic), and the `api -d/--data` removal
rationale cites only `gh api` / `curl`. `chat` BREAKING entry
rationale similarly simplified.
- 35 cross-references to design-spec section numbers (§4.1 /
§4.5 / §5.3 etc.) removed from Go doc comments and test
comments across 13 files. The referenced spec lives outside
the shipped tree; readers of the public repo cannot resolve
them. Each reference replaced with a self-contained semantic
description (e.g. "the batch envelope" / "AGENTS.md section
on the success path").
- Mixed-language strings translated to English:
- Four Go comments: internal/cmdutil/exit.go:213,215,
internal/cmdutil/errors.go:156,
internal/output/batch_test.go:90,
internal/output/envelope_test.go:27.
- One CHANGELOG section title:
`v0.7 — Agent-first wire contract + 命令面集中清理` →
`... + command-surface cleanup`.
- CJK test fixtures (internal/text/truncate_test.go CJK
truncation cases, cmd/session/list_test.go Chinese session
title, acceptance/e2e/e2e_test.go Chinese RAG corpus)
retained — they are intentional test inputs, not stray prose.
- Makefile help comment: `golangci-lint added in PR-9` →
`golangci-lint planned`. Internal PR numbering should not
surface in shipped Makefile prose.
Build green, 28/28 packages, +5 new ValidateProfileName tests.
go vet / gofmt / go mod verify / go mod tidy all clean.
Rationale for the cascade: pre-1.0 is the cheapest moment to close
L-25 self-consistency (L-26). The half-finished internal rename
would have perpetuated the very `context` vs `context.Context`
ambiguity that motivated v0.7's user-visible rename in the first
place.
302 lines
8.5 KiB
Go
302 lines
8.5 KiB
Go
package doc
|
|
|
|
import (
|
|
"context"
|
|
"errors"
|
|
"fmt"
|
|
"io"
|
|
"math/rand"
|
|
"sync"
|
|
"time"
|
|
|
|
sdk "github.com/Tencent/WeKnora/client"
|
|
"github.com/spf13/cobra"
|
|
|
|
"github.com/Tencent/WeKnora/cli/internal/cmdutil"
|
|
"github.com/Tencent/WeKnora/cli/internal/iostreams"
|
|
)
|
|
|
|
// WaitOptions captures `doc wait` flag state.
|
|
type WaitOptions struct {
|
|
IDs []string
|
|
Timeout time.Duration
|
|
Interval time.Duration
|
|
}
|
|
|
|
// NewCmdWait builds `weknora doc wait <id> [<id>...]`.
|
|
//
|
|
// Multi-id behaviour is always wait-all: blocks until every id reaches a
|
|
// terminal state. Use shell composition
|
|
// (`weknora doc wait id1 && weknora doc wait id2`) when fail-fast is
|
|
// desired.
|
|
func NewCmdWait(f *cmdutil.Factory) *cobra.Command {
|
|
opts := &WaitOptions{}
|
|
cmd := &cobra.Command{
|
|
Use: "wait <doc-id> [<doc-id>...]",
|
|
Short: "Wait for one or more documents to finish parsing",
|
|
Long: `Block until every given document reaches a terminal parse_status
|
|
(completed or failed), the timeout expires, or the user interrupts (Ctrl-C).
|
|
Always wait-all: every id must reach a terminal state before returning.
|
|
|
|
Exit codes:
|
|
0 all completed
|
|
1 any failed
|
|
124 --timeout reached (matches GNU 'timeout' command)
|
|
130 Ctrl-C / SIGINT
|
|
|
|
Multi-id is polled concurrently (max 5 parallel; use 'xargs -P' for more).
|
|
For fail-fast semantics, use shell composition:
|
|
weknora doc wait id1 && weknora doc wait id2 && weknora doc wait id3`,
|
|
Example: ` weknora doc wait doc_abc
|
|
weknora doc wait id1 id2 id3 --timeout 20m
|
|
weknora doc wait id1 id2 --format ndjson`,
|
|
Args: cobra.MinimumNArgs(1),
|
|
RunE: func(c *cobra.Command, args []string) error {
|
|
// Validate flags FIRST so an invalid --format doesn't cost the
|
|
// user a multi-minute poll before erroring out.
|
|
fopts, err := cmdutil.CheckFormatFlag(c)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
fopts.ResolveDefault(iostreams.IO.IsStdoutTTY())
|
|
|
|
opts.IDs = args
|
|
cli, err := f.Client()
|
|
if err != nil {
|
|
return err
|
|
}
|
|
res, err := waitForDocs(c.Context(), opts.IDs, cli, *opts)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
if err := emitWaitResult(res, fopts, iostreams.IO.Out); err != nil {
|
|
return err
|
|
}
|
|
|
|
switch res.ExitCode() {
|
|
case 0:
|
|
return nil
|
|
case 1:
|
|
return cmdutil.NewError(cmdutil.CodeOperationFailed, fmt.Sprintf("%d doc(s) failed", len(res.Failed)))
|
|
case 124:
|
|
return cmdutil.NewError(cmdutil.CodeOperationTimeout, fmt.Sprintf("wait timed out (%d doc(s) still pending)", len(res.Timeout)))
|
|
}
|
|
return nil
|
|
},
|
|
}
|
|
cmd.Flags().DurationVar(&opts.Timeout, "timeout", 10*time.Minute, "Max wait time before exiting 124")
|
|
cmd.Flags().DurationVar(&opts.Interval, "interval", 2*time.Second, "Initial poll interval; exponential backoff capped at 15s + jitter")
|
|
cmdutil.AddFormatFlag(cmd)
|
|
return cmd
|
|
}
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// Core poll loop (B2)
|
|
// ---------------------------------------------------------------------------
|
|
|
|
// WaitService is the narrow SDK surface needed for polling.
|
|
type WaitService interface {
|
|
GetKnowledge(ctx context.Context, id string) (*sdk.Knowledge, error)
|
|
}
|
|
|
|
// WaitResult is the terminal-state partition returned by waitForDocs.
|
|
type WaitResult struct {
|
|
Completed []string `json:"completed"`
|
|
Failed []FailedDoc `json:"failed,omitempty"`
|
|
Timeout []string `json:"timeout,omitempty"`
|
|
}
|
|
|
|
// FailedDoc carries the id + reason for a doc that reached parse_status=failed
|
|
// or that GetKnowledge returned an error for.
|
|
type FailedDoc struct {
|
|
ID string `json:"id"`
|
|
Code string `json:"code,omitempty"`
|
|
Message string `json:"message,omitempty"`
|
|
}
|
|
|
|
const (
|
|
maxConcurrentPolls = 5
|
|
maxBackoffInterval = 15 * time.Second
|
|
jitterMax = 500 * time.Millisecond
|
|
)
|
|
|
|
// waitForDocs polls each id until terminal state or timeout. Concurrency is
|
|
// bounded by maxConcurrentPolls. Returns the partitioned terminal state.
|
|
// Always waits for every id (wait-all semantics).
|
|
//
|
|
// Duplicate ids are deduplicated at entry — polling the same id twice would
|
|
// produce duplicate result entries and waste poll quota.
|
|
//
|
|
// Exponential backoff starts at opts.Interval, doubles each tick, caps at
|
|
// maxBackoffInterval, with up to jitterMax random jitter added per sleep.
|
|
func waitForDocs(ctx context.Context, ids []string, svc WaitService, opts WaitOptions) (*WaitResult, error) {
|
|
// Dedup ids while preserving first-seen order.
|
|
seen := make(map[string]struct{}, len(ids))
|
|
deduped := make([]string, 0, len(ids))
|
|
for _, id := range ids {
|
|
if _, ok := seen[id]; ok {
|
|
continue
|
|
}
|
|
seen[id] = struct{}{}
|
|
deduped = append(deduped, id)
|
|
}
|
|
ids = deduped
|
|
|
|
ctx, cancel := context.WithTimeout(ctx, opts.Timeout)
|
|
defer cancel()
|
|
|
|
result := &WaitResult{}
|
|
var mu sync.Mutex
|
|
addCompleted := func(id string) {
|
|
mu.Lock()
|
|
defer mu.Unlock()
|
|
result.Completed = append(result.Completed, id)
|
|
}
|
|
addFailed := func(fd FailedDoc) {
|
|
mu.Lock()
|
|
defer mu.Unlock()
|
|
result.Failed = append(result.Failed, fd)
|
|
}
|
|
addTimeout := func(id string) {
|
|
mu.Lock()
|
|
defer mu.Unlock()
|
|
result.Timeout = append(result.Timeout, id)
|
|
}
|
|
|
|
sem := make(chan struct{}, maxConcurrentPolls)
|
|
var wg sync.WaitGroup
|
|
for _, id := range ids {
|
|
wg.Add(1)
|
|
go func(id string) {
|
|
defer wg.Done()
|
|
sem <- struct{}{}
|
|
defer func() { <-sem }()
|
|
|
|
interval := opts.Interval
|
|
for {
|
|
select {
|
|
case <-ctx.Done():
|
|
// Distinguish SIGINT/SIGTERM (Canceled) from --timeout
|
|
// (DeadlineExceeded) so a user interrupt does not
|
|
// pollute the timeout list with the in-flight ids.
|
|
if errors.Is(ctx.Err(), context.Canceled) {
|
|
// Signal-driven cancel: root's signal handler
|
|
// exits 130; don't classify these ids as timed out.
|
|
return
|
|
}
|
|
addTimeout(id)
|
|
return
|
|
default:
|
|
}
|
|
|
|
doc, err := svc.GetKnowledge(ctx, id)
|
|
if err != nil {
|
|
// Check Canceled first — SIGINT during an in-flight
|
|
// request surfaces as a context-canceled error; it is
|
|
// not a real GetKnowledge failure.
|
|
if errors.Is(ctx.Err(), context.Canceled) {
|
|
return
|
|
}
|
|
if errors.Is(ctx.Err(), context.DeadlineExceeded) {
|
|
addTimeout(id)
|
|
return
|
|
}
|
|
addFailed(FailedDoc{ID: id, Message: err.Error()})
|
|
return
|
|
}
|
|
|
|
switch doc.ParseStatus {
|
|
case "completed":
|
|
addCompleted(id)
|
|
return
|
|
case "failed":
|
|
addFailed(FailedDoc{ID: id, Message: doc.ErrorMessage})
|
|
return
|
|
}
|
|
|
|
// Not yet terminal — sleep with jitter, then exp-backoff.
|
|
jitter := time.Duration(rand.Int63n(int64(jitterMax)))
|
|
timer := time.NewTimer(interval + jitter)
|
|
select {
|
|
case <-ctx.Done():
|
|
timer.Stop()
|
|
if errors.Is(ctx.Err(), context.Canceled) {
|
|
return
|
|
}
|
|
addTimeout(id)
|
|
return
|
|
case <-timer.C:
|
|
}
|
|
interval *= 2
|
|
if interval > maxBackoffInterval {
|
|
interval = maxBackoffInterval
|
|
}
|
|
}
|
|
}(id)
|
|
}
|
|
wg.Wait()
|
|
return result, nil
|
|
}
|
|
|
|
// ExitCode resolves the compound terminal state to a Unix exit code.
|
|
// Priority: 1 > 124 > 0 (failed > timeout > completed). SIGINT (exit 130)
|
|
// is handled by the Go runtime / context cancellation, not here.
|
|
func (r *WaitResult) ExitCode() int {
|
|
if len(r.Failed) > 0 {
|
|
return 1
|
|
}
|
|
if len(r.Timeout) > 0 {
|
|
return 124
|
|
}
|
|
return 0
|
|
}
|
|
|
|
// Compile-time assertion: *sdk.Client satisfies WaitService.
|
|
var _ WaitService = (*sdk.Client)(nil)
|
|
|
|
// ---------------------------------------------------------------------------
|
|
// B5: output rendering
|
|
// ---------------------------------------------------------------------------
|
|
|
|
// emitWaitResult renders r according to --format. Output writer is
|
|
// parametrized for tests; production callers pass iostreams.IO.Out.
|
|
func emitWaitResult(r *WaitResult, fopts *cmdutil.FormatOptions, w io.Writer) error {
|
|
switch fopts.Mode {
|
|
case cmdutil.FormatJSON, cmdutil.FormatNDJSON:
|
|
return fopts.Emit(w, r, nil)
|
|
case cmdutil.FormatText, "":
|
|
return writeWaitText(w, r)
|
|
default:
|
|
return fmt.Errorf("unsupported --format %q for doc wait", fopts.Mode)
|
|
}
|
|
}
|
|
|
|
// writeWaitText renders r as human-readable lines:
|
|
//
|
|
// ✓ <id> completed
|
|
// ✗ <id> failed: <message>
|
|
// ⏱ <id> timeout
|
|
func writeWaitText(w io.Writer, r *WaitResult) error {
|
|
for _, id := range r.Completed {
|
|
if _, err := fmt.Fprintf(w, "✓ %s completed\n", id); err != nil {
|
|
return err
|
|
}
|
|
}
|
|
for _, fd := range r.Failed {
|
|
msg := fd.Message
|
|
if msg == "" {
|
|
msg = "(no message)"
|
|
}
|
|
if _, err := fmt.Fprintf(w, "✗ %s failed: %s\n", fd.ID, msg); err != nil {
|
|
return err
|
|
}
|
|
}
|
|
for _, id := range r.Timeout {
|
|
if _, err := fmt.Fprintf(w, "⏱ %s timeout\n", id); err != nil {
|
|
return err
|
|
}
|
|
}
|
|
return nil
|
|
}
|