mirror of
https://github.com/Tencent/WeKnora.git
synced 2026-06-04 13:30:32 +08:00
The per-image multimodal subspan only captured image_url / enable_ocr / enable_caption on input and chunk_id on output, so the trace viewer could not answer "what did THIS image actually produce?" without joining back to the chunks table. Adds to the per-image span output: - vlm_model_id (or "legacy_inline" for inline-config KBs) - image_bytes (read size) - ocr_prompt: "default" | "scanned_pdf" - ocr_chars + ocr_preview (sanitized text, capped at 200 runes) - caption_chars + caption_preview - chunks_created (count of OCR/caption child chunks) - indexed (true after BatchIndex completes) - per-step error fields (read_error / ocr_error / caption_error / skipped reason) when something fails Also adds parent_chunk_id to the span input so the trace links back to the text chunk this image hangs off — useful when a doc has hundreds of inline images and you need to know WHERE in the text this one came from. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>