mirror of
https://github.com/Tencent/WeKnora.git
synced 2026-06-04 13:30:32 +08:00
feat(openmaic-classroom): 增加基于知识图谱概念的微课堂生成支持,更新需求构建模板
This commit is contained in:
committed by
lyingbug
parent
19cd17c0a6
commit
db8abd3f9b
@@ -1,6 +1,6 @@
|
||||
---
|
||||
name: openmaic-classroom
|
||||
description: 将 RAG 检索结果或文档块转换为 OpenMAIC 互动课程。当用户要求将知识库内容、检索到的文档片段、或上传的文档转换为教学课件/互动课堂时使用此技能。支持纯需求生成和基于 PDF 内容的课程生成。
|
||||
description: 将 RAG 检索结果、文档块或知识图谱概念转换为 OpenMAIC 互动课程。当用户要求将知识库内容、检索到的文档片段、上传的文档、或知识图谱中的概念批量转换为教学课件/互动课堂时使用此技能。支持纯需求生成、基于 PDF 内容的课程生成、和基于概念图遍历的批量课堂生成。
|
||||
---
|
||||
|
||||
# OpenMAIC Classroom Generator
|
||||
@@ -12,6 +12,7 @@ description: 将 RAG 检索结果或文档块转换为 OpenMAIC 互动课程。
|
||||
1. **RAG → 课程**: 将知识检索结果提炼为教学需求(requirement),通过 OpenMAIC API 生成互动课程
|
||||
2. **PDF → 课程**: 解析用户上传的 PDF,结合内容生成课程
|
||||
3. **文档块 → 课程集**: 将多个文档块/知识片段组织为多阶段课程集
|
||||
4. **概念图遍历 → 批量微课堂**: 遍历知识图谱中所有 concept 页面,每个 concept 生成一个 micro-classroom
|
||||
|
||||
## 能力边界
|
||||
|
||||
@@ -62,12 +63,13 @@ OpenMAIC 有两种使用模式,**根据用户场景选择**:
|
||||
- "基于检索结果生成课程"
|
||||
- "为这个知识点创建互动课堂"
|
||||
- "将知识库内容转换为教学材料"
|
||||
- "批量生成课程" / "把知识图谱的概念都做成课堂" / "基于概念图生成微课堂"
|
||||
|
||||
## 工作流程
|
||||
|
||||
### Phase 1: 确认输入源
|
||||
|
||||
确认课程生成的输入来源(三选一):
|
||||
确认课程生成的输入来源(四选一):
|
||||
|
||||
1. **纯需求生成**: 用户直接描述教学主题,无需额外文档
|
||||
→ 直接使用用户描述作为 `requirement`,**无需调用脚本**
|
||||
@@ -75,6 +77,8 @@ OpenMAIC 有两种使用模式,**根据用户场景选择**:
|
||||
→ 使用 `scripts/rag-to-requirement.py` 脚本转换检索结果为结构化 requirement(见 Phase 1.1)
|
||||
3. **PDF 文件**: 用户提供 PDF 文件路径,先解析再调用生成 API
|
||||
→ 提取 PDF 文本后构建 requirement,**无需调用脚本**
|
||||
4. **概念图遍历批量生成**: 遍历知识图谱中所有 concept 页面,每个 concept 生成一个 micro-classroom
|
||||
→ 使用 `scripts/concept-to-requirement.py` 脚本转换 concept + 关联 entity 为结构化 requirement(见 Phase 1.2)
|
||||
|
||||
### Phase 1.1: RAG 结果 → Requirement 转换(仅适用于场景 2)
|
||||
|
||||
@@ -101,6 +105,82 @@ execute_skill_script(
|
||||
- **不要**在没有任何参数的情况下调用此脚本,否则会报错退出
|
||||
- 如果脚本执行失败,可直接根据检索结果手动构建 requirement
|
||||
|
||||
### Phase 1.2: Concept Graph → Requirement 转换(仅适用于场景 4)
|
||||
|
||||
当场景 4 需要基于知识图谱概念批量生成课堂时,执行以下步骤:
|
||||
|
||||
**步骤 1:列出所有 concept 页面**
|
||||
|
||||
使用 `wiki_search` 工具搜索所有 concept 类型的页面:
|
||||
|
||||
```
|
||||
wiki_search("^concept/", limit=50)
|
||||
```
|
||||
|
||||
如果 concept 数量超过 50 个,多次调用翻页直到获取全部。
|
||||
|
||||
**步骤 2:对每个 concept 获取详情和关联 entity**
|
||||
|
||||
对每个 concept 页面:
|
||||
|
||||
1. 调用 `wiki_read_page([concept_slug])` 获取页面详情(含 OutLinks 和 InLinks)
|
||||
2. 从 OutLinks 和 InLinks 中筛选出 `entity/*` 开头的 slug
|
||||
3. 确定每个 entity 的 link_type:
|
||||
- 同时出现在 OutLinks 和 InLinks 中 → `bidirectional`
|
||||
- 仅出现在 OutLinks 中 → `outlink`
|
||||
- 仅出现在 InLinks 中 → `inlink`
|
||||
4. 调用 `wiki_read_page([entity_slugs])` 批量读取关联 entity(只取 title + summary,不取完整 content)
|
||||
|
||||
**步骤 3:转换为 requirement**
|
||||
|
||||
对每个 concept,调用 `scripts/concept-to-requirement.py` 将 concept + 关联 entity 转换为 requirement:
|
||||
|
||||
```
|
||||
execute_skill_script(
|
||||
skill_name: "openmaic-classroom",
|
||||
script_path: "scripts/concept-to-requirement.py",
|
||||
input: '{"concept": {"slug": "...", "title": "...", "summary": "...", "content": "..."}, "entities": [{"slug": "...", "title": "...", "summary": "...", "link_type": "..."}], "language": "zh-CN", "depth": "intermediate"}'
|
||||
)
|
||||
```
|
||||
|
||||
**input 参数格式(JSON 字符串,必须通过 `input` 参数传入):**
|
||||
- `concept`(必填): concept 页面对象,包含 `slug`、`title`、`summary`、`content`
|
||||
- `entities`(可选): 关联 entity 数组,每项包含 `slug`、`title`、`summary`、`link_type`
|
||||
- `language`(可选): `zh-CN|en-US`,默认 `zh-CN`
|
||||
- `depth`(可选): `beginner|intermediate|advanced`,默认 `intermediate`
|
||||
- `audience`(可选): 目标受众描述,默认"相关领域的学习者"
|
||||
|
||||
**步骤 4:顺序调用 OpenMAIC API**
|
||||
|
||||
对每个 concept 的 requirement,**顺序**调用 OpenMAIC 生成 API(concurrency=1):
|
||||
|
||||
- 每个 concept → 一个 micro-classroom
|
||||
- requirement 中标注 `micro-classroom`
|
||||
- 不可并行,避免配额冲突
|
||||
|
||||
**步骤 5:生成 manifest(可恢复性)**
|
||||
|
||||
生成 manifest JSON,记录每个 concept 的生成状态:
|
||||
|
||||
```json
|
||||
{
|
||||
"kb_id": "...",
|
||||
"total_concepts": 10,
|
||||
"generated": ["concept/rag", "concept/llm"],
|
||||
"failed": [{"slug": "concept/embedding", "error": "..."}],
|
||||
"pending": ["concept/vector-db"]
|
||||
}
|
||||
```
|
||||
|
||||
失败时从断点继续:跳过 `generated` 中的 concept,从 `pending` 的第一个开始。
|
||||
|
||||
**关键约束**:
|
||||
- wiki 读取和脚本转换允许 batching
|
||||
- OpenMAIC API 生成 concurrency=1
|
||||
- concept.Summary 作为 requirement 核心锚定
|
||||
- entity 只取 title + summary,不取完整 content
|
||||
- 无关联 entity 的 concept 仍可生成课堂(缺少实践环节)
|
||||
|
||||
### Phase 2: 构建 Generation Request
|
||||
|
||||
根据输入源构建请求体,**字段说明**:
|
||||
@@ -120,6 +200,7 @@ execute_skill_script(
|
||||
- **场景 1(纯需求)**: `requirement` 直接使用用户描述
|
||||
- **场景 2(RAG 结果)**: `requirement` 使用 Phase 1.1 脚本输出中的 `requirement` 字段
|
||||
- **场景 3(PDF)**: `requirement` 根据 PDF 提取的文本构建,`pdfContent` 填入解析结果
|
||||
- **场景 4(概念图遍历)**: `requirement` 使用 Phase 1.2 脚本输出中的 `requirement` 字段,每个 concept 单独调用 API
|
||||
|
||||
### Phase 3: 调用 OpenMAIC API
|
||||
|
||||
@@ -261,6 +342,23 @@ Classroom URL:
|
||||
4. 如果 MCP 工具不可用,告知用户先部署 mcp_api_requester(见 MCP 可用性检查)
|
||||
5. 汇总返回所有 Classroom URL
|
||||
|
||||
## 概念图遍历 → 批量微课堂
|
||||
|
||||
当用户需要基于知识图谱概念批量生成课堂时(场景 4),遵循 Phase 1.2 的完整流程。
|
||||
|
||||
**MVP 课程编排策略**:one concept → one micro-classroom
|
||||
|
||||
**课程类型标注**:requirement 中标注 `micro-classroom`
|
||||
|
||||
**批处理可恢复性**:生成 manifest JSON,记录每个 concept 的生成状态,失败时可从断点继续。
|
||||
|
||||
**关键约束**:
|
||||
- wiki 读取和脚本转换允许 batching
|
||||
- OpenMAIC API 生成 concurrency=1
|
||||
- concept.Summary 作为 requirement 核心锚定
|
||||
- entity 只取 title + summary,不取完整 content
|
||||
- 无关联 entity 的 concept 仍可生成课堂(缺少实践环节)
|
||||
|
||||
## 注意事项
|
||||
|
||||
- 脚本在 Docker 沙箱中执行,**沙箱默认禁用网络访问**
|
||||
|
||||
@@ -70,6 +70,66 @@ OpenMAIC 的 `requirement` 字段需要是**结构化的教学需求描述**,
|
||||
- 重点覆盖:[关键主题列表]
|
||||
```
|
||||
|
||||
### 模板 5: 基于概念图遍历(Concept Graph)
|
||||
|
||||
当从知识图谱 concept 页面及其关联 entity 生成微课堂时使用此模板。由 `scripts/concept-to-requirement.py` 自动生成。
|
||||
|
||||
**输入结构**:
|
||||
```json
|
||||
{
|
||||
"concept": { "slug": "concept/rag", "title": "RAG 检索增强生成", "summary": "...", "content": "..." },
|
||||
"entities": [
|
||||
{ "slug": "entity/vector-db", "title": "向量数据库", "summary": "...", "link_type": "outlink" },
|
||||
{ "slug": "entity/embedding", "title": "Embedding 模型", "summary": "...", "link_type": "bidirectional" }
|
||||
],
|
||||
"language": "zh-CN",
|
||||
"depth": "intermediate",
|
||||
"audience": "相关领域的学习者"
|
||||
}
|
||||
```
|
||||
|
||||
**输出 requirement 结构**:
|
||||
```
|
||||
基于知识图谱概念「[concept.title]」,为[audience]创建一个[depth]微课堂(micro-classroom)。
|
||||
|
||||
教学锚点:[concept.summary]
|
||||
|
||||
学习目标:
|
||||
- 理解[concept.summary 中的关键句]
|
||||
|
||||
核心知识点:
|
||||
- [从 concept.content 解析的定义/机制]
|
||||
|
||||
关联实体(实践环节):
|
||||
- 案例:[entity.title]:[entity.summary]
|
||||
- 工具:[entity.title]:[entity.summary]
|
||||
- 应用场景:[entity.title]:[entity.summary]
|
||||
- 前置知识:[entity.title]
|
||||
|
||||
实践任务:
|
||||
- 通过 [entity.title] 实践 [concept.title] 的应用
|
||||
|
||||
常见误区检查:
|
||||
- [从 concept.content 解析的误区]
|
||||
|
||||
评估提示:
|
||||
- 请解释 [concept.title] 的核心定义
|
||||
|
||||
请使用中文生成课程内容。
|
||||
```
|
||||
|
||||
**entity 分类排序规则**(纯文本操作,无 LLM/embedding):
|
||||
- link_type 权重:bidirectional (+3) > outlink (+2) > inlink (+1)
|
||||
- title token overlap:concept title 分词后与 entity title 的交集数 (+1 per hit, cap +2)
|
||||
- summary keyword hit:concept summary 关键词在 entity summary 中出现 (+1 per hit, cap +2)
|
||||
- slug token hit:concept slug token 在 entity slug 中出现 (+1)
|
||||
- summary 为空扣分 (-2)
|
||||
- 取 top 3-5 entities,分为 Examples / Tools / Application Scenarios / Prerequisites
|
||||
|
||||
**概念内容解析逻辑**:
|
||||
- 优先解析 markdown 结构:标题列表(定义段、机制段、案例段、误区段)
|
||||
- fallback 到前 N 字
|
||||
|
||||
## 示例
|
||||
|
||||
### 示例 1: 技术文档 → 课程
|
||||
|
||||
@@ -0,0 +1,412 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Concept Graph Material → OpenMAIC Requirement 转换器
|
||||
|
||||
将 WeKnora wiki 知识图谱中的 concept 页面及其关联 entity 转换为
|
||||
结构化的 OpenMAIC 课程生成需求描述。两阶段转换:
|
||||
1. Concept Graph Material → Pedagogical Design JSON
|
||||
2. Pedagogical Design JSON → requirement string
|
||||
|
||||
此脚本仅做数据转换,不涉及网络调用。
|
||||
|
||||
用法:
|
||||
echo '{"concept": {...}, "entities": [...]}' | python scripts/concept-to-requirement.py
|
||||
python scripts/concept-to-requirement.py --file input.json
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
|
||||
def _tokenize(text: str) -> list[str]:
|
||||
"""Simple whitespace + punctuation tokenizer for title/slug overlap."""
|
||||
return [t.lower() for t in re.split(r"[\s_\-/]+", text) if t]
|
||||
|
||||
|
||||
def _score_entity(
|
||||
entity: dict[str, Any],
|
||||
concept_title_tokens: list[str],
|
||||
concept_summary_keywords: list[str],
|
||||
concept_slug_tokens: list[str],
|
||||
) -> int:
|
||||
"""Score an entity by relevance to the concept (pure text heuristics)."""
|
||||
score = 0
|
||||
|
||||
# link_type scoring
|
||||
link_type = entity.get("link_type", "")
|
||||
if link_type == "bidirectional":
|
||||
score += 3
|
||||
elif link_type == "outlink":
|
||||
score += 2
|
||||
elif link_type == "inlink":
|
||||
score += 1
|
||||
|
||||
# title token overlap
|
||||
entity_title_tokens = set(_tokenize(entity.get("title", "")))
|
||||
overlap = entity_title_tokens & set(concept_title_tokens)
|
||||
score += min(len(overlap), 2)
|
||||
|
||||
# summary keyword hit
|
||||
entity_summary = (entity.get("summary") or "").lower()
|
||||
keyword_hits = sum(1 for kw in concept_summary_keywords if kw in entity_summary)
|
||||
score += min(keyword_hits, 2)
|
||||
|
||||
# slug token hit
|
||||
entity_slug_tokens = set(_tokenize(entity.get("slug", "")))
|
||||
slug_overlap = entity_slug_tokens & set(concept_slug_tokens)
|
||||
score += min(len(slug_overlap), 1)
|
||||
|
||||
# penalty for empty summary
|
||||
if not entity.get("summary"):
|
||||
score -= 2
|
||||
|
||||
return score
|
||||
|
||||
|
||||
def _classify_entity(
|
||||
entity: dict[str, Any], concept_title: str
|
||||
) -> str:
|
||||
"""Classify an entity into a pedagogical role."""
|
||||
title = (entity.get("title") or "").lower()
|
||||
summary = (entity.get("summary") or "").lower()
|
||||
text = f"{title} {summary}"
|
||||
|
||||
tool_keywords = ["工具", "平台", "框架", "库", "sdk", "api", "tool", "platform", "framework", "library"]
|
||||
example_keywords = ["案例", "实例", "示例", "应用", "case", "example", "application", "demo"]
|
||||
prereq_keywords = ["前提", "基础", "前置", "先决", "prerequisite", "foundation", "basic"]
|
||||
|
||||
if any(kw in text for kw in tool_keywords):
|
||||
return "Tools"
|
||||
if any(kw in text for kw in example_keywords):
|
||||
return "Examples"
|
||||
if any(kw in text for kw in prereq_keywords):
|
||||
return "Prerequisites"
|
||||
return "Application Scenarios"
|
||||
|
||||
|
||||
def _parse_markdown_sections(content: str) -> dict[str, str]:
|
||||
"""Parse markdown content into sections keyed by heading."""
|
||||
sections: dict[str, str] = {}
|
||||
current_heading = ""
|
||||
current_lines: list[str] = []
|
||||
|
||||
for line in content.split("\n"):
|
||||
heading_match = re.match(r"^(#{1,4})\s+(.+)$", line)
|
||||
if heading_match:
|
||||
if current_heading:
|
||||
sections[current_heading] = "\n".join(current_lines).strip()
|
||||
current_heading = heading_match.group(2).strip()
|
||||
current_lines = []
|
||||
else:
|
||||
current_lines.append(line)
|
||||
|
||||
if current_heading:
|
||||
sections[current_heading] = "\n".join(current_lines).strip()
|
||||
|
||||
return sections
|
||||
|
||||
|
||||
def _extract_key_points(sections: dict[str, str]) -> list[str]:
|
||||
"""Extract key points from markdown sections (definitions, mechanisms)."""
|
||||
points: list[str] = []
|
||||
definition_headings = {"定义", "概念", "概述", "简介", "Definition", "Overview", "Introduction"}
|
||||
mechanism_headings = {"机制", "原理", "工作原理", "Mechanism", "How it works", "Principle"}
|
||||
|
||||
for heading, body in sections.items():
|
||||
if any(d in heading for d in definition_headings):
|
||||
first_para = body.split("\n\n")[0].strip()
|
||||
if first_para:
|
||||
points.append(first_para[:200])
|
||||
elif any(m in heading for m in mechanism_headings):
|
||||
bullets = [l.strip().lstrip("-*• ") for l in body.split("\n") if l.strip().startswith(("- ", "* ", "• "))]
|
||||
points.extend(bullets[:3])
|
||||
|
||||
return points[:5]
|
||||
|
||||
|
||||
def _extract_examples(sections: dict[str, str]) -> list[str]:
|
||||
"""Extract examples from markdown sections."""
|
||||
example_headings = {"案例", "示例", "实例", "应用场景", "Example", "Use Case", "Application"}
|
||||
examples: list[str] = []
|
||||
|
||||
for heading, body in sections.items():
|
||||
if any(e in heading for e in example_headings):
|
||||
bullets = [l.strip().lstrip("-*• ") for l in body.split("\n") if l.strip().startswith(("- ", "* ", "• "))]
|
||||
examples.extend(bullets[:3])
|
||||
|
||||
return examples[:3]
|
||||
|
||||
|
||||
def _extract_misconceptions(sections: dict[str, str]) -> list[str]:
|
||||
"""Extract common misconceptions from markdown sections."""
|
||||
misconception_headings = {"误区", "常见错误", "误解", "Misconception", "Common mistake", "Pitfall"}
|
||||
misconceptions: list[str] = []
|
||||
|
||||
for heading, body in sections.items():
|
||||
if any(m in heading for m in misconception_headings):
|
||||
bullets = [l.strip().lstrip("-*• ") for l in body.split("\n") if l.strip().startswith(("- ", "* ", "• "))]
|
||||
misconceptions.extend(bullets[:3])
|
||||
|
||||
return misconceptions[:3]
|
||||
|
||||
|
||||
def build_pedagogical_design(data: dict[str, Any]) -> dict[str, Any]:
|
||||
"""Stage 1: Concept Graph Material → Pedagogical Design JSON."""
|
||||
concept = data["concept"]
|
||||
entities = data.get("entities", [])
|
||||
language = data.get("language", "zh-CN")
|
||||
depth = data.get("depth", "intermediate")
|
||||
audience = data.get("audience", "相关领域的学习者")
|
||||
|
||||
concept_title = concept.get("title", "")
|
||||
concept_summary = concept.get("summary", "")
|
||||
concept_content = concept.get("content", "")
|
||||
concept_slug = concept.get("slug", "")
|
||||
|
||||
# Parse markdown sections from content
|
||||
sections = _parse_markdown_sections(concept_content) if concept_content else {}
|
||||
|
||||
# Extract pedagogical elements from content
|
||||
key_points = _extract_key_points(sections)
|
||||
examples_from_content = _extract_examples(sections)
|
||||
misconceptions = _extract_misconceptions(sections)
|
||||
|
||||
# Score and rank entities
|
||||
concept_title_tokens = _tokenize(concept_title)
|
||||
concept_summary_keywords = [w.lower() for w in re.findall(r"\w+", concept_summary) if len(w) > 1]
|
||||
concept_slug_tokens = _tokenize(concept_slug)
|
||||
|
||||
scored_entities = []
|
||||
for entity in entities:
|
||||
score = _score_entity(entity, concept_title_tokens, concept_summary_keywords, concept_slug_tokens)
|
||||
scored_entities.append((score, entity))
|
||||
scored_entities.sort(key=lambda x: x[0], reverse=True)
|
||||
|
||||
# Select top entities (3-5)
|
||||
top_count = min(max(3, len(scored_entities)), 5)
|
||||
top_entities = scored_entities[:top_count]
|
||||
|
||||
# Classify entities into pedagogical roles
|
||||
classified: dict[str, list[dict[str, Any]]] = {
|
||||
"Examples": [],
|
||||
"Tools": [],
|
||||
"Application Scenarios": [],
|
||||
"Prerequisites": [],
|
||||
}
|
||||
for score, entity in top_entities:
|
||||
role = _classify_entity(entity, concept_title)
|
||||
classified[role].append({
|
||||
"slug": entity.get("slug", ""),
|
||||
"title": entity.get("title", ""),
|
||||
"summary": entity.get("summary", ""),
|
||||
"link_type": entity.get("link_type", ""),
|
||||
"relevance_score": score,
|
||||
})
|
||||
|
||||
# Build learning objectives from concept summary
|
||||
learning_objectives: list[str] = []
|
||||
if concept_summary:
|
||||
sentences = re.split(r"[。!?.!?]", concept_summary)
|
||||
learning_objectives = [f"理解{s.strip()}" for s in sentences if s.strip()][:3]
|
||||
if not learning_objectives:
|
||||
learning_objectives = [f"掌握 {concept_title} 的核心概念"]
|
||||
|
||||
# Build practice tasks from entity examples
|
||||
practice_tasks: list[str] = []
|
||||
for ent in classified.get("Examples", []):
|
||||
practice_tasks.append(f"通过 {ent['title']} 实践 {concept_title} 的应用")
|
||||
for ent in classified.get("Application Scenarios", []):
|
||||
practice_tasks.append(f"分析 {ent['title']} 在 {concept_title} 中的作用")
|
||||
if not practice_tasks and top_entities:
|
||||
_, first_ent = top_entities[0]
|
||||
practice_tasks.append(f"结合 {first_ent.get('title', '相关实体')} 理解 {concept_title} 的实际应用")
|
||||
|
||||
# Build prerequisites from entity prerequisites
|
||||
prerequisites: list[str] = []
|
||||
for ent in classified.get("Prerequisites", []):
|
||||
prerequisites.append(ent["title"])
|
||||
|
||||
# Build assessment prompts
|
||||
assessment_prompts: list[str] = []
|
||||
if concept_summary:
|
||||
assessment_prompts.append(f"请解释 {concept_title} 的核心定义")
|
||||
if key_points:
|
||||
assessment_prompts.append(f"请描述 {concept_title} 的工作机制")
|
||||
if classified.get("Examples"):
|
||||
assessment_prompts.append(f"请举例说明 {concept_title} 的实际应用")
|
||||
|
||||
# Build warnings from misconceptions
|
||||
warnings: list[str] = []
|
||||
for m in misconceptions:
|
||||
warnings.append(f"常见误区:{m}")
|
||||
|
||||
return {
|
||||
"concept_slug": concept_slug,
|
||||
"title": concept_title,
|
||||
"teaching_anchor": concept_summary or concept_title,
|
||||
"learning_objectives": learning_objectives,
|
||||
"key_points": key_points,
|
||||
"examples": examples_from_content,
|
||||
"practice_tasks": practice_tasks,
|
||||
"prerequisites": prerequisites,
|
||||
"misconception_checks": misconceptions,
|
||||
"assessment_prompts": assessment_prompts,
|
||||
"warnings": warnings,
|
||||
"classified_entities": classified,
|
||||
}
|
||||
|
||||
|
||||
def build_requirement(design: dict[str, Any], data: dict[str, Any]) -> str:
|
||||
"""Stage 2: Pedagogical Design JSON → requirement string."""
|
||||
concept = data["concept"]
|
||||
depth = data.get("depth", "intermediate")
|
||||
audience = data.get("audience", "相关领域的学习者")
|
||||
language = data.get("language", "zh-CN")
|
||||
|
||||
depth_map = {"beginner": "入门", "intermediate": "中级", "advanced": "高级"}
|
||||
depth_cn = depth_map.get(depth, "中级")
|
||||
|
||||
parts: list[str] = []
|
||||
|
||||
# Header
|
||||
parts.append(f"基于知识图谱概念「{design['title']}」,为{audience}创建一个{depth_cn}微课堂(micro-classroom)。")
|
||||
parts.append("")
|
||||
|
||||
# Teaching anchor
|
||||
parts.append(f"教学锚点:{design['teaching_anchor']}")
|
||||
parts.append("")
|
||||
|
||||
# Learning objectives
|
||||
if design["learning_objectives"]:
|
||||
parts.append("学习目标:")
|
||||
for obj in design["learning_objectives"]:
|
||||
parts.append(f" - {obj}")
|
||||
parts.append("")
|
||||
|
||||
# Key points
|
||||
if design["key_points"]:
|
||||
parts.append("核心知识点:")
|
||||
for kp in design["key_points"]:
|
||||
parts.append(f" - {kp}")
|
||||
parts.append("")
|
||||
|
||||
# Classified entities as practice context
|
||||
classified = design.get("classified_entities", {})
|
||||
entity_sections = []
|
||||
for role in ("Examples", "Tools", "Application Scenarios", "Prerequisites"):
|
||||
ents = classified.get(role, [])
|
||||
if ents:
|
||||
role_cn = {
|
||||
"Examples": "案例",
|
||||
"Tools": "工具",
|
||||
"Application Scenarios": "应用场景",
|
||||
"Prerequisites": "前置知识",
|
||||
}[role]
|
||||
ent_descs = [f"{e['title']}" + (f":{e['summary'][:80]}" if e.get("summary") else "") for e in ents]
|
||||
entity_sections.append(f"{role_cn}:{';'.join(ent_descs)}")
|
||||
|
||||
if entity_sections:
|
||||
parts.append("关联实体(实践环节):")
|
||||
for section in entity_sections:
|
||||
parts.append(f" - {section}")
|
||||
parts.append("")
|
||||
|
||||
# Practice tasks
|
||||
if design["practice_tasks"]:
|
||||
parts.append("实践任务:")
|
||||
for task in design["practice_tasks"]:
|
||||
parts.append(f" - {task}")
|
||||
parts.append("")
|
||||
|
||||
# Misconception checks
|
||||
if design["misconception_checks"]:
|
||||
parts.append("常见误区检查:")
|
||||
for mc in design["misconception_checks"]:
|
||||
parts.append(f" - {mc}")
|
||||
parts.append("")
|
||||
|
||||
# Assessment
|
||||
if design["assessment_prompts"]:
|
||||
parts.append("评估提示:")
|
||||
for ap in design["assessment_prompts"]:
|
||||
parts.append(f" - {ap}")
|
||||
parts.append("")
|
||||
|
||||
# Language directive
|
||||
if language == "zh-CN":
|
||||
parts.append("请使用中文生成课程内容。")
|
||||
|
||||
# Concept content fallback
|
||||
concept_content = concept.get("content", "")
|
||||
if concept_content and len(concept_content) > 200:
|
||||
parts.append("")
|
||||
parts.append(f"参考内容(前500字):{concept_content[:500]}")
|
||||
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
def process(input_data: dict[str, Any]) -> dict[str, Any]:
|
||||
"""Main processing: two-stage conversion."""
|
||||
concept = input_data.get("concept")
|
||||
if not concept:
|
||||
return {
|
||||
"requirement": "",
|
||||
"pedagogical_design": {},
|
||||
"metadata": {"error": "Missing 'concept' in input"},
|
||||
}
|
||||
|
||||
entities = input_data.get("entities", [])
|
||||
|
||||
# Stage 1: Build pedagogical design
|
||||
design = build_pedagogical_design(input_data)
|
||||
|
||||
# Stage 2: Build requirement string
|
||||
requirement = build_requirement(design, input_data)
|
||||
|
||||
return {
|
||||
"requirement": requirement,
|
||||
"pedagogical_design": design,
|
||||
"metadata": {
|
||||
"concept_slug": concept.get("slug", ""),
|
||||
"entity_count": len(entities),
|
||||
"depth": input_data.get("depth", "intermediate"),
|
||||
"language": input_data.get("language", "zh-CN"),
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def main() -> None:
|
||||
"""Entry point: read from stdin or file, output JSON."""
|
||||
import argparse
|
||||
|
||||
parser = argparse.ArgumentParser(description="Concept Graph → OpenMAIC Requirement 转换器")
|
||||
parser.add_argument("--file", "-f", help="输入 JSON 文件路径")
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.file:
|
||||
with open(args.file, "r", encoding="utf-8") as f:
|
||||
input_data = json.load(f)
|
||||
else:
|
||||
input_text = sys.stdin.read()
|
||||
if not input_text.strip():
|
||||
print(
|
||||
"错误: 未提供输入数据。用法:\n"
|
||||
' echo \'{"concept": {...}, "entities": [...]}\' | python concept-to-requirement.py\n'
|
||||
" python concept-to-requirement.py --file input.json",
|
||||
file=sys.stderr,
|
||||
)
|
||||
sys.exit(1)
|
||||
try:
|
||||
input_data = json.loads(input_text)
|
||||
except json.JSONDecodeError as e:
|
||||
print(f"错误: 输入 JSON 解析失败: {e}", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
result = process(input_data)
|
||||
print(json.dumps(result, ensure_ascii=False, indent=2))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user