停止跟踪文档和单元测试

2026-04-15 11:12:50 +08:00
parent 1e327ea92f
commit 68567b98b3
28 changed files with 0 additions and 6757 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,30 +0,0 @@
-# label_ai_service Development Guidelines
-
-Auto-generated from all feature plans. Last updated: 2026-04-10
-
-## Active Technologies
-
- Python 3.12.13（conda `label` 环境） + FastAPI ≥0.111, uvicorn[standard] ≥0.29, pydantic ≥2.7, zhipuai ≥2.1, boto3 ≥1.34, pdfplumber ≥0.11, python-docx ≥1.1, opencv-python-headless ≥4.9, numpy ≥1.26, httpx ≥0.27, python-dotenv ≥1.0, pyyaml ≥6.0 (001-ai-service-requirements)
-
-## Project Structure
-
-```text
-backend/
-frontend/
-tests/
-```
-
-## Commands
-
-cd src; pytest; ruff check .
-
-## Code Style
-
-Python 3.12.13（conda `label` 环境）: Follow standard conventions
-
-## Recent Changes
-
- 001-ai-service-requirements: Added Python 3.12.13（conda `label` 环境） + FastAPI ≥0.111, uvicorn[standard] ≥0.29, pydantic ≥2.7, zhipuai ≥2.1, boto3 ≥1.34, pdfplumber ≥0.11, python-docx ≥1.1, opencv-python-headless ≥4.9, numpy ≥1.26, httpx ≥0.27, python-dotenv ≥1.0, pyyaml ≥6.0
-
-<!-- MANUAL ADDITIONS START -->
-<!-- MANUAL ADDITIONS END -->
--- a/docs/superpowers/plans/2026-04-10-ai-service-impl.md
+++ b/docs/superpowers/plans/2026-04-10-ai-service-impl.md
--- a/docs/superpowers/specs/2026-04-10-ai-service-design.md
+++ b/docs/superpowers/specs/2026-04-10-ai-service-design.md
@@ -1,835 +0,0 @@
-# 知识图谱智能标注平台 — AI 服务设计文档
-
-> 版本：v1.0 | 日期：2026-04-10  
-> 运行时：Python 3.12.13（conda `label` 环境）| 框架：FastAPI  
-> 上游系统：label-backend（Java Spring Boot）| 模型：ZhipuAI GLM 系列
-
---
-
-## 一、项目定位
-
-AI 服务（`label_ai_service`）是标注平台的智能计算层，独立部署为 Python FastAPI 服务，接收 Java 后端调用，完成以下核心任务：
-
-| 能力 | 说明 |
-|------|------|
-| 文本三元组提取 | 从 TXT / PDF / DOCX 文档中提取 subject / predicate / object + 原文定位信息 |
-| 图像四元组提取 | 调用 GLM-4V 分析图片，提取四元组 + bbox 坐标，自动裁剪区域图 |
-| 视频帧提取 | OpenCV 按间隔或关键帧模式抽帧，帧图上传 RustFS |
-| 视频转文本 | GLM-4V 理解视频片段，输出结构化文字描述，降维为文本标注流程 |
-| 问答对生成 | 基于三元组/四元组 + 原文/图像证据，生成 GLM 微调格式候选问答对 |
-| 微调任务管理 | 向 ZhipuAI 提交微调任务、查询状态 |
-
-系统只有两条标注流水线（文本线、图片线），视频是两种预处理入口，不构成第三条流水线。
-
---
-
-## 二、整体架构
-
-### 2.1 在平台中的位置
-
-```
-                    ┌─────────────┐
-                    │  Nginx 反代  │
-                    └──────┬──────┘
-             ┌─────────────┼─────────────┐
-             ▼             ▼             ▼
-        ┌─────────┐  ┌──────────┐  ┌──────────┐
-        │ Vue3 前端│  │ Spring   │  │ FastAPI  │
-        │ (静态)   │  │ Boot 后端 │  │ AI 服务  │◄── 本文档范围
-        └─────────┘  └────┬─────┘  └────┬─────┘
-                          │              │
-              ┌───────────┼──────────────┤
-              ▼           ▼              ▼
-        ┌──────────┐ ┌────────┐  ┌────────────┐
-        │PostgreSQL│ │ Redis  │  │   RustFS   │
-        └──────────┘ └────────┘  └────────────┘
-```
-
-AI 服务**不直接访问数据库**，只通过：
- **RustFS S3 API**：读取原始文件、写入处理结果
- **ZhipuAI API**：调用 GLM 系列模型
- **Java 后端回调接口**：视频异步任务完成后回传结果
-
-### 2.2 目录结构
-
-```
-label_ai_service/
-├── app/
-│   ├── main.py                        # FastAPI 应用入口，注册路由、lifespan
-│   ├── core/
-│   │   ├── config.py                  # YAML + .env 分层配置，lru_cache 单例
-│   │   ├── logging.py                 # 统一结构化日志配置
-│   │   ├── exceptions.py              # 自定义异常类 + 全局异常处理器
-│   │   └── dependencies.py            # FastAPI Depends 工厂函数
-│   ├── clients/
-│   │   ├── llm/
-│   │   │   ├── base.py                # LLMClient ABC（抽象接口）
-│   │   │   └── zhipuai_client.py      # ZhipuAI 实现
-│   │   └── storage/
-│   │       ├── base.py                # StorageClient ABC（抽象接口）
-│   │       └── rustfs_client.py       # RustFS S3 兼容实现（boto3）
-│   ├── services/
-│   │   ├── text_service.py            # 文档解析 + 三元组提取
-│   │   ├── image_service.py           # 图像四元组提取 + bbox 裁剪
-│   │   ├── video_service.py           # OpenCV 抽帧 + 视频转文本
-│   │   ├── qa_service.py              # 文本/图像问答对生成
-│   │   └── finetune_service.py        # 微调任务提交与状态查询
-│   ├── routers/
-│   │   ├── text.py                    # POST /api/v1/text/extract
-│   │   ├── image.py                   # POST /api/v1/image/extract
-│   │   ├── video.py                   # POST /api/v1/video/extract-frames
-│   │   │                              # POST /api/v1/video/to-text
-│   │   ├── qa.py                      # POST /api/v1/qa/gen-text
-│   │   │                              # POST /api/v1/qa/gen-image
-│   │   └── finetune.py                # POST /api/v1/finetune/start
-│   │                                  # GET  /api/v1/finetune/status/{jobId}
-│   └── models/
-│       ├── text_models.py             # 三元组请求/响应 schema
-│       ├── image_models.py            # 四元组请求/响应 schema
-│       ├── video_models.py            # 视频处理请求/响应 schema
-│       ├── qa_models.py               # 问答对请求/响应 schema
-│       └── finetune_models.py         # 微调请求/响应 schema
-├── config.yaml                        # 非敏感配置（提交 git）
-├── .env                               # 密钥与环境差异项（提交 git）
-├── requirements.txt
-├── Dockerfile
-└── docker-compose.yml
-```
-
---
-
-## 三、配置设计
-
-### 3.1 分层配置原则
-
-| 文件 | 职责 | 提交 git |
-|------|------|----------|
-| `config.yaml` | 稳定配置：端口、路径规范、模型名、桶名、视频参数 | ✅ |
-| `.env` | 环境差异项：密钥、服务地址 | ✅ |
-
-环境变量优先级高于 `config.yaml`，Docker Compose 通过 `env_file` 加载 `.env`，本地开发由 `python-dotenv` 加载。
-
-### 3.2 `config.yaml`
-
-```yaml
-server:
-  port: 8000
-  log_level: INFO
-
-storage:
-  buckets:
-    source_data: "source-data"
-    finetune_export: "finetune-export"
-
-backend: {}   # callback_url 由 .env 注入
-
-video:
-  frame_sample_count: 8    # 视频转文本时均匀抽取的代表帧数
-  max_file_size_mb: 200    # 视频文件大小上限（超过则拒绝，防止 OOM）
-
-models:
-  default_text: "glm-4-flash"
-  default_vision: "glm-4v-flash"
-```
-
-### 3.3 `.env`
-
-```ini
-ZHIPUAI_API_KEY=your-zhipuai-api-key
-STORAGE_ACCESS_KEY=minioadmin
-STORAGE_SECRET_KEY=minioadmin
-STORAGE_ENDPOINT=http://rustfs:9000
-BACKEND_CALLBACK_URL=http://backend:8080/internal/video-job/callback
-# MAX_VIDEO_SIZE_MB=200   # 可选，覆盖 config.yaml 中的视频大小上限
-```
-
-### 3.4 config 模块实现
-
-```python
-# core/config.py
-import os, yaml
-from functools import lru_cache
-from pathlib import Path
-from dotenv import load_dotenv
-
-_ROOT = Path(__file__).parent.parent.parent
-
-# 环境变量 → YAML 路径映射
-_ENV_OVERRIDES = {
-    "ZHIPUAI_API_KEY":       ["zhipuai", "api_key"],
-    "STORAGE_ACCESS_KEY":    ["storage", "access_key"],
-    "STORAGE_SECRET_KEY":    ["storage", "secret_key"],
-    "STORAGE_ENDPOINT":      ["storage", "endpoint"],
-    "BACKEND_CALLBACK_URL":  ["backend", "callback_url"],
-    "LOG_LEVEL":             ["server", "log_level"],
-    "MAX_VIDEO_SIZE_MB":     ["video", "max_file_size_mb"],
-}
-
-def _set_nested(d: dict, keys: list[str], value: str):
-    for k in keys[:-1]:
-        d = d.setdefault(k, {})
-    d[keys[-1]] = value
-
-@lru_cache(maxsize=1)
-def get_config() -> dict:
-    load_dotenv(_ROOT / ".env")                          # 1. 加载 .env
-    with open(_ROOT / "config.yaml", encoding="utf-8") as f:
-        cfg = yaml.safe_load(f)                          # 2. 读取 YAML
-    for env_key, yaml_path in _ENV_OVERRIDES.items():    # 3. 环境变量覆盖
-        val = os.environ.get(env_key)
-        if val:
-            _set_nested(cfg, yaml_path, val)
-    _validate(cfg)
-    return cfg
-
-def _validate(cfg: dict):
-    checks = [
-        (["zhipuai", "api_key"],    "ZHIPUAI_API_KEY"),
-        (["storage", "access_key"], "STORAGE_ACCESS_KEY"),
-        (["storage", "secret_key"], "STORAGE_SECRET_KEY"),
-    ]
-    for path, name in checks:
-        val = cfg
-        for k in path:
-            val = (val or {}).get(k, "")
-        if not val:
-            raise RuntimeError(f"缺少必要配置项：{name}")
-```
-
---
-
-## 四、适配层设计
-
-### 4.1 LLM 适配层
-
-```python
-# clients/llm/base.py
-from abc import ABC, abstractmethod
-
-class LLMClient(ABC):
-    @abstractmethod
-    async def chat(self, messages: list[dict], model: str, **kwargs) -> str:
-        """纯文本对话，返回模型输出文本"""
-
-    @abstractmethod
-    async def chat_vision(self, messages: list[dict], model: str, **kwargs) -> str:
-        """多模态对话（图文混合输入），返回模型输出文本"""
-```
-
-```python
-# clients/llm/zhipuai_client.py
-import asyncio
-from zhipuai import ZhipuAI
-from .base import LLMClient
-
-class ZhipuAIClient(LLMClient):
-    def __init__(self, api_key: str):
-        self._client = ZhipuAI(api_key=api_key)
-
-    async def chat(self, messages: list[dict], model: str, **kwargs) -> str:
-        loop = asyncio.get_event_loop()
-        resp = await loop.run_in_executor(
-            None,
-            lambda: self._client.chat.completions.create(
-                model=model, messages=messages, **kwargs
-            ),
-        )
-        return resp.choices[0].message.content
-
-    async def chat_vision(self, messages: list[dict], model: str, **kwargs) -> str:
-        # GLM-4V 与文本接口相同，通过 image_url type 区分图文消息
-        return await self.chat(messages, model, **kwargs)
-```
-
-**扩展**：替换 GLM 只需新增 `class OpenAIClient(LLMClient)` 并在 `lifespan` 中注入，services 层零修改。
-
-### 4.2 Storage 适配层
-
-```python
-# clients/storage/base.py
-from abc import ABC, abstractmethod
-
-class StorageClient(ABC):
-    @abstractmethod
-    async def download_bytes(self, bucket: str, path: str) -> bytes: ...
-
-    @abstractmethod
-    async def upload_bytes(
-        self, bucket: str, path: str, data: bytes,
-        content_type: str = "application/octet-stream"
-    ) -> None: ...
-
-    @abstractmethod
-    def get_presigned_url(self, bucket: str, path: str, expires: int = 3600) -> str: ...
-```
-
-```python
-# clients/storage/rustfs_client.py
-import asyncio
-import boto3
-from .base import StorageClient
-
-class RustFSClient(StorageClient):
-    def __init__(self, endpoint: str, access_key: str, secret_key: str):
-        self._s3 = boto3.client(
-            "s3",
-            endpoint_url=endpoint,
-            aws_access_key_id=access_key,
-            aws_secret_access_key=secret_key,
-        )
-
-    async def download_bytes(self, bucket: str, path: str) -> bytes:
-        loop = asyncio.get_event_loop()
-        resp = await loop.run_in_executor(
-            None, lambda: self._s3.get_object(Bucket=bucket, Key=path)
-        )
-        return resp["Body"].read()
-
-    async def upload_bytes(self, bucket, path, data, content_type="application/octet-stream"):
-        loop = asyncio.get_event_loop()
-        await loop.run_in_executor(
-            None,
-            lambda: self._s3.put_object(
-                Bucket=bucket, Key=path, Body=data, ContentType=content_type
-            ),
-        )
-
-    def get_presigned_url(self, bucket: str, path: str, expires: int = 3600) -> str:
-        return self._s3.generate_presigned_url(
-            "get_object",
-            Params={"Bucket": bucket, "Key": path},
-            ExpiresIn=expires,
-        )
-```
-
-### 4.3 依赖注入
-
-```python
-# core/dependencies.py
-from app.clients.llm.base import LLMClient
-from app.clients.storage.base import StorageClient
-
-_llm_client: LLMClient | None = None
-_storage_client: StorageClient | None = None
-
-def set_clients(llm: LLMClient, storage: StorageClient):
-    global _llm_client, _storage_client
-    _llm_client, _storage_client = llm, storage
-
-def get_llm_client() -> LLMClient:
-    return _llm_client
-
-def get_storage_client() -> StorageClient:
-    return _storage_client
-```
-
-```python
-# main.py（lifespan 初始化）
-from contextlib import asynccontextmanager
-from fastapi import FastAPI
-from app.core.config import get_config
-from app.core.dependencies import set_clients
-from app.clients.llm.zhipuai_client import ZhipuAIClient
-from app.clients.storage.rustfs_client import RustFSClient
-
-@asynccontextmanager
-async def lifespan(app: FastAPI):
-    cfg = get_config()
-    set_clients(
-        llm=ZhipuAIClient(api_key=cfg["zhipuai"]["api_key"]),
-        storage=RustFSClient(
-            endpoint=cfg["storage"]["endpoint"],
-            access_key=cfg["storage"]["access_key"],
-            secret_key=cfg["storage"]["secret_key"],
-        ),
-    )
-    yield
-
-app = FastAPI(title="Label AI Service", lifespan=lifespan)
-```
-
---
-
-## 五、API 接口设计
-
-统一前缀：`/api/v1`。FastAPI 自动生成 Swagger 文档（`/docs`）。
-
-### 5.0 健康检查
-
-**`GET /health`**
-
-```json
-// 响应（200 OK）
-{"status": "ok"}
-```
-
-用于 Docker healthcheck、Nginx 上游探测、运维监控。无需认证，不访问外部依赖。
-
-### 5.1 文本三元组提取
-
-**`POST /api/v1/text/extract`**
-
-```json
-// 请求
-{
-  "file_path": "text/202404/123.txt",
-  "file_name": "设备规范.txt",
-  "model": "glm-4-flash",
-  "prompt_template": "..."        // 可选，不传使用 config 默认
-}
-
-// 响应
-{
-  "items": [
-    {
-      "subject": "变压器",
-      "predicate": "额定电压",
-      "object": "110kV",
-      "source_snippet": "该变压器额定电压为110kV，...",
-      "source_offset": {"start": 120, "end": 280}
-    }
-  ]
-}
-```
-
-### 5.2 图像四元组提取
-
-**`POST /api/v1/image/extract`**
-
-```json
-// 请求
-{
-  "file_path": "image/202404/456.jpg",
-  "task_id": 789,
-  "model": "glm-4v-flash",
-  "prompt_template": "..."
-}
-
-// 响应
-{
-  "items": [
-    {
-      "subject": "电缆接头",
-      "predicate": "位于",
-      "object": "配电箱左侧",
-      "qualifier": "2024年检修现场",
-      "bbox": {"x": 10, "y": 20, "w": 100, "h": 80},
-      "cropped_image_path": "crops/789/0.jpg"
-    }
-  ]
-}
-```
-
-裁剪图由 AI 服务自动完成并上传 RustFS，`cropped_image_path` 直接写入响应。
-
-### 5.3 视频帧提取（异步）
-
-**`POST /api/v1/video/extract-frames`**
-
-```json
-// 请求
-{
-  "file_path": "video/202404/001.mp4",
-  "source_id": 10,
-  "job_id": 42,
-  "mode": "interval",             // interval | keyframe
-  "frame_interval": 30            // interval 模式专用，单位：帧数
-}
-
-// 立即响应（202 Accepted）
-{
-  "message": "任务已接受，后台处理中",
-  "job_id": 42
-}
-```
-
-后台完成后，AI 服务调用 Java 后端回调接口：
-
-```json
-POST {BACKEND_CALLBACK_URL}
-{
-  "job_id": 42,
-  "status": "SUCCESS",
-  "frames": [
-    {"frame_index": 0,  "time_sec": 0.0,  "frame_path": "frames/10/0.jpg"},
-    {"frame_index": 30, "time_sec": 1.0,  "frame_path": "frames/10/1.jpg"}
-  ],
-  "error_message": null
-}
-```
-
-### 5.4 视频转文本（异步）
-
-**`POST /api/v1/video/to-text`**
-
-```json
-// 请求
-{
-  "file_path": "video/202404/001.mp4",
-  "source_id": 10,
-  "job_id": 43,
-  "start_sec": 0,
-  "end_sec": 120,
-  "model": "glm-4v-flash",
-  "prompt_template": "..."
-}
-
-// 立即响应（202 Accepted）
-{
-  "message": "任务已接受，后台处理中",
-  "job_id": 43
-}
-```
-
-后台完成后回调：
-
-```json
-POST {BACKEND_CALLBACK_URL}
-{
-  "job_id": 43,
-  "status": "SUCCESS",
-  "output_path": "video-text/10/1712800000.txt",
-  "error_message": null
-}
-```
-
-### 5.5 文本问答对生成
-
-**`POST /api/v1/qa/gen-text`**
-
-```json
-// 请求
-{
-  "items": [
-    {
-      "subject": "变压器",
-      "predicate": "额定电压",
-      "object": "110kV",
-      "source_snippet": "该变压器额定电压为110kV，..."
-    }
-  ],
-  "model": "glm-4-flash",
-  "prompt_template": "..."
-}
-
-// 响应
-{
-  "pairs": [
-    {
-      "question": "变压器的额定电压是多少？",
-      "answer": "该变压器额定电压为110kV。"
-    }
-  ]
-}
-```
-
-### 5.6 图像问答对生成
-
-**`POST /api/v1/qa/gen-image`**
-
-```json
-// 请求
-{
-  "items": [
-    {
-      "subject": "电缆接头",
-      "predicate": "位于",
-      "object": "配电箱左侧",
-      "qualifier": "2024年检修现场",
-      "cropped_image_path": "crops/789/0.jpg"
-    }
-  ],
-  "model": "glm-4v-flash",
-  "prompt_template": "..."
-}
-
-// 响应
-{
-  "pairs": [
-    {
-      "question": "图中电缆接头位于何处？",
-      "answer": "图中电缆接头位于配电箱左侧。",
-      "image_path": "crops/789/0.jpg"
-    }
-  ]
-}
-```
-
-图像 QA 生成时，AI 服务通过 `storage.download_bytes` 重新下载裁剪图，base64 编码后直接嵌入多模态消息，避免 RustFS 内网 presigned URL 无法被云端 GLM-4V 访问的问题。
-
-### 5.7 提交微调任务
-
-**`POST /api/v1/finetune/start`**
-
-```json
-// 请求
-{
-  "jsonl_url": "https://rustfs.example.com/finetune-export/export/xxx.jsonl",
-  "base_model": "glm-4-flash",
-  "hyperparams": {
-    "learning_rate": 1e-4,
-    "epochs": 3
-  }
-}
-
-// 响应
-{
-  "job_id": "glm-ft-xxxxxx"
-}
-```
-
-### 5.8 查询微调状态
-
-**`GET /api/v1/finetune/status/{jobId}`**
-
-```json
-// 响应
-{
-  "job_id": "glm-ft-xxxxxx",
-  "status": "RUNNING",            // RUNNING | SUCCESS | FAILED
-  "progress": 45,
-  "error_message": null
-}
-```
-
---
-
-## 六、Service 层设计
-
-### 6.1 text_service — 文档解析 + 三元组提取
-
-```
-1. storage.download_bytes("source-data", file_path) → bytes
-2. 按扩展名路由解析器：
-   .txt  → decode("utf-8")
-   .pdf  → pdfplumber.open() 提取全文
-   .docx → python-docx 遍历段落
-3. 拼装 Prompt（系统模板 + 文档正文）
-4. llm.chat(messages, model) → JSON 字符串
-5. 解析 JSON → 校验字段完整性 → 返回 TripleList
-```
-
-解析器注册表（消除 if-else）：
-
-```python
-PARSERS: dict[str, Callable[[bytes], str]] = {
-    ".txt":  parse_txt,
-    ".pdf":  parse_pdf,
-    ".docx": parse_docx,
-}
-def extract_text(data: bytes, filename: str) -> str:
-    ext = Path(filename).suffix.lower()
-    if ext not in PARSERS:
-        raise UnsupportedFileTypeError(ext)
-    return PARSERS[ext](data)
-```
-
-### 6.2 image_service — 四元组提取 + bbox 裁剪
-
-```
-1. storage.download_bytes("source-data", file_path) → bytes
-2. 图片 bytes 转 base64，构造 GLM-4V image_url 消息
-3. llm.chat_vision(messages, model) → JSON 字符串
-4. 解析四元组（含 bbox）
-5. 按 bbox 裁剪：
-   numpy 解码 bytes → cv2 裁剪区域 → cv2.imencode(".jpg") → bytes
-6. storage.upload_bytes("source-data", f"crops/{task_id}/{i}.jpg", ...)
-7. 返回 QuadrupleList（含 cropped_image_path）
-```
-
-### 6.3 video_service — OpenCV 抽帧 + 视频转文本
-
-**抽帧（BackgroundTask）**：
-
-```
-0. storage.get_object_size(bucket, file_path) → 字节数
-   超过 video.max_file_size_mb 限制 → 回调 FAILED（路由层提前校验，返回 400）
-1. storage.download_bytes → bytes → 写入 tempfile
-2. cv2.VideoCapture 打开临时文件
-3. interval 模式：按 frame_interval 步进读帧
-   keyframe 模式：逐帧计算与前帧的像素差均值，差值超过阈值则判定为场景切换关键帧
-                  （OpenCV 无原生 I 帧检测，用帧差分近似实现）
-4. 每帧 cv2.imencode(".jpg") → upload_bytes("source-data", f"frames/{source_id}/{i}.jpg")
-5. 清理临时文件
-6. httpx.post(BACKEND_CALLBACK_URL, json={job_id, status="SUCCESS", frames=[...]})
-异常：回调 status="FAILED", error_message=str(e)
-```
-
-**视频转文本（BackgroundTask）**：
-
-```
-1. download_bytes → tempfile
-2. cv2.VideoCapture 在 start_sec～end_sec 区间均匀抽 frame_sample_count 帧
-3. 每帧转 base64，构造多图 GLM-4V 消息（含时序说明）
-4. llm.chat_vision → 文字描述
-5. 描述文本 upload_bytes("source-data", f"video-text/{source_id}/{timestamp}.txt")
-6. 回调 Java 后端：output_path + status="SUCCESS"
-```
-
-### 6.4 qa_service — 问答对生成
-
-```
-文本 QA：
-  批量拼入三元组 + source_snippet 到 Prompt
-  llm.chat(messages, model) → 解析问答对 JSON → QAPairList
-
-图像 QA：
-  遍历四元组列表
-  storage.download_bytes(bucket, cropped_image_path) → bytes → base64 编码
-  构造多模态消息（data:image/jpeg;base64,... + 问题指令）
-  llm.chat_vision → 解析 → 含 image_path 的 QAPairList
-  （注：不使用 presigned URL，因 RustFS 为内网部署，云端 GLM-4V 无法访问内网地址）
-```
-
-### 6.5 finetune_service — GLM 微调对接
-
-微调 API 属 ZhipuAI 专有能力，无需抽象为通用接口。`finetune_service` 直接依赖 `ZhipuAIClient`（通过依赖注入获取后强转类型），不走 `LLMClient` ABC。
-
-```
-提交：
-  zhipuai_client._client.fine_tuning.jobs.create(
-      training_file=jsonl_url,
-      model=base_model,
-      hyperparameters=hyperparams
-  ) → job_id
-
-查询：
-  zhipuai_client._client.fine_tuning.jobs.retrieve(job_id)
-  → 映射 status 枚举 RUNNING / SUCCESS / FAILED
-```
-
---
-
-## 七、日志设计
-
- 使用标准库 `logging`，JSON 格式输出，与 uvicorn 集成
- 每个请求记录：`method / path / status_code / duration_ms`
- 每次 GLM 调用记录：`model / prompt_tokens / completion_tokens / duration_ms`
- BackgroundTask 记录：`job_id / stage / status / error`
- **不记录文件内容原文**（防止敏感数据泄露）
-
---
-
-## 八、异常处理
-
-| 异常类 | HTTP 状态码 | 场景 |
-|--------|------------|------|
-| `UnsupportedFileTypeError` | 400 | 文件格式不支持 |
-| `StorageDownloadError` | 502 | RustFS 不可达或文件不存在 |
-| `LLMResponseParseError` | 502 | GLM 返回非合法 JSON |
-| `LLMCallError` | 503 | GLM API 限流 / 超时 |
-| 未捕获异常 | 500 | 记录完整 traceback |
-
-所有错误响应统一格式：
-
-```json
-{"code": "ERROR_CODE", "message": "具体描述"}
-```
-
---
-
-## 九、RustFS 存储路径规范
-
-| 资源类型 | 存储桶 | 路径格式 |
-|----------|--------|----------|
-| 上传文本文件 | `source-data` | `text/{年月}/{source_id}.txt` |
-| 上传图片 | `source-data` | `image/{年月}/{source_id}.jpg` |
-| 上传视频 | `source-data` | `video/{年月}/{source_id}.mp4` |
-| 视频帧模式抽取的帧图 | `source-data` | `frames/{source_id}/{frame_index}.jpg` |
-| 视频片段转译输出的文本 | `source-data` | `video-text/{source_id}/{timestamp}.txt` |
-| 图像/帧 bbox 裁剪图 | `source-data` | `crops/{task_id}/{item_index}.jpg` |
-| 导出 JSONL 文件 | `finetune-export` | `export/{batchUuid}.jsonl` |
-
---
-
-## 十、部署设计
-
-### 10.1 Dockerfile
-
-```dockerfile
-FROM python:3.12-slim
-
-WORKDIR /app
-
-# OpenCV 系统依赖
-RUN apt-get update && apt-get install -y \
-    libgl1 libglib2.0-0 \
-    && rm -rf /var/lib/apt/lists/*
-
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-
-COPY app/ ./app/
-COPY config.yaml .
-COPY .env .
-
-EXPOSE 8000
-CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
-```
-
-### 10.2 docker-compose.yml（ai-service 片段）
-
-```yaml
-ai-service:
-  build: ./label_ai_service
-  ports:
-    - "8000:8000"
-  env_file:
-    - ./label_ai_service/.env
-  depends_on:
-    - rustfs
-    - backend
-  networks:
-    - label-net
-  healthcheck:
-    test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
-    interval: 30s
-    timeout: 5s
-    retries: 3
-    start_period: 10s
-```
-
-### 10.3 requirements.txt
-
-```
-fastapi>=0.111
-uvicorn[standard]>=0.29
-pydantic>=2.7
-python-dotenv>=1.0
-pyyaml>=6.0
-zhipuai>=2.1
-boto3>=1.34
-pdfplumber>=0.11
-python-docx>=1.1
-opencv-python-headless>=4.9
-numpy>=1.26
-httpx>=0.27
-```
-
---
-
-## 十一、关键设计决策
-
-### 11.1 为何 LLMClient / StorageClient 使用 ABC
-
-当前只实现 ZhipuAI 和 RustFS，但模型选型和对象存储可能随项目演进变化。ABC 约束接口契约，保证替换实现时 services 层零修改。注入点集中在 `lifespan`，一处修改全局生效。
-
-### 11.2 为何 ZhipuAI 同步 SDK 在线程池中调用
-
-ZhipuAI 官方 SDK 是同步阻塞调用，直接 `await` 不生效。通过 `loop.run_in_executor(None, ...)` 在线程池中运行，不阻塞 FastAPI 的 asyncio 事件循环，保持并发处理能力。
-
-### 11.3 为何视频任务使用 BackgroundTasks 而非 Celery
-
-项目规模适中，视频处理任务由 ADMIN 手动触发，并发量可控。FastAPI `BackgroundTasks` 无需额外中间件（Redis 队列、Celery Worker），部署简单，任务状态通过回调接口传递给 Java 后端管理，符合整体架构风格。
-
-### 11.4 为何图像 QA 生成用 base64 而非 presigned URL
-
-RustFS 部署在 Docker 内网（`http://rustfs:9000`），presigned URL 指向内网地址，云端 GLM-4V API 无法访问，会导致所有图像 QA 请求失败。因此将裁剪图重新下载为 bytes，base64 编码后直接嵌入多模态消息体，与 `image_service` 处理原图的方式保持一致，无需 RustFS 有公网地址。
-
-### 11.5 config.yaml + .env 分层配置的原因
-
-`config.yaml` 存结构化、稳定的非敏感配置，可读性好，适合 git 追踪变更历史；`.env` 存密钥和环境差异项，格式简单，Docker `env_file` 原生支持，本地开发和容器启动行为一致，无需维护两套配置文件。
-
---
-
-*文档版本：v1.0 | 生成日期：2026-04-10*
--- a/pytest.ini
+++ b/pytest.ini
@@ -1,3 +0,0 @@
-[pytest]
-asyncio_mode = auto
-testpaths = tests
--- a/specs/001-ai-service-requirements/checklists/requirements.md
+++ b/specs/001-ai-service-requirements/checklists/requirements.md
@@ -1,38 +0,0 @@
-# Specification Quality Checklist: AI 服务需求文档
-
-**Purpose**: Validate specification completeness and quality before proceeding to planning
-**Created**: 2026-04-10
-**Feature**: [../spec.md](../spec.md)
-
-## Content Quality
-
- [x] No implementation details (languages, frameworks, APIs) — 注：Technical Environment 节单独列出，明确标注为已确认技术约束，不影响需求层表述
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders（业务场景均以 ADMIN/标注员/系统为视角描述）
- [x] All mandatory sections completed
-
-## Requirement Completeness
-
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous（每条 FR 含明确行为和可验证条件）
- [x] Success criteria are measurable（SC 含具体时间、像素精度等量化指标）
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined（8 个 User Story 均含 Acceptance Scenarios）
- [x] Edge cases are identified（6 条边界情况，覆盖文件损坏、空结果、并发等）
- [x] Scope is clearly bounded（明确：不处理上传逻辑，不管理训练资源，不对外暴露）
- [x] Dependencies and assumptions identified（9 条假设，含内外网访问、ZhipuAI 托管等）
-
-## Feature Readiness
-
- [x] All functional requirements have clear acceptance criteria（FR-001~FR-018 逐一可追溯到 User Story 验收场景）
- [x] User scenarios cover primary flows（P1: 文本/图像提取；P2: 视频/QA；P3: 微调/健康检查）
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification（Technical Environment 节独立，不混入 FR/SC）
-
-## Notes
-
- Technical Environment 节超出传统需求文档范围，但用户明确要求包含环境约束（Python 3.12.13、FastAPI、conda label 环境），已单独成节并说明其性质。
- SC-009（测试覆盖）为工程质量指标，非用户感知需求，但对服务可靠性有实质影响，保留。
- 所有 [NEEDS CLARIFICATION] 均已通过合理默认值或设计文档确认，无待用户回答的开放问题。
-
-**VERDICT**: ✅ 规格就绪，可进行 `/speckit.clarify` 或 `/speckit.plan`
--- a/specs/001-ai-service-requirements/contracts/api.md
+++ b/specs/001-ai-service-requirements/contracts/api.md
@@ -1,333 +0,0 @@
-# API Contract: AI 服务接口定义
-
-**Branch**: `001-ai-service-requirements` | **Date**: 2026-04-10  
-**Base URL**: `http://ai-service:8000`  
-**API Prefix**: `/api/v1`  
-**Swagger**: `/docs`（FastAPI 自动生成）
-
---
-
-## 通用约定
-
-### 请求格式
- 所有请求体：`Content-Type: application/json`
- 无认证机制（内网服务，仅 Java 后端调用）
-
-### 响应格式
- 成功：HTTP 2xx，JSON 响应体
- 错误：HTTP 4xx/5xx，统一错误格式：
-  ```json
-  {"code": "ERROR_CODE", "message": "具体描述"}
-  ```
-
-### 错误码
-
-| HTTP 状态码 | code | 触发条件 |
-|------------|------|---------|
-| 400 | UNSUPPORTED_FILE_TYPE | 文件格式不支持（如 .xlsx） |
-| 400 | VIDEO_TOO_LARGE | 视频文件超过大小上限 |
-| 502 | STORAGE_ERROR | RustFS 不可达或文件不存在 |
-| 502 | LLM_PARSE_ERROR | GLM 返回非合法 JSON |
-| 503 | LLM_CALL_ERROR | GLM API 限流 / 超时 |
-| 500 | INTERNAL_ERROR | 未捕获异常 |
-
---
-
-## 端点一览
-
-| 端点 | 方法 | 功能 | 响应码 |
-|------|------|------|--------|
-| `/health` | GET | 健康检查 | 200 |
-| `/api/v1/text/extract` | POST | 文档三元组提取 | 200 |
-| `/api/v1/image/extract` | POST | 图像四元组提取 | 200 |
-| `/api/v1/video/extract-frames` | POST | 视频帧提取（异步） | 202 |
-| `/api/v1/video/to-text` | POST | 视频转文本（异步） | 202 |
-| `/api/v1/qa/gen-text` | POST | 文本问答对生成 | 200 |
-| `/api/v1/qa/gen-image` | POST | 图像问答对生成 | 200 |
-| `/api/v1/finetune/start` | POST | 提交微调任务 | 200 |
-| `/api/v1/finetune/status/{jobId}` | GET | 查询微调状态 | 200 |
-
---
-
-## 端点详情
-
-### GET /health
-
-健康检查端点，无需认证，无请求体。
-
-**响应（200 OK）**:
-```json
-{"status": "ok"}
-```
-
---
-
-### POST /api/v1/text/extract
-
-从存储中指定路径的文档提取文本三元组。
-
-**请求体**:
-```json
-{
-  "file_path": "text/202404/123.txt",
-  "file_name": "设备规范.txt",
-  "model": "glm-4-flash",
-  "prompt_template": "..."
-}
-```
-
-| 字段 | 类型 | 必填 | 说明 |
-|------|------|------|------|
-| file_path | string | 是 | RustFS 中的文件路径 |
-| file_name | string | 是 | 带扩展名的文件名（用于判断格式） |
-| model | string | 否 | 模型名，默认使用 config 中的 default_text |
-| prompt_template | string | 否 | 自定义提示词，不传使用内置模板 |
-
-**支持格式**: `.txt`, `.pdf`, `.docx`
-
-**响应（200 OK）**:
-```json
-{
-  "items": [
-    {
-      "subject": "变压器",
-      "predicate": "额定电压",
-      "object": "110kV",
-      "source_snippet": "该变压器额定电压为110kV",
-      "source_offset": {"start": 120, "end": 150}
-    }
-  ]
-}
-```
-
---
-
-### POST /api/v1/image/extract
-
-从存储中指定路径的图片提取知识四元组，并自动裁剪 bbox 区域。
-
-**请求体**:
-```json
-{
-  "file_path": "image/202404/456.jpg",
-  "task_id": 789,
-  "model": "glm-4v-flash",
-  "prompt_template": "..."
-}
-```
-
-| 字段 | 类型 | 必填 | 说明 |
-|------|------|------|------|
-| file_path | string | 是 | RustFS 中的图片路径 |
-| task_id | int | 是 | 标注任务 ID（用于构造裁剪图存储路径） |
-| model | string | 否 | 默认使用 config 中的 default_vision |
-| prompt_template | string | 否 | 自定义提示词 |
-
-**响应（200 OK）**:
-```json
-{
-  "items": [
-    {
-      "subject": "电缆接头",
-      "predicate": "位于",
-      "object": "配电箱左侧",
-      "qualifier": "2024年检修现场",
-      "bbox": {"x": 10, "y": 20, "w": 100, "h": 80},
-      "cropped_image_path": "crops/789/0.jpg"
-    }
-  ]
-}
-```
-
---
-
-### POST /api/v1/video/extract-frames
-
-触发视频帧提取后台任务，立即返回。
-
-**请求体**:
-```json
-{
-  "file_path": "video/202404/001.mp4",
-  "source_id": 10,
-  "job_id": 42,
-  "mode": "interval",
-  "frame_interval": 30
-}
-```
-
-| 字段 | 类型 | 必填 | 说明 |
-|------|------|------|------|
-| file_path | string | 是 | RustFS 中的视频路径 |
-| source_id | int | 是 | 原始资料 ID（用于构造帧存储路径） |
-| job_id | int | 是 | 由 Java 后端分配的任务 ID |
-| mode | string | 否 | `interval`（默认）或 `keyframe` |
-| frame_interval | int | 否 | interval 模式专用，按帧数步进，默认 30 |
-
-**响应（202 Accepted）**:
-```json
-{"message": "任务已接受，后台处理中", "job_id": 42}
-```
-
-**完成后回调 Java 后端**（POST `{BACKEND_CALLBACK_URL}`）:
-```json
-{
-  "job_id": 42,
-  "status": "SUCCESS",
-  "frames": [
-    {"frame_index": 0, "time_sec": 0.0, "frame_path": "frames/10/0.jpg"}
-  ],
-  "error_message": null
-}
-```
-
---
-
-### POST /api/v1/video/to-text
-
-触发视频片段转文字后台任务，立即返回。
-
-**请求体**:
-```json
-{
-  "file_path": "video/202404/001.mp4",
-  "source_id": 10,
-  "job_id": 43,
-  "start_sec": 0,
-  "end_sec": 120,
-  "model": "glm-4v-flash",
-  "prompt_template": "..."
-}
-```
-
-| 字段 | 类型 | 必填 | 说明 |
-|------|------|------|------|
-| file_path | string | 是 | RustFS 中的视频路径 |
-| source_id | int | 是 | 原始资料 ID |
-| job_id | int | 是 | 由 Java 后端分配的任务 ID |
-| start_sec | float | 是 | 分析起始时间（秒） |
-| end_sec | float | 是 | 分析结束时间（秒） |
-| model | string | 否 | 默认使用 config 中的 default_vision |
-| prompt_template | string | 否 | 自定义提示词 |
-
-**响应（202 Accepted）**:
-```json
-{"message": "任务已接受，后台处理中", "job_id": 43}
-```
-
-**完成后回调 Java 后端**（POST `{BACKEND_CALLBACK_URL}`）:
-```json
-{
-  "job_id": 43,
-  "status": "SUCCESS",
-  "output_path": "video-text/10/1712800000.txt",
-  "error_message": null
-}
-```
-
---
-
-### POST /api/v1/qa/gen-text
-
-基于文本三元组批量生成候选问答对。
-
-**请求体**:
-```json
-{
-  "items": [
-    {
-      "subject": "变压器",
-      "predicate": "额定电压",
-      "object": "110kV",
-      "source_snippet": "该变压器额定电压为110kV"
-    }
-  ],
-  "model": "glm-4-flash",
-  "prompt_template": "..."
-}
-```
-
-**响应（200 OK）**:
-```json
-{
-  "pairs": [
-    {"question": "变压器的额定电压是多少？", "answer": "该变压器额定电压为110kV。"}
-  ]
-}
-```
-
---
-
-### POST /api/v1/qa/gen-image
-
-基于图像四元组生成候选图文问答对。图片由 AI 服务从存储自动获取，调用方只需提供路径。
-
-**请求体**:
-```json
-{
-  "items": [
-    {
-      "subject": "电缆接头",
-      "predicate": "位于",
-      "object": "配电箱左侧",
-      "qualifier": "2024年检修现场",
-      "cropped_image_path": "crops/789/0.jpg"
-    }
-  ],
-  "model": "glm-4v-flash",
-  "prompt_template": "..."
-}
-```
-
-**响应（200 OK）**:
-```json
-{
-  "pairs": [
-    {
-      "question": "图中电缆接头位于何处？",
-      "answer": "图中电缆接头位于配电箱左侧。",
-      "image_path": "crops/789/0.jpg"
-    }
-  ]
-}
-```
-
---
-
-### POST /api/v1/finetune/start
-
-向 ZhipuAI 提交微调任务。
-
-**请求体**:
-```json
-{
-  "jsonl_url": "https://rustfs.example.com/finetune-export/export/xxx.jsonl",
-  "base_model": "glm-4-flash",
-  "hyperparams": {"learning_rate": 1e-4, "epochs": 3}
-}
-```
-
-**响应（200 OK）**:
-```json
-{"job_id": "glm-ft-xxxxxx"}
-```
-
---
-
-### GET /api/v1/finetune/status/{jobId}
-
-查询微调任务状态。
-
-**路径参数**: `jobId` — 微调任务 ID（由 `/finetune/start` 返回）
-
-**响应（200 OK）**:
-```json
-{
-  "job_id": "glm-ft-xxxxxx",
-  "status": "RUNNING",
-  "progress": 45,
-  "error_message": null
-}
-```
-
-`status` 取值: `RUNNING` | `SUCCESS` | `FAILED`
--- a/specs/001-ai-service-requirements/data-model.md
+++ b/specs/001-ai-service-requirements/data-model.md
@@ -1,167 +0,0 @@
-# Data Model: AI 服务
-
-**Branch**: `001-ai-service-requirements` | **Date**: 2026-04-10
-
---
-
-## 实体定义
-
-### TripleItem（文本三元组）
-
-从文档中提取的一条知识关系。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| subject | string | 非空 | 主语实体 |
-| predicate | string | 非空 | 谓语/关系 |
-| object | string | 非空 | 宾语实体 |
-| source_snippet | string | 非空 | 原文中的证据片段（直接引用） |
-| source_offset.start | int | ≥0 | 证据片段在全文中的起始字符偏移 |
-| source_offset.end | int | >start | 证据片段在全文中的结束字符偏移 |
-
-**状态转换**: 无（只读输出）
-
---
-
-### QuadrupleItem（图像四元组）
-
-从图像中提取的一条知识关系，带图像位置信息。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| subject | string | 非空 | 主体实体 |
-| predicate | string | 非空 | 关系/属性 |
-| object | string | 非空 | 客体实体 |
-| qualifier | string | 可为空 | 修饰信息（时间、条件、场景） |
-| bbox.x | int | ≥0 | 边界框左上角 x 像素坐标 |
-| bbox.y | int | ≥0 | 边界框左上角 y 像素坐标 |
-| bbox.w | int | >0 | 边界框宽度（像素） |
-| bbox.h | int | >0 | 边界框高度（像素） |
-| cropped_image_path | string | 非空 | 裁剪图在 RustFS 中的存储路径 |
-
-**派生规则**: `cropped_image_path = "crops/{task_id}/{item_index}.jpg"`，由 image_service 自动生成并上传
-
---
-
-### QAPair（文本问答对）
-
-由文本三元组生成的训练候选问答对。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| question | string | 非空 | 问题文本 |
-| answer | string | 非空 | 答案文本 |
-
---
-
-### ImageQAPair（图像问答对）
-
-由图像四元组生成的训练候选图文问答对。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| question | string | 非空 | 问题文本 |
-| answer | string | 非空 | 答案文本 |
-| image_path | string | 非空 | 对应裁剪图的存储路径（来源于 QuadrupleItem.cropped_image_path） |
-
---
-
-### FrameInfo（视频帧信息）
-
-视频帧提取任务中单帧的元数据。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| frame_index | int | ≥0 | 帧在视频中的原始帧序号 |
-| time_sec | float | ≥0.0 | 帧对应的时间点（秒） |
-| frame_path | string | 非空 | 帧图在 RustFS 中的存储路径 |
-
-**派生规则**: `frame_path = "frames/{source_id}/{upload_index}.jpg"`
-
---
-
-### VideoJobCallback（视频任务回调）
-
-异步视频任务完成后发送给 Java 后端的通知载荷。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| job_id | int | 非空 | 由 Java 后端分配的任务 ID |
-| status | string | SUCCESS \| FAILED | 任务最终状态 |
-| frames | FrameInfo[] \| null | 仅帧提取时非 null | 提取的帧列表（可为空列表） |
-| output_path | string \| null | 仅视频转文本时非 null | 输出文字描述的存储路径 |
-| error_message | string \| null | 仅 FAILED 时非 null | 错误描述 |
-
---
-
-### FinetuneJob（微调任务）
-
-微调任务的状态快照。
-
-| 字段 | 类型 | 约束 | 说明 |
-|------|------|------|------|
-| job_id | string | 非空 | 由 ZhipuAI 平台分配的任务 ID（如 "glm-ft-xxxxxx"） |
-| status | string | RUNNING \| SUCCESS \| FAILED | 当前状态 |
-| progress | int \| null | 0-100 \| null | 完成百分比（ZhipuAI 支持时） |
-| error_message | string \| null | 仅 FAILED 时非 null | 错误描述 |
-
-**状态映射**:
-```
-ZhipuAI "running"   → RUNNING
-ZhipuAI "succeeded" → SUCCESS
-ZhipuAI "failed"    → FAILED
-其他               → RUNNING（保守处理）
-```
-
---
-
-## RustFS 存储路径规范
-
-| 资源类型 | 存储桶 | 路径格式 |
-|----------|--------|----------|
-| 上传文本文件 | `source-data` | `text/{年月}/{source_id}.txt` |
-| 上传图片 | `source-data` | `image/{年月}/{source_id}.jpg` |
-| 上传视频 | `source-data` | `video/{年月}/{source_id}.mp4` |
-| 视频帧图 | `source-data` | `frames/{source_id}/{upload_index}.jpg` |
-| 视频转译文本 | `source-data` | `video-text/{source_id}/{timestamp}.txt` |
-| 图像/帧 bbox 裁剪图 | `source-data` | `crops/{task_id}/{item_index}.jpg` |
-| 导出 JSONL 文件 | `finetune-export` | `export/{batchUuid}.jsonl` |
-
---
-
-## 配置模型
-
-### config.yaml（非敏感，提交 git）
-
-```yaml
-server:
-  port: 8000
-  log_level: INFO
-
-storage:
-  buckets:
-    source_data: "source-data"
-    finetune_export: "finetune-export"
-
-backend: {}   # callback_url 由 .env 注入
-
-video:
-  frame_sample_count: 8    # 视频转文本时均匀采样帧数
-  max_file_size_mb: 200    # 视频大小上限（可通过 MAX_VIDEO_SIZE_MB 覆盖）
-
-models:
-  default_text: "glm-4-flash"
-  default_vision: "glm-4v-flash"
-```
-
-### 环境变量覆盖映射
-
-| 环境变量 | YAML 路径 | 说明 |
-|----------|-----------|------|
-| ZHIPUAI_API_KEY | zhipuai.api_key | 必填 |
-| STORAGE_ACCESS_KEY | storage.access_key | 必填 |
-| STORAGE_SECRET_KEY | storage.secret_key | 必填 |
-| STORAGE_ENDPOINT | storage.endpoint | RustFS 地址 |
-| BACKEND_CALLBACK_URL | backend.callback_url | Java 后端回调接口 |
-| LOG_LEVEL | server.log_level | 日志级别 |
-| MAX_VIDEO_SIZE_MB | video.max_file_size_mb | 视频大小上限 |
--- a/specs/001-ai-service-requirements/plan.md
+++ b/specs/001-ai-service-requirements/plan.md
@@ -1,120 +0,0 @@
-# Implementation Plan: AI 服务需求文档
-
-**Branch**: `001-ai-service-requirements` | **Date**: 2026-04-10 | **Spec**: [spec.md](spec.md)  
-**Input**: Feature specification from `/specs/001-ai-service-requirements/spec.md`
-
-> **参考实现计划（主计划）**: `docs/superpowers/plans/2026-04-10-ai-service-impl.md`  
-> 本文件为 speckit 规划框架文档，详细 TDD 任务（17 个步骤含完整代码）见上述主计划。
-
-## Summary
-
-实现一个独立部署的 Python FastAPI AI 服务，为知识图谱标注平台提供文本三元组提取、图像四元组提取、视频帧处理、问答对生成和 GLM 微调管理能力。服务通过 RustFS S3 API 读写文件，通过 ZhipuAI GLM API 调用大模型，通过回调接口通知 Java 后端异步任务结果。采用 ABC 适配层（LLMClient / StorageClient）保证可扩展性，FastAPI BackgroundTasks 处理视频长任务，全量 TDD 开发。
-
-## Technical Context
-
-**Language/Version**: Python 3.12.13（conda `label` 环境）  
-**Primary Dependencies**: FastAPI ≥0.111, uvicorn[standard] ≥0.29, pydantic ≥2.7, zhipuai ≥2.1, boto3 ≥1.34, pdfplumber ≥0.11, python-docx ≥1.1, opencv-python-headless ≥4.9, numpy ≥1.26, httpx ≥0.27, python-dotenv ≥1.0, pyyaml ≥6.0  
-**Storage**: RustFS（S3 兼容协议，boto3 访问）  
-**Testing**: pytest ≥8.0 + pytest-asyncio ≥0.23，所有 service 和 router 均有单元测试  
-**Target Platform**: Linux 容器（Docker + Docker Compose）  
-**Project Type**: web-service  
-**Performance Goals**: 文本提取 <60s；图像提取 <30s；视频任务接受 <1s；健康检查 <1s；QA 生成（≤10条）<90s  
-**Constraints**: 视频文件大小上限默认 200MB（可通过 MAX_VIDEO_SIZE_MB 环境变量配置）；不访问数据库；GLM 为云端 API，图片须以 base64 传输；ZhipuAI SDK 同步阻塞，须在线程池中执行  
-**Scale/Scope**: 低并发（ADMIN 手动触发），同时不超过 5 个视频任务
-
-## Constitution Check
-
-*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
-
-> 项目 constitution 为未填充的模板，无项目特定约束规则。以下采用通用工程原则进行评估。
-
-| 原则 | 状态 | 说明 |
-|------|------|------|
-| 测试优先（TDD） | ✅ 通过 | 实现计划采用红绿重构循环，所有模块先写测试 |
-| 简单性（YAGNI） | ✅ 通过 | BackgroundTasks 而非 Celery；无数据库；适配层仅当前实现 |
-| 可观测性 | ✅ 通过 | JSON 结构化日志，含请求/GLM/视频任务维度 |
-| 错误分类 | ✅ 通过 | 4 种异常类（400/502/503/500），结构化响应 |
-| 可扩展性 | ✅ 通过 | LLMClient / StorageClient ABC 适配层 |
-| 配置分层 | ✅ 通过 | config.yaml + .env + 环境变量覆盖 |
-
-**GATE RESULT**: ✅ 无违规，可进入 Phase 0。
-
-## Project Structure
-
-### Documentation (this feature)
-
-```text
-specs/001-ai-service-requirements/
-├── plan.md              # 本文件 (/speckit.plan 输出)
-├── research.md          # Phase 0 输出
-├── data-model.md        # Phase 1 输出
-├── quickstart.md        # Phase 1 输出
-├── contracts/           # Phase 1 输出
-│   └── api.md
-└── tasks.md             # Phase 2 输出 (/speckit.tasks - 未由本命令创建)
-```
-
-### Source Code (repository root)
-
-```text
-label_ai_service/
-├── app/
-│   ├── main.py                    # FastAPI 应用入口，lifespan，/health 端点
-│   ├── core/
-│   │   ├── config.py              # YAML + .env 分层配置，lru_cache 单例
-│   │   ├── logging.py             # JSON 结构化日志，请求日志中间件
-│   │   ├── exceptions.py          # 自定义异常 + 全局处理器
-│   │   ├── json_utils.py          # GLM 响应 JSON 解析（兼容 Markdown 代码块）
-│   │   └── dependencies.py        # FastAPI Depends 工厂函数
-│   ├── clients/
-│   │   ├── llm/
-│   │   │   ├── base.py            # LLMClient ABC（chat / chat_vision）
-│   │   │   └── zhipuai_client.py  # ZhipuAI 实现（线程池包装同步 SDK）
-│   │   └── storage/
-│   │       ├── base.py            # StorageClient ABC（download/upload/presigned/size）
-│   │       └── rustfs_client.py   # RustFS S3 兼容实现
-│   ├── services/
-│   │   ├── text_service.py        # TXT/PDF/DOCX 解析 + 三元组提取
-│   │   ├── image_service.py       # 四元组提取 + bbox 裁剪
-│   │   ├── video_service.py       # 帧提取 + 视频转文本（BackgroundTask）
-│   │   ├── qa_service.py          # 文本/图像问答对生成（图像用 base64）
-│   │   └── finetune_service.py    # 微调任务提交与查询
-│   ├── routers/
-│   │   ├── text.py                # POST /api/v1/text/extract
-│   │   ├── image.py               # POST /api/v1/image/extract
-│   │   ├── video.py               # POST /api/v1/video/extract-frames, /to-text
-│   │   ├── qa.py                  # POST /api/v1/qa/gen-text, /gen-image
-│   │   └── finetune.py            # POST /api/v1/finetune/start, GET /status/{id}
-│   └── models/
-│       ├── text_models.py
-│       ├── image_models.py
-│       ├── video_models.py
-│       ├── qa_models.py
-│       └── finetune_models.py
-├── tests/
-│   ├── conftest.py                # mock_llm, mock_storage fixtures
-│   ├── test_config.py
-│   ├── test_llm_client.py
-│   ├── test_storage_client.py
-│   ├── test_text_service.py
-│   ├── test_text_router.py
-│   ├── test_image_service.py
-│   ├── test_image_router.py
-│   ├── test_video_service.py
-│   ├── test_video_router.py
-│   ├── test_qa_service.py
-│   ├── test_qa_router.py
-│   ├── test_finetune_service.py
-│   └── test_finetune_router.py
-├── config.yaml
-├── .env
-├── requirements.txt
-├── Dockerfile
-└── docker-compose.yml
-```
-
-**Structure Decision**: 单项目结构（Option 1），分层为 routers → services → clients，测试与源码并列。
-
-## Complexity Tracking
-
-> Constitution 无违规，此节无需填写。
--- a/specs/001-ai-service-requirements/quickstart.md
+++ b/specs/001-ai-service-requirements/quickstart.md
@@ -1,109 +0,0 @@
-# Quickstart: AI 服务开发指南
-
-**Branch**: `001-ai-service-requirements` | **Date**: 2026-04-10
-
---
-
-## 环境准备
-
-```bash
-# 激活 conda 环境
-conda activate label
-
-# 安装依赖（在 label_ai_service 目录下）
-pip install -r requirements.txt
-```
-
---
-
-## 本地开发启动
-
-```bash
-# 1. 复制并配置 .env（已提交模板）
-# 编辑 .env 填写真实的 ZHIPUAI_API_KEY 和 STORAGE_ENDPOINT
-
-# 2. 启动开发服务器
-conda run -n label uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
-
-# 3. 访问 Swagger 文档
-# http://localhost:8000/docs
-```
-
---
-
-## 运行测试
-
-```bash
-# 运行全部测试
-conda run -n label pytest tests/ -v
-
-# 运行指定模块测试
-conda run -n label pytest tests/test_text_service.py -v
-
-# 运行带覆盖率报告
-conda run -n label pytest tests/ --cov=app --cov-report=term-missing
-```
-
---
-
-## Docker 部署
-
-```bash
-# 构建镜像
-docker build -t label-ai-service:dev .
-
-# 使用 docker-compose 启动（含 RustFS）
-docker-compose up -d
-
-# 查看日志
-docker-compose logs -f ai-service
-
-# 健康检查
-curl http://localhost:8000/health
-```
-
---
-
-## 关键配置说明
-
-### 视频大小上限调整
-
-无需重建镜像，在 `.env` 中添加：
-```ini
-MAX_VIDEO_SIZE_MB=500
-```
-
-### 切换大模型
-
-修改 `config.yaml`：
-```yaml
-models:
-  default_text: "glm-4-flash"    # 文本模型
-  default_vision: "glm-4v-flash" # 视觉模型
-```
-
---
-
-## 开发流程（TDD）
-
-详细的 17 个任务步骤（含完整代码）见主实现计划：  
-`docs/superpowers/plans/2026-04-10-ai-service-impl.md`
-
-每个任务的开发步骤：
-1. 编写失败测试（`pytest ... -v` 验证失败）
-2. 实现最小代码使测试通过（`pytest ... -v` 验证通过）
-3. Commit
-
---
-
-## 目录结构速查
-
-```
-app/
-├── main.py          # 入口，/health 端点，路由注册
-├── core/            # 配置、日志、异常、工具
-├── clients/         # LLM 和 Storage 适配层（ABC + 实现）
-├── services/        # 业务逻辑（text/image/video/qa/finetune）
-├── routers/         # HTTP 路由处理
-└── models/          # Pydantic 请求/响应 Schema
-```
--- a/specs/001-ai-service-requirements/research.md
+++ b/specs/001-ai-service-requirements/research.md
@@ -1,76 +0,0 @@
-# Research: AI 服务实现方案
-
-**Branch**: `001-ai-service-requirements` | **Date**: 2026-04-10  
-**Status**: 完成（所有决策已在设计阶段确定，无待研究项）
-
---
-
-## 决策记录
-
-### D-001: 异步框架选型
-
-**Decision**: FastAPI + uvicorn  
-**Rationale**: 原生 async/await 支持、Pydantic 自动校验、自动生成 Swagger 文档、Python 生态系中性能和开发效率的最优权衡。  
-**Alternatives considered**: Django（过重）、Flask（无原生异步）、aiohttp（无自动文档和类型校验）
-
---
-
-### D-002: ZhipuAI SDK 调用方式
-
-**Decision**: 同步 SDK 通过 `asyncio.get_event_loop().run_in_executor(None, ...)` 在线程池中调用  
-**Rationale**: ZhipuAI 官方 SDK 为同步阻塞设计，直接在 async 函数中调用会阻塞事件循环。`run_in_executor` 将阻塞调用卸载到线程池，保持 FastAPI 事件循环响应能力。  
-**Alternatives considered**: 使用 `asyncio.to_thread()`（Python 3.9+ 语法糖，等效实现，选择 run_in_executor 保持向后兼容性）；使用 httpx 直接调用 ZhipuAI HTTP API（绕过 SDK 但增加维护负担）
-
---
-
-### D-003: 图像 QA 生成的图片传输方式
-
-**Decision**: base64 编码嵌入消息体（`data:image/jpeg;base64,...`）  
-**Rationale**: RustFS 部署在 Docker 内网（endpoint: `http://rustfs:9000`），presigned URL 指向内网地址，云端 GLM-4V 无法访问。base64 编码将图片内容直接内联到 API 请求，不依赖网络可达性。  
-**Alternatives considered**: presigned URL（不可行，内网地址云端不可达）；公网 RustFS 暴露（增加安全风险）
-
---
-
-### D-004: 视频长任务处理机制
-
-**Decision**: FastAPI BackgroundTasks + HTTP 回调通知 Java 后端  
-**Rationale**: 视频处理耗时不可控（几秒到几分钟），同步等待会超时。BackgroundTasks 无需额外中间件（Redis/Celery），部署简单，任务状态通过回调接口由 Java 后端管理，符合整体架构风格。并发量有限（≤5个同时任务），BackgroundTasks 完全够用。  
-**Alternatives considered**: Celery（需 Redis broker，引入额外运维负担）；asyncio.create_task（进程重启会丢失任务）
-
---
-
-### D-005: 分层配置方案
-
-**Decision**: config.yaml（稳定非敏感配置）+ .env（密钥和环境差异项），环境变量优先级高于 YAML  
-**Rationale**: YAML 提供结构化可读性，适合 git 追踪非敏感配置变更；.env 格式为 Docker `env_file` 原生支持；环境变量覆盖机制使容器部署时无需重建镜像即可切换配置。  
-**Alternatives considered**: 纯 .env 文件（缺乏结构化，复杂配置难维护）；数据库存储配置（过重）
-
---
-
-### D-006: 视频大文件 OOM 防护
-
-**Decision**: 在视频路由层（接受请求后、启动后台任务前）通过 `storage.get_object_size()` 查询文件大小，超限返回 HTTP 400  
-**Rationale**: 在下载前拒绝，避免实际 OOM；大小限制通过 config.yaml + MAX_VIDEO_SIZE_MB 环境变量运行时可配置，无需重建镜像；实现简单，无需引入流式下载的新抽象。  
-**Alternatives considered**: 流式下载（Completeness: 9/10，但 YAGNI，当前规模不需要）；不限制（Completeness: 4/10，有 OOM 风险）
-
---
-
-### D-007: 视频关键帧检测算法
-
-**Decision**: 帧差分（frame difference）近似检测：计算当前帧与前帧灰度图的像素差均值，差值超过阈值（默认 30.0）判定为场景切换  
-**Rationale**: OpenCV 无原生 I 帧检测 API（`CAP_PROP_POS_FRAMES` 是帧定位，非 I 帧标识）。帧差分简单有效，对场景切换检测准确，且无需视频解码器底层支持。  
-**Alternatives considered**: 基于编码信息的 I 帧检测（需 FFmpeg 支持，引入额外依赖）；固定间隔（不够智能，不适合关键帧模式）
-
---
-
-### D-008: 测试策略
-
-**Decision**: pytest + pytest-asyncio，Service 层和 Router 层分别测试，使用 AsyncMock 模拟外部依赖  
-**Rationale**: Service 层测试业务逻辑，不依赖 HTTP；Router 层使用 TestClient 测试完整请求流程。视频 service 测试使用真实小视频文件（OpenCV VideoWriter 生成），验证帧提取逻辑正确性。  
-**Alternatives considered**: 仅集成测试（需要真实 RustFS 和 ZhipuAI，CI 成本高）；全部单元测试（无法覆盖路由和异常处理器集成）
-
---
-
-## 无待解决项
-
-所有 NEEDS CLARIFICATION 均已在设计阶段通过用户确认或合理默认值解决。本 research.md 仅作决策存档。
--- a/specs/001-ai-service-requirements/spec.md
+++ b/specs/001-ai-service-requirements/spec.md
@@ -1,258 +0,0 @@
-# Feature Specification: AI 服务需求文档
-
-**Feature Branch**: `001-ai-service-requirements`  
-**Created**: 2026-04-10  
-**Status**: Draft  
-**Input**: User description: "@docs/superpowers/specs/2026-04-10-ai-service-design.md 根据设计文档完成需求文档"
-
---
-
-## 概述
-
-知识图谱智能标注平台需要一个独立的 AI 计算服务，接收 Java 后端的调用，完成文档结构化提取、图像分析、视频预处理、训练数据生成和模型微调管理等智能化任务，将大模型能力嵌入标注工作流，大幅降低人工标注成本。
-
---
-
-## User Scenarios & Testing *(mandatory)*
-
-### User Story 1 - ADMIN 从文档中提取知识三元组 (Priority: P1)
-
-ADMIN 在标注平台上选择一份已上传的文本文件（TXT、PDF 或 Word 文档），触发 AI 辅助提取。AI 服务从存储系统中读取该文档，分析内容，识别其中的主谓宾知识关系（三元组），并为每个三元组标注原文出处片段和字符偏移位置，返回结构化结果供标注员审核确认。
-
-**Why this priority**: 文本三元组提取是平台文本标注流水线的核心入口，所有文本类标注任务都依赖此能力。无此功能，平台的主要价值无法实现。
-
-**Independent Test**: 向 AI 服务发送一个包含已知知识点的测试文档路径，验证返回结果包含正确的主语/谓语/宾语和对应的原文位置信息，即可独立验证此功能完整运行。
-
-**Acceptance Scenarios**:
-
-1. **Given** 存储系统中存有一份 TXT 格式文档，**When** AI 服务收到该文档路径和提取请求，**Then** 返回包含至少一条三元组的结果，每条含 subject、predicate、object、原文片段和字符偏移。
-2. **Given** 存储系统中存有一份 PDF 格式文档，**When** AI 服务收到提取请求，**Then** 正确解析 PDF 内容并返回三元组结果。
-3. **Given** 存储系统中存有一份 Word（.docx）格式文档，**When** AI 服务收到提取请求，**Then** 正确解析文档内容并返回三元组结果。
-4. **Given** 请求包含不支持的文件格式（如 .xlsx），**When** AI 服务收到请求，**Then** 返回明确的格式不支持错误，不崩溃。
-5. **Given** 存储系统不可达，**When** AI 服务尝试下载文件，**Then** 返回存储故障错误，而非通用服务器错误。
-
---
-
-### User Story 2 - ADMIN 从图片中提取知识四元组并自动裁剪 (Priority: P1)
-
-ADMIN 在标注平台选择一张已上传的图片，触发 AI 辅助提取。AI 服务读取该图片，通过多模态大模型分析图像内容，识别图中的知识实体关系（四元组：主体、关系、客体、修饰信息），同时给出每个知识点在图像中的位置框（bbox 坐标），并自动将对应区域裁剪保存，供标注员对照审核。
-
-**Why this priority**: 图像四元组提取是图片标注流水线的核心入口，与文本三元组提取并列为平台两大主流水线的起点。
-
-**Independent Test**: 向 AI 服务发送一张包含可识别对象关系的测试图片路径，验证返回结果包含四元组信息和裁剪图的存储路径，即可独立验证此功能完整运行。
-
-**Acceptance Scenarios**:
-
-1. **Given** 存储系统中存有一张图片，**When** AI 服务收到该图片路径和提取请求，**Then** 返回包含至少一条四元组的结果，每条含 subject、predicate、object、qualifier 和 bbox 坐标。
-2. **Given** AI 服务成功提取四元组，**When** 处理完成，**Then** 每个四元组对应的图像区域已自动裁剪并上传至存储，响应中包含裁剪图的存储路径。
-3. **Given** bbox 坐标超出图像边界，**When** 裁剪时，**Then** 自动截断至图像有效区域，不报错。
-4. **Given** 大模型返回格式异常（非 JSON），**When** 解析响应，**Then** 返回解析失败错误，不返回部分结果。
-
---
-
-### User Story 3 - ADMIN 对视频进行帧提取（帧模式预处理） (Priority: P2)
-
-ADMIN 在标注平台选择一段已上传的视频，选择"帧提取"模式（按固定间隔或关键帧），触发 AI 服务处理。AI 服务在后台异步完成帧提取，将每一帧图片上传至存储，处理完成后主动通知 Java 后端，后端随即为每一帧创建图片标注任务，进入图片标注流程。
-
-**Why this priority**: 视频帧提取是视频进入图片标注流水线的预处理步骤，依赖图片提取流水线（P1）已就绪。
-
-**Independent Test**: 向 AI 服务发送一个测试视频的存储路径和 job_id，服务立即返回 202 Accepted，稍后验证回调接口收到含帧路径列表的成功通知，即可独立验证。
-
-**Acceptance Scenarios**:
-
-1. **Given** 存储系统中存有一段视频（大小在限制内），**When** AI 服务收到帧提取请求（interval 模式），**Then** 立即返回 202 Accepted 和 job_id，不等待处理完成。
-2. **Given** 帧提取任务在后台成功完成，**When** 处理完成，**Then** AI 服务向 Java 后端发送回调，包含 job_id、status=SUCCESS 和帧图存储路径列表。
-3. **Given** keyframe 模式，**When** AI 服务处理视频，**Then** 仅提取画面发生显著变化的帧，而非固定间隔。
-4. **Given** 视频文件大小超过系统上限（默认 200MB，可配置），**When** 收到请求，**Then** 立即返回 400 错误，不启动后台任务。
-5. **Given** 帧提取过程中发生错误，**When** 任务失败，**Then** AI 服务仍向 Java 后端发送回调，status=FAILED，包含错误描述。
-
---
-
-### User Story 4 - ADMIN 将视频片段转换为文字描述（片段模式预处理） (Priority: P2)
-
-ADMIN 在标注平台选择一段已上传视频的时间段，触发"视频转文本"预处理。AI 服务在后台均匀采样该时间段的视频帧，用多模态大模型理解视频内容，生成结构化文字描述，将描述文本上传存储，完成后通知 Java 后端，后端将其创建为新的文本类原始资料，进入文本标注流程。
-
-**Why this priority**: 视频转文本预处理使视频内容能够通过文本标注流水线处理，扩展了平台的数据来源范围。
-
-**Independent Test**: 向 AI 服务发送测试视频路径、时间段和 job_id，验证回调收到 output_path 指向一个可读的文字描述文件，即可独立验证。
-
-**Acceptance Scenarios**:
-
-1. **Given** 存储系统中存有一段视频（大小在限制内），**When** AI 服务收到视频转文本请求，**Then** 立即返回 202 Accepted 和 job_id。
-2. **Given** 视频转文本任务在后台成功完成，**When** 处理完成，**Then** AI 服务向 Java 后端发送回调，包含 job_id、status=SUCCESS 和文字描述的存储路径。
-3. **Given** 请求指定了起止时间段（start_sec、end_sec），**When** 处理视频，**Then** 仅分析该时间段内的内容，不处理其他片段。
-4. **Given** 视频文件大小超过上限，**When** 收到请求，**Then** 立即返回 400 错误。
-5. **Given** 大模型调用失败，**When** 任务异常，**Then** 回调 status=FAILED，包含错误描述。
-
---
-
-### User Story 5 - 系统自动为已审批三元组生成候选问答对 (Priority: P2)
-
-标注员提交的文本三元组经审批员审批通过后，系统自动调用 AI 服务，将三元组列表和对应原文片段批量输入大模型，生成符合微调格式的候选问答对，作为后续训练数据的来源。
-
-**Why this priority**: 问答对生成是平台训练数据产出流程的关键环节，依赖三元组提取（P1）已完成并通过审批。
-
-**Independent Test**: 向 AI 服务发送一组测试三元组（含原文片段），验证返回包含可读、合理的问答对列表，即可独立验证。
-
-**Acceptance Scenarios**:
-
-1. **Given** 一组已审批的文本三元组，**When** AI 服务收到文本 QA 生成请求，**Then** 返回包含 question 和 answer 的问答对列表，每个三元组至少对应一个问答对。
-2. **Given** 大模型返回合法 JSON，**When** 解析响应，**Then** 正确提取每对问答并返回。
-3. **Given** 大模型返回格式异常，**When** 解析响应，**Then** 返回解析失败错误。
-4. **Given** 大模型服务不可用，**When** 调用失败，**Then** 返回明确的服务不可用错误。
-
---
-
-### User Story 6 - 系统自动为已审批四元组生成候选图文问答对 (Priority: P2)
-
-图像四元组经审批通过后，系统自动调用 AI 服务，将四元组信息与对应裁剪图一起输入多模态大模型，生成图文问答对，用于后续图像类训练数据集。
-
-**Why this priority**: 图像 QA 生成是图片标注流水线产出训练数据的最终步骤，优先级与文本 QA 生成（P2）相同。
-
-**Independent Test**: 向 AI 服务发送一组四元组（含裁剪图存储路径），验证返回的问答对引用了图片路径，即可独立验证。
-
-**Acceptance Scenarios**:
-
-1. **Given** 一组已审批的图像四元组（含裁剪图路径），**When** AI 服务收到图像 QA 生成请求，**Then** 返回包含 question、answer 和 image_path 的问答对列表。
-2. **Given** 裁剪图存储路径有效，**When** AI 服务处理，**Then** 自动获取图片内容并结合四元组信息生成问答，无需调用方额外传输图片数据。
-3. **Given** 裁剪图无法从存储获取，**When** 处理请求，**Then** 返回存储错误，不返回空结果。
-
---
-
-### User Story 7 - ADMIN 提交微调任务并查询进度 (Priority: P3)
-
-ADMIN 在标注平台完成训练数据集导出后，选择提交大模型微调任务。平台调用 AI 服务提交微调请求（包含训练数据文件地址、基础模型和超参数），获取微调任务 ID。此后，ADMIN 可随时查询该任务的运行状态（进行中/成功/失败）和完成进度。
-
-**Why this priority**: 微调任务管理是平台最终目标（产出定制化模型）的关键步骤，但需要前置数据准备流程全部完成，故列为 P3。
-
-**Independent Test**: 向 AI 服务发送微调请求，获取 job_id，再调用状态查询接口，验证能正确返回当前状态，即可独立验证。
-
-**Acceptance Scenarios**:
-
-1. **Given** 训练数据 JSONL 文件已在存储中准备就绪，**When** AI 服务收到微调提交请求（含文件地址、基础模型、超参数），**Then** 返回微调任务 ID。
-2. **Given** 微调任务已提交，**When** 查询任务状态，**Then** 返回 job_id、当前状态（RUNNING/SUCCESS/FAILED）和进度百分比。
-3. **Given** 任务处于运行中，**When** 多次查询状态，**Then** 每次均返回最新状态，不缓存旧状态。
-4. **Given** 传入不存在的 job_id 查询状态，**When** 处理请求，**Then** 返回明确错误，不崩溃。
-
---
-
-### User Story 8 - 运维监控服务健康状态 (Priority: P3)
-
-运维人员或监控系统定期探测 AI 服务的健康状态，判断服务是否正常运行，以便在异常时及时告警或自动重启。
-
-**Why this priority**: 健康检查是服务稳定运行的基础保障，但不属于业务功能，列为 P3。
-
-**Independent Test**: 对健康检查接口发起 HTTP GET 请求，验证收到表示正常的响应，即可独立验证。
-
-**Acceptance Scenarios**:
-
-1. **Given** AI 服务正常运行，**When** 任何系统对健康检查接口发起请求，**Then** 立即返回服务正常的响应，响应时间不超过 1 秒。
-2. **Given** 容器运行中，**When** 容器编排系统定期发起健康探测，**Then** 通过探测的容器才被标记为可用状态并接收流量。
-
---
-
-### Edge Cases
-
- 文件存在于存储系统但内容损坏（如 PDF 页面为空）时，如何处理？→ 返回解析结果为空，不报错，日志记录警告。
- 视频帧提取结果为零帧（如视频文件损坏或间隔过大）时，如何处理？→ 回调 SUCCESS，返回空帧列表，Java 后端决定是否重试。
- 大模型返回的三元组/四元组超过合理数量（如数百条）时，如何处理？→ 全量返回，由 Java 后端或标注员筛选，AI 服务不做截断。
- 多个视频任务并发执行时，是否会互相影响？→ 每个任务独立使用临时文件，处理完成后清理，互不干扰。
- 视频文件大小恰好等于上限时，如何处理？→ 视为超限，拒绝处理，避免边界情况下的内存压力。
- 大模型以 Markdown 代码块格式（\`\`\`json ... \`\`\`）返回 JSON 时，如何处理？→ 自动提取代码块内的 JSON 内容，兼容此格式。
-
---
-
-## Requirements *(mandatory)*
-
-### Functional Requirements
-
-**文本处理**
-
- **FR-001**: 系统 MUST 支持从 TXT、PDF、DOCX 三种格式的文档中提取知识三元组（subject / predicate / object），并为每条三元组提供原文出处片段和字符偏移位置。
- **FR-002**: 系统 MUST 在文件格式不受支持时，返回明确的格式不支持错误（HTTP 400），拒绝处理请求。
-
-**图像处理**
-
- **FR-003**: 系统 MUST 支持从图片中提取知识四元组（subject / predicate / object / qualifier），并提供每个知识点在图像中的位置框（bbox：x, y, w, h 像素坐标）。
- **FR-004**: 系统 MUST 在返回四元组结果时，自动将每个知识点对应的图像区域裁剪并保存至存储，响应中包含裁剪图的存储路径。
-
-**视频处理**
-
- **FR-005**: 系统 MUST 支持视频帧提取，提供两种模式：固定间隔模式（按帧数间隔）和关键帧模式（场景切换时提取）。
- **FR-006**: 系统 MUST 以异步方式处理视频任务，接受请求后立即返回接受确认（HTTP 202），在后台完成处理后主动通知调用方。
- **FR-007**: 系统 MUST 支持视频片段转文字描述，输入起止时间段，输出视频内容的结构化文字描述，并将描述文本保存至存储。
- **FR-008**: 系统 MUST 在视频文件大小超过上限时，拒绝处理并返回明确错误；大小上限 MUST 支持运行时配置（默认 200MB），不需要重新构建服务即可调整。
-
-**问答对生成**
-
- **FR-009**: 系统 MUST 支持基于文本三元组（含原文片段）批量生成候选问答对，每条三元组至少生成一个问答对。
- **FR-010**: 系统 MUST 支持基于图像四元组（含裁剪图存储路径）生成图文候选问答对，图片内容由系统自动从存储获取，调用方只需提供存储路径。
-
-**微调管理**
-
- **FR-011**: 系统 MUST 支持向大模型服务提交微调任务，输入训练数据文件地址、基础模型名称和超参数，返回微调任务 ID。
- **FR-012**: 系统 MUST 支持通过任务 ID 查询微调任务当前状态（RUNNING / SUCCESS / FAILED）和完成进度。
-
-**服务运维**
-
- **FR-013**: 系统 MUST 提供轻量健康检查接口，可被容器编排系统、反向代理和监控工具调用，无需认证，响应时间不超过 1 秒。
- **FR-014**: 系统 MUST 对每次请求记录结构化日志，包含请求路径、响应状态和耗时；对每次大模型调用记录模型名称和耗时；对视频后台任务记录任务 ID、阶段和结果；日志 MUST NOT 包含文件原文内容。
- **FR-015**: 系统 MUST 在大模型返回非法格式时（HTTP 502）、存储不可达时（HTTP 502）、大模型服务不可用时（HTTP 503），分别返回不同的结构化错误响应，便于调用方判断根因。
- **FR-016**: 系统 MUST 提供 Swagger/OpenAPI 自动文档，描述所有接口的请求和响应格式。
-
-**可扩展性**
-
- **FR-017**: 系统 MUST 将大模型调用和存储访问封装为可替换的适配层，当前实现 ZhipuAI GLM 系列和 RustFS，替换实现时业务逻辑层无需修改。
- **FR-018**: 系统 MUST 通过配置文件和环境变量管理所有可变参数（模型名称、存储地址、密钥、视频大小上限等），支持不重建服务镜像的情况下切换环境配置。
-
-### Key Entities
-
- **三元组（Triple）**: 从文本中提取的知识关系，由主语（subject）、谓语（predicate）、宾语（object）、原文片段（source_snippet）和字符偏移（source_offset: start/end）组成。
- **四元组（Quadruple）**: 从图像中提取的知识关系，在三元组基础上增加修饰信息（qualifier）和图像位置框（bbox: x/y/w/h），并关联裁剪图存储路径（cropped_image_path）。
- **问答对（QA Pair）**: 由 question 和 answer 组成，文本类关联三元组上下文，图像类额外携带图片存储路径（image_path）。
- **视频任务回调（Video Job Callback）**: 异步任务完成通知，包含 job_id、status（SUCCESS/FAILED）、结果数据（帧路径列表或文字描述路径）和错误信息。
- **微调任务（Finetune Job）**: 包含任务 ID、当前状态（RUNNING/SUCCESS/FAILED）和进度百分比。
-
---
-
-## Success Criteria *(mandatory)*
-
-### Measurable Outcomes
-
- **SC-001**: 对于长度在 10,000 字以内的文档，三元组提取请求在 60 秒内完成并返回结果，满足标注员实时等待的体验预期。
- **SC-002**: 对于分辨率在 4K 以内的图片，四元组提取和裁剪图上传在 30 秒内完成，裁剪图区域与 bbox 坐标对应准确（误差 ≤2 像素）。
- **SC-003**: 视频帧提取和视频转文本任务提交后，接受响应在 1 秒内返回；后台处理完成后回调通知在 10 分钟内送达（针对 200MB 以内的视频）。
- **SC-004**: 视频大小超限的请求，拒绝响应在 3 秒内返回（含存储查询耗时），不启动任何后台处理。
- **SC-005**: 问答对生成请求（≤10 条三元组/四元组），在 90 秒内完成并返回全部问答对。
- **SC-006**: 健康检查接口在服务正常运行时，响应时间不超过 1 秒，容器编排系统依此判断服务可用状态。
- **SC-007**: 所有错误响应均返回结构化错误信息（含错误类型和描述），不返回通用服务器错误，便于调用方在不查看日志的情况下判断根因。
- **SC-008**: 替换大模型服务商或存储实现时，业务逻辑层代码零修改，仅需变更配置和适配层实现。
- **SC-009**: 所有业务接口通过自动化单元测试覆盖，包括正常路径、存储错误、大模型错误、格式解析错误等场景。
-
---
-
-## Technical Environment *(mandatory)*
-
-> 注：本节记录项目已确定的技术约束，这些决定已由团队确认，不作为需求变更点。
-
- **运行时**: Python 3.12.13
- **Web 框架**: FastAPI（含 uvicorn 服务器）
- **运行环境**: conda 虚拟环境，环境名称 `label`
- **大模型**: ZhipuAI GLM 系列（文本：glm-4-flash，视觉：glm-4v-flash），通过官方 SDK 调用
- **对象存储**: RustFS，通过 S3 兼容 API（boto3）访问
- **文档解析**: TXT（UTF-8 解码）、PDF（pdfplumber）、DOCX（python-docx）
- **视频处理**: OpenCV（帧提取 + 帧差分关键帧检测）
- **容器化**: Docker + Docker Compose，提供 Dockerfile 和 docker-compose.yml
-
---
-
-## Assumptions
-
- Java 后端（label-backend）是 AI 服务的唯一调用方，AI 服务不对外直接暴露，无需用户认证机制。
- 大模型服务部署在公网（ZhipuAI 云端 API），RustFS 部署在 Docker 内网；因此图片内容必须以 base64 方式传递给大模型，不能依赖 RustFS 内网地址被云端服务访问。
- 文档、图片、视频等原始文件由 Java 后端负责上传至存储，AI 服务仅通过存储路径读取，不处理文件上传逻辑。
- 微调任务提交后的训练过程由 ZhipuAI 平台托管，AI 服务仅负责提交和查询，不管理训练算力资源。
- 视频任务为低频操作（由 ADMIN 手动触发），并发量有限（预计同时不超过 5 个视频任务），当前无需专用任务队列。
- 日志仅输出到标准输出（stdout），由容器运行时或日志收集系统负责落盘和归档；不记录文件原文内容，防止敏感信息泄露。
- ZhipuAI SDK 为同步阻塞调用；为保持服务并发能力，SDK 调用将在线程池中执行，不阻塞主事件循环。
- 视频大小上限默认 200MB，可通过环境变量（MAX_VIDEO_SIZE_MB）在容器运行时覆盖，无需重建镜像。
--- a/specs/001-ai-service-requirements/tasks.md
+++ b/specs/001-ai-service-requirements/tasks.md
@@ -1,318 +0,0 @@
-# Tasks: AI 服务（知识图谱标注平台 AI 计算服务）
-
-**Input**: Design documents from `/specs/001-ai-service-requirements/`  
-**Prerequisites**: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/api.md ✅  
-**Tests**: Included — spec and plan explicitly mandate TDD（全量 TDD 开发）
-
-**Organization**: Tasks grouped by user story. Each phase is independently implementable and testable.
-
-## Format: `[ID] [P?] [Story?] Description`
-
- **[P]**: Can run in parallel (different files, no shared dependencies)
- **[Story]**: Which user story this task belongs to (US1–US8)
- All paths are relative to project root `label_ai_service/`
-
---
-
-## Phase 1: Setup（项目初始化）
-
-**Purpose**: Create project skeleton and configuration files before any code is written.
-
- [ ] T001 Create directory structure: `app/core/`, `app/clients/llm/`, `app/clients/storage/`, `app/services/`, `app/routers/`, `app/models/`, `tests/`
- [ ] T002 Create `requirements.txt` with pinned dependencies: fastapi≥0.111, uvicorn[standard]≥0.29, pydantic≥2.7, zhipuai≥2.1, boto3≥1.34, pdfplumber≥0.11, python-docx≥1.1, opencv-python-headless≥4.9, numpy≥1.26, httpx≥0.27, python-dotenv≥1.0, pyyaml≥6.0, pytest≥8.0, pytest-asyncio≥0.23
- [ ] T003 [P] Create `config.yaml` with default server/storage/video/models configuration (port 8000, buckets, max_file_size_mb 200, glm-4-flash / glm-4v-flash)
- [ ] T004 [P] Create `.env` template with required env var keys (ZHIPUAI_API_KEY, STORAGE_ACCESS_KEY, STORAGE_SECRET_KEY, STORAGE_ENDPOINT, BACKEND_CALLBACK_URL, LOG_LEVEL, MAX_VIDEO_SIZE_MB)
- [ ] T005 [P] Create `Dockerfile` (python:3.12-slim base, install requirements, expose 8000, CMD uvicorn)
- [ ] T006 [P] Create `docker-compose.yml` with ai-service and rustfs services, env_file, healthcheck (curl /health every 30s)
-
---
-
-## Phase 2: Foundational（核心基础设施）
-
-**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented.
-
-**⚠️ CRITICAL**: No user story work can begin until this phase is complete.
-
-### Config & Core Utilities
-
- [ ] T007 Implement `app/core/config.py`: load `config.yaml` with PyYAML + override via `_ENV_OVERRIDES` dict mapping env vars to nested YAML paths (including `MAX_VIDEO_SIZE_MB → video.max_file_size_mb`), expose `get_config()` with `@lru_cache`
- [ ] T008 [P] Implement `app/core/logging.py`: JSON structured logging via `logging` module, `RequestLoggingMiddleware` that logs path/status/latency, helper `get_logger(name)`
- [ ] T009 [P] Implement `app/core/exceptions.py`: custom exception classes `UnsupportedFileTypeError(400)`, `VideoTooLargeError(400)`, `StorageError(502)`, `LLMParseError(502)`, `LLMCallError(503)`, plus global exception handler that returns `{"code": ..., "message": ...}` JSON
- [ ] T010 [P] Implement `app/core/json_utils.py`: `extract_json(text) -> dict` that strips Markdown code fences (` ```json ... ``` `) before `json.loads`, raises `LLMParseError` on invalid JSON
- [ ] T011 Write `tests/test_config.py`: verify YAML defaults load correctly; verify `MAX_VIDEO_SIZE_MB=500` env var overrides `video.max_file_size_mb`; verify missing required env vars surface clear errors
-
-### LLM Client（大模型适配层）
-
- [ ] T012 [P] Implement `app/clients/llm/base.py`: `LLMClient` ABC with abstract methods `chat(model, messages) -> str` and `chat_vision(model, messages) -> str`
- [ ] T013 Implement `app/clients/llm/zhipuai_client.py`: `ZhipuAIClient(LLMClient)` that wraps synchronous ZhipuAI SDK calls via `asyncio.get_event_loop().run_in_executor(None, ...)` in a thread pool; raise `LLMCallError` on SDK exceptions
- [ ] T014 [P] Write `tests/test_llm_client.py`: mock ZhipuAI SDK to verify `chat()` and `chat_vision()` call the SDK correctly; verify `LLMCallError` is raised on SDK exception; verify thread-pool wrapping does not block the event loop
-
-### Storage Client（存储适配层）
-
- [ ] T015 [P] Implement `app/clients/storage/base.py`: `StorageClient` ABC with abstract methods `download_bytes(bucket, path) -> bytes`, `upload_bytes(bucket, path, data, content_type) -> None`, `get_presigned_url(bucket, path, expires) -> str`, `get_object_size(bucket, path) -> int`
- [ ] T016 Implement `app/clients/storage/rustfs_client.py`: `RustFSClient(StorageClient)` using boto3 S3 client; all calls wrapped via `run_in_executor`; `get_object_size` uses `head_object`; raise `StorageError` on `ClientError`
- [ ] T017 [P] Write `tests/test_storage_client.py`: mock boto3 S3 client; verify `download_bytes` returns correct bytes; verify `get_object_size` calls `head_object` and returns `ContentLength`; verify `StorageError` raised on S3 exception
-
-### FastAPI Application Entry
-
- [ ] T018 Implement `app/main.py`: create FastAPI app with lifespan, register `RequestLoggingMiddleware`, register global exception handlers from `exceptions.py`, mount all routers (empty stubs initially), expose `GET /health → {"status": "ok"}`
- [ ] T019 [P] Implement `app/core/dependencies.py`: `get_llm_client() -> LLMClient` and `get_storage_client() -> StorageClient` as `@lru_cache` singletons, instantiated from `get_config()` values
- [ ] T020 Write `tests/conftest.py`: `mock_llm` fixture (AsyncMock implementing LLMClient), `mock_storage` fixture (AsyncMock implementing StorageClient with `get_object_size` returning 10MB), `test_app` fixture overriding Depends, `client` fixture using `TestClient`
-
-**Checkpoint**: Foundation complete — all user story phases can now begin in parallel.
-
---
-
-## Phase 3: User Story 1 — ADMIN 从文档中提取知识三元组 (Priority: P1) 🎯 MVP
-
-**Goal**: `POST /api/v1/text/extract` reads a TXT/PDF/DOCX file from RustFS, calls GLM, returns structured triples with source offsets.
-
-**Independent Test**: Send `{"file_path": "text/test.txt", "file_name": "test.txt"}` to the endpoint; verify response contains `items` with `subject`, `predicate`, `object`, `source_snippet`, `source_offset.start/end`.
-
-### Tests for User Story 1 ⚠️ Write FIRST — verify FAIL before implementing
-
- [ ] T021 [P] [US1] Write `tests/test_text_service.py`: test TXT parsing returns triples; test PDF parsing (mock pdfplumber); test DOCX parsing (mock python-docx); test unsupported format raises `UnsupportedFileTypeError`; test storage failure raises `StorageError`; test LLM parse error raises `LLMParseError`
-
-### Implementation for User Story 1
-
- [ ] T022 [P] [US1] Create `app/models/text_models.py`: `SourceOffset(start: int, end: int)`, `TripleItem(subject, predicate, object, source_snippet, source_offset)`, `TextExtractRequest(file_path, file_name, model?, prompt_template?)`, `TextExtractResponse(items: list[TripleItem])`
- [ ] T023 [US1] Implement `app/services/text_service.py`: `extract_triples(req, llm, storage) -> TextExtractResponse`; dispatch to `_parse_txt / _parse_pdf / _parse_docx` by file extension; build prompt from content + optional `prompt_template`; call `llm.chat()`; parse JSON response via `extract_json()`; validate triple fields; raise typed exceptions
- [ ] T024 [US1] Write `tests/test_text_router.py`: POST `/api/v1/text/extract` returns 200 with items; unsupported format returns 400 with `UNSUPPORTED_FILE_TYPE`; storage error returns 502 with `STORAGE_ERROR`; LLM parse error returns 502 with `LLM_PARSE_ERROR`
- [ ] T025 [US1] Implement `app/routers/text.py`: `APIRouter(prefix="/api/v1")` with `POST /text/extract` handler that injects `storage` and `llm` via Depends, calls `text_service.extract_triples()`; register router in `app/main.py`
-
-**Checkpoint**: `POST /api/v1/text/extract` fully functional. Run `pytest tests/test_text_service.py tests/test_text_router.py -v` — all green.
-
---
-
-## Phase 4: User Story 2 — ADMIN 从图片中提取知识四元组并自动裁剪 (Priority: P1)
-
-**Goal**: `POST /api/v1/image/extract` downloads an image from RustFS, calls GLM-4V, crops bbox regions, uploads crops, returns quads with cropped_image_path.
-
-**Independent Test**: Send `{"file_path": "image/test.jpg", "task_id": 1}` to the endpoint; verify response contains `items` each with `bbox`, `qualifier`, and `cropped_image_path` matching pattern `crops/1/{n}.jpg`.
-
-### Tests for User Story 2 ⚠️ Write FIRST — verify FAIL before implementing
-
- [ ] T026 [P] [US2] Write `tests/test_image_service.py`: test full quad extraction pipeline with mock LLM returning valid JSON; test bbox crop uses correct pixel coordinates; test out-of-bounds bbox is clamped to image dimensions; test crop upload path follows `crops/{task_id}/{index}.jpg` convention; test LLM parse error raises `LLMParseError`
-
-### Implementation for User Story 2
-
- [ ] T027 [P] [US2] Create `app/models/image_models.py`: `BBox(x, y, w, h: int)`, `QuadrupleItem(subject, predicate, object, qualifier?, bbox, cropped_image_path)`, `ImageExtractRequest(file_path, task_id, model?, prompt_template?)`, `ImageExtractResponse(items: list[QuadrupleItem])`
- [ ] T028 [US2] Implement `app/services/image_service.py`: `extract_quads(req, llm, storage) -> ImageExtractResponse`; download image bytes → decode with OpenCV (`cv2.imdecode`); base64 encode image for GLM-4V multimodal message; call `llm.chat_vision()`; parse JSON via `extract_json()`; for each quad, clamp bbox to image dimensions, crop with numpy slicing, encode as JPEG, upload to `crops/{task_id}/{index}.jpg`; return quads with paths
- [ ] T029 [US2] Write `tests/test_image_router.py`: POST `/api/v1/image/extract` returns 200 with items; LLM parse error returns 502; storage download failure returns 502
- [ ] T030 [US2] Implement `app/routers/image.py`: `POST /image/extract` handler; register in `app/main.py`
-
-**Checkpoint**: `POST /api/v1/image/extract` fully functional. Run `pytest tests/test_image_service.py tests/test_image_router.py -v` — all green.
-
---
-
-## Phase 5: User Stories 3 & 4 — 视频帧提取 + 视频转文本 (Priority: P2)
-
-**Goal**: `POST /api/v1/video/extract-frames` and `POST /api/v1/video/to-text` immediately return 202, process video in background via FastAPI BackgroundTasks, then POST callback to Java backend with results.
-
-**Independent Test (US3)**: Send extract-frames request; verify immediate 202 with job_id; mock storage and callback URL; verify callback received with `status=SUCCESS` and non-empty `frames` list.
-
-**Independent Test (US4)**: Send to-text request with `start_sec=0, end_sec=10`; verify immediate 202; verify callback received with `status=SUCCESS` and `output_path` pointing to an uploaded text file.
-
-### Tests for User Stories 3 & 4 ⚠️ Write FIRST — verify FAIL before implementing
-
- [ ] T031 [P] [US3] Write `tests/test_video_service.py` (frame extraction tests): generate small test video via `cv2.VideoWriter`; test interval mode extracts correct frame indices; test keyframe mode only extracts frames exceeding difference threshold; test each extracted frame is uploaded to `frames/{source_id}/{index}.jpg`; test failed extraction triggers FAILED callback with error_message
- [ ] T032 [P] [US4] Append to `tests/test_video_service.py` (to-text tests): test uniform sampling selects `frame_sample_count` frames from `[start_sec, end_sec]` window; test sampled frames are passed as base64 to `llm.chat_vision()`; test output text is uploaded to `video-text/{source_id}/{timestamp}.txt`; test LLM failure triggers FAILED callback
-
-### Implementation for User Stories 3 & 4
-
- [ ] T033 [US3] Create `app/models/video_models.py`: `ExtractFramesRequest(file_path, source_id, job_id, mode="interval", frame_interval=30)`, `VideoToTextRequest(file_path, source_id, job_id, start_sec, end_sec, model?, prompt_template?)`, `FrameInfo(frame_index, time_sec, frame_path)`, `VideoJobCallback(job_id, status, frames?, output_path?, error_message?)`, `VideoAcceptedResponse(message, job_id)`
- [ ] T034 [US3] Implement frame extraction in `app/services/video_service.py`: `extract_frames_task(req, llm, storage, callback_url)` background function; download video to temp file; open with `cv2.VideoCapture`; interval mode: step by `frame_interval`; keyframe mode: compute grayscale frame diff, extract when diff > threshold (default 30.0); upload each frame JPEG; POST callback with `FrameInfo` list; clean up temp file; catch all exceptions and POST FAILED callback
- [ ] T035 [US4] Implement to-text in `app/services/video_service.py`: `video_to_text_task(req, llm, storage, callback_url)` background function; download video to temp file; sample `frame_sample_count` frames uniformly within `[start_sec, end_sec]`; base64 encode frames; call `llm.chat_vision()` with all frames in one multimodal message; upload text result to `video-text/{source_id}/{timestamp}.txt`; POST callback with `output_path`; clean up temp file
- [ ] T036 [US3] Write `tests/test_video_router.py`: POST `/api/v1/video/extract-frames` returns 202 immediately; video exceeding `max_file_size_mb` returns 400 with `VIDEO_TOO_LARGE`; background task is registered (mock BackgroundTasks)
- [ ] T037 [US4] Append to `tests/test_video_router.py`: POST `/api/v1/video/to-text` returns 202; size limit applies equally
- [ ] T038 [US3] Implement `app/routers/video.py`: `_check_video_size(storage, bucket, file_path, max_mb)` helper that calls `storage.get_object_size()` and raises `VideoTooLargeError`; `POST /video/extract-frames` and `POST /video/to-text` handlers check size then enqueue background task; register router in `app/main.py`
-
-**Checkpoint**: Both video endpoints fully functional. Run `pytest tests/test_video_service.py tests/test_video_router.py -v` — all green.
-
---
-
-## Phase 6: User Stories 5 & 6 — 文本QA生成 + 图像QA生成 (Priority: P2)
-
-**Goal**: `POST /api/v1/qa/gen-text` generates QA pairs from text triples; `POST /api/v1/qa/gen-image` generates multimodal QA pairs from image quads (images fetched and base64-encoded internally).
-
-**Independent Test (US5)**: Send `{"items": [{"subject":"变压器","predicate":"额定电压","object":"110kV","source_snippet":"..."}]}` to gen-text; verify response contains `pairs` with non-empty `question` and `answer`.
-
-**Independent Test (US6)**: Send `{"items": [{"subject":"...","cropped_image_path":"crops/1/0.jpg",...}]}` to gen-image; verify response contains `pairs` with `image_path` matching `crops/1/0.jpg`.
-
-### Tests for User Stories 5 & 6 ⚠️ Write FIRST — verify FAIL before implementing
-
- [ ] T039 [P] [US5] Write `tests/test_qa_service.py` (text QA tests): test triples are formatted into prompt correctly; test LLM response JSON is parsed into `QAPair` list; test `LLMParseError` on malformed LLM response; test `LLMCallError` propagates correctly
- [ ] T040 [P] [US6] Append to `tests/test_qa_service.py` (image QA tests): test storage downloads cropped image and encodes as base64 before LLM call; test multimodal message includes both text (quad info) and inline image data URI; test `StorageError` on failed image download
-
-### Implementation for User Stories 5 & 6
-
- [ ] T041 [P] [US5] Create `app/models/qa_models.py`: `TextQAItem(subject, predicate, object, source_snippet)`, `GenTextQARequest(items, model?, prompt_template?)`, `QAPair(question, answer)`, `ImageQAItem(subject, predicate, object, qualifier?, cropped_image_path)`, `GenImageQARequest(items, model?, prompt_template?)`, `ImageQAPair(question, answer, image_path)`, `TextQAResponse(pairs)`, `ImageQAResponse(pairs)`
- [ ] T042 [US5] Implement `gen_text_qa(req, llm) -> TextQAResponse` in `app/services/qa_service.py`: format all triples + source snippets into a single batch prompt; call `llm.chat()`; parse JSON array via `extract_json()`; return `QAPair` list
- [ ] T043 [US6] Implement `gen_image_qa(req, llm, storage) -> ImageQAResponse` in `app/services/qa_service.py`: for each `ImageQAItem`, download `cropped_image_path` bytes from `source-data` bucket; base64 encode; build multimodal message with quad text + `data:image/jpeg;base64,...` inline URL; call `llm.chat_vision()`; parse JSON; return `ImageQAPair` with `image_path = item.cropped_image_path`
- [ ] T044 [US5] Write `tests/test_qa_router.py`: POST `/api/v1/qa/gen-text` returns 200 with pairs; POST `/api/v1/qa/gen-image` returns 200 with pairs including image_path; LLM errors return 502/503
- [ ] T045 [US5] Implement `app/routers/qa.py`: `POST /qa/gen-text` and `POST /qa/gen-image` handlers; register router in `app/main.py`
-
-**Checkpoint**: Both QA endpoints fully functional. Run `pytest tests/test_qa_service.py tests/test_qa_router.py -v` — all green.
-
---
-
-## Phase 7: User Stories 7 & 8 — 微调任务管理 + 健康检查 (Priority: P3)
-
-**Goal**: `POST /api/v1/finetune/start` submits a ZhipuAI fine-tune job; `GET /api/v1/finetune/status/{jobId}` queries its state; `GET /health` returns service liveness.
-
-**Independent Test (US7)**: Call `POST /finetune/start` with mock LLM returning a job ID; then call `GET /finetune/status/{jobId}`; verify `status` is one of `RUNNING/SUCCESS/FAILED` and `progress` is an integer.
-
-**Independent Test (US8)**: `GET /health` returns `{"status": "ok"}` with HTTP 200 in under 1 second.
-
-### Tests for User Stories 7 & 8 ⚠️ Write FIRST — verify FAIL before implementing
-
- [ ] T046 [P] [US7] Write `tests/test_finetune_service.py`: test `submit_finetune()` calls ZhipuAI finetune API with correct params and returns job_id; test `get_status()` maps ZhipuAI `"running"→RUNNING`, `"succeeded"→SUCCESS`, `"failed"→FAILED`, unknown status→RUNNING (conservative); test `LLMCallError` on SDK failure
- [ ] T047 [P] [US8] Write health check test in `tests/test_finetune_router.py` (or new `tests/test_health.py`): `GET /health` returns 200 with `{"status": "ok"}`
-
-### Implementation for User Stories 7 & 8
-
- [ ] T048 [P] [US7] Create `app/models/finetune_models.py`: `FinetuneStartRequest(jsonl_url, base_model, hyperparams?)`, `FinetuneStartResponse(job_id)`, `FinetuneStatusResponse(job_id, status, progress?, error_message?)`
- [ ] T049 [US7] Implement `app/services/finetune_service.py`: `submit_finetune(req, llm) -> FinetuneStartResponse` calls ZhipuAI fine-tune create API via `run_in_executor`; `get_finetune_status(job_id, llm) -> FinetuneStatusResponse` calls ZhipuAI fine-tune retrieve API and maps status strings; raise `LLMCallError` on failure
- [ ] T050 [US7] Write `tests/test_finetune_router.py`: `POST /api/v1/finetune/start` returns 200 with job_id; `GET /api/v1/finetune/status/{jobId}` returns 200 with status fields; unknown job_id propagates error response
- [ ] T051 [US7] Implement `app/routers/finetune.py`: `POST /finetune/start` and `GET /finetune/status/{job_id}` handlers; register router in `app/main.py`
-
-**Checkpoint**: All 8 user stories complete. Run `pytest tests/ -v` — all green.
-
---
-
-## Phase 8: Polish & Cross-Cutting Concerns
-
-**Purpose**: Final integration, documentation verification, and deployment readiness.
-
- [ ] T052 [P] Create `.gitignore` for Python project (`.env`, `__pycache__/`, `*.pyc`, `.pytest_cache/`, `tmp/` for video temp files)
- [ ] T053 Run full test suite `conda run -n label pytest tests/ -v --cov=app --cov-report=term-missing` and fix any remaining failures or coverage gaps
- [ ] T054 [P] Verify Swagger/OpenAPI docs at `http://localhost:8000/docs` show all 9 endpoints with correct request/response schemas
- [ ] T055 Validate quickstart.md end-to-end: `conda activate label && pip install -r requirements.txt && conda run -n label uvicorn app.main:app --reload` starts cleanly; `GET /health` returns 200; `docker-compose up -d` builds and healthcheck passes
-
---
-
-## Dependencies & Execution Order
-
-### Phase Dependencies
-
-```
-Phase 1 (Setup)
-    └─→ Phase 2 (Foundational) ← BLOCKS everything
-            ├─→ Phase 3 (US1, P1) ─┐
-            ├─→ Phase 4 (US2, P1) ─┤ Can run in parallel after Phase 2
-            ├─→ Phase 5 (US3+4, P2)─┤
-            ├─→ Phase 6 (US5+6, P2)─┤
-            └─→ Phase 7 (US7+8, P3)─┘
-                    └─→ Phase 8 (Polish)
-```
-
-### User Story Dependencies
-
-| Story | Priority | Depends On | Blocking |
-|-------|----------|-----------|---------|
-| US1 (文本三元组) | P1 | Phase 2 only | Nothing |
-| US2 (图像四元组) | P1 | Phase 2 only | US6 (shares image downloading pattern) |
-| US3 (视频帧提取) | P2 | Phase 2 only | Nothing |
-| US4 (视频转文本) | P2 | Phase 2, US3 (shares video_service.py) | Nothing |
-| US5 (文本QA) | P2 | Phase 2 only | Nothing |
-| US6 (图像QA) | P2 | Phase 2 only | Nothing |
-| US7 (微调管理) | P3 | Phase 2 only | Nothing |
-| US8 (健康检查) | P3 | T018 (main.py) | Nothing |
-
-### Within Each User Story
-
-1. Tests MUST be written first and verified to **FAIL** before implementation
-2. Models → Services → Routers (in dependency order)
-3. Register router in `main.py` after router file is complete
-4. Run story-specific tests before marking story done
-
-### Parallel Opportunities
-
-All tasks marked `[P]` within a phase can run concurrently (different files):
- **Phase 2**: T008, T009, T010 (core utilities) + T012, T014 (LLM) + T015, T017 (Storage) + T019 (dependencies)
- **Phase 3**: T021 (tests) and T022 (models) can start together
- **Phase 4**: T026 (tests) and T027 (models) can start together
- **Phase 5**: T031 (US3 tests) and T032 (US4 tests) can start together
- **Phase 6**: T039 (US5 tests) and T040, T041 (US6 tests + models) can start together
- **Phase 7**: T046, T047, T048 can start together
-
---
-
-## Parallel Example: Phase 2 Foundational
-
-```bash
-# Kick off these in parallel (all different files):
-[T008] app/core/logging.py
-[T009] app/core/exceptions.py
-[T010] app/core/json_utils.py
-[T012] app/clients/llm/base.py
-[T014] tests/test_llm_client.py
-[T015] app/clients/storage/base.py
-[T017] tests/test_storage_client.py
-[T019] app/core/dependencies.py
-
-# Then in sequence (each depends on previous):
-[T007] app/core/config.py  →  [T011] tests/test_config.py
-[T013] app/clients/llm/zhipuai_client.py (needs T012)
-[T016] app/clients/storage/rustfs_client.py (needs T015)
-[T018] app/main.py (needs T009, T008)
-[T020] tests/conftest.py (needs T018, T013, T016)
-```
-
---
-
-## Implementation Strategy
-
-### MVP First (US1 + US2 — P1 Stories Only)
-
-1. Complete Phase 1: Setup
-2. Complete Phase 2: Foundational (CRITICAL — blocks all stories)
-3. Complete Phase 3: US1 (文本三元组提取) → validate independently
-4. Complete Phase 4: US2 (图像四元组提取) → validate independently
-5. **STOP and DEMO**: Core extraction pipeline is production-ready
-
-### Incremental Delivery
-
-```
-Phase 1+2 complete  →  Foundation ready (commit)
-Phase 3 complete    →  Text extraction works  (commit, demo)
-Phase 4 complete    →  Image extraction works (commit, demo)
-Phase 5 complete    →  Video processing works (commit, demo)
-Phase 6 complete    →  QA generation works   (commit, demo)
-Phase 7 complete    →  Fine-tune management  (commit, demo)
-Phase 8 complete    →  Production-ready      (tag release)
-```
-
-### Parallel Team Strategy
-
-With two developers after Phase 2 completes:
- **Dev A**: US1 (text) → US5 (text QA) → US7 (finetune)
- **Dev B**: US2 (image) → US6 (image QA) → US3+US4 (video)
-
---
-
-## Summary
-
-| Phase | Tasks | User Story | Priority |
-|-------|-------|-----------|---------|
-| Phase 1: Setup | T001–T006 (6) | — | — |
-| Phase 2: Foundational | T007–T020 (14) | — | — |
-| Phase 3 | T021–T025 (5) | US1 文本三元组 | P1 🎯 MVP |
-| Phase 4 | T026–T030 (5) | US2 图像四元组 | P1 |
-| Phase 5 | T031–T038 (8) | US3+US4 视频处理 | P2 |
-| Phase 6 | T039–T045 (7) | US5+US6 QA生成 | P2 |
-| Phase 7 | T046–T051 (6) | US7+US8 微调+健康检查 | P3 |
-| Phase 8: Polish | T052–T055 (4) | — | — |
-| **Total** | **55 tasks** | **8 user stories** | |
-
---
-
-## Notes
-
- `[P]` tasks = different files, no shared dependencies within the same phase
- `[US?]` label maps each task to its user story for traceability
- Tests in `tests/conftest.py` (T020) use `AsyncMock` — no real ZhipuAI or RustFS calls in unit tests
- Video tasks use a real small video file generated by `cv2.VideoWriter` in tests — no external media needed
- All config is loaded via `get_config()` — never hardcode model names or bucket names in services
- Commit after each phase checkpoint at minimum; commit after each task for clean git history
- Stop at any checkpoint to validate the story independently before proceeding
--- a/tests/init.py
+++ b/tests/init.py
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -1,39 +0,0 @@
-import pytest
-from unittest.mock import AsyncMock, MagicMock
-from fastapi.testclient import TestClient
-
-from app.clients.llm.base import LLMClient
-from app.clients.storage.base import StorageClient
-from app.core.dependencies import get_llm_client, get_storage_client
-
-
-@pytest.fixture
-def mock_llm() -> LLMClient:
-    client = MagicMock(spec=LLMClient)
-    client.chat = AsyncMock(return_value='[]')
-    client.chat_vision = AsyncMock(return_value='[]')
-    return client
-
-
-@pytest.fixture
-def mock_storage() -> StorageClient:
-    client = MagicMock(spec=StorageClient)
-    client.download_bytes = AsyncMock(return_value=b"")
-    client.upload_bytes = AsyncMock(return_value=None)
-    client.get_presigned_url = AsyncMock(return_value="http://example.com/presigned")
-    client.get_object_size = AsyncMock(return_value=10 * 1024 * 1024)  # 10 MB default
-    return client
-
-
-@pytest.fixture
-def test_app(mock_llm, mock_storage):
-    from app.main import app
-    app.dependency_overrides[get_llm_client] = lambda: mock_llm
-    app.dependency_overrides[get_storage_client] = lambda: mock_storage
-    yield app
-    app.dependency_overrides.clear()
-
-
-@pytest.fixture
-def client(test_app):
-    return TestClient(test_app)
--- a/tests/test_config.py
+++ b/tests/test_config.py
@@ -1,40 +0,0 @@
-import os
-import pytest
-
-
-def test_yaml_defaults_load(monkeypatch):
-    # Clear lru_cache so each test gets a fresh load
-    from app.core import config as cfg_module
-    cfg_module.get_config.cache_clear()
-
-    # Remove env overrides that might bleed from shell environment
-    for var in ["MAX_VIDEO_SIZE_MB", "LOG_LEVEL", "STORAGE_ENDPOINT"]:
-        monkeypatch.delenv(var, raising=False)
-
-    cfg = cfg_module.get_config()
-
-    assert cfg["server"]["port"] == 8000
-    assert cfg["video"]["max_file_size_mb"] == 200
-    assert cfg["models"]["default_text"] == "glm-4-flash"
-    assert cfg["models"]["default_vision"] == "glm-4v-flash"
-    assert cfg["storage"]["buckets"]["source_data"] == "source-data"
-
-
-def test_max_video_size_env_override(monkeypatch):
-    from app.core import config as cfg_module
-    cfg_module.get_config.cache_clear()
-
-    monkeypatch.setenv("MAX_VIDEO_SIZE_MB", "500")
-    cfg = cfg_module.get_config()
-
-    assert cfg["video"]["max_file_size_mb"] == 500
-
-
-def test_log_level_env_override(monkeypatch):
-    from app.core import config as cfg_module
-    cfg_module.get_config.cache_clear()
-
-    monkeypatch.setenv("LOG_LEVEL", "DEBUG")
-    cfg = cfg_module.get_config()
-
-    assert cfg["server"]["log_level"] == "DEBUG"
--- a/tests/test_finetune_router.py
+++ b/tests/test_finetune_router.py
@@ -1,112 +0,0 @@
-"""T050: Integration tests for finetune router endpoints."""
-import pytest
-from unittest.mock import MagicMock, patch
-
-from app.core.exceptions import LLMCallError
-from app.models.finetune_models import FinetuneStartResponse, FinetuneStatusResponse
-
-
-# ---------------------------------------------------------------------------
-# POST /api/v1/finetune/start
-# ---------------------------------------------------------------------------
-
-def test_finetune_start_returns_200_with_job_id(client):
-    start_resp = FinetuneStartResponse(job_id="glm-ft-router-test")
-
-    with patch("app.routers.finetune.finetune_service.submit_finetune") as mock_submit:
-        mock_submit.return_value = start_resp
-
-        resp = client.post(
-            "/api/v1/finetune/start",
-            json={
-                "jsonl_url": "s3://bucket/train.jsonl",
-                "base_model": "glm-4",
-                "hyperparams": {"n_epochs": 3},
-            },
-        )
-
-    assert resp.status_code == 200
-    data = resp.json()
-    assert data["job_id"] == "glm-ft-router-test"
-
-
-def test_finetune_start_without_hyperparams(client):
-    start_resp = FinetuneStartResponse(job_id="glm-ft-nohp")
-
-    with patch("app.routers.finetune.finetune_service.submit_finetune") as mock_submit:
-        mock_submit.return_value = start_resp
-
-        resp = client.post(
-            "/api/v1/finetune/start",
-            json={
-                "jsonl_url": "s3://bucket/train.jsonl",
-                "base_model": "glm-4",
-            },
-        )
-
-    assert resp.status_code == 200
-    assert resp.json()["job_id"] == "glm-ft-nohp"
-
-
-def test_finetune_start_llm_call_error_returns_503(client):
-    with patch("app.routers.finetune.finetune_service.submit_finetune") as mock_submit:
-        mock_submit.side_effect = LLMCallError("SDK failed")
-
-        resp = client.post(
-            "/api/v1/finetune/start",
-            json={
-                "jsonl_url": "s3://bucket/train.jsonl",
-                "base_model": "glm-4",
-            },
-        )
-
-    assert resp.status_code == 503
-    assert resp.json()["code"] == "LLM_CALL_ERROR"
-
-
-# ---------------------------------------------------------------------------
-# GET /api/v1/finetune/status/{job_id}
-# ---------------------------------------------------------------------------
-
-def test_finetune_status_returns_200_with_fields(client):
-    status_resp = FinetuneStatusResponse(
-        job_id="glm-ft-router-test",
-        status="RUNNING",
-        progress=30,
-    )
-
-    with patch("app.routers.finetune.finetune_service.get_finetune_status") as mock_status:
-        mock_status.return_value = status_resp
-
-        resp = client.get("/api/v1/finetune/status/glm-ft-router-test")
-
-    assert resp.status_code == 200
-    data = resp.json()
-    assert data["job_id"] == "glm-ft-router-test"
-    assert data["status"] == "RUNNING"
-    assert data["progress"] == 30
-
-
-def test_finetune_status_succeeded(client):
-    status_resp = FinetuneStatusResponse(
-        job_id="glm-ft-done",
-        status="SUCCESS",
-    )
-
-    with patch("app.routers.finetune.finetune_service.get_finetune_status") as mock_status:
-        mock_status.return_value = status_resp
-
-        resp = client.get("/api/v1/finetune/status/glm-ft-done")
-
-    assert resp.status_code == 200
-    assert resp.json()["status"] == "SUCCESS"
-
-
-def test_finetune_status_llm_call_error_returns_503(client):
-    with patch("app.routers.finetune.finetune_service.get_finetune_status") as mock_status:
-        mock_status.side_effect = LLMCallError("SDK failed")
-
-        resp = client.get("/api/v1/finetune/status/glm-ft-bad")
-
-    assert resp.status_code == 503
-    assert resp.json()["code"] == "LLM_CALL_ERROR"
--- a/tests/test_finetune_service.py
+++ b/tests/test_finetune_service.py
@@ -1,151 +0,0 @@
-"""Tests for finetune_service — uses LLMClient interface (no internal SDK access)."""
-import pytest
-from unittest.mock import MagicMock, AsyncMock
-
-from app.clients.llm.base import LLMClient
-from app.core.exceptions import LLMCallError
-from app.models.finetune_models import (
-    FinetuneStartRequest,
-    FinetuneStartResponse,
-    FinetuneStatusResponse,
-)
-
-
-# ---------------------------------------------------------------------------
-# Helpers
-# ---------------------------------------------------------------------------
-
-def _make_llm(job_id: str = "glm-ft-test", status: str = "running", progress: int | None = None):
-    """Return a MagicMock(spec=LLMClient) with submit_finetune and get_finetune_status as AsyncMocks."""
-    llm = MagicMock(spec=LLMClient)
-    llm.submit_finetune = AsyncMock(return_value=job_id)
-    llm.get_finetune_status = AsyncMock(return_value={
-        "job_id": job_id,
-        "status": status,
-        "progress": progress,
-        "error_message": None,
-    })
-    return llm
-
-
-# ---------------------------------------------------------------------------
-# submit_finetune
-# ---------------------------------------------------------------------------
-
-@pytest.mark.asyncio
-async def test_submit_finetune_returns_job_id():
-    from app.services.finetune_service import submit_finetune
-
-    llm = _make_llm(job_id="glm-ft-abc123")
-    req = FinetuneStartRequest(
-        jsonl_url="s3://bucket/train.jsonl",
-        base_model="glm-4",
-        hyperparams={"n_epochs": 3},
-    )
-
-    result = await submit_finetune(req, llm)
-
-    assert isinstance(result, FinetuneStartResponse)
-    assert result.job_id == "glm-ft-abc123"
-
-
-@pytest.mark.asyncio
-async def test_submit_finetune_calls_interface_with_correct_params():
-    from app.services.finetune_service import submit_finetune
-
-    llm = _make_llm(job_id="glm-ft-xyz")
-    req = FinetuneStartRequest(
-        jsonl_url="s3://bucket/train.jsonl",
-        base_model="glm-4",
-        hyperparams={"n_epochs": 5},
-    )
-
-    await submit_finetune(req, llm)
-
-    llm.submit_finetune.assert_awaited_once_with(
-        "s3://bucket/train.jsonl",
-        "glm-4",
-        {"n_epochs": 5},
-    )
-
-
-@pytest.mark.asyncio
-async def test_submit_finetune_none_hyperparams_passes_empty_dict():
-    """hyperparams=None should be passed as {} to the interface."""
-    from app.services.finetune_service import submit_finetune
-
-    llm = _make_llm(job_id="glm-ft-nohp")
-    req = FinetuneStartRequest(
-        jsonl_url="s3://bucket/train.jsonl",
-        base_model="glm-4",
-    )
-
-    await submit_finetune(req, llm)
-
-    llm.submit_finetune.assert_awaited_once_with(
-        "s3://bucket/train.jsonl",
-        "glm-4",
-        {},
-    )
-
-
-@pytest.mark.asyncio
-async def test_submit_finetune_raises_llm_call_error_on_failure():
-    from app.services.finetune_service import submit_finetune
-
-    llm = MagicMock(spec=LLMClient)
-    llm.submit_finetune = AsyncMock(side_effect=LLMCallError("微调任务提交失败: SDK exploded"))
-
-    req = FinetuneStartRequest(
-        jsonl_url="s3://bucket/train.jsonl",
-        base_model="glm-4",
-    )
-
-    with pytest.raises(LLMCallError):
-        await submit_finetune(req, llm)
-
-
-# ---------------------------------------------------------------------------
-# get_finetune_status — status mapping
-# ---------------------------------------------------------------------------
-
-@pytest.mark.asyncio
-@pytest.mark.parametrize("sdk_status,expected", [
-    ("running", "RUNNING"),
-    ("succeeded", "SUCCESS"),
-    ("failed", "FAILED"),
-    ("pending", "RUNNING"),      # unknown → conservative RUNNING
-    ("queued", "RUNNING"),       # unknown → conservative RUNNING
-    ("cancelled", "RUNNING"),    # unknown → conservative RUNNING
-])
-async def test_get_finetune_status_maps_status(sdk_status, expected):
-    from app.services.finetune_service import get_finetune_status
-
-    llm = _make_llm(status=sdk_status)
-
-    result = await get_finetune_status("glm-ft-test", llm)
-
-    assert isinstance(result, FinetuneStatusResponse)
-    assert result.status == expected
-    assert result.job_id == "glm-ft-test"
-
-
-@pytest.mark.asyncio
-async def test_get_finetune_status_includes_progress():
-    from app.services.finetune_service import get_finetune_status
-
-    llm = _make_llm(status="running", progress=42)
-    result = await get_finetune_status("glm-ft-test", llm)
-
-    assert result.progress == 42
-
-
-@pytest.mark.asyncio
-async def test_get_finetune_status_raises_llm_call_error_on_failure():
-    from app.services.finetune_service import get_finetune_status
-
-    llm = MagicMock(spec=LLMClient)
-    llm.get_finetune_status = AsyncMock(side_effect=LLMCallError("查询微调任务失败: SDK exploded"))
-
-    with pytest.raises(LLMCallError):
-        await get_finetune_status("glm-ft-bad", llm)
--- a/tests/test_health.py
+++ b/tests/test_health.py
@@ -1,8 +0,0 @@
-"""T047: Health check endpoint test — GET /health → 200 {"status": "ok"}"""
-from fastapi.testclient import TestClient
-
-
-def test_health_returns_ok(client: TestClient):
-    response = client.get("/health")
-    assert response.status_code == 200
-    assert response.json() == {"status": "ok"}
--- a/tests/test_image_router.py
+++ b/tests/test_image_router.py
@@ -1,63 +0,0 @@
-import json
-import numpy as np
-import cv2
-import pytest
-from unittest.mock import AsyncMock
-
-from app.core.exceptions import StorageError
-
-
-def _make_test_image_bytes() -> bytes:
-    img = np.zeros((80, 100, 3), dtype=np.uint8)
-    _, buf = cv2.imencode(".jpg", img)
-    return buf.tobytes()
-
-
-SAMPLE_QUADS_JSON = json.dumps([
-    {
-        "subject": "电缆接头",
-        "predicate": "位于",
-        "object": "配电箱左侧",
-        "qualifier": "2024年检修",
-        "bbox": {"x": 5, "y": 5, "w": 20, "h": 15},
-    }
-])
-
-
-def test_image_extract_returns_200(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=_make_test_image_bytes())
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QUADS_JSON)
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    resp = client.post(
-        "/api/v1/image/extract",
-        json={"file_path": "image/test.jpg", "task_id": 1},
-    )
-    assert resp.status_code == 200
-    data = resp.json()
-    assert "items" in data
-    assert data["items"][0]["subject"] == "电缆接头"
-    assert data["items"][0]["cropped_image_path"] == "crops/1/0.jpg"
-
-
-def test_image_extract_llm_parse_error_returns_502(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=_make_test_image_bytes())
-    mock_llm.chat_vision = AsyncMock(return_value="not json {{")
-
-    resp = client.post(
-        "/api/v1/image/extract",
-        json={"file_path": "image/test.jpg", "task_id": 1},
-    )
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "LLM_PARSE_ERROR"
-
-
-def test_image_extract_storage_error_returns_502(client, mock_storage):
-    mock_storage.download_bytes = AsyncMock(side_effect=StorageError("RustFS down"))
-
-    resp = client.post(
-        "/api/v1/image/extract",
-        json={"file_path": "image/test.jpg", "task_id": 1},
-    )
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "STORAGE_ERROR"
--- a/tests/test_image_service.py
+++ b/tests/test_image_service.py
@@ -1,102 +0,0 @@
-import io
-import json
-import pytest
-import numpy as np
-import cv2
-from unittest.mock import AsyncMock
-
-from app.core.exceptions import LLMParseError
-from app.models.image_models import ImageExtractRequest
-
-
-def _make_test_image_bytes(width=100, height=80) -> bytes:
-    img = np.zeros((height, width, 3), dtype=np.uint8)
-    img[10:50, 10:60] = (255, 0, 0)  # blue rectangle
-    _, buf = cv2.imencode(".jpg", img)
-    return buf.tobytes()
-
-
-SAMPLE_QUADS_JSON = json.dumps([
-    {
-        "subject": "电缆接头",
-        "predicate": "位于",
-        "object": "配电箱左侧",
-        "qualifier": "2024年检修",
-        "bbox": {"x": 10, "y": 10, "w": 40, "h": 30},
-    }
-])
-
-
-@pytest.fixture
-def image_bytes():
-    return _make_test_image_bytes()
-
-
-@pytest.fixture
-def req():
-    return ImageExtractRequest(file_path="image/test.jpg", task_id=1)
-
-
-@pytest.mark.asyncio
-async def test_extract_quads_returns_items(mock_llm, mock_storage, image_bytes, req):
-    mock_storage.download_bytes = AsyncMock(return_value=image_bytes)
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QUADS_JSON)
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    from app.services.image_service import extract_quads
-    result = await extract_quads(req, mock_llm, mock_storage)
-
-    assert len(result.items) == 1
-    item = result.items[0]
-    assert item.subject == "电缆接头"
-    assert item.predicate == "位于"
-    assert item.bbox.x == 10
-    assert item.bbox.y == 10
-    assert item.cropped_image_path == "crops/1/0.jpg"
-
-
-@pytest.mark.asyncio
-async def test_crop_is_uploaded(mock_llm, mock_storage, image_bytes, req):
-    mock_storage.download_bytes = AsyncMock(return_value=image_bytes)
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QUADS_JSON)
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    from app.services.image_service import extract_quads
-    await extract_quads(req, mock_llm, mock_storage)
-
-    # upload_bytes called once for the crop
-    mock_storage.upload_bytes.assert_called_once()
-    call_args = mock_storage.upload_bytes.call_args
-    assert call_args.args[1] == "crops/1/0.jpg"
-
-
-@pytest.mark.asyncio
-async def test_out_of_bounds_bbox_is_clamped(mock_llm, mock_storage, req):
-    img = _make_test_image_bytes(width=50, height=40)
-    mock_storage.download_bytes = AsyncMock(return_value=img)
-
-    # bbox goes outside image boundary
-    oob_json = json.dumps([{
-        "subject": "test",
-        "predicate": "rel",
-        "object": "obj",
-        "qualifier": None,
-        "bbox": {"x": 30, "y": 20, "w": 100, "h": 100},  # extends beyond 50x40
-    }])
-    mock_llm.chat_vision = AsyncMock(return_value=oob_json)
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    from app.services.image_service import extract_quads
-    # Should not raise; bbox is clamped
-    result = await extract_quads(req, mock_llm, mock_storage)
-    assert len(result.items) == 1
-
-
-@pytest.mark.asyncio
-async def test_llm_parse_error_raised(mock_llm, mock_storage, image_bytes, req):
-    mock_storage.download_bytes = AsyncMock(return_value=image_bytes)
-    mock_llm.chat_vision = AsyncMock(return_value="bad json {{")
-
-    from app.services.image_service import extract_quads
-    with pytest.raises(LLMParseError):
-        await extract_quads(req, mock_llm, mock_storage)
--- a/tests/test_llm_client.py
+++ b/tests/test_llm_client.py
@@ -1,81 +0,0 @@
-import pytest
-from unittest.mock import MagicMock, patch
-
-from app.clients.llm.zhipuai_client import ZhipuAIClient
-from app.core.exceptions import LLMCallError
-
-
-@pytest.fixture
-def mock_sdk_response():
-    resp = MagicMock()
-    resp.choices[0].message.content = '{"result": "ok"}'
-    return resp
-
-
-@pytest.fixture
-def client():
-    with patch("app.clients.llm.zhipuai_client.ZhipuAI"):
-        c = ZhipuAIClient(api_key="test-key")
-        return c
-
-
-@pytest.mark.asyncio
-async def test_chat_returns_content(client, mock_sdk_response):
-    client._client.chat.completions.create.return_value = mock_sdk_response
-    result = await client.chat("glm-4-flash", [{"role": "user", "content": "hello"}])
-    assert result == '{"result": "ok"}'
-
-
-@pytest.mark.asyncio
-async def test_chat_vision_returns_content(client, mock_sdk_response):
-    client._client.chat.completions.create.return_value = mock_sdk_response
-    result = await client.chat_vision("glm-4v-flash", [{"role": "user", "content": []}])
-    assert result == '{"result": "ok"}'
-
-
-@pytest.mark.asyncio
-async def test_llm_call_error_on_sdk_exception(client):
-    client._client.chat.completions.create.side_effect = RuntimeError("quota exceeded")
-    with pytest.raises(LLMCallError, match="大模型调用失败"):
-        await client.chat("glm-4-flash", [{"role": "user", "content": "hi"}])
-
-
-@pytest.mark.asyncio
-async def test_submit_finetune_returns_job_id(client):
-    """submit_finetune should call the SDK and return the job id."""
-    resp = MagicMock()
-    resp.id = "glm-ft-newjob"
-    client._client.fine_tuning.jobs.create.return_value = resp
-
-    job_id = await client.submit_finetune(
-        jsonl_url="s3://bucket/train.jsonl",
-        base_model="glm-4",
-        hyperparams={"n_epochs": 2},
-    )
-
-    assert job_id == "glm-ft-newjob"
-    client._client.fine_tuning.jobs.create.assert_called_once_with(
-        training_file="s3://bucket/train.jsonl",
-        model="glm-4",
-        hyperparameters={"n_epochs": 2},
-    )
-
-
-@pytest.mark.asyncio
-async def test_get_finetune_status_returns_correct_dict(client):
-    """get_finetune_status should return a normalized dict with progress coerced to int."""
-    resp = MagicMock()
-    resp.id = "glm-ft-abc"
-    resp.status = "running"
-    resp.progress = "75"  # SDK may return string; should be coerced to int
-    resp.error_message = None
-    client._client.fine_tuning.jobs.retrieve.return_value = resp
-
-    result = await client.get_finetune_status("glm-ft-abc")
-
-    assert result == {
-        "job_id": "glm-ft-abc",
-        "status": "running",
-        "progress": 75,
-        "error_message": None,
-    }
--- a/tests/test_qa_router.py
+++ b/tests/test_qa_router.py
@@ -1,121 +0,0 @@
-"""Tests for QA router: /api/v1/qa/gen-text and /api/v1/qa/gen-image."""
-import json
-import pytest
-from unittest.mock import AsyncMock
-
-from app.core.exceptions import LLMCallError, LLMParseError, StorageError
-
-
-SAMPLE_QA_JSON = json.dumps([
-    {"question": "电缆接头位于哪里？", "answer": "配电箱左侧"},
-])
-
-FAKE_IMAGE_BYTES = b"\xff\xd8\xff\xe0fake_jpeg_content"
-
-TEXT_QA_PAYLOAD = {
-    "items": [
-        {
-            "subject": "电缆接头",
-            "predicate": "位于",
-            "object": "配电箱左侧",
-            "source_snippet": "电缆接头位于配电箱左侧",
-        }
-    ]
-}
-
-IMAGE_QA_PAYLOAD = {
-    "items": [
-        {
-            "subject": "电缆接头",
-            "predicate": "位于",
-            "object": "配电箱左侧",
-            "cropped_image_path": "crops/1/0.jpg",
-        }
-    ]
-}
-
-
-# ---------------------------------------------------------------------------
-# POST /api/v1/qa/gen-text
-# ---------------------------------------------------------------------------
-
-
-def test_gen_text_qa_returns_200(client, mock_llm):
-    mock_llm.chat = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    resp = client.post("/api/v1/qa/gen-text", json=TEXT_QA_PAYLOAD)
-
-    assert resp.status_code == 200
-    data = resp.json()
-    assert "pairs" in data
-    assert len(data["pairs"]) == 1
-    assert data["pairs"][0]["question"] == "电缆接头位于哪里？"
-    assert data["pairs"][0]["answer"] == "配电箱左侧"
-
-
-def test_gen_text_qa_llm_parse_error_returns_502(client, mock_llm):
-    mock_llm.chat = AsyncMock(return_value="not valid json {{")
-
-    resp = client.post("/api/v1/qa/gen-text", json=TEXT_QA_PAYLOAD)
-
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "LLM_PARSE_ERROR"
-
-
-def test_gen_text_qa_llm_call_error_returns_503(client, mock_llm):
-    mock_llm.chat = AsyncMock(side_effect=LLMCallError("GLM timeout"))
-
-    resp = client.post("/api/v1/qa/gen-text", json=TEXT_QA_PAYLOAD)
-
-    assert resp.status_code == 503
-    assert resp.json()["code"] == "LLM_CALL_ERROR"
-
-
-# ---------------------------------------------------------------------------
-# POST /api/v1/qa/gen-image
-# ---------------------------------------------------------------------------
-
-
-def test_gen_image_qa_returns_200(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=FAKE_IMAGE_BYTES)
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    resp = client.post("/api/v1/qa/gen-image", json=IMAGE_QA_PAYLOAD)
-
-    assert resp.status_code == 200
-    data = resp.json()
-    assert "pairs" in data
-    assert len(data["pairs"]) == 1
-    pair = data["pairs"][0]
-    assert pair["question"] == "电缆接头位于哪里？"
-    assert pair["answer"] == "配电箱左侧"
-    assert pair["image_path"] == "crops/1/0.jpg"
-
-
-def test_gen_image_qa_llm_parse_error_returns_502(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=FAKE_IMAGE_BYTES)
-    mock_llm.chat_vision = AsyncMock(return_value="bad json {{")
-
-    resp = client.post("/api/v1/qa/gen-image", json=IMAGE_QA_PAYLOAD)
-
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "LLM_PARSE_ERROR"
-
-
-def test_gen_image_qa_llm_call_error_returns_503(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=FAKE_IMAGE_BYTES)
-    mock_llm.chat_vision = AsyncMock(side_effect=LLMCallError("GLM vision timeout"))
-
-    resp = client.post("/api/v1/qa/gen-image", json=IMAGE_QA_PAYLOAD)
-
-    assert resp.status_code == 503
-    assert resp.json()["code"] == "LLM_CALL_ERROR"
-
-
-def test_gen_image_qa_storage_error_returns_502(client, mock_storage):
-    mock_storage.download_bytes = AsyncMock(side_effect=StorageError("RustFS down"))
-
-    resp = client.post("/api/v1/qa/gen-image", json=IMAGE_QA_PAYLOAD)
-
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "STORAGE_ERROR"
--- a/tests/test_qa_service.py
+++ b/tests/test_qa_service.py
@@ -1,236 +0,0 @@
-"""Tests for qa_service: text QA (US5) and image QA (US6)."""
-import base64
-import json
-import pytest
-from unittest.mock import AsyncMock
-
-from app.core.exceptions import LLMCallError, LLMParseError, StorageError
-
-
-# ---------------------------------------------------------------------------
-# Shared fixtures / helpers
-# ---------------------------------------------------------------------------
-
-SAMPLE_QA_JSON = json.dumps([
-    {"question": "电缆接头位于哪里？", "answer": "配电箱左侧"},
-])
-
-
-# ---------------------------------------------------------------------------
-# T039 — Text QA service tests (US5)
-# ---------------------------------------------------------------------------
-
-
-@pytest.mark.asyncio
-async def test_gen_text_qa_prompt_contains_triples(mock_llm):
-    """Triple fields and source_snippet must appear in the message sent to LLM."""
-    from app.models.qa_models import GenTextQARequest, TextQAItem
-    from app.services.qa_service import gen_text_qa
-
-    mock_llm.chat = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    req = GenTextQARequest(items=[
-        TextQAItem(
-            subject="电缆接头",
-            predicate="位于",
-            object="配电箱左侧",
-            source_snippet="电缆接头位于配电箱左侧",
-        )
-    ])
-
-    await gen_text_qa(req, mock_llm)
-
-    assert mock_llm.chat.called
-    call_args = mock_llm.chat.call_args
-    messages = call_args.args[1] if call_args.args else call_args.kwargs["messages"]
-    prompt_text = messages[0]["content"]
-    assert "电缆接头" in prompt_text
-    assert "位于" in prompt_text
-    assert "配电箱左侧" in prompt_text
-    assert "电缆接头位于配电箱左侧" in prompt_text
-
-
-@pytest.mark.asyncio
-async def test_gen_text_qa_returns_qa_pair_list(mock_llm):
-    """Parsed JSON must be returned as QAPair list."""
-    from app.models.qa_models import GenTextQARequest, QAPair, TextQAItem
-    from app.services.qa_service import gen_text_qa
-
-    mock_llm.chat = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    req = GenTextQARequest(items=[
-        TextQAItem(
-            subject="电缆接头",
-            predicate="位于",
-            object="配电箱左侧",
-            source_snippet="电缆接头位于配电箱左侧",
-        )
-    ])
-
-    result = await gen_text_qa(req, mock_llm)
-
-    assert len(result.pairs) == 1
-    pair = result.pairs[0]
-    assert isinstance(pair, QAPair)
-    assert pair.question == "电缆接头位于哪里？"
-    assert pair.answer == "配电箱左侧"
-
-
-@pytest.mark.asyncio
-async def test_gen_text_qa_llm_parse_error_on_malformed_response(mock_llm):
-    """LLMParseError must be raised when LLM returns non-JSON."""
-    from app.models.qa_models import GenTextQARequest, TextQAItem
-    from app.services.qa_service import gen_text_qa
-
-    mock_llm.chat = AsyncMock(return_value="this is not json {{")
-
-    req = GenTextQARequest(items=[
-        TextQAItem(subject="s", predicate="p", object="o", source_snippet="snip")
-    ])
-
-    with pytest.raises(LLMParseError):
-        await gen_text_qa(req, mock_llm)
-
-
-@pytest.mark.asyncio
-async def test_gen_text_qa_llm_call_error_propagates(mock_llm):
-    """LLMCallError from LLM client must propagate unchanged."""
-    from app.models.qa_models import GenTextQARequest, TextQAItem
-    from app.services.qa_service import gen_text_qa
-
-    mock_llm.chat = AsyncMock(side_effect=LLMCallError("GLM timeout"))
-
-    req = GenTextQARequest(items=[
-        TextQAItem(subject="s", predicate="p", object="o", source_snippet="snip")
-    ])
-
-    with pytest.raises(LLMCallError):
-        await gen_text_qa(req, mock_llm)
-
-
-# ---------------------------------------------------------------------------
-# T040 — Image QA service tests (US6)
-# ---------------------------------------------------------------------------
-
-FAKE_IMAGE_BYTES = b"\xff\xd8\xff\xe0fake_jpeg_content"
-
-
-@pytest.mark.asyncio
-async def test_gen_image_qa_downloads_image_and_encodes_base64(mock_llm, mock_storage):
-    """Storage.download_bytes must be called, result base64-encoded in LLM message."""
-    from app.models.qa_models import GenImageQARequest, ImageQAItem
-    from app.services.qa_service import gen_image_qa
-
-    mock_storage.download_bytes = AsyncMock(return_value=FAKE_IMAGE_BYTES)
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    req = GenImageQARequest(items=[
-        ImageQAItem(
-            subject="电缆接头",
-            predicate="位于",
-            object="配电箱左侧",
-            cropped_image_path="crops/1/0.jpg",
-        )
-    ])
-
-    await gen_image_qa(req, mock_llm, mock_storage)
-
-    # Storage download must have been called with the correct path
-    mock_storage.download_bytes.assert_called_once()
-    call_args = mock_storage.download_bytes.call_args
-    path_arg = call_args.args[1] if len(call_args.args) > 1 else call_args.kwargs.get("path", call_args.kwargs.get("key"))
-    assert path_arg == "crops/1/0.jpg"
-
-
-@pytest.mark.asyncio
-async def test_gen_image_qa_multimodal_message_format(mock_llm, mock_storage):
-    """Multimodal message must contain inline base64 image_url and text."""
-    from app.models.qa_models import GenImageQARequest, ImageQAItem
-    from app.services.qa_service import gen_image_qa
-
-    mock_storage.download_bytes = AsyncMock(return_value=FAKE_IMAGE_BYTES)
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    req = GenImageQARequest(items=[
-        ImageQAItem(
-            subject="电缆接头",
-            predicate="位于",
-            object="配电箱左侧",
-            qualifier="2024检修",
-            cropped_image_path="crops/1/0.jpg",
-        )
-    ])
-
-    await gen_image_qa(req, mock_llm, mock_storage)
-
-    assert mock_llm.chat_vision.called
-    call_args = mock_llm.chat_vision.call_args
-    messages = call_args.args[1] if call_args.args else call_args.kwargs["messages"]
-
-    # Find the content list in messages
-    content = messages[0]["content"]
-    assert isinstance(content, list)
-
-    # Must have an image_url part with inline base64 data URI
-    image_parts = [p for p in content if p.get("type") == "image_url"]
-    assert len(image_parts) >= 1
-    url = image_parts[0]["image_url"]["url"]
-    expected_b64 = base64.b64encode(FAKE_IMAGE_BYTES).decode()
-    assert url == f"data:image/jpeg;base64,{expected_b64}"
-
-    # Must have a text part containing quad info
-    text_parts = [p for p in content if p.get("type") == "text"]
-    assert len(text_parts) >= 1
-    text = text_parts[0]["text"]
-    assert "电缆接头" in text
-    assert "位于" in text
-    assert "配电箱左侧" in text
-
-
-@pytest.mark.asyncio
-async def test_gen_image_qa_returns_image_qa_pair_with_image_path(mock_llm, mock_storage):
-    """Result ImageQAPair must include image_path from the item."""
-    from app.models.qa_models import GenImageQARequest, ImageQAItem, ImageQAPair
-    from app.services.qa_service import gen_image_qa
-
-    mock_storage.download_bytes = AsyncMock(return_value=FAKE_IMAGE_BYTES)
-    mock_llm.chat_vision = AsyncMock(return_value=SAMPLE_QA_JSON)
-
-    req = GenImageQARequest(items=[
-        ImageQAItem(
-            subject="电缆接头",
-            predicate="位于",
-            object="配电箱左侧",
-            cropped_image_path="crops/1/0.jpg",
-        )
-    ])
-
-    result = await gen_image_qa(req, mock_llm, mock_storage)
-
-    assert len(result.pairs) == 1
-    pair = result.pairs[0]
-    assert isinstance(pair, ImageQAPair)
-    assert pair.question == "电缆接头位于哪里？"
-    assert pair.answer == "配电箱左侧"
-    assert pair.image_path == "crops/1/0.jpg"
-
-
-@pytest.mark.asyncio
-async def test_gen_image_qa_storage_error_propagates(mock_llm, mock_storage):
-    """StorageError from download must propagate unchanged."""
-    from app.models.qa_models import GenImageQARequest, ImageQAItem
-    from app.services.qa_service import gen_image_qa
-
-    mock_storage.download_bytes = AsyncMock(side_effect=StorageError("RustFS down"))
-
-    req = GenImageQARequest(items=[
-        ImageQAItem(
-            subject="s",
-            predicate="p",
-            object="o",
-            cropped_image_path="crops/1/0.jpg",
-        )
-    ])
-
-    with pytest.raises(StorageError):
-        await gen_image_qa(req, mock_llm, mock_storage)
--- a/tests/test_storage_client.py
+++ b/tests/test_storage_client.py
@@ -1,62 +0,0 @@
-import pytest
-from unittest.mock import MagicMock, patch
-from botocore.exceptions import ClientError
-
-from app.clients.storage.rustfs_client import RustFSClient
-from app.core.exceptions import StorageError
-
-
-@pytest.fixture
-def client():
-    with patch("app.clients.storage.rustfs_client.boto3") as mock_boto3:
-        c = RustFSClient(
-            endpoint="http://rustfs:9000",
-            access_key="key",
-            secret_key="secret",
-        )
-        c._s3 = MagicMock()
-        return c
-
-
-@pytest.mark.asyncio
-async def test_download_bytes_returns_bytes(client):
-    client._s3.get_object.return_value = {"Body": MagicMock(read=lambda: b"hello")}
-    result = await client.download_bytes("source-data", "text/test.txt")
-    assert result == b"hello"
-    client._s3.get_object.assert_called_once_with(Bucket="source-data", Key="text/test.txt")
-
-
-@pytest.mark.asyncio
-async def test_download_bytes_raises_storage_error(client):
-    client._s3.get_object.side_effect = ClientError(
-        {"Error": {"Code": "NoSuchKey", "Message": "Not Found"}}, "GetObject"
-    )
-    with pytest.raises(StorageError, match="存储下载失败"):
-        await client.download_bytes("source-data", "missing.txt")
-
-
-@pytest.mark.asyncio
-async def test_get_object_size_returns_content_length(client):
-    client._s3.head_object.return_value = {"ContentLength": 1024}
-    size = await client.get_object_size("source-data", "video/test.mp4")
-    assert size == 1024
-    client._s3.head_object.assert_called_once_with(Bucket="source-data", Key="video/test.mp4")
-
-
-@pytest.mark.asyncio
-async def test_get_object_size_raises_storage_error(client):
-    client._s3.head_object.side_effect = ClientError(
-        {"Error": {"Code": "NoSuchKey", "Message": "Not Found"}}, "HeadObject"
-    )
-    with pytest.raises(StorageError, match="获取文件大小失败"):
-        await client.get_object_size("source-data", "video/missing.mp4")
-
-
-@pytest.mark.asyncio
-async def test_upload_bytes_calls_put_object(client):
-    client._s3.put_object.return_value = {}
-    await client.upload_bytes("source-data", "frames/1/0.jpg", b"jpeg-data", "image/jpeg")
-    client._s3.put_object.assert_called_once()
-    call_kwargs = client._s3.put_object.call_args
-    assert call_kwargs.kwargs["Bucket"] == "source-data"
-    assert call_kwargs.kwargs["Key"] == "frames/1/0.jpg"
--- a/tests/test_text_router.py
+++ b/tests/test_text_router.py
@@ -1,63 +0,0 @@
-import pytest
-from unittest.mock import AsyncMock
-
-
-SAMPLE_TRIPLES_JSON = '''[
-  {
-    "subject": "变压器",
-    "predicate": "额定电压",
-    "object": "110kV",
-    "source_snippet": "该变压器额定电压为110kV",
-    "source_offset": {"start": 0, "end": 12}
-  }
-]'''
-
-
-def test_text_extract_returns_200(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"some text content")
-    mock_llm.chat = AsyncMock(return_value=SAMPLE_TRIPLES_JSON)
-
-    resp = client.post(
-        "/api/v1/text/extract",
-        json={"file_path": "text/test.txt", "file_name": "test.txt"},
-    )
-    assert resp.status_code == 200
-    data = resp.json()
-    assert "items" in data
-    assert data["items"][0]["subject"] == "变压器"
-    assert data["items"][0]["source_offset"]["start"] == 0
-
-
-def test_text_extract_unsupported_format_returns_400(client, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"data")
-
-    resp = client.post(
-        "/api/v1/text/extract",
-        json={"file_path": "text/test.xlsx", "file_name": "data.xlsx"},
-    )
-    assert resp.status_code == 400
-    assert resp.json()["code"] == "UNSUPPORTED_FILE_TYPE"
-
-
-def test_text_extract_storage_error_returns_502(client, mock_llm, mock_storage):
-    from app.core.exceptions import StorageError
-    mock_storage.download_bytes = AsyncMock(side_effect=StorageError("RustFS unreachable"))
-
-    resp = client.post(
-        "/api/v1/text/extract",
-        json={"file_path": "text/test.txt", "file_name": "test.txt"},
-    )
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "STORAGE_ERROR"
-
-
-def test_text_extract_llm_parse_error_returns_502(client, mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"content")
-    mock_llm.chat = AsyncMock(return_value="not json {{{{")
-
-    resp = client.post(
-        "/api/v1/text/extract",
-        json={"file_path": "text/test.txt", "file_name": "test.txt"},
-    )
-    assert resp.status_code == 502
-    assert resp.json()["code"] == "LLM_PARSE_ERROR"
--- a/tests/test_text_service.py
+++ b/tests/test_text_service.py
@@ -1,122 +0,0 @@
-import pytest
-from unittest.mock import AsyncMock, MagicMock
-
-from app.core.exceptions import LLMParseError, StorageError, UnsupportedFileTypeError
-from app.models.text_models import TextExtractRequest
-
-
-SAMPLE_TRIPLES_JSON = '''[
-  {
-    "subject": "变压器",
-    "predicate": "额定电压",
-    "object": "110kV",
-    "source_snippet": "该变压器额定电压为110kV",
-    "source_offset": {"start": 0, "end": 12}
-  }
-]'''
-
-
-@pytest.fixture
-def req_txt():
-    return TextExtractRequest(file_path="text/test.txt", file_name="test.txt")
-
-
-@pytest.fixture
-def req_pdf():
-    return TextExtractRequest(file_path="text/test.pdf", file_name="report.pdf")
-
-
-@pytest.fixture
-def req_docx():
-    return TextExtractRequest(file_path="text/test.docx", file_name="doc.docx")
-
-
-@pytest.fixture
-def llm(mock_llm):
-    mock_llm.chat = AsyncMock(return_value=SAMPLE_TRIPLES_JSON)
-    return mock_llm
-
-
-@pytest.mark.asyncio
-async def test_txt_extraction_returns_triples(llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"test content")
-    from app.services.text_service import extract_triples
-    req = TextExtractRequest(file_path="text/test.txt", file_name="test.txt")
-    result = await extract_triples(req, llm, mock_storage)
-    assert len(result.items) == 1
-    assert result.items[0].subject == "变压器"
-    assert result.items[0].predicate == "额定电压"
-    assert result.items[0].object == "110kV"
-    assert result.items[0].source_offset.start == 0
-    assert result.items[0].source_offset.end == 12
-
-
-@pytest.mark.asyncio
-async def test_pdf_extraction(llm, mock_storage, tmp_path):
-    import pdfplumber, io
-    # We mock download_bytes to return a minimal PDF-like response
-    # and mock pdfplumber.open to return pages with text
-    mock_storage.download_bytes = AsyncMock(return_value=b"%PDF fake")
-
-    with pytest.MonkeyPatch().context() as mp:
-        mock_page = MagicMock()
-        mock_page.extract_text.return_value = "PDF content here"
-        mock_pdf = MagicMock()
-        mock_pdf.__enter__ = lambda s: s
-        mock_pdf.__exit__ = MagicMock(return_value=False)
-        mock_pdf.pages = [mock_page]
-        mp.setattr("pdfplumber.open", lambda f: mock_pdf)
-
-        from app.services import text_service
-        import importlib
-        importlib.reload(text_service)
-        req = TextExtractRequest(file_path="text/test.pdf", file_name="doc.pdf")
-        result = await text_service.extract_triples(req, llm, mock_storage)
-    assert len(result.items) == 1
-
-
-@pytest.mark.asyncio
-async def test_docx_extraction(llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"PK fake docx bytes")
-
-    with pytest.MonkeyPatch().context() as mp:
-        mock_para = MagicMock()
-        mock_para.text = "Word paragraph content"
-        mock_doc = MagicMock()
-        mock_doc.paragraphs = [mock_para]
-        mp.setattr("docx.Document", lambda f: mock_doc)
-
-        from app.services import text_service
-        import importlib
-        importlib.reload(text_service)
-        req = TextExtractRequest(file_path="text/test.docx", file_name="doc.docx")
-        result = await text_service.extract_triples(req, llm, mock_storage)
-    assert len(result.items) == 1
-
-
-@pytest.mark.asyncio
-async def test_unsupported_format_raises_error(llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"data")
-    from app.services.text_service import extract_triples
-    req = TextExtractRequest(file_path="text/test.xlsx", file_name="data.xlsx")
-    with pytest.raises(UnsupportedFileTypeError):
-        await extract_triples(req, llm, mock_storage)
-
-
-@pytest.mark.asyncio
-async def test_storage_error_propagates(llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(side_effect=StorageError("not found"))
-    from app.services.text_service import extract_triples
-    req = TextExtractRequest(file_path="text/test.txt", file_name="test.txt")
-    with pytest.raises(StorageError):
-        await extract_triples(req, llm, mock_storage)
-
-
-@pytest.mark.asyncio
-async def test_llm_parse_error_propagates(mock_llm, mock_storage):
-    mock_storage.download_bytes = AsyncMock(return_value=b"content")
-    mock_llm.chat = AsyncMock(return_value="not json {{")
-    from app.services.text_service import extract_triples
-    req = TextExtractRequest(file_path="text/test.txt", file_name="test.txt")
-    with pytest.raises(LLMParseError):
-        await extract_triples(req, mock_llm, mock_storage)
--- a/tests/test_video_router.py
+++ b/tests/test_video_router.py
@@ -1,71 +0,0 @@
-import pytest
-from unittest.mock import AsyncMock, patch
-
-from app.core.exceptions import VideoTooLargeError
-
-
-def test_extract_frames_returns_202(client, mock_storage):
-    mock_storage.get_object_size = AsyncMock(return_value=10 * 1024 * 1024)  # 10 MB
-
-    with patch("app.routers.video.BackgroundTasks.add_task"):
-        resp = client.post(
-            "/api/v1/video/extract-frames",
-            json={
-                "file_path": "video/test.mp4",
-                "source_id": 10,
-                "job_id": 42,
-            },
-        )
-    assert resp.status_code == 202
-    data = resp.json()
-    assert data["job_id"] == 42
-
-
-def test_extract_frames_video_too_large_returns_400(client, mock_storage):
-    mock_storage.get_object_size = AsyncMock(return_value=300 * 1024 * 1024)  # 300 MB > 200 MB
-
-    resp = client.post(
-        "/api/v1/video/extract-frames",
-        json={
-            "file_path": "video/big.mp4",
-            "source_id": 10,
-            "job_id": 99,
-        },
-    )
-    assert resp.status_code == 400
-    assert resp.json()["code"] == "VIDEO_TOO_LARGE"
-
-
-def test_video_to_text_returns_202(client, mock_storage):
-    mock_storage.get_object_size = AsyncMock(return_value=10 * 1024 * 1024)
-
-    with patch("app.routers.video.BackgroundTasks.add_task"):
-        resp = client.post(
-            "/api/v1/video/to-text",
-            json={
-                "file_path": "video/test.mp4",
-                "source_id": 10,
-                "job_id": 43,
-                "start_sec": 0,
-                "end_sec": 60,
-            },
-        )
-    assert resp.status_code == 202
-    assert resp.json()["job_id"] == 43
-
-
-def test_video_to_text_too_large_returns_400(client, mock_storage):
-    mock_storage.get_object_size = AsyncMock(return_value=300 * 1024 * 1024)
-
-    resp = client.post(
-        "/api/v1/video/to-text",
-        json={
-            "file_path": "video/big.mp4",
-            "source_id": 10,
-            "job_id": 99,
-            "start_sec": 0,
-            "end_sec": 60,
-        },
-    )
-    assert resp.status_code == 400
-    assert resp.json()["code"] == "VIDEO_TOO_LARGE"
--- a/tests/test_video_service.py
+++ b/tests/test_video_service.py
@@ -1,195 +0,0 @@
-import io
-import json
-import os
-import tempfile
-import pytest
-import numpy as np
-import cv2
-from unittest.mock import AsyncMock, MagicMock, patch
-
-from app.models.video_models import ExtractFramesRequest, VideoToTextRequest
-
-
-def _make_test_video(path: str, num_frames: int = 10, fps: float = 10.0, width=64, height=64):
-    """Write a small test video to `path` using cv2.VideoWriter."""
-    fourcc = cv2.VideoWriter_fourcc(*"mp4v")
-    out = cv2.VideoWriter(path, fourcc, fps, (width, height))
-    for i in range(num_frames):
-        frame = np.full((height, width, 3), (i * 20) % 256, dtype=np.uint8)
-        out.write(frame)
-    out.release()
-
-
-# ── US3: Frame Extraction ──────────────────────────────────────────────────────
-
-@pytest.fixture
-def frames_req():
-    return ExtractFramesRequest(
-        file_path="video/test.mp4",
-        source_id=10,
-        job_id=42,
-        mode="interval",
-        frame_interval=3,
-    )
-
-
-@pytest.mark.asyncio
-async def test_interval_mode_extracts_correct_frames(mock_storage, frames_req, tmp_path):
-    video_path = str(tmp_path / "test.mp4")
-    _make_test_video(video_path, num_frames=10, fps=10.0)
-
-    with open(video_path, "rb") as f:
-        video_bytes = f.read()
-
-    mock_storage.download_bytes = AsyncMock(return_value=video_bytes)
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    callback_payloads = []
-
-    async def fake_callback(url, payload):
-        callback_payloads.append(payload)
-
-    with patch("app.services.video_service._post_callback", new=fake_callback):
-        from app.services.video_service import extract_frames_task
-        await extract_frames_task(frames_req, mock_storage, "http://backend/callback")
-
-    assert len(callback_payloads) == 1
-    cb = callback_payloads[0]
-    assert cb["status"] == "SUCCESS"
-    assert cb["job_id"] == 42
-    # With 10 frames and interval=3, we expect frames at indices 0, 3, 6, 9 → 4 frames
-    assert len(cb["frames"]) == 4
-
-
-@pytest.mark.asyncio
-async def test_keyframe_mode_extracts_scene_changes(mock_storage, tmp_path):
-    video_path = str(tmp_path / "kf.mp4")
-    # Create video with 2 distinct scenes separated by sudden color change
-    fourcc = cv2.VideoWriter_fourcc(*"mp4v")
-    out = cv2.VideoWriter(video_path, fourcc, 10.0, (64, 64))
-    for _ in range(5):
-        out.write(np.zeros((64, 64, 3), dtype=np.uint8))        # black frames
-    for _ in range(5):
-        out.write(np.full((64, 64, 3), 200, dtype=np.uint8))    # bright frames
-    out.release()
-
-    with open(video_path, "rb") as f:
-        video_bytes = f.read()
-
-    mock_storage.download_bytes = AsyncMock(return_value=video_bytes)
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    callback_payloads = []
-
-    async def fake_callback(url, payload):
-        callback_payloads.append(payload)
-
-    req = ExtractFramesRequest(
-        file_path="video/kf.mp4",
-        source_id=10,
-        job_id=43,
-        mode="keyframe",
-    )
-    with patch("app.services.video_service._post_callback", new=fake_callback):
-        from app.services.video_service import extract_frames_task
-        await extract_frames_task(req, mock_storage, "http://backend/callback")
-
-    cb = callback_payloads[0]
-    assert cb["status"] == "SUCCESS"
-    # Should capture at least the scene-change frame
-    assert len(cb["frames"]) >= 1
-
-
-@pytest.mark.asyncio
-async def test_frame_upload_path_convention(mock_storage, frames_req, tmp_path):
-    video_path = str(tmp_path / "test.mp4")
-    _make_test_video(video_path, num_frames=3, fps=10.0)
-    with open(video_path, "rb") as f:
-        mock_storage.download_bytes = AsyncMock(return_value=f.read())
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    callback_payloads = []
-    async def fake_callback(url, payload):
-        callback_payloads.append(payload)
-
-    req = ExtractFramesRequest(
-        file_path="video/test.mp4", source_id=10, job_id=99, mode="interval", frame_interval=1
-    )
-    with patch("app.services.video_service._post_callback", new=fake_callback):
-        from app.services.video_service import extract_frames_task
-        await extract_frames_task(req, mock_storage, "http://backend/callback")
-
-    uploaded_paths = [call.args[1] for call in mock_storage.upload_bytes.call_args_list]
-    for i, path in enumerate(uploaded_paths):
-        assert path == f"frames/10/{i}.jpg"
-
-
-@pytest.mark.asyncio
-async def test_failed_extraction_sends_failed_callback(mock_storage, frames_req):
-    mock_storage.download_bytes = AsyncMock(side_effect=Exception("storage failure"))
-
-    callback_payloads = []
-    async def fake_callback(url, payload):
-        callback_payloads.append(payload)
-
-    with patch("app.services.video_service._post_callback", new=fake_callback):
-        from app.services.video_service import extract_frames_task
-        await extract_frames_task(frames_req, mock_storage, "http://backend/callback")
-
-    assert callback_payloads[0]["status"] == "FAILED"
-    assert callback_payloads[0]["error_message"] is not None
-
-
-# ── US4: Video To Text ─────────────────────────────────────────────────────────
-
-@pytest.fixture
-def totext_req():
-    return VideoToTextRequest(
-        file_path="video/test.mp4",
-        source_id=10,
-        job_id=44,
-        start_sec=0.0,
-        end_sec=1.0,
-    )
-
-
-@pytest.mark.asyncio
-async def test_video_to_text_samples_frames_and_calls_llm(mock_llm, mock_storage, totext_req, tmp_path):
-    video_path = str(tmp_path / "totext.mp4")
-    _make_test_video(video_path, num_frames=20, fps=10.0)
-    with open(video_path, "rb") as f:
-        mock_storage.download_bytes = AsyncMock(return_value=f.read())
-    mock_llm.chat_vision = AsyncMock(return_value="视频描述内容")
-    mock_storage.upload_bytes = AsyncMock(return_value=None)
-
-    callback_payloads = []
-    async def fake_callback(url, payload):
-        callback_payloads.append(payload)
-
-    with patch("app.services.video_service._post_callback", new=fake_callback):
-        from app.services.video_service import video_to_text_task
-        await video_to_text_task(totext_req, mock_llm, mock_storage, "http://backend/callback")
-
-    assert callback_payloads[0]["status"] == "SUCCESS"
-    assert "output_path" in callback_payloads[0]
-    assert callback_payloads[0]["output_path"].startswith("video-text/10/")
-    mock_llm.chat_vision.assert_called_once()
-
-
-@pytest.mark.asyncio
-async def test_video_to_text_llm_failure_sends_failed_callback(mock_llm, mock_storage, totext_req, tmp_path):
-    video_path = str(tmp_path / "fail.mp4")
-    _make_test_video(video_path, num_frames=5, fps=10.0)
-    with open(video_path, "rb") as f:
-        mock_storage.download_bytes = AsyncMock(return_value=f.read())
-    mock_llm.chat_vision = AsyncMock(side_effect=Exception("LLM unavailable"))
-
-    callback_payloads = []
-    async def fake_callback(url, payload):
-        callback_payloads.append(payload)
-
-    with patch("app.services.video_service._post_callback", new=fake_callback):
-        from app.services.video_service import video_to_text_task
-        await video_to_text_task(totext_req, mock_llm, mock_storage, "http://backend/callback")
-
-    assert callback_payloads[0]["status"] == "FAILED"