跳到主要內容

飛書群消息接入日報管線 — 實施計劃

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: 把飛書群作為第二個信息源接入現有日報管線;新建獨立的 bot/feishu-bot/ 目錄做抽取,最終單份 detailed.md 同時包含微信和飛書內容。

Architecture: 飛書側每日批量調 Open API 拉前一日消息,落到 bot/feishu-bot/data/daily/<date>.feishu.json。schema 與微信側 <date>.json 嚴格對齊。bot/wechat-bot/scripts/generate_report.py 唯一改動是新增 _load_daily(date_str) 函數把兩個 JSON 的 groups 列表串接。_display_source / dedupe_highlights / sanitize_highlights / prompt 全部 0 改動。

Tech Stack: Python 3.9+,僅依賴 requests + python-dotenv,不引入飛書官方 SDK。測試用 unittest(標準庫),離線 JSON fixture,不打活的 API。

Reference Spec: docs/superpowers/specs/2026-05-01-feishu-group-extraction-design.md


File Structure

新增:

bot/feishu-bot/
├── README.md # 用法 + 飛書自建應用配置步驟
├── CLAUDE.md # 給 AI 的快速 context
├── .env.example # FEISHU_APP_ID / SECRET / CHAT_IDS / GROUP_LABELS
├── .gitignore # data/, .env
├── scripts/
│ ├── _feishu.py # client + token + decoders(single source of truth)
│ ├── extract_day.py # CLI,與微信側對齊
│ └── inventory.py # 列機器人在的群、最近 7 日消息量
├── data/
│ └── daily/<date>.feishu.json # 抽取產物,gitignored
└── tests/
├── __init__.py
├── fixtures/
│ ├── post_simple.json
│ ├── post_with_links.json
│ ├── share_chat.json
│ ├── share_user.json
│ ├── file.json
│ └── messages_page1.json
│ └── messages_page2.json
├── test_decoders.py
├── test_pagination.py
└── test_load_daily.py

修改:

  • bot/wechat-bot/scripts/generate_report.py:抽出 _load_daily(),約 +25 行

每個任務下方"Files"段落引用的所有路徑都相對倉庫根。


Pre-conditions(用戶操作,不在自動化範圍

這些動作 由人 在飛書開放平臺完成,無法在 plan 裡執行。代碼任務可以在沒拿到真實 credentials 時通過單測推進;但 Task 17(端到端驗證)需要先完成下面三步。

  1. 創建自建應用:飛書開放平臺 → 創建企業自建應用 → 啟用機器人能力。
  2. 申請權限 scope:勾選 im:message:readonlyim:chat:readonlyim:chat.member:readcontact:user.id:readonly。提交併獲得審批通過。
  3. 拿憑證 + 拉機器人入群:複製 app_id / app_secret.env;在 Hermes 飛書群裡 @機器人 把它加入群。

Task 1: 項目目錄骨架

Files:

  • Create: bot/feishu-bot/.gitignore

  • Create: bot/feishu-bot/.env.example

  • Create: bot/feishu-bot/scripts/ (directory marker via .gitkeep-style file is unnecessary; we'll add _feishu.py next task)

  • Create: bot/feishu-bot/tests/__init__.py

  • Create: bot/feishu-bot/tests/fixtures/.gitkeep

  • Step 1: 寫 .gitignore

Create bot/feishu-bot/.gitignore:

.env
.env.*
!.env.example

# Python
__pycache__/
*.pyc
*.pyo
.venv/
venv/

# Extracted Feishu data — sensitive, never commit
data/

# macOS
.DS_Store
  • Step 2: 寫 .env.example

Create bot/feishu-bot/.env.example:

# Feishu 自建應用憑證
FEISHU_APP_ID=cli_xxxxxxxx
FEISHU_APP_SECRET=xxxxxxxxxxxx

# 要監聽的群 chat_id 列表,逗號分隔
FEISHU_CHAT_IDS=oc_xxxxxxxxxxxx

# 可選:把 chat_id 顯式映射到日報中顯示的群名前綴
# 多條目用 ; 分隔,單條 chat_id=label 用 = 分隔
# 缺省時按 "Hermes Agent 中文社區飛書群" + FEISHU_CHAT_IDS 中的順序號補
# FEISHU_GROUP_LABELS=oc_xxx=Hermes Agent 中文社區飛書群 1
  • Step 3: 創建測試目錄骨架
mkdir -p bot/feishu-bot/scripts bot/feishu-bot/tests/fixtures bot/feishu-bot/data/daily
touch bot/feishu-bot/tests/__init__.py
touch bot/feishu-bot/tests/fixtures/.gitkeep
  • Step 4: Commit
git add bot/feishu-bot/.gitignore bot/feishu-bot/.env.example \
bot/feishu-bot/tests/__init__.py bot/feishu-bot/tests/fixtures/.gitkeep
git commit -m "feat(feishu-bot): 項目目錄骨架"

Task 2: 消息解碼器 — text 類型

Files:

  • Create: bot/feishu-bot/scripts/_feishu.py
  • Create: bot/feishu-bot/tests/fixtures/text_simple.json
  • Create: bot/feishu-bot/tests/test_decoders.py

Background: 飛書 im/v1/messages 返回的每條消息的 body.content 是 JSON 字符串,裡面再嵌套類型相關結構。text 類型最簡單:{"text": "你好"}

  • Step 1: 寫 fixture

Create bot/feishu-bot/tests/fixtures/text_simple.json:

{
"message_id": "om_aaa111",
"create_time": "1714492800000",
"msg_type": "text",
"sender": {"id": "ou_aaa", "id_type": "open_id"},
"body": {"content": "{\"text\":\"deepseek 怎麼樣\"}"}
}

注意 create_time毫秒級字符串(飛書 API 實際返回值),body.content 也是字符串。

  • Step 2: 寫失敗的測試

Create bot/feishu-bot/tests/test_decoders.py:

import json
import unittest
from pathlib import Path

import sys
sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "scripts"))
from _feishu import decode_message # noqa: E402

FIXTURES = Path(__file__).parent / "fixtures"


def load_fixture(name: str) -> dict:
return json.loads((FIXTURES / name).read_text(encoding="utf-8"))


class TestDecodeText(unittest.TestCase):
def test_simple_text(self):
raw = load_fixture("text_simple.json")
msg = decode_message(raw)
self.assertIsNotNone(msg)
self.assertEqual(msg["type"], "text")
self.assertEqual(msg["sender_wxid"], "ou_aaa")
self.assertEqual(msg["sender_name"], "")
self.assertEqual(msg["text"], "deepseek 怎麼樣")
# ts 是秒級 int(從毫秒字符串轉)
self.assertEqual(msg["ts"], 1714492800)
# time 是 Asia/Shanghai 的 HH:MM:SS
self.assertEqual(msg["time"], "09:20:00")


if __name__ == "__main__":
unittest.main()
  • Step 3: 跑測試,確認失敗
cd bot/feishu-bot
/usr/bin/python3 -m unittest tests.test_decoders -v

Expected: ImportError: cannot import name 'decode_message' from '_feishu'(或 ModuleNotFoundError

  • Step 4: 寫最小實現

Create bot/feishu-bot/scripts/_feishu.py:

"""Feishu Open API client + message decoders.

Single source of truth for everything in this directory.
"""
from __future__ import annotations

import datetime as dt
import json
from zoneinfo import ZoneInfo

TZ = ZoneInfo("Asia/Shanghai")


def _ts_seconds(create_time: str | int) -> int:
"""Feishu API returns create_time as milliseconds (string or int)."""
return int(int(create_time) // 1000)


def _decode_text(content: dict) -> str:
return (content.get("text") or "").strip()


def decode_message(raw: dict) -> dict | None:
"""Decode one Feishu message envelope into our internal schema.

Returns None for unsupported / noise types (image, sticker, audio, etc.).
"""
msg_type = raw.get("msg_type")
body_raw = (raw.get("body") or {}).get("content") or "{}"
try:
content = json.loads(body_raw) if isinstance(body_raw, str) else body_raw
except json.JSONDecodeError:
return None

if msg_type == "text":
text = _decode_text(content)
else:
return None

if not text:
return None

ts = _ts_seconds(raw.get("create_time") or 0)
sender_id = ((raw.get("sender") or {}).get("id")) or ""
return {
"ts": ts,
"time": dt.datetime.fromtimestamp(ts, TZ).strftime("%H:%M:%S"),
"sender_wxid": sender_id,
"sender_name": "",
"type": msg_type,
"text": text,
}
  • Step 5: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_decoders -v

Expected: PASS

  • Step 6: Commit
git add bot/feishu-bot/scripts/_feishu.py \
bot/feishu-bot/tests/test_decoders.py \
bot/feishu-bot/tests/fixtures/text_simple.json
git commit -m "feat(feishu-bot): 解碼 text 消息(含 ts/time 轉換)"

Task 3: 消息解碼器 — post 富文本

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Create: bot/feishu-bot/tests/fixtures/post_simple.json
  • Create: bot/feishu-bot/tests/fixtures/post_with_links.json
  • Modify: bot/feishu-bot/tests/test_decoders.py

Background: 飛書 post 消息的 content 形如:

{"title":"標題","content":[[{"tag":"text","text":"段一"},{"tag":"a","href":"https://x","text":"鏈接"}],[{"tag":"text","text":"段二"}]]}

content段落數組的數組:外層每項是一段,內層是該段內的元素(text / a / at / img / emotion)。

  • Step 1: 寫兩個 fixture

Create bot/feishu-bot/tests/fixtures/post_simple.json:

{
"message_id": "om_post1",
"create_time": "1714492810000",
"msg_type": "post",
"sender": {"id": "ou_bbb", "id_type": "open_id"},
"body": {"content": "{\"title\":\"週報\",\"content\":[[{\"tag\":\"text\",\"text\":\"本週完成 X\"}],[{\"tag\":\"text\",\"text\":\"下週計劃 Y\"}]]}"}
}

Create bot/feishu-bot/tests/fixtures/post_with_links.json:

{
"message_id": "om_post2",
"create_time": "1714492820000",
"msg_type": "post",
"sender": {"id": "ou_ccc", "id_type": "open_id"},
"body": {"content": "{\"title\":\"\",\"content\":[[{\"tag\":\"text\",\"text\":\"看這個 \"},{\"tag\":\"a\",\"href\":\"https://example.com\",\"text\":\"博客\"},{\"tag\":\"text\",\"text\":\" 還可以\"}],[{\"tag\":\"at\",\"user_id\":\"ou_xxx\",\"user_name\":\"張三\"},{\"tag\":\"text\",\"text\":\" 你怎麼看\"}]]}"}
}
  • Step 2: 加測試用例

Append to bot/feishu-bot/tests/test_decoders.py:

class TestDecodePost(unittest.TestCase):
def test_post_with_title(self):
raw = load_fixture("post_simple.json")
msg = decode_message(raw)
self.assertEqual(msg["type"], "post")
self.assertEqual(msg["sender_wxid"], "ou_bbb")
# 標題與正文段之間用空行隔開;段之間也用空行
self.assertEqual(msg["text"], "週報\n\n本週完成 X\n\n下週計劃 Y")

def test_post_with_link_and_at(self):
raw = load_fixture("post_with_links.json")
msg = decode_message(raw)
# 鏈接保留 [文字](url) 形式;at 保留 @姓名
self.assertEqual(
msg["text"],
"看這個 [博客](https://example.com) 還可以\n\n@張三 你怎麼看",
)
  • Step 3: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_decoders.TestDecodePost -v

Expected: FAIL(decode_message 還不支持 post,返回 None

  • Step 4: 實現 _decode_post

Add to bot/feishu-bot/scripts/_feishu.py:

def _decode_post_element(el: dict) -> str:
"""One inline element inside a post paragraph."""
tag = el.get("tag")
if tag == "text":
return el.get("text") or ""
if tag == "a":
text = el.get("text") or ""
href = el.get("href") or ""
return f"[{text}]({href})" if href else text
if tag == "at":
# user_name 可能為空(被 at 的人不在群),用 user_id 兜底
return "@" + (el.get("user_name") or el.get("user_id") or "")
if tag == "img":
return "[圖片]"
if tag == "emotion":
return ""
if tag == "media":
return "[媒體]"
if tag == "file":
return "[文件]"
return ""


def _decode_post(content: dict) -> str:
"""Flatten Feishu post content into plain text with links and ats."""
title = (content.get("title") or "").strip()
paragraphs = content.get("content") or []
rendered = []
for para in paragraphs:
if not isinstance(para, list):
continue
line = "".join(_decode_post_element(el) for el in para if isinstance(el, dict))
line = line.strip()
if line:
rendered.append(line)
body = "\n\n".join(rendered)
if title and body:
return f"{title}\n\n{body}"
return title or body

Modify the decode_message dispatch:

    if msg_type == "text":
text = _decode_text(content)
elif msg_type in ("post", "post_v2"):
text = _decode_post(content)
else:
return None
  • Step 5: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_decoders -v

Expected: 3 tests, all PASS

  • Step 6: Commit
git add bot/feishu-bot/scripts/_feishu.py \
bot/feishu-bot/tests/test_decoders.py \
bot/feishu-bot/tests/fixtures/post_simple.json \
bot/feishu-bot/tests/fixtures/post_with_links.json
git commit -m "feat(feishu-bot): 解碼 post 富文本(保留鏈接/@/段落分隔)"

Task 4: 消息解碼器 — share_chat / share_user

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Create: bot/feishu-bot/tests/fixtures/share_chat.json
  • Create: bot/feishu-bot/tests/fixtures/share_user.json
  • Modify: bot/feishu-bot/tests/test_decoders.py

Background: share_chat 消息的 content 形如 {"chatId":"oc_xxx"}share_user 形如 {"userId":"ou_xxx"}。這兩種類型本身不帶名字 / URL,需要客戶端二次查詢。為了讓 decoder 保持純函數,我們讓 decode_message 接受一個可選的 resolver 回調,由調用方決定怎麼補名字。

  • Step 1: 寫兩個 fixture

Create bot/feishu-bot/tests/fixtures/share_chat.json:

{
"message_id": "om_share1",
"create_time": "1714492830000",
"msg_type": "share_chat",
"sender": {"id": "ou_ddd", "id_type": "open_id"},
"body": {"content": "{\"chatId\":\"oc_target_group\"}"}
}

Create bot/feishu-bot/tests/fixtures/share_user.json:

{
"message_id": "om_share2",
"create_time": "1714492840000",
"msg_type": "share_user",
"sender": {"id": "ou_eee", "id_type": "open_id"},
"body": {"content": "{\"userId\":\"ou_target_user\"}"}
}
  • Step 2: 加測試用例

Append to bot/feishu-bot/tests/test_decoders.py:

class TestDecodeShare(unittest.TestCase):
def test_share_chat_with_resolver(self):
raw = load_fixture("share_chat.json")
resolver = lambda kind, ref_id: f"群名(假){ref_id}" if kind == "chat" else None
msg = decode_message(raw, resolver=resolver)
self.assertEqual(msg["type"], "share")
self.assertEqual(msg["text"], "[轉發鏈接] 群名(假)oc_target_group")

def test_share_chat_without_resolver_degrades(self):
raw = load_fixture("share_chat.json")
msg = decode_message(raw)
self.assertEqual(msg["text"], "[轉發鏈接] oc_target_group")

def test_share_user_with_resolver(self):
raw = load_fixture("share_user.json")
resolver = lambda kind, ref_id: "李四" if kind == "user" else None
msg = decode_message(raw, resolver=resolver)
self.assertEqual(msg["type"], "share")
self.assertEqual(msg["text"], "[轉發名片] 李四")
  • Step 3: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_decoders.TestDecodeShare -v

Expected: FAIL

  • Step 4: 實現 share 解碼 + resolver 參數

Modify decode_message in bot/feishu-bot/scripts/_feishu.py:

from typing import Callable

ShareResolver = Callable[[str, str], str | None]


def _decode_share_chat(content: dict, resolver: ShareResolver | None) -> str:
chat_id = content.get("chatId") or content.get("chat_id") or ""
name = resolver("chat", chat_id) if (resolver and chat_id) else None
return f"[轉發鏈接] {name or chat_id}"


def _decode_share_user(content: dict, resolver: ShareResolver | None) -> str:
user_id = content.get("userId") or content.get("user_id") or ""
name = resolver("user", user_id) if (resolver and user_id) else None
return f"[轉發名片] {name or user_id}"


def decode_message(raw: dict, resolver: ShareResolver | None = None) -> dict | None:
msg_type = raw.get("msg_type")
body_raw = (raw.get("body") or {}).get("content") or "{}"
try:
content = json.loads(body_raw) if isinstance(body_raw, str) else body_raw
except json.JSONDecodeError:
return None

if msg_type == "text":
text = _decode_text(content)
out_type = "text"
elif msg_type in ("post", "post_v2"):
text = _decode_post(content)
out_type = "post"
elif msg_type == "share_chat":
text = _decode_share_chat(content, resolver)
out_type = "share"
elif msg_type == "share_user":
text = _decode_share_user(content, resolver)
out_type = "share"
else:
return None

if not text:
return None

ts = _ts_seconds(raw.get("create_time") or 0)
sender_id = ((raw.get("sender") or {}).get("id")) or ""
return {
"ts": ts,
"time": dt.datetime.fromtimestamp(ts, TZ).strftime("%H:%M:%S"),
"sender_wxid": sender_id,
"sender_name": "",
"type": out_type,
"text": text,
}

注意:原來 out_type = msg_type,現在改成 explicit 映射,因為 share_chat / share_user 都規範化成 share

  • Step 5: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_decoders -v

Expected: 6 tests, all PASS

  • Step 6: Commit
git add bot/feishu-bot/scripts/_feishu.py \
bot/feishu-bot/tests/test_decoders.py \
bot/feishu-bot/tests/fixtures/share_chat.json \
bot/feishu-bot/tests/fixtures/share_user.json
git commit -m "feat(feishu-bot): 解碼 share_chat / share_user(含 resolver 回調)"

Task 5: 消息解碼器 — file + 未知類型

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py

  • Create: bot/feishu-bot/tests/fixtures/file.json

  • Create: bot/feishu-bot/tests/fixtures/sticker.json

  • Modify: bot/feishu-bot/tests/test_decoders.py

  • Step 1: 寫兩個 fixture

Create bot/feishu-bot/tests/fixtures/file.json:

{
"message_id": "om_file1",
"create_time": "1714492850000",
"msg_type": "file",
"sender": {"id": "ou_fff", "id_type": "open_id"},
"body": {"content": "{\"file_key\":\"file_xxx\",\"file_name\":\"演示文檔.pdf\"}"}
}

Create bot/feishu-bot/tests/fixtures/sticker.json:

{
"message_id": "om_stk1",
"create_time": "1714492860000",
"msg_type": "sticker",
"sender": {"id": "ou_ggg", "id_type": "open_id"},
"body": {"content": "{\"file_key\":\"sticker_xxx\"}"}
}
  • Step 2: 加測試用例

Append to bot/feishu-bot/tests/test_decoders.py:

class TestDecodeFile(unittest.TestCase):
def test_file_keeps_filename(self):
raw = load_fixture("file.json")
msg = decode_message(raw)
self.assertEqual(msg["type"], "file")
self.assertEqual(msg["text"], "[文件] 演示文檔.pdf")


class TestDecodeUnknown(unittest.TestCase):
def test_sticker_dropped(self):
raw = load_fixture("sticker.json")
msg = decode_message(raw)
self.assertIsNone(msg)

def test_garbage_content_dropped(self):
raw = {
"msg_type": "text",
"create_time": "1714492870000",
"sender": {"id": "ou_x"},
"body": {"content": "not json"},
}
self.assertIsNone(decode_message(raw))
  • Step 3: 跑測試,確認 file 測試失敗、unknown 已通過
/usr/bin/python3 -m unittest tests.test_decoders -v

Expected: file 測試 FAIL;sticker / garbage 已自動通過(因為目前 dispatch 走 else: return None)

  • Step 4: 實現 _decode_file

Modify bot/feishu-bot/scripts/_feishu.py:

def _decode_file(content: dict) -> str:
name = (content.get("file_name") or "").strip()
return f"[文件] {name}" if name else ""

Add to dispatch in decode_message:

    elif msg_type == "file":
text = _decode_file(content)
out_type = "file"
  • Step 5: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_decoders -v

Expected: 9 tests, all PASS

  • Step 6: Commit
git add bot/feishu-bot/scripts/_feishu.py \
bot/feishu-bot/tests/test_decoders.py \
bot/feishu-bot/tests/fixtures/file.json \
bot/feishu-bot/tests/fixtures/sticker.json
git commit -m "feat(feishu-bot): 解碼 file(保留文件名)+ 未知類型 drop"

Task 6: HTTP 客戶端骨架 + tenant_access_token

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Create: bot/feishu-bot/tests/test_client.py

Background: 飛書認證流程:

POST https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal
Body: {"app_id":"...","app_secret":"..."}
Resp: {"code":0, "tenant_access_token":"t-xxx", "expire": 7200}

後續業務調用 header 加 Authorization: Bearer <token>。Token 過期會返回 code == 99991663,需要刷新重試。

FeishuClient 接收一個可選 transport callable,簽名 (method, url, headers, params, json) -> (status_code, json_body)。生產用 requests,測試用一個 in-memory fake。

  • Step 1: 寫測試

Create bot/feishu-bot/tests/test_client.py:

import sys
import unittest
from pathlib import Path

sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "scripts"))
from _feishu import FeishuClient # noqa: E402


class FakeTransport:
"""Records calls + returns canned responses by URL match."""

def __init__(self):
self.calls: list[dict] = []
self.responses: dict[str, list[tuple[int, dict]]] = {}

def respond(self, url_substring: str, status: int, body: dict):
self.responses.setdefault(url_substring, []).append((status, body))

def __call__(self, method, url, *, headers=None, params=None, json=None):
self.calls.append({
"method": method, "url": url,
"headers": headers or {}, "params": params or {}, "json": json,
})
for sub, queue in self.responses.items():
if sub in url and queue:
return queue.pop(0)
raise AssertionError(f"unexpected request: {method} {url}")


class TestTokenFetch(unittest.TestCase):
def test_fetch_token_first_call(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-abc", "expire": 7200,
})
c = FeishuClient("cli_a", "secret_a", transport=t)

token = c._get_token()

self.assertEqual(token, "t-abc")
self.assertEqual(t.calls[0]["json"], {"app_id": "cli_a", "app_secret": "secret_a"})

def test_token_cached_within_run(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-abc", "expire": 7200,
})
c = FeishuClient("cli_a", "secret_a", transport=t)

c._get_token()
c._get_token()

self.assertEqual(len(t.calls), 1)

def test_token_fetch_failure_raises(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {"code": 1234, "msg": "bad app"})
c = FeishuClient("cli_a", "secret_a", transport=t)

with self.assertRaises(RuntimeError):
c._get_token()


if __name__ == "__main__":
unittest.main()
  • Step 2: 跑測試,確認失敗
cd bot/feishu-bot
/usr/bin/python3 -m unittest tests.test_client -v

Expected: FAIL — FeishuClient 不存在

  • Step 3: 實現 FeishuClient + _get_token

Add to top of bot/feishu-bot/scripts/_feishu.py:

import time

FEISHU_API = "https://open.feishu.cn/open-apis"
TOKEN_URL = f"{FEISHU_API}/auth/v3/tenant_access_token/internal"


def _default_transport(method: str, url: str, *, headers=None, params=None, json=None):
"""Real HTTP transport using `requests`. Imported lazily so unit tests
don't need the dependency installed.
"""
import requests # local import so test env without requests still works
resp = requests.request(
method, url, headers=headers, params=params, json=json, timeout=30
)
try:
body = resp.json()
except ValueError:
body = {}
return resp.status_code, body


class FeishuClient:
def __init__(self, app_id: str, app_secret: str, *, transport=None):
self.app_id = app_id
self.app_secret = app_secret
self._transport = transport or _default_transport
self._token: str | None = None

def _get_token(self) -> str:
if self._token:
return self._token
status, body = self._transport(
"POST", TOKEN_URL,
json={"app_id": self.app_id, "app_secret": self.app_secret},
)
if status != 200 or body.get("code") != 0:
raise RuntimeError(
f"Feishu token fetch failed: status={status} body={body}"
)
self._token = body.get("tenant_access_token")
if not self._token:
raise RuntimeError(f"Feishu token response missing token: {body}")
return self._token
  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_client -v

Expected: 3 tests PASS

  • Step 5: Commit
git add bot/feishu-bot/scripts/_feishu.py bot/feishu-bot/tests/test_client.py
git commit -m "feat(feishu-bot): FeishuClient 骨架 + tenant_access_token 拉取"

Task 7: 通用請求 + token 過期重試

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Modify: bot/feishu-bot/tests/test_client.py

Background: 後續所有業務請求都要帶 token。封裝 _request(method, path, params=None, json=None):自動加 token → 調用 transport → 檢查 code == 99991663(token 過期)則刷一次重試 → 5xx / 網絡異常退避重試 3 次。

  • Step 1: 加測試

Append to bot/feishu-bot/tests/test_client.py:

class TestRequest(unittest.TestCase):
def test_token_attached(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-abc", "expire": 7200,
})
t.respond("im/v1/messages", 200, {"code": 0, "data": {"items": []}})
c = FeishuClient("cli", "sec", transport=t)

c._request("GET", "/im/v1/messages", params={"x": 1})

# second call (the messages one) should have Authorization header
msg_call = t.calls[1]
self.assertEqual(msg_call["headers"]["Authorization"], "Bearer t-abc")
self.assertEqual(msg_call["params"], {"x": 1})

def test_token_refresh_on_99991663(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-old", "expire": 7200,
})
# First business call returns expired-token error
t.respond("im/v1/messages", 200, {"code": 99991663, "msg": "token expired"})
# After refresh, second token + second business call succeed
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-new", "expire": 7200,
})
t.respond("im/v1/messages", 200, {"code": 0, "data": {"items": []}})
c = FeishuClient("cli", "sec", transport=t)

body = c._request("GET", "/im/v1/messages")

self.assertEqual(body, {"code": 0, "data": {"items": []}})
# 4 calls total: token, messages(fail), token(refresh), messages(retry)
self.assertEqual(len(t.calls), 4)
self.assertEqual(t.calls[3]["headers"]["Authorization"], "Bearer t-new")

def test_5xx_retried_then_succeeds(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-abc", "expire": 7200,
})
t.respond("im/v1/messages", 503, {})
t.respond("im/v1/messages", 200, {"code": 0, "data": {}})
c = FeishuClient("cli", "sec", transport=t)
# speed up retry in test
c._retry_base_delay = 0

body = c._request("GET", "/im/v1/messages")

self.assertEqual(body["code"], 0)
self.assertEqual(len(t.calls), 3)

def test_5xx_exhausted_raises(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-abc", "expire": 7200,
})
for _ in range(3):
t.respond("im/v1/messages", 503, {})
c = FeishuClient("cli", "sec", transport=t)
c._retry_base_delay = 0

with self.assertRaises(RuntimeError):
c._request("GET", "/im/v1/messages")
  • Step 2: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_client.TestRequest -v

Expected: FAIL — _request 未實現

  • Step 3: 實現 _request

Add to FeishuClient in bot/feishu-bot/scripts/_feishu.py:

class FeishuClient:
# ... 既有代碼 ...

_max_retries = 3
_retry_base_delay = 2.0 # tests override to 0

def _request(self, method: str, path: str, *, params=None, json=None) -> dict:
url = FEISHU_API + path if path.startswith("/") else f"{FEISHU_API}/{path}"

for attempt in range(self._max_retries):
headers = {"Authorization": f"Bearer {self._get_token()}"}
status, body = self._transport(
method, url, headers=headers, params=params, json=json,
)

# Token expired — clear cached token; the next loop iteration will
# refresh on next _get_token. This consumes one retry slot, which
# is fine: token refresh is rare and we have _max_retries to spare.
if status == 200 and isinstance(body, dict) and body.get("code") == 99991663:
self._token = None
continue

if status >= 500:
if attempt + 1 == self._max_retries:
raise RuntimeError(f"Feishu {method} {path} 5xx after {self._max_retries} attempts: status={status}")
time.sleep(self._retry_base_delay * (2 ** attempt))
continue

if status == 429:
if attempt + 1 == self._max_retries:
raise RuntimeError(f"Feishu {method} {path} 429 rate limited after {self._max_retries} attempts")
time.sleep(self._retry_base_delay * (2 ** attempt))
continue

if status != 200 or body.get("code") not in (0, None):
raise RuntimeError(
f"Feishu {method} {path} failed: status={status} body={body}"
)

return body

# Should be unreachable — retry loop either returns or raises.
raise RuntimeError(f"Feishu {method} {path}: retry loop exhausted unexpectedly")

注意:Python for attempt in range(...) 配合 continue 會推進到下一次迭代,所以 token-expired 重試會消耗一次 attempt 槽位——這沒關係,因為 token 過期罕見且我們有 3 次 retry 餘量。如果未來發現耗盡 retry 的情形,再考慮用 while 循環。

  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_client -v

Expected: 7 tests PASS

  • Step 5: Commit
git add bot/feishu-bot/scripts/_feishu.py bot/feishu-bot/tests/test_client.py
git commit -m "feat(feishu-bot): _request 通用請求(含 token 刷新 + 5xx/429 退避重試)"

Task 8: 群消息分頁拉取 iter_messages

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Create: bot/feishu-bot/tests/test_pagination.py

Background: 飛書 GET /open-apis/im/v1/messages 參數:

  • container_id_type=chat
  • container_id=oc_xxx
  • start_time=<unix秒字符串>
  • end_time=<unix秒字符串>
  • page_size=50
  • page_token=...(上次響應給的)
  • sort_type=ByCreateTimeDescByCreateTimeAsc

響應:

{"code":0,"data":{"items":[...], "has_more":true, "page_token":"next_xxx"}}

我們要按時間升序拉,全部拉完。

  • Step 1: 寫測試

Create bot/feishu-bot/tests/test_pagination.py:

import sys
import unittest
from pathlib import Path

sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "scripts"))
from _feishu import FeishuClient # noqa: E402
from tests.test_client import FakeTransport # reuse the harness


class TestIterMessages(unittest.TestCase):
def setUp(self):
self.t = FakeTransport()
self.t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t-abc", "expire": 7200,
})
self.c = FeishuClient("cli", "sec", transport=self.t)

def test_single_page(self):
self.t.respond("im/v1/messages", 200, {
"code": 0,
"data": {
"items": [{"message_id": "m1"}, {"message_id": "m2"}],
"has_more": False,
"page_token": "",
},
})

out = list(self.c.iter_messages("oc_x", 1714492800, 1714579200))

self.assertEqual([m["message_id"] for m in out], ["m1", "m2"])

def test_two_pages(self):
self.t.respond("im/v1/messages", 200, {
"code": 0,
"data": {
"items": [{"message_id": "m1"}],
"has_more": True,
"page_token": "tok2",
},
})
self.t.respond("im/v1/messages", 200, {
"code": 0,
"data": {
"items": [{"message_id": "m2"}],
"has_more": False,
"page_token": "",
},
})

out = list(self.c.iter_messages("oc_x", 1, 100))

self.assertEqual([m["message_id"] for m in out], ["m1", "m2"])
# second call must carry page_token=tok2
msg_calls = [c for c in self.t.calls if "im/v1/messages" in c["url"]]
self.assertEqual(msg_calls[1]["params"].get("page_token"), "tok2")

def test_passes_window(self):
self.t.respond("im/v1/messages", 200, {
"code": 0,
"data": {"items": [], "has_more": False, "page_token": ""},
})
list(self.c.iter_messages("oc_x", 1714492800, 1714579200))

first = [c for c in self.t.calls if "im/v1/messages" in c["url"]][0]
self.assertEqual(first["params"]["container_id"], "oc_x")
self.assertEqual(first["params"]["container_id_type"], "chat")
self.assertEqual(first["params"]["start_time"], "1714492800")
self.assertEqual(first["params"]["end_time"], "1714579200")
self.assertEqual(first["params"]["sort_type"], "ByCreateTimeAsc")
  • Step 2: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_pagination -v

Expected: FAIL — iter_messages 未實現

  • Step 3: 實現 iter_messages

Add to FeishuClient in bot/feishu-bot/scripts/_feishu.py:

    def iter_messages(self, chat_id: str, start_ts: int, end_ts: int):
"""Yield raw message envelopes within [start_ts, end_ts), ascending."""
page_token = ""
while True:
params = {
"container_id_type": "chat",
"container_id": chat_id,
"start_time": str(start_ts),
"end_time": str(end_ts),
"page_size": "50",
"sort_type": "ByCreateTimeAsc",
}
if page_token:
params["page_token"] = page_token
body = self._request("GET", "/im/v1/messages", params=params)
data = body.get("data") or {}
for item in data.get("items") or []:
yield item
if not data.get("has_more"):
return
page_token = data.get("page_token") or ""
if not page_token:
return
  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_pagination -v

Expected: 3 tests PASS

  • Step 5: Commit
git add bot/feishu-bot/scripts/_feishu.py bot/feishu-bot/tests/test_pagination.py
git commit -m "feat(feishu-bot): iter_messages 分頁拉取(時間升序)"

Task 9: get_chat_name + 群名緩存

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Modify: bot/feishu-bot/tests/test_client.py

Background: GET /open-apis/im/v1/chats/:chat_id{"code":0,"data":{"name":"...","chat_id":"oc_x"}}

share_chat resolver 也要用這個。一次腳本運行內 LRU 緩存即可。

  • Step 1: 加測試

Append to bot/feishu-bot/tests/test_client.py:

class TestChatName(unittest.TestCase):
def test_get_chat_name_cached(self):
t = FakeTransport()
t.respond("tenant_access_token", 200, {
"code": 0, "tenant_access_token": "t", "expire": 7200,
})
t.respond("im/v1/chats/oc_x", 200, {
"code": 0, "data": {"name": "Hermes 中文社區"},
})
c = FeishuClient("a", "b", transport=t)

n1 = c.get_chat_name("oc_x")
n2 = c.get_chat_name("oc_x")

self.assertEqual(n1, "Hermes 中文社區")
self.assertEqual(n2, "Hermes 中文社區")
chat_calls = [x for x in t.calls if "im/v1/chats/oc_x" in x["url"]]
self.assertEqual(len(chat_calls), 1) # cached
  • Step 2: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_client.TestChatName -v

Expected: FAIL — get_chat_name 未實現

  • Step 3: 實現 get_chat_name

Add to FeishuClient __init__:

        self._chat_name_cache: dict[str, str] = {}

Add method:

    def get_chat_name(self, chat_id: str) -> str:
if chat_id in self._chat_name_cache:
return self._chat_name_cache[chat_id]
body = self._request("GET", f"/im/v1/chats/{chat_id}")
name = ((body.get("data") or {}).get("name") or "").strip()
self._chat_name_cache[chat_id] = name
return name
  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_client -v

Expected: 8 tests PASS

  • Step 5: Commit
git add bot/feishu-bot/scripts/_feishu.py bot/feishu-bot/tests/test_client.py
git commit -m "feat(feishu-bot): get_chat_name 帶本次運行內緩存"

Task 10: 群標籤解析 resolve_group_label

Files:

  • Modify: bot/feishu-bot/scripts/_feishu.py
  • Modify: bot/feishu-bot/tests/test_client.py

Background: 用戶輸入 FEISHU_GROUP_LABELS=oc_x=Hermes Agent 中文社區飛書群 1;oc_y=... 時按顯式映射;缺失時按 FEISHU_CHAT_IDS 中順序號補 "Hermes Agent 中文社區飛書群 N"。這是純函數。

  • Step 1: 加測試

Append to bot/feishu-bot/tests/test_client.py:

from _feishu import resolve_group_label, parse_group_labels


class TestGroupLabels(unittest.TestCase):
def test_parse_empty(self):
self.assertEqual(parse_group_labels(""), {})
self.assertEqual(parse_group_labels(None), {})

def test_parse_single(self):
self.assertEqual(
parse_group_labels("oc_x=Hermes Agent 中文社區飛書群 1"),
{"oc_x": "Hermes Agent 中文社區飛書群 1"},
)

def test_parse_multi(self):
self.assertEqual(
parse_group_labels("oc_x=群A;oc_y=群B"),
{"oc_x": "群A", "oc_y": "群B"},
)

def test_parse_strips_whitespace(self):
self.assertEqual(
parse_group_labels(" oc_x = 群A ; oc_y = 群B "),
{"oc_x": "群A", "oc_y": "群B"},
)

def test_resolve_explicit(self):
labels = {"oc_x": "群A"}
self.assertEqual(
resolve_group_label("oc_x", chat_ids=["oc_x", "oc_y"], labels=labels),
"群A",
)

def test_resolve_default_by_index(self):
self.assertEqual(
resolve_group_label("oc_y", chat_ids=["oc_x", "oc_y"], labels={}),
"Hermes Agent 中文社區飛書群 2",
)

def test_resolve_unknown_chat_falls_back_to_id(self):
self.assertEqual(
resolve_group_label("oc_z", chat_ids=["oc_x", "oc_y"], labels={}),
"Hermes Agent 中文社區飛書群 oc_z",
)
  • Step 2: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_client.TestGroupLabels -v

Expected: FAIL — parse_group_labels / resolve_group_label 未實現

  • Step 3: 實現兩個函數

Add to bot/feishu-bot/scripts/_feishu.py:

def parse_group_labels(raw: str | None) -> dict[str, str]:
"""Parse FEISHU_GROUP_LABELS env: "oc_x=Label A;oc_y=Label B"."""
out: dict[str, str] = {}
if not raw:
return out
for pair in raw.split(";"):
pair = pair.strip()
if not pair or "=" not in pair:
continue
k, v = pair.split("=", 1)
k = k.strip()
v = v.strip()
if k and v:
out[k] = v
return out


def resolve_group_label(
chat_id: str, *, chat_ids: list[str], labels: dict[str, str]
) -> str:
"""Resolve display name for a Feishu chat in the daily digest.

Priority:
1. Explicit FEISHU_GROUP_LABELS mapping
2. "Hermes Agent 中文社區飛書群 <index>" using order in FEISHU_CHAT_IDS
3. "Hermes Agent 中文社區飛書群 <chat_id>" if not in FEISHU_CHAT_IDS
"""
if chat_id in labels:
return labels[chat_id]
try:
idx = chat_ids.index(chat_id) + 1
return f"Hermes Agent 中文社區飛書群 {idx}"
except ValueError:
return f"Hermes Agent 中文社區飛書群 {chat_id}"
  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_client -v

Expected: 15 tests PASS

  • Step 5: Commit
git add bot/feishu-bot/scripts/_feishu.py bot/feishu-bot/tests/test_client.py
git commit -m "feat(feishu-bot): 群標籤解析(FEISHU_GROUP_LABELS + 順序號兜底)"

Task 11: extract_day.pyday_bounds + 週末補跑

Files:

  • Create: bot/feishu-bot/scripts/extract_day.py
  • Create: bot/feishu-bot/tests/test_extract_day.py

Background: 這個任務只做日期處理邏輯,不打 API。把可單測的部分抽成純函數。

  • Step 1: 寫測試

Create bot/feishu-bot/tests/test_extract_day.py:

import datetime as dt
import sys
import unittest
from pathlib import Path
from zoneinfo import ZoneInfo

sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "scripts"))
from extract_day import day_bounds, expand_dates # noqa: E402

TZ = ZoneInfo("Asia/Shanghai")


class TestDayBounds(unittest.TestCase):
def test_one_day_window(self):
start, end = day_bounds("2026-04-30")
# 2026-04-30 00:00:00 Asia/Shanghai = 1714406400
self.assertEqual(start, 1714406400)
self.assertEqual(end, 1714492800)
self.assertEqual(end - start, 86400)


class TestExpandDates(unittest.TestCase):
def test_explicit_date_no_expansion(self):
# explicit date — never expanded even if Monday
self.assertEqual(
expand_dates(explicit_date="2026-04-27", today=dt.date(2026, 4, 27)),
["2026-04-27"],
)

def test_no_explicit_normal_day(self):
# today=2026-04-30 (Thursday) → just yesterday
self.assertEqual(
expand_dates(explicit_date=None, today=dt.date(2026, 4, 30)),
["2026-04-29"],
)

def test_no_explicit_monday_backfills_weekend(self):
# 2026-05-04 is a Monday — yesterday=Sun, 前天=Sat → 拉 Sat/Sun/Mon-1=Sun=...
# 與微信側 wechat-bot/scripts/extract_day.py 行為對齊:週一跑時拉 Sat/Sun/Mon
self.assertEqual(
expand_dates(explicit_date=None, today=dt.date(2026, 5, 4)),
["2026-05-02", "2026-05-03", "2026-05-04"],
)
  • Step 2: 跑測試,確認失敗
/usr/bin/python3 -m unittest tests.test_extract_day -v

Expected: FAIL — extract_day 不存在

  • Step 3: 寫 extract_day.py 骨架

Create bot/feishu-bot/scripts/extract_day.py:

#!/usr/bin/env python3
"""Pull one day of Feishu group messages → bot/feishu-bot/data/daily/<date>.feishu.json.

Schema is aligned with bot/wechat-bot/data/daily/<date>.json so generate_report.py
can merge them by simply concatenating the `groups` list.

Usage:
python3 scripts/extract_day.py # yesterday (Asia/Shanghai)
python3 scripts/extract_day.py 2026-04-30 # explicit date
python3 scripts/extract_day.py --dry-run # don't write file
python3 scripts/extract_day.py --no-overwrite # exit if file exists
"""
from __future__ import annotations

import argparse
import datetime as dt
import json
import os
import sys
from pathlib import Path
from zoneinfo import ZoneInfo

from dotenv import load_dotenv

# Local imports (scripts dir on sys.path via the same trick as tests use)
sys.path.insert(0, str(Path(__file__).resolve().parent))
from _feishu import ( # noqa: E402
FeishuClient,
decode_message,
parse_group_labels,
resolve_group_label,
)

ROOT = Path(__file__).resolve().parent.parent
OUT_DIR = ROOT / "data/daily"
TZ = ZoneInfo("Asia/Shanghai")


def day_bounds(date_str: str) -> tuple[int, int]:
"""[start, end) unix seconds for the given YYYY-MM-DD in Asia/Shanghai."""
d = dt.datetime.strptime(date_str, "%Y-%m-%d").replace(tzinfo=TZ)
return int(d.timestamp()), int((d + dt.timedelta(days=1)).timestamp())


def expand_dates(*, explicit_date: str | None, today: dt.date) -> list[str]:
"""Decide which dates to extract this run.

Behavior matches bot/wechat-bot/scripts/extract_day.py:
- explicit_date given → just that date
- no explicit + today is Monday → backfill Sat/Sun/Mon
- otherwise → just yesterday
"""
if explicit_date:
return [explicit_date]
if today.weekday() == 0: # Monday
return [
(today - dt.timedelta(days=2)).strftime("%Y-%m-%d"),
(today - dt.timedelta(days=1)).strftime("%Y-%m-%d"),
today.strftime("%Y-%m-%d"),
]
return [(today - dt.timedelta(days=1)).strftime("%Y-%m-%d")]


def main(): # pragma: no cover — CLI; covered by manual smoke
ap = argparse.ArgumentParser()
ap.add_argument("date", nargs="?", help="YYYY-MM-DD (default: yesterday)")
ap.add_argument("--dry-run", action="store_true")
ap.add_argument("--no-overwrite", action="store_true")
args = ap.parse_args()

today = dt.datetime.now(TZ).date()
dates = expand_dates(explicit_date=args.date, today=today)
print(f"[*] dates to extract: {dates}")
# Real implementation in next task.


if __name__ == "__main__": # pragma: no cover
main()

注意 main() 只是佔位,下個任務才補完。

  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_extract_day -v

Expected: 4 tests PASS(注意:expand_dates 週一情形會返回三天,跟微信側一致 — 當天傳日期時會被顯式分支攔截)

  • Step 5: Commit
git add bot/feishu-bot/scripts/extract_day.py bot/feishu-bot/tests/test_extract_day.py
git commit -m "feat(feishu-bot): extract_day 骨架 + day_bounds / expand_dates 單測"

Task 12: extract_day.py — 主流程 + JSON 寫盤

Files:

  • Modify: bot/feishu-bot/scripts/extract_day.py

Background:_feishu 的客戶端拼起來:每個 chat_id → iter_messagesdecode_message(帶 share resolver)→ 收集到列表 → 寫文件。

這一步不寫新單測,因為是粘合層;通過端到端手測在 Task 17 驗證。

  • Step 1: 實現 _extract_one_day + _write_daily_json

Replace main() and add helpers in bot/feishu-bot/scripts/extract_day.py:

def _make_share_resolver(client: FeishuClient):
"""Build a resolver(kind, ref_id) -> name for share messages."""
def resolver(kind: str, ref_id: str) -> str | None:
if kind == "chat":
try:
return client.get_chat_name(ref_id) or None
except Exception:
return None
# share_user resolution would need contact API; skip for v1
return None
return resolver


def _extract_one_day(
client: FeishuClient,
date_str: str,
chat_ids: list[str],
labels: dict[str, str],
) -> dict:
start, end = day_bounds(date_str)
resolver = _make_share_resolver(client)

groups = []
for chat_id in chat_ids:
chat_name = ""
try:
chat_name = client.get_chat_name(chat_id)
except Exception as e:
print(f" ⚠️ get_chat_name({chat_id}) failed: {e}", file=sys.stderr)

messages = []
for raw in client.iter_messages(chat_id, start, end):
decoded = decode_message(raw, resolver=resolver)
if decoded is not None:
messages.append(decoded)

groups.append({
"group_id": chat_id,
"group_name": resolve_group_label(chat_id, chat_ids=chat_ids, labels=labels),
"platform": "feishu",
"chat_name": chat_name,
"message_count": len(messages),
"messages": messages,
})

return {
"date": date_str,
"tz": "Asia/Shanghai",
"platform": "feishu",
"window_start": start,
"window_end": end,
"groups": groups,
}


def _write_daily_json(data: dict, out_path: Path, no_overwrite: bool) -> None:
if out_path.exists() and no_overwrite:
print(f"[-] {out_path} exists and --no-overwrite given; skipping.", file=sys.stderr)
sys.exit(2)
out_path.parent.mkdir(parents=True, exist_ok=True)
out_path.write_text(json.dumps(data, ensure_ascii=False, indent=2), encoding="utf-8")
total = sum(g["message_count"] for g in data["groups"])
print(f"[{data['date']}] {total} messages across {len(data['groups'])} groups -> {out_path}")


def main():
ap = argparse.ArgumentParser()
ap.add_argument("date", nargs="?", help="YYYY-MM-DD (default: yesterday)")
ap.add_argument("--dry-run", action="store_true")
ap.add_argument("--no-overwrite", action="store_true")
args = ap.parse_args()

load_dotenv(ROOT / ".env")
app_id = os.environ.get("FEISHU_APP_ID")
app_secret = os.environ.get("FEISHU_APP_SECRET")
chat_ids_raw = os.environ.get("FEISHU_CHAT_IDS") or ""
labels = parse_group_labels(os.environ.get("FEISHU_GROUP_LABELS"))

if not app_id or not app_secret:
print("[-] FEISHU_APP_ID / FEISHU_APP_SECRET missing in .env", file=sys.stderr)
sys.exit(1)
chat_ids = [c.strip() for c in chat_ids_raw.split(",") if c.strip()]
if not chat_ids:
print("[-] FEISHU_CHAT_IDS empty in .env", file=sys.stderr)
sys.exit(1)

today = dt.datetime.now(TZ).date()
dates = expand_dates(explicit_date=args.date, today=today)
print(f"[*] dates to extract: {dates}")
print(f"[*] chats: {chat_ids}")

client = FeishuClient(app_id, app_secret)

for date_str in dates:
data = _extract_one_day(client, date_str, chat_ids, labels)
if args.dry_run:
print(f"[dry-run] would write {OUT_DIR / f'{date_str}.feishu.json'}")
continue
_write_daily_json(data, OUT_DIR / f"{date_str}.feishu.json", args.no_overwrite)
  • Step 2: 重跑全套測試,確認沒破壞
cd bot/feishu-bot
/usr/bin/python3 -m unittest discover tests -v

Expected: 全部 PASS(共 ~19 tests)

  • Step 3: Commit
git add bot/feishu-bot/scripts/extract_day.py
git commit -m "feat(feishu-bot): extract_day 主流程 + JSON 寫盤"

Task 13: inventory.py — 排查工具

Files:

  • Create: bot/feishu-bot/scripts/inventory.py

Background: 仿照 bot/wechat-bot/scripts/inventory.py,列機器人在的群 + 最近 7 日每群消息量。這是排查腳本,不寫單測,端到端手測就夠。

  • Step 1: 寫腳本

Create bot/feishu-bot/scripts/inventory.py:

#!/usr/bin/env python3
"""List groups the bot belongs to + 7-day message counts.

Usage:
python3 scripts/inventory.py # default: print to stdout
"""
from __future__ import annotations

import datetime as dt
import os
import sys
from pathlib import Path
from zoneinfo import ZoneInfo

from dotenv import load_dotenv

sys.path.insert(0, str(Path(__file__).resolve().parent))
from _feishu import FeishuClient # noqa: E402

ROOT = Path(__file__).resolve().parent.parent
TZ = ZoneInfo("Asia/Shanghai")


def main():
load_dotenv(ROOT / ".env")
app_id = os.environ.get("FEISHU_APP_ID")
app_secret = os.environ.get("FEISHU_APP_SECRET")
if not app_id or not app_secret:
print("[-] FEISHU_APP_ID / FEISHU_APP_SECRET missing in .env", file=sys.stderr)
sys.exit(1)

client = FeishuClient(app_id, app_secret)

# List the chats the bot is in.
body = client._request("GET", "/im/v1/chats", params={"page_size": "50"})
chats = (body.get("data") or {}).get("items") or []
if not chats:
print("Bot is in 0 chats. 把機器人 @ 拉進群之後再跑。")
return

now = dt.datetime.now(TZ)
seven_days_ago = now - dt.timedelta(days=7)
start = int(seven_days_ago.timestamp())
end = int(now.timestamp())

print(f"機器人在 {len(chats)} 個群中:")
for c in chats:
chat_id = c.get("chat_id") or ""
name = c.get("name") or "(無名)"
# Cheap count — pull all 7 days, just count.
try:
count = sum(1 for _ in client.iter_messages(chat_id, start, end))
except Exception as e:
print(f" - {name} ({chat_id}): error {e}")
continue
print(f" - {name} ({chat_id}): {count} 條 / 近 7 天")


if __name__ == "__main__":
main()
  • Step 2: 跑一下讓 import 檢查通過
cd bot/feishu-bot
/usr/bin/python3 scripts/inventory.py --help 2>&1 | head -5

Expected: argparse 沒用到,命令直接進 main 但缺憑證會退出 — 這只是驗證 import 不掛。預期看到 "FEISHU_APP_ID / FEISHU_APP_SECRET missing"。

  • Step 3: Commit
git add bot/feishu-bot/scripts/inventory.py
git commit -m "feat(feishu-bot): inventory 排查腳本(列機器人在的群 + 7 日消息量)"

Task 14: 修改 generate_report.py_load_daily()

Files:

  • Modify: bot/wechat-bot/scripts/generate_report.py
  • Create: bot/wechat-bot/tests/__init__.py
  • Create: bot/wechat-bot/tests/test_load_daily.py

Background: 唯一改動既有代碼的點。在 _run_single_day 頂部,把 "讀 daily JSON" 這一步抽成 _load_daily(date_str) -> dict | None,並讓它讀兩個文件、串接 groups

  • Step 1: 寫測試

Create bot/wechat-bot/tests/__init__.py(空文件)。

Create bot/wechat-bot/tests/test_load_daily.py:

import json
import sys
import tempfile
import unittest
from pathlib import Path
from unittest.mock import patch

sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "scripts"))
import generate_report as gr # noqa: E402


class TestLoadDaily(unittest.TestCase):
def setUp(self):
# Build a temp tree mirroring bot/{wechat-bot,feishu-bot}/data/daily
self.tmp = tempfile.TemporaryDirectory()
self.bot_dir = Path(self.tmp.name) / "bot"
self.wechat_daily = self.bot_dir / "wechat-bot" / "data" / "daily"
self.feishu_daily = self.bot_dir / "feishu-bot" / "data" / "daily"
self.wechat_daily.mkdir(parents=True)
self.feishu_daily.mkdir(parents=True)

# Patch DAILY_DIR + ROOT to point inside the temp tree.
# ROOT in generate_report is bot/wechat-bot.
self._patches = [
patch.object(gr, "ROOT", self.bot_dir / "wechat-bot"),
patch.object(gr, "DAILY_DIR", self.wechat_daily),
]
for p in self._patches:
p.start()

def tearDown(self):
for p in self._patches:
p.stop()
self.tmp.cleanup()

def _write(self, path: Path, data: dict):
path.write_text(json.dumps(data, ensure_ascii=False), encoding="utf-8")

def test_neither_file_returns_none(self):
self.assertIsNone(gr._load_daily("2026-04-30"))

def test_only_wechat(self):
self._write(self.wechat_daily / "2026-04-30.json", {
"date": "2026-04-30", "tz": "Asia/Shanghai",
"groups": [{"group_id": "wx1", "group_name": "Hermes Agent 中文社區 1",
"message_count": 10, "messages": []}],
})
data = gr._load_daily("2026-04-30")
self.assertIsNotNone(data)
self.assertEqual(len(data["groups"]), 1)
self.assertEqual(data["groups"][0]["group_id"], "wx1")

def test_only_feishu(self):
self._write(self.feishu_daily / "2026-04-30.feishu.json", {
"date": "2026-04-30", "tz": "Asia/Shanghai", "platform": "feishu",
"groups": [{"group_id": "oc_x", "group_name": "Hermes Agent 中文社區飛書群 1",
"platform": "feishu", "message_count": 5, "messages": []}],
})
data = gr._load_daily("2026-04-30")
self.assertIsNotNone(data)
self.assertEqual(len(data["groups"]), 1)
self.assertEqual(data["groups"][0]["platform"], "feishu")

def test_both_files_concatenate_groups(self):
self._write(self.wechat_daily / "2026-04-30.json", {
"date": "2026-04-30", "tz": "Asia/Shanghai",
"groups": [
{"group_id": "wx1", "group_name": "Hermes Agent 中文社區 1",
"message_count": 10, "messages": []},
{"group_id": "wx2", "group_name": "Hermes Agent 中文社區 2",
"message_count": 7, "messages": []},
],
})
self._write(self.feishu_daily / "2026-04-30.feishu.json", {
"date": "2026-04-30", "tz": "Asia/Shanghai", "platform": "feishu",
"groups": [{"group_id": "oc_x", "group_name": "Hermes Agent 中文社區飛書群 1",
"platform": "feishu", "message_count": 5, "messages": []}],
})
data = gr._load_daily("2026-04-30")
self.assertEqual(len(data["groups"]), 3)
ids = [g["group_id"] for g in data["groups"]]
self.assertEqual(ids, ["wx1", "wx2", "oc_x"])


if __name__ == "__main__":
unittest.main()
  • Step 2: 跑測試,確認失敗
cd bot/wechat-bot
/usr/bin/python3 -m unittest tests.test_load_daily -v

Expected: FAIL — _load_daily 不存在

  • Step 3: 實現 _load_daily

Modify bot/wechat-bot/scripts/generate_report.py:

Add after the existing imports + constants block (right after DEFAULT_BASE_URL), new function:

def _load_daily(date_str: str) -> dict | None:
"""Merge WeChat + Feishu daily extracts for one date into a single payload.

Reads:
- bot/wechat-bot/data/daily/<date>.json (WeChat side)
- bot/feishu-bot/data/daily/<date>.feishu.json (Feishu side)

Returns None if neither exists. Otherwise concatenates `groups` lists from
each file in (wechat, feishu) order. Top-level metadata (date, tz, …) comes
from whichever file is present first.

Schema is identical between sides: see
docs/superpowers/specs/2026-05-01-feishu-group-extraction-design.md §5.1.
"""
wechat_path = DAILY_DIR / f"{date_str}.json"
feishu_path = ROOT.parent / "feishu-bot" / "data" / "daily" / f"{date_str}.feishu.json"

parts: list[dict] = []
for p in (wechat_path, feishu_path):
if not p.exists():
continue
try:
parts.append(json.loads(p.read_text(encoding="utf-8")))
except json.JSONDecodeError:
print(f"[-] {p} is not valid JSON, skipping", file=sys.stderr)

if not parts:
return None

base = dict(parts[0])
base["groups"] = []
for p in parts:
base["groups"].extend(p.get("groups", []))
return base

Modify _run_single_day to use it:

Find:

def _run_single_day(args, date_str: str):
daily_path = DAILY_DIR / f"{date_str}.json"
if not daily_path.exists():
print(f"[-] {daily_path} not found. Run scripts/extract_day.py {date_str} first.", file=sys.stderr)
return

data = json.loads(daily_path.read_text(encoding="utf-8"))

Replace with:

def _run_single_day(args, date_str: str):
data = _load_daily(date_str)
if data is None:
wechat_path = DAILY_DIR / f"{date_str}.json"
feishu_path = ROOT.parent / "feishu-bot" / "data" / "daily" / f"{date_str}.feishu.json"
print(
f"[-] No daily data for {date_str}. Looked at:\n"
f" {wechat_path}\n"
f" {feishu_path}\n"
f" Run extract_day.py first.",
file=sys.stderr,
)
return
  • Step 4: 跑測試,確認通過
/usr/bin/python3 -m unittest tests.test_load_daily -v

Expected: 4 tests PASS

  • Step 5: 跑微信側整體冒煙(如果有 vendor/decrypted 數據)
# 不打 LLM,只走數據加載分支
/usr/bin/python3 scripts/generate_report.py 2026-04-30 --dry-run

Expected: 輸出 "[*] 2026-04-30: N groups pass threshold",N 與現有數據一致;如果 2026-04-30 這天文件都不存在,輸出新的 "Looked at: ..." 錯誤信息 — 不報錯退出碼非 0 不算壞。

  • Step 6: Commit
git add bot/wechat-bot/scripts/generate_report.py \
bot/wechat-bot/tests/__init__.py \
bot/wechat-bot/tests/test_load_daily.py
git commit -m "feat(generate_report): _load_daily 合併微信 + 飛書每日抽取"

Task 15: README + CLAUDE.md

Files:

  • Create: bot/feishu-bot/README.md

  • Create: bot/feishu-bot/CLAUDE.md

  • Step 1: 寫 README.md

Create bot/feishu-bot/README.md:

# feishu-bot

把 Hermes Agent 飛書群每日消息批量拉下來,落到 `bot/feishu-bot/data/daily/<date>.feishu.json`。下游 `bot/wechat-bot/scripts/generate_report.py` 會把這份 JSON 與微信側 `<date>.json` 合併,最終生成單份日報。

## 一次性配置

1. 飛書開放平臺 → 創建企業自建應用
2. 啟用機器人能力,申請權限 scope:
- `im:message:readonly`
- `im:chat:readonly`
- `im:chat.member:read`
- `contact:user.id:readonly`
3. 提交審核(僅團隊內)
4. 複製 `app_id` / `app_secret``bot/feishu-bot/.env`(參考 `.env.example`
5. 在飛書群裡 `@機器人` 把它拉進群
6. 拿群的 `chat_id` 寫入 `FEISHU_CHAT_IDS`
```bash
/usr/bin/python3 bot/feishu-bot/scripts/inventory.py
```

## 每日運行

```bash
# 默認拉昨天(Asia/Shanghai)
/usr/bin/python3 bot/feishu-bot/scripts/extract_day.py

# 指定日期
/usr/bin/python3 bot/feishu-bot/scripts/extract_day.py 2026-04-30

# 不寫盤,只看會拉哪些
/usr/bin/python3 bot/feishu-bot/scripts/extract_day.py --dry-run

# 已存在則退出
/usr/bin/python3 bot/feishu-bot/scripts/extract_day.py --no-overwrite
```

跑完後接現有 generate_report:

```bash
cd bot/wechat-bot
/usr/bin/python3 scripts/generate_report.py 2026-04-30
```

`generate_report.py` 會同時讀:

- `bot/wechat-bot/data/daily/2026-04-30.json`
- `bot/feishu-bot/data/daily/2026-04-30.feishu.json`

輸出還是單份 `bot/wechat-bot/data/reports/<model>/2026-04-30.detailed.md`

## 週末

跟微信側對齊:週末不出日報。週一不傳日期跑 `extract_day.py`,會自動補 Sat/Sun/Mon 三天。

## 測試

```bash
cd bot/feishu-bot
/usr/bin/python3 -m unittest discover tests -v
```

不打活的飛書 API;都走離線 fixture 和 in-memory transport。
  • Step 2: 寫 CLAUDE.md

Create bot/feishu-bot/CLAUDE.md:

# CLAUDE.md

This file provides guidance to Claude Code when working in `bot/feishu-bot/`.

## What this is

第二個信息源接入:把飛書群的一天消息批量拉下來,落到 `data/daily/<date>.feishu.json`,schema 與 `bot/wechat-bot/data/daily/<date>.json` 嚴格對齊。下游 `bot/wechat-bot/scripts/generate_report.py` 會合並兩側,**沒有獨立的報告生成、prompt、海報**

設計稿:`docs/superpowers/specs/2026-05-01-feishu-group-extraction-design.md`

## 關鍵文件

- `scripts/_feishu.py` — single source of truth:
- `FeishuClient`:tenant_access_token、5xx/429 退避、token 過期刷新
- `iter_messages(chat_id, start_ts, end_ts)`:分頁升序
- `decode_message(raw, resolver=None)`:text / post / share_chat / share_user / file 解碼;其它類型返 `None`
- `parse_group_labels(raw)` / `resolve_group_label(...)`:按 `FEISHU_GROUP_LABELS` 或順序號補群名
- `scripts/extract_day.py` — CLI;`day_bounds``expand_dates` 是純函數,主流程不單測
- `scripts/inventory.py` — 排查:列機器人在的群 + 7 日消息量

## 與微信側的接縫

`generate_report.py:_load_daily(date_str)` 讀兩邊 JSON,串接 `groups` 列表。schema 一致是這步零適配的基礎——任何 schema 漂移都會破壞合併:

- `groups`**列表**
- 每條 message 字段名:`ts` / `time` / `sender_wxid` / `sender_name` / `text`
- 飛書消息把 `open_id` 存到 `sender_wxid` key 下;`sender_name` 留空字符串

## 測試邊界

- 解碼器 / 客戶端 / 分頁 / 日期邊界 / 群標籤解析 — 全單測
- `inventory.py` 主流程 + `extract_day.py` 主流程 — 不單測,靠手動冒煙(要打活的 API)

## 風險點

- 飛書群名變更 → 用 `FEISHU_GROUP_LABELS` 顯式映射兜底
- 機器人被踢 → API 403,extract 報錯並退出碼非 0
- token 不寫盤緩存:每次運行重新拿,避免憑證落地
- prune_report.py 的 dedupe key 必須與 generate_report.py 同步(本次未改,但要點記住)
  • Step 3: Commit
git add bot/feishu-bot/README.md bot/feishu-bot/CLAUDE.md
git commit -m "docs(feishu-bot): README + CLAUDE.md"

Task 16: 全套測試 + 倉庫根 .gitignore 兜底

Files:

  • Modify: .gitignore(倉庫根)— 僅在確認未覆蓋時

Background: 倉庫根 .gitignore 應該已經覆蓋 data/ 等。這一步只是兜底確認 — 檢查 bot/feishu-bot/data/.env 不會被 commit。

  • Step 1: 檢查根 .gitignore 覆蓋情況
cd /Users/claw/Documents/GithubProjects/hermes-cn-v1
cat .gitignore | head -40
git check-ignore bot/feishu-bot/data/daily/test.feishu.json bot/feishu-bot/.env

Expected output (if covered): 兩個路徑都被 git check-ignore 列出。

  • Step 2: 如果根 .gitignore 沒覆蓋,加規則

只在上一步命令對 bot/feishu-bot/.env 沒輸出時才執行:

# Append (尾部,不破壞現有規則)
cat >> .gitignore <<'EOF'

# Feishu bot
bot/feishu-bot/data/
bot/feishu-bot/.env
bot/feishu-bot/.env.*
!bot/feishu-bot/.env.example
EOF

bot/feishu-bot/.gitignore(Task 1 創建的)已經覆蓋了同一作用域,但根級再寫一遍是雙保險,避免有人在倉庫根 git add . 時繞過子目錄 .gitignore(實際上 git 會讀 nested .gitignore,所以這步多半不必要——只在 Step 1 顯示未覆蓋時做)。

  • Step 3: 跑兩側全套單測
cd bot/feishu-bot
/usr/bin/python3 -m unittest discover tests -v

cd ../wechat-bot
/usr/bin/python3 -m unittest discover tests -v

Expected: 飛書側 ~19 tests PASS;微信側 4 tests PASS(test_load_daily)

  • Step 4: Commit(僅當 Step 2 改了根 .gitignore)
git add .gitignore
git commit -m "chore: 根 .gitignore 兜底覆蓋 bot/feishu-bot/{data,.env}"

Task 17: 端到端手動驗證

前置:Pre-conditions 三步已完成(機器人創建、權限審批、入群、.env 已填好真實 app_id / app_secret / chat_id)。

這一步不能自動化——必須打活的飛書 API,用戶決定何時跑。

  • Step 1: inventory 驗證連通
cd bot/feishu-bot
/usr/bin/python3 scripts/inventory.py

Expected: 列出至少 1 個群,群名能被正確讀出。如果 0 個 → 機器人沒入群;如果 401/403 → app_secret 錯或權限沒批通過。

  • Step 2: 單日抽取

挑昨天作為目標日(腳本默認就是昨天):

/usr/bin/python3 scripts/extract_day.py

Expected:

  • stdout 出現 [YYYY-MM-DD] N messages across 1 groups -> /...
  • data/daily/YYYY-MM-DD.feishu.json 存在
  • 打開看:groups 是 list,第一個 group 有 platform: "feishu"message_count > 0(如果當天有消息)

如果 message_count = 0 但群裡有消息:im:message:readonly 沒生效 / scope 沒批 / 時間窗外。

  • Step 3: dry-run + no-overwrite
/usr/bin/python3 scripts/extract_day.py --dry-run
/usr/bin/python3 scripts/extract_day.py --no-overwrite # 應該 exit 2

Expected: 第一條不寫盤只打印;第二條 exit code = 2,stderr 有 "exists and --no-overwrite given"。

  • Step 4: 接 generate_report 跑一天 dry-run
cd ../wechat-bot
/usr/bin/python3 scripts/generate_report.py YYYY-MM-DD --dry-run

Expected: stdout 列出 N+M 個群(N 微信 + M 飛書),所有飛書群名以 "Hermes Agent 中文社區飛書群" 開頭。

  • Step 5: 完整跑一遍(消耗 LLM 配額)
/usr/bin/python3 scripts/generate_report.py YYYY-MM-DD

Expected:

  • bot/wechat-bot/data/reports/<model>/YYYY-MM-DD.detailed.md 包含飛書群來源條目

  • 來源標籤裡飛書群顯示為 "Hermes Agent 中文社區飛書群 1",微信群顯示為 "Hermes Agent 中文社區微信群 N"

  • 如果某話題在兩個平臺都討論過,dedupe 後 **來源**:Hermes Agent 中文社區微信群 3 / Hermes Agent 中文社區飛書群 1

  • Step 6: 出海報驗證下游

cd ../..
pnpm wechat-summary:render -- YYYY-MM-DD

Expected: bot/wechat-summary-bot/output/YYYY-MM-DD.png 包含飛書群條目;視覺上沒破。

  • Step 7: 驗收

確認上述全部 OK 即可關掉端到端任務。這一步不 commit——驗證通過即結束。

如有失敗:根據失敗位置回到對應 Task(解碼錯 → Task 3-5;分頁錯 → Task 8;合併錯 → Task 14)補單測復現 + 修復。


Self-Review

設計稿 (docs/superpowers/specs/2026-05-01-feishu-group-extraction-design.md) 各章節覆蓋檢查:

  • §3 整體架構 → Task 11-12(extract_day)+ Task 14(generate_report 改造)
  • §4 飛書自建應用 & 權限 → Pre-conditions(人工)
  • §5.1 schema → Task 12 輸出格式 + Task 14 測試斷言
  • §5.2 實現要點 → Task 6-9 客戶端 + Task 2-5 解碼器 + Task 11 day_bounds + Task 12 主流程 + Task 11 週末補跑
  • §6.1 _load_daily → Task 14
  • §6.2 _display_source 不動 → 隱含;不需要任務(不動既有正則)
  • §6.3 LLM 輸入預處理保持現狀 → 不需要任務
  • §6.4 dedupe 不動 → 不需要任務
  • §7 目錄結構 → Task 1, 13, 15
  • §8 配置 & 憑證 → Task 1(.env.example)+ Task 16(gitignore 兜底)
  • §9 錯誤處理 & 冪等性 → Task 7(重試)+ Task 12(--no-overwrite)+ Task 4-5(未知類型 drop)
  • §10 測試 → Task 2-11、Task 14
  • §11 風險與權衡 → README/CLAUDE.md(Task 15)

placeholder 掃描:無 TBD / TODO / "適當的錯誤處理"。

類型一致性:decode_message 返回值字段名(ts / time / sender_wxid / sender_name / type / text)在 Task 2 定義後,Task 3-5、12 全部一致使用。FeishuClient 方法簽名(_get_token / _request / iter_messages / get_chat_name)在 Task 6-9 定義後,Task 12-13 調用一致。

scope 檢查:本計劃是單一實現計劃,三個 logical phases(解碼器、客戶端、CLI/集成)順序執行,無獨立子項目。

Plan ready.