指南

使用正則表達式和文法控制 LLM 輸出，保證有效的 JSON/XML/代碼生成，強制執行結構化格式，並利用 Guidance（微軟研究院的約束生成框架）構建多步驟工作流

技能元數據


來源	可選 — 使用 `hermes skills install official/mlops/guidance` 安裝
路徑	`optional-skills/mlops/guidance`
版本	`1.0.0`
作者	Orchestra Research
許可證	MIT
依賴項	`guidance`, `transformers`
標籤	`Prompt Engineering`, `Guidance`, `Constrained Generation`, `Structured Output`, `JSON Validation`, `Grammar`, `Microsoft Research`, `Format Enforcement`, `Multi-Step Workflows`

參考：完整 SKILL.md

信息

以下是 Hermes 在觸發此技能時加載的完整技能定義。這是技能激活時代理看到的指令。

Guidance：約束性 LLM 生成

何時使用此技能

當您需要執行以下操作時，請使用 Guidance：

使用正則表達式或文法控制 LLM 輸出語法
保證生成有效的 JSON/XML/代碼
降低延遲（相較於傳統提示方法）
強制執行結構化格式（日期、電子郵件、ID 等）
使用 Pythonic 控制流構建多步驟工作流
通過文法約束防止無效輸出

GitHub Stars: 18,000+ | 來源: Microsoft Research

安裝

# Base installation
pip install guidance

# With specific backends
pip install guidance[transformers]  # Hugging Face models
pip install guidance[llama_cpp]     # llama.cpp models

快速入門

基本示例：結構化生成

from guidance import models, gen

# Load model (supports OpenAI, Transformers, llama.cpp)
lm = models.OpenAI("gpt-4")

# Generate with constraints
result = lm + "The capital of France is " + gen("capital", max_tokens=5)

print(result["capital"])  # "Paris"

配合 Anthropic Claude 使用

from guidance import models, gen, system, user, assistant

# Configure Claude
lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Use context managers for chat format
with system():
    lm += "You are a helpful assistant."

with user():
    lm += "What is the capital of France?"

with assistant():
    lm += gen(max_tokens=20)

核心概念

1. 上下文管理器

Guidance 使用 Pythonic 上下文管理器進行聊天式交互。

from guidance import system, user, assistant, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# System message
with system():
    lm += "You are a JSON generation expert."

# User message
with user():
    lm += "Generate a person object with name and age."

# Assistant response
with assistant():
    lm += gen("response", max_tokens=100)

print(lm["response"])

優勢：

自然的聊天流程
清晰的角色分離
易於閱讀和維護

2. 約束生成

Guidance 確保輸出符合使用正則表達式或文法指定的模式。

正則表達式約束

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Constrain to valid email format
lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")

# Constrain to date format (YYYY-MM-DD)
lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}")

# Constrain to phone number
lm += "Phone: " + gen("phone", regex=r"\d{3}-\d{3}-\d{4}")

print(lm["email"])  # Guaranteed valid email
print(lm["date"])   # Guaranteed YYYY-MM-DD format

工作原理：

正則表達式在 token 級別轉換為文法
在生成過程中過濾無效 token
模型只能生成匹配的輸出

選擇約束

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Constrain to specific choices
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")

# Multiple-choice selection
lm += "Best answer: " + select(
    ["A) Paris", "B) London", "C) Berlin", "D) Madrid"],
    name="answer"
)

print(lm["sentiment"])  # One of: positive, negative, neutral
print(lm["answer"])     # One of: A, B, C, or D

3. Token 修復 (Token Healing)

Guidance 自動“修復”提示詞和生成內容之間的 token 邊界。

問題： Tokenization 會產生不自然的邊界。

# Without token healing
prompt = "The capital of France is "
# Last token: " is "
# First generated token might be " Par" (with leading space)
# Result: "The capital of France is  Paris" (double space!)

解決方案： Guidance 回退一個 token 並重新生成。

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# Token healing enabled by default
lm += "The capital of France is " + gen("capital", max_tokens=5)
# Result: "The capital of France is Paris" (correct spacing)

優勢：

自然的文本邊界
無尷尬的空格問題
更好的模型性能（看到自然的 token 序列）

4. 基於文法的生成

使用上下文無關文法定義複雜結構。

from guidance import models, gen

lm = models.Anthropic("claude-sonnet-4-5-20250929")

# JSON grammar (simplified)
json_grammar = """
{
    "name": <gen name regex="[A-Za-z ]+" max_tokens=20>,
    "age": <gen age regex="[0-9]+" max_tokens=3>,
    "email": <gen email regex="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}" max_tokens=50>
}
"""

# Generate valid JSON
lm += gen("person", grammar=json_grammar)

print(lm["person"])  # Guaranteed valid JSON structure

用例：

複雜結構化輸出
嵌套數據結構
編程語言語法
領域特定語言

5. Guidance 函數

使用 @guidance 裝飾器創建可重用的生成模式。

from guidance import guidance, gen, models

@guidance
def generate_person(lm):
    """Generate a person with name and age."""
    lm += "Name: " + gen("name", max_tokens=20, stop="\n")
    lm += "\nAge: " + gen("age", regex=r"[0-9]+", max_tokens=3)
    return lm

# Use the function
lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = generate_person(lm)

print(lm["name"])
print(lm["age"])

有狀態函數：

@guidance(stateless=False)
def react_agent(lm, question, tools, max_rounds=5):
    """ReAct agent with tool use."""
    lm += f"Question: {question}\n\n"

    for i in range(max_rounds):
        # Thought
        lm += f"Thought {i+1}: " + gen("thought", stop="\n")

        # Action
        lm += "\nAction: " + select(list(tools.keys()), name="action")

        # Execute tool
        tool_result = tools[lm["action"]]()
        lm += f"\nObservation: {tool_result}\n\n"

        # Check if done
        lm += "Done? " + select(["Yes", "No"], name="done")
        if lm["done"] == "Yes":
            break

    # Final answer
    lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
    return lm

後端配置

Anthropic Claude

from guidance import models

lm = models.Anthropic(
    model="claude-sonnet-4-5-20250929",
    api_key="your-api-key"  # Or set ANTHROPIC_API_KEY env var
)

OpenAI

lm = models.OpenAI(
    model="gpt-4o-mini",
    api_key="your-api-key"  # Or set OPENAI_API_KEY env var
)

本地模型 (Transformers)

from guidance.models import Transformers

lm = Transformers(
    "microsoft/Phi-4-mini-instruct",
    device="cuda"  # Or "cpu"
)

本地模型 (llama.cpp)

from guidance.models import LlamaCpp

lm = LlamaCpp(
    model_path="/path/to/model.gguf",
    n_ctx=4096,
    n_gpu_layers=35
)

常見模式

模式 1：JSON 生成

from guidance import models, gen, system, user, assistant

lm = models.Anthropic("claude-sonnet-4-5-20250929")

with system():
    lm += "You generate valid JSON."

with user():
    lm += "Generate a user profile with name, age, and email."

with assistant():
    lm += """{
    "name": """ + gen("name", regex=r'"[A-Za-z ]+"', max_tokens=30) + """,
    "age": """ + gen("age", regex=r"[0-9]+", max_tokens=3) + """,
    "email": """ + gen("email", regex=r'"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"', max_tokens=50) + """
}"""

print(lm)  # Valid JSON guaranteed

模式 2：分類

from guidance import models, gen, select

lm = models.Anthropic("claude-sonnet-4-5-20250929")

text = "This product is amazing! I love it."

lm += f"Text: {text}\n"
lm += "Sentiment: " + select(["positive", "negative", "neutral"], name="sentiment")
lm += "\nConfidence: " + gen("confidence", regex=r"[0-9]+", max_tokens=3) + "%"

print(f"Sentiment: {lm['sentiment']}")
print(f"Confidence: {lm['confidence']}%")

模式 3：多步推理

from guidance import models, gen, guidance

@guidance
def chain_of_thought(lm, question):
    """Generate answer with step-by-step reasoning."""
    lm += f"Question: {question}\n\n"

    # Generate multiple reasoning steps
    for i in range(3):
        lm += f"Step {i+1}: " + gen(f"step_{i+1}", stop="\n", max_tokens=100) + "\n"

    # Final answer
    lm += "\nTherefore, the answer is: " + gen("answer", max_tokens=50)

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = chain_of_thought(lm, "What is 15% of 200?")

print(lm["answer"])

模式 4：ReAct Agent

from guidance import models, gen, select, guidance

@guidance(stateless=False)
def react_agent(lm, question):
    """ReAct agent with tool use."""
    tools = {
        "calculator": lambda expr: eval(expr),
        "search": lambda query: f"Search results for: {query}",
    }

    lm += f"Question: {question}\n\n"

    for round in range(5):
        # Thought
        lm += f"Thought: " + gen("thought", stop="\n") + "\n"

        # Action selection
        lm += "Action: " + select(["calculator", "search", "answer"], name="action")

        if lm["action"] == "answer":
            lm += "\nFinal Answer: " + gen("answer", max_tokens=100)
            break

        # Action input
        lm += "\nAction Input: " + gen("action_input", stop="\n") + "\n"

        # Execute tool
        if lm["action"] in tools:
            result = tools[lm["action"]](lm["action_input"])
            lm += f"Observation: {result}\n\n"

    return lm

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = react_agent(lm, "What is 25 * 4 + 10?")
print(lm["answer"])

模式 5：數據提取

from guidance import models, gen, guidance

@guidance
def extract_entities(lm, text):
    """Extract structured entities from text."""
    lm += f"Text: {text}\n\n"

    # Extract person
    lm += "Person: " + gen("person", stop="\n", max_tokens=30) + "\n"

    # Extract organization
    lm += "Organization: " + gen("organization", stop="\n", max_tokens=30) + "\n"

    # Extract date
    lm += "Date: " + gen("date", regex=r"\d{4}-\d{2}-\d{2}", max_tokens=10) + "\n"

    # Extract location
    lm += "Location: " + gen("location", stop="\n", max_tokens=30) + "\n"

    return lm

text = "Tim Cook announced at Apple Park on 2024-09-15 in Cupertino."

lm = models.Anthropic("claude-sonnet-4-5-20250929")
lm = extract_entities(lm, text)

print(f"Person: {lm['person']}")
print(f"Organization: {lm['organization']}")
print(f"Date: {lm['date']}")
print(f"Location: {lm['location']}")

最佳實踐

1. 使用正則表達式進行格式驗證

# ✅ Good: Regex ensures valid format
lm += "Email: " + gen("email", regex=r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")

# ❌ Bad: Free generation may produce invalid emails
lm += "Email: " + gen("email", max_tokens=50)

2. 使用 select() 處理固定類別

# ✅ Good: Guaranteed valid category
lm += "Status: " + select(["pending", "approved", "rejected"], name="status")

# ❌ Bad: May generate typos or invalid values
lm += "Status: " + gen("status", max_tokens=20)

3. 利用 Token 修復

# Token healing is enabled by default
# No special action needed - just concatenate naturally
lm += "The capital is " + gen("capital")  # Automatic healing

4. 使用 stop 序列

# ✅ Good: Stop at newline for single-line outputs
lm += "Name: " + gen("name", stop="\n")

# ❌ Bad: May generate multiple lines
lm += "Name: " + gen("name", max_tokens=50)

5. 創建可重用函數

# ✅ Good: Reusable pattern
@guidance
def generate_person(lm):
    lm += "Name: " + gen("name", stop="\n")
    lm += "\nAge: " + gen("age", regex=r"[0-9]+")
    return lm

# Use multiple times
lm = generate_person(lm)
lm += "\n\n"
lm = generate_person(lm)

6. 平衡約束

# ✅ Good: Reasonable constraints
lm += gen("name", regex=r"[A-Za-z ]+", max_tokens=30)

# ❌ Too strict: May fail or be very slow
lm += gen("name", regex=r"^(John|Jane)$", max_tokens=10)

與替代方案比較

特性	Guidance	Instructor	Outlines	LMQL
正則表達式約束	✅ 是	❌ 否	✅ 是	✅ 是
文法支持	✅ CFG	❌ 否	✅ CFG	✅ CFG
Pydantic 驗證	❌ 否	✅ 是	✅ 是	❌ 否
Token 修復	✅ 是	❌ 否	✅ 是	❌ 否
本地模型	✅ 是	⚠️ 有限	✅ 是	✅ 是
API 模型	✅ 是	✅ 是	⚠️ 有限	✅ 是
Pythonic 語法	✅ 是	✅ 是	✅ 是	❌ 類 SQL
學習曲線	低	低	中	高

何時選擇 Guidance：

需要正則表達式/文法約束
想要 token 修復功能
構建帶有控制流的複雜工作流
使用本地模型（Transformers, llama.cpp）
偏好 Pythonic 語法

何時選擇替代方案：

Instructor：需要帶有自動重試功能的 Pydantic 驗證
Outlines：需要 JSON Schema 驗證
LMQL：偏好聲明式查詢語法

性能特徵

延遲降低：

對於受限輸出，比傳統提示快 30-50%
Token 修復減少不必要的重新生成
語法約束防止生成無效 token

內存使用：

與無約束生成相比，開銷極小
首次使用後緩存語法編譯結果
推理時高效過濾 token

Token 效率：

防止在無效輸出上浪費 token
無需重試循環
直接生成有效輸出

資源

文檔：https://guidance.readthedocs.io
GitHub：https://github.com/guidance-ai/guidance（18k+ stars）
Notebooks：https://github.com/guidance-ai/guidance/tree/main/notebooks
Discord：提供社區支持

另見

references/constraints.md - 全面的正則表達式和語法模式
references/backends.md - 特定後端的配置
references/examples.md - 生產就緒示例

技能元數據​

參考：完整 SKILL.md​

Guidance：約束性 LLM 生成

何時使用此技能​

安裝​

快速入門​

基本示例：結構化生成​

配合 Anthropic Claude 使用​

核心概念​

1. 上下文管理器​

2. 約束生成​

正則表達式約束​

選擇約束​

3. Token 修復 (Token Healing)​

4. 基於文法的生成​

5. Guidance 函數​

後端配置​

Anthropic Claude​

OpenAI​

本地模型 (Transformers)​

本地模型 (llama.cpp)​

常見模式​

模式 1：JSON 生成​

模式 2：分類​

模式 3：多步推理​

模式 4：ReAct Agent​

模式 5：數據提取​

最佳實踐​

1. 使用正則表達式進行格式驗證​

2. 使用 select() 處理固定類別​

3. 利用 Token 修復​

4. 使用 stop 序列​

5. 創建可重用函數​

6. 平衡約束​

與替代方案比較​

性能特徵​

資源​

另見​