Faiss

Facebook 用於高效相似性搜索和稠密向量聚類的庫。支持數十億級向量、GPU 加速以及各種索引類型（Flat、IVF、HNSW）。適用於快速 k-NN 搜索、大規模向量檢索，或需要在無元數據情況下進行純相似性搜索的場景。最適合高性能應用。

技能元數據


來源	可選 — 使用 `hermes skills install official/mlops/faiss` 安裝
路徑	`optional-skills/mlops/faiss`
版本	`1.0.0`
作者	Orchestra Research
許可證	MIT
依賴項	`faiss-cpu`, `faiss-gpu`, `numpy`
標籤	`RAG`, `FAISS`, `Similarity Search`, `Vector Search`, `Facebook AI`, `GPU Acceleration`, `Billion-Scale`, `K-NN`, `HNSW`, `High Performance`, `Large Scale`

參考：完整 SKILL.md

信息

以下是 Hermes 在觸發此技能時加載的完整技能定義。這是技能激活時代理所看到的指令。

FAISS - 高效相似性搜索

Facebook AI 用於十億級向量相似性搜索的庫。

何時使用 FAISS

在以下情況使用 FAISS：

需要在大型向量數據集（數百萬/數十億）上進行快速相似性搜索
需要 GPU 加速
純向量相似性（不需要元數據過濾）
高吞吐量、低延遲至關重要
嵌入的離線/批處理

指標：

31,700+ GitHub 星標
Meta/Facebook AI Research
處理數十億級向量
C++ 附帶 Python 綁定

改用其他替代方案：

Chroma/Pinecone：需要元數據過濾
Weaviate：需要完整的數據庫功能
Annoy：更簡單，功能較少

快速開始

安裝

# CPU only
pip install faiss-cpu

# GPU support
pip install faiss-gpu

基本用法

import faiss
import numpy as np

# Create sample data (1000 vectors, 128 dimensions)
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')

# Create index
index = faiss.IndexFlatL2(d)  # L2 distance
index.add(vectors)             # Add vectors

# Search
k = 5  # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)

print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")

索引類型

1. Flat（精確搜索）

# L2 (Euclidean) distance
index = faiss.IndexFlatL2(d)

# Inner product (cosine similarity if normalized)
index = faiss.IndexFlatIP(d)

# Slowest, most accurate

2. IVF（倒排文件）- 快速近似

# Create quantizer
quantizer = faiss.IndexFlatL2(d)

# IVF index with 100 clusters
nlist = 100
index = faiss.IndexIVFFlat(quantizer, d, nlist)

# Train on data
index.train(vectors)

# Add vectors
index.add(vectors)

# Search (nprobe = clusters to search)
index.nprobe = 10
distances, indices = index.search(query, k)

3. HNSW（分層 NSW）- 最佳質量/速度比

# HNSW index
M = 32  # Number of connections per layer
index = faiss.IndexHNSWFlat(d, M)

# No training needed
index.add(vectors)

# Search
distances, indices = index.search(query, k)

4. 乘積量化（Product Quantization）- 內存高效

# PQ reduces memory by 16-32×
m = 8   # Number of subquantizers
nbits = 8
index = faiss.IndexPQ(d, m, nbits)

# Train and add
index.train(vectors)
index.add(vectors)

保存和加載

# Save index
faiss.write_index(index, "large.index")

# Load index
index = faiss.read_index("large.index")

# Continue using
distances, indices = index.search(query, k)

GPU 加速

# Single GPU
res = faiss.StandardGpuResources()
index_cpu = faiss.IndexFlatL2(d)
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)  # GPU 0

# Multi-GPU
index_gpu = faiss.index_cpu_to_all_gpus(index_cpu)

# 10-100× faster than CPU

LangChain 集成

from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings

# Create FAISS vector store
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())

# Save
vectorstore.save_local("faiss_index")

# Load
vectorstore = FAISS.load_local(
    "faiss_index",
    OpenAIEmbeddings(),
    allow_dangerous_deserialization=True
)

# Search
results = vectorstore.similarity_search("query", k=5)

LlamaIndex 集成

from llama_index.vector_stores.faiss import FaissVectorStore
import faiss

# Create FAISS index
d = 1536
faiss_index = faiss.IndexFlatL2(d)

vector_store = FaissVectorStore(faiss_index=faiss_index)

最佳實踐

選擇合適的索引類型 - 小於 10K 用 Flat，10K-1M 用 IVF，追求質量用 HNSW
歸一化以用於餘弦相似度 - 對歸一化向量使用 IndexFlatIP
對大型數據集使用 GPU - 速度快 10-100 倍
保存訓練好的索引 - 訓練成本高昂
調整 nprobe/ef_search - 平衡速度/準確性
監控內存 - 大型數據集使用 PQ
批量查詢 - 更好地利用 GPU

性能

索引類型	構建時間	搜索時間	內存	準確性
Flat	快	慢	高	100%
IVF	中等	快	中等	95-99%
HNSW	慢	最快	高	99%
PQ	中等	快	低	90-95%

資源

GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
Wiki: https://github.com/facebookresearch/faiss/wiki
許可證: MIT

技能元數據​

參考：完整 SKILL.md​

FAISS - 高效相似性搜索

何時使用 FAISS​

快速開始​

安裝​

基本用法​

索引類型​

1. Flat（精確搜索）​

2. IVF（倒排文件）- 快速近似​

3. HNSW（分層 NSW）- 最佳質量/速度比​

4. 乘積量化（Product Quantization）- 內存高效​

保存和加載​

GPU 加速​

LangChain 集成​

LlamaIndex 集成​

最佳實踐​

性能​

資源​