RAG

pipeline

RAG Pipeline (也称为 Extractor) 将提示、上下文数据存储和生成模型结合在一起，用于提取知识。

数据存储可以是嵌入数据库或具有关联输入文本的相似度实例。生成模型可以是提示驱动的大型语言模型（LLM）、抽取式问答模型或自定义 Pipeline。这被称为检索增强生成（RAG）。

示例

下面显示了使用此 Pipeline 的一个简单示例。

from txtai import Embeddings, RAG

# Input data
data = [
  "US tops 5 million confirmed virus cases",
  "Canada's last fully intact ice shelf has suddenly collapsed, " +
  "forming a Manhattan-sized iceberg",
  "Beijing mobilises invasion craft along coast as Taiwan tensions escalate",
  "The National Park Service warns against sacrificing slower friends " +
  "in a bear attack",
  "Maine man wins $1M from $25 lottery ticket",
  "Make huge profits without work, earn up to $100,000 a day"
]

# Build embeddings index
embeddings = Embeddings(content=True)
embeddings.index(data)

# Create and run pipeline
rag = RAG(embeddings, "google/flan-t5-base", template="""
  Answer the following question using the provided context.

  Question:
  {question}

  Context:
  {context}
""")

rag("What was won?")

# Instruction tuned models typically require string prompts to
# follow a specific chat template set by the model
rag = RAG(embeddings, "meta-llama/Meta-Llama-3.1-8B-Instruct", template="""
  <|im_start|>system
  You are a friendly assistant.<|im_end|>
  <|im_start|>user
  Answer the following question using the provided context.

  Question:
  {question}

  Context:
  {context}
  <|im_start|>assistant
  """
)
rag("What was won?")

# LLM options can be passed as additional arguments
rag = RAG(embeddings, "meta-llama/Meta-Llama-3.1-8B-Instruct", template="""
  Answer the following question using the provided context.

  Question:
  {question}

  Context:
  {context}
""")

# Set the default role to user and string inputs are converted to chat messages
rag("What was won?", defaultrole="user")

有关其他配置选项，请参阅嵌入和LLM页面。

请参阅以下链接获取更详细的示例。

Notebook	描述
使用LLM进行提示驱动搜索	使用大型语言模型（LLM）进行嵌入引导和提示驱动搜索
提示模板和任务链	使用工作流构建模型提示并将任务连接在一起
使用 txtai 构建 RAG Pipeline	关于检索增强生成的指南，包括如何创建引文
集成 LLM 框架	集成 llama.cpp, LiteLLM 和自定义生成框架
使用语义图和 RAG 生成知识	使用语义图和 RAG 进行知识探索和发现
使用 LLM 构建知识图谱	使用 LLM 驱动的实体提取构建知识图谱
使用图路径遍历进行高级 RAG	图路径遍历收集复杂数据集以用于高级 RAG
使用引导式生成进行高级 RAG	检索增强和引导生成
使用 llama.cpp 和外部 API 服务进行 RAG	使用额外的向量和 LLM 框架进行 RAG
txtai 中的 RAG 如何工作	创建 RAG 进程、API 服务和 Docker 实例
语音到语音 RAG ▶️	使用 RAG 实现完整的语音到语音工作流
生成式音频	使用生成式音频工作流进行故事讲述
使用图和 Agent 分析 Hugging Face 帖子	使用图分析和 Agent 探索丰富的数据集
赋予 Agent 自主权	Agent 根据情况迭代解决问题
LLM API 入门	使用 OpenAI, Claude, Gemini, Bedrock 等生成嵌入并运行 LLM
使用图和 Agent 分析 LinkedIn 公司帖子	探索如何利用 AI 改善社交媒体互动
使用 txtai 进行抽取式问答	txtai 抽取式问答简介
使用 Elasticsearch 进行抽取式问答	使用 Elasticsearch 运行抽取式问答查询
利用抽取式问答构建结构化数据	使用抽取式问答构建结构化数据集
使用 txtai 解析星体	探索已知恒星、行星、星系的宇宙知识图谱
为 RAG 分块你的数据	提取、分块和索引内容以实现有效检索

配置驱动示例

Pipeline 可以使用 Python 或配置运行。Pipeline 可以通过 Pipeline 的小写名称在配置中实例化。配置驱动的 Pipeline 使用工作流或API运行。

config.yml

# Allow documents to be indexed
writable: True

# Content is required for extractor pipeline
embeddings:
  content: True

rag:
  path: google/flan-t5-base
  template: |
    Answer the following question using the provided context.

    Question:
    {question}

    Context:
    {context}

workflow:
  search:
    tasks:
      - action: rag

通过工作流运行

内置任务使使用 Extractor Pipeline 更加容易。

from txtai import Application

# Create and run pipeline with workflow
app = Application("config.yml")
app.add([
  "US tops 5 million confirmed virus cases",
  "Canada's last fully intact ice shelf has suddenly collapsed, " +
  "forming a Manhattan-sized iceberg",
  "Beijing mobilises invasion craft along coast as Taiwan tensions escalate",
  "The National Park Service warns against sacrificing slower friends " +
  "in a bear attack",
  "Maine man wins $1M from $25 lottery ticket",
  "Make huge profits without work, earn up to $100,000 a day"
])
app.index()

list(app.workflow("search", ["What was won?"]))

通过API运行

CONFIG=config.yml uvicorn "txtai.api:app" &

curl \
  -X POST "https://:8000/workflow" \
  -H "Content-Type: application/json" \
  -d '{"name": "search", "elements": ["What was won"]}'

方法

此 Pipeline 的 Python 文档。

`init(similarity, path, quantize=False, gpu=True, model=None, tokenizer=None, minscore=None, mintokens=None, context=None, task=None, output='default', template=None, separator=' ', system=None, **kwargs)`

构建一个新的 RAG Pipeline。

参数

名称	描述	默认值
`similarity`	相似度实例（嵌入或相似度 Pipeline）	必需
`path`	模型路径，支持 LLM、问答或自定义 Pipeline	必需
`quantize`	如果在推理前应量化模型，则为 True，否则为 False。	`False`
`gpu`	是否应使用 GPU 推理（仅在 GPU 可用时有效）	`True`
`model`	可选的现有 Pipeline 模型进行封装	`None`
`tokenizer`	Tokenizer 类	`None`
`minscore`	包含上下文匹配的最低分数，默认为 None	`None`
`mintokens`	包含上下文匹配的最低 token 数，默认为 None	`None`
`context`	要包含的 topn 个上下文匹配项，默认为 3	`None`
`task`	模型任务（语言生成、序列到序列或问答），默认为自动检测	`None`
`output`	输出格式，'default' 返回 (name, answer)，'flatten' 返回 answers，'reference' 返回 (name, answer, reference)	`'default'`
`template`	提示模板，必须包含 {question} 和 {context} 参数，默认为 "{question} {context}"	`None`
`separator`	上下文分隔符	`' '`
`system`	系统提示，默认为 None	`None`
`kwargs`	传递给 Pipeline 模型的额外关键字参数	`{}`

源代码位于 txtai/pipeline/llm/rag.py

def __init__(
    self,
    similarity,
    path,
    quantize=False,
    gpu=True,
    model=None,
    tokenizer=None,
    minscore=None,
    mintokens=None,
    context=None,
    task=None,
    output="default",
    template=None,
    separator=" ",
    system=None,
    **kwargs,
):
    """
    Builds a new RAG pipeline.

    Args:
        similarity: similarity instance (embeddings or similarity pipeline)
        path: path to model, supports a LLM, Questions or custom pipeline
        quantize: True if model should be quantized before inference, False otherwise.
        gpu: if gpu inference should be used (only works if GPUs are available)
        model: optional existing pipeline model to wrap
        tokenizer: Tokenizer class
        minscore: minimum score to include context match, defaults to None
        mintokens: minimum number of tokens to include context match, defaults to None
        context: topn context matches to include, defaults to 3
        task: model task (language-generation, sequence-sequence or question-answering), defaults to auto-detect
        output: output format, 'default' returns (name, answer), 'flatten' returns answers and 'reference' returns (name, answer, reference)
        template: prompt template, it must have a parameter for {question} and {context}, defaults to "{question} {context}"
        separator: context separator
        system: system prompt, defaults to None
        kwargs: additional keyword arguments to pass to pipeline model
    """

    # Similarity instance
    self.similarity = similarity

    # Model can be a LLM, Questions or custom pipeline
    self.model = self.load(path, quantize, gpu, model, task, **kwargs)

    # Tokenizer class use default method if not set
    self.tokenizer = tokenizer if tokenizer else Tokenizer() if hasattr(self.similarity, "scoring") and self.similarity.isweighted() else None

    # Minimum score to include context match
    self.minscore = minscore if minscore is not None else 0.0

    # Minimum number of tokens to include context match
    self.mintokens = mintokens if mintokens is not None else 0.0

    # Top n context matches to include for context
    self.context = context if context else 3

    # Output format
    self.output = output

    # Prompt template
    self.template = template if template else "{question} {context}"

    # Context separator
    self.separator = separator

    # System prompt template
    self.system = system

`call(queue, texts=None, **kwargs)`

查找输入问题的答案。此方法运行查询以找到前 n 个最佳匹配，并将其用作上下文。然后针对每个输入问题，使用模型对上下文进行推理，并返回答案。

参数

名称	描述	默认值
`queue`	输入问题队列 (name, query, question, snippet)，可以是元组/字典/字符串列表或单个输入元素	必需
`texts`	可选的上下文文本列表，否则运行嵌入搜索	`None`
`kwargs`	传递给 Pipeline 模型的额外关键字参数	`{}`

返回值

类型	描述
	与输入格式（元组或字典）匹配的答案列表，包含由输出格式指定的字段