LLM

pipeline

LLM 流水线通过大型语言模型 (LLM) 运行提示词。此流水线会根据模型路径自动检测 LLM 框架。

示例

以下显示了使用此流水线的一个简单示例。

from txtai import LLM

# Create LLM pipeline
llm = LLM()

# Run prompt
llm(
  """
  Answer the following question using the provided context.

  Question:
  What are the applications of txtai?

  Context:
  txtai is an open-source platform for semantic search and
  workflows powered by language models.
  """
)

# Instruction tuned models typically require string prompts to
# follow a specific chat template set by the model
llm(
  """
  <|im_start|>system
  You are a friendly assistant.<|im_end|>
  <|im_start|>user
  Answer the following question...<|im_end|>
  <|im_start|>assistant
  """
)

# Chat messages automatically handle templating
llm([
  {"role": "system", "content": "You are a friendly assistant."},
  {"role": "user", "content": "Answer the following question..."}
])

# Set the default role to user and string inputs are converted to chat messages
llm("Answer the following question...", defaultrole="user")

LLM 流水线会自动检测底层 LLM 框架。这也可以手动设置。

此流水线支持 Hugging Face Transformers、llama.cpp 和通过 LiteLLM 托管的 API 模型。

有关 LiteLLM 模型可用选项，请参阅 LiteLLM 文档。llama.cpp 模型支持本地和 HF Hub 上的远程 GGUF 路径。

from txtai import LLM

# Transformers
llm = LLM("meta-llama/Meta-Llama-3.1-8B-Instruct")
llm = LLM("meta-llama/Meta-Llama-3.1-8B-Instruct", method="transformers")

# llama.cpp
llm = LLM("microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf")
llm = LLM("microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
           method="llama.cpp")

# LiteLLM
llm = LLM("ollama/llama3.1")
llm = LLM("ollama/llama3.1", method="litellm")

# Custom Ollama endpoint
llm = LLM("ollama/llama3.1", api_base="http://localhost:11434")

# Custom OpenAI-compatible endpoint
llm = LLM("openai/llama3.1", api_base="http://localhost:4000")

# LLM APIs - must also set API key via environment variable
llm = LLM("gpt-4o")
llm = LLM("claude-3-5-sonnet-20240620")

模型可以外部加载并传递给流水线。这对于 Transformers 尚不支持和/或需要特殊初始化的模型非常有用。

import torch

from transformers import AutoModelForCausalLM, AutoTokenizer
from txtai import LLM

# Load Phi 3.5-mini
path = "microsoft/Phi-3.5-mini-instruct"
model = AutoModelForCausalLM.from_pretrained(
  path,
  torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(path)

llm = LLM((model, tokenizer))

更多详细示例请参见以下链接。

笔记本	描述
使用 LLM 进行提示词驱动的搜索	使用大型语言模型 (LLM) 进行嵌入引导和提示词驱动的搜索
提示词模板和任务链	构建模型提示词并通过工作流将任务连接在一起
使用 txtai 构建 RAG 流水线	检索增强生成指南，包括如何创建引用
集成 LLM 框架	集成 llama.cpp、LiteLLM 和自定义生成框架
使用语义图和 RAG 生成知识	使用语义图和 RAG 进行知识探索和发现
使用 LLM 构建知识图谱	使用 LLM 驱动的实体提取构建知识图谱
使用图路径遍历的高级 RAG	图路径遍历以收集复杂数据集用于高级 RAG
使用引导生成的高级 RAG	检索增强和引导生成
使用 llama.cpp 和外部 API 服务进行 RAG	使用额外的向量和 LLM 框架进行 RAG
txtai 中的 RAG 工作原理	创建 RAG 进程、API 服务和 Docker 实例
语音到语音 RAG ▶️	使用 RAG 的完整语音到语音工作流
生成式音频	使用生成式音频工作流进行故事叙述
使用图和智能体分析 Hugging Face 帖子	使用图分析和智能体探索丰富的数据集
赋予智能体自主权	智能体根据自身判断迭代解决问题
LLM API 入门	使用 OpenAI、Claude、Gemini、Bedrock 等生成嵌入和运行 LLM
使用图和智能体分析领英公司帖子	探索如何通过 AI 提高社交媒体互动
使用 txtai 解析星辰	探索包含已知恒星、行星、星系的知识图谱
为 RAG 分块数据	提取、分块和索引内容以实现有效检索

配置驱动的示例

流水线可以通过 Python 或配置运行。流水线可以在配置中使用流水线的小写名称进行实例化。配置驱动的流水线通过工作流或 API 运行。

config.yml

# Create pipeline using lower case class name
llm:

# Run pipeline with workflow
workflow:
  llm:
    tasks:
      - action: llm

与上面的 Python 示例类似，底层 Hugging Face 流水线参数和模型参数可以在流水线配置中设置。

llm:
  path: microsoft/Phi-3.5-mini-instruct
  torch_dtype: torch.bfloat16

使用工作流运行

from txtai import Application

# Create and run pipeline with workflow
app = Application("config.yml")
list(app.workflow("llm", [
  """
  Answer the following question using the provided context.

  Question:
  What are the applications of txtai? 

  Context:
  txtai is an open-source platform for semantic search and
  workflows powered by language models.
  """
]))

使用 API 运行

CONFIG=config.yml uvicorn "txtai.api:app" &

curl \
  -X POST "http://localhost:8000/workflow" \
  -H "Content-Type: application/json" \
  -d '{"name":"llm", "elements": ["Answer the following question..."]}'

方法

流水线的 Python 文档。

`init(path=None, method=None, **kwargs)`

创建一个新的 LLM。

参数

名称	描述	默认值
`path`	模型路径	`None`
`method`	llm 模型框架，如果未提供则从 path 推断	`None`
`kwargs`	模型关键字参数	`{}`

源代码位于 txtai/pipeline/llm/llm.py

def __init__(self, path=None, method=None, **kwargs):
    """
    Creates a new LLM.

    Args:
        path: model path
        method: llm model framework, infers from path if not provided
        kwargs: model keyword arguments
    """

    # Default LLM if not provided
    path = path if path else "google/flan-t5-base"

    # Generation instance
    self.generator = GenerationFactory.create(path, method, **kwargs)

`call(text, maxlength=512, stream=False, stop=None, defaultrole='prompt', **kwargs)`

生成文本。支持以下输入格式

字符串或字符串列表（指令微调模型必须遵循聊天模板）
包含 role 和 content 键值对的字典列表或列表的列表

参数

名称	描述	默认值
`text`	text\|list	必需
`maxlength`	最大序列长度	`512`
`stream`	如果为 True，则流式传输响应，默认为 False	`False`
`stop`	停止字符串列表，默认为 None	`None`
`defaultrole`	应用于文本输入的默认角色（用于原始提示词的 prompt（默认）或用于用户聊天消息的 user）	`'prompt'`
`kwargs`	其他生成关键字参数	`{}`

返回值

类型	描述
	生成的文本