HFOnnx

pipeline

将 Hugging Face Transformer 模型导出到 ONNX。目前，这对于分类/池化/问答模型效果最好。序列到序列模型（摘要、转录、翻译）的工作正在进行中。

示例

下面展示了使用此管道的一个简单示例。

from txtai.pipeline import HFOnnx, Labels

# Model path
path = "distilbert-base-uncased-finetuned-sst-2-english"

# Export model to ONNX
onnx = HFOnnx()
model = onnx(path, "text-classification", "model.onnx", True)

# Run inference and validate
labels = Labels((model, path), dynamic=False)
labels("I am happy")

有关更详细的示例，请参阅下面的链接。

笔记本	描述
使用 ONNX 导出和运行模型	使用 ONNX 导出模型，并在 JavaScript、Java 和 Rust 中原生运行

方法

该管道的 Python 文档。

`call(path, task='default', output=None, quantize=False, opset=14)`

将 Hugging Face Transformer 模型导出到 ONNX。

参数

名称	描述	默认值
`路径`	模型路径，接受 Hugging Face 模型中心 ID、本地路径或 (model, tokenizer) 元组	必需
`任务`	可选的模型任务或类别，决定模型类型和输出，默认为导出隐藏状态	`'default'`
`输出`	可选的输出模型路径，如果为 None，则默认为返回字节数组	`None`
`量化`	是否应量化模型（需要安装 onnx），默认为 False	`False`
`opset`	onnx opset，默认为 14	`14`

返回值

类型	描述
	模型输出路径或模型字节数组，取决于输出参数

源代码位于 txtai/pipeline/train/hfonnx.py 中

def __call__(self, path, task="default", output=None, quantize=False, opset=14):
    """
    Exports a Hugging Face Transformer model to ONNX.

    Args:
        path: path to model, accepts Hugging Face model hub id, local path or (model, tokenizer) tuple
        task: optional model task or category, determines the model type and outputs, defaults to export hidden state
        output: optional output model path, defaults to return byte array if None
        quantize: if model should be quantized (requires onnx to be installed), defaults to False
        opset: onnx opset, defaults to 14

    Returns:
        path to model output or model as bytes depending on output parameter
    """

    inputs, outputs, model = self.parameters(task)

    if isinstance(path, (list, tuple)):
        model, tokenizer = path
        model = model.cpu()
    else:
        model = model(path)
        tokenizer = AutoTokenizer.from_pretrained(path)

    # Generate dummy inputs
    dummy = dict(tokenizer(["test inputs"], return_tensors="pt"))

    # Default to BytesIO if no output file provided
    output = output if output else BytesIO()

    # Export model to ONNX
    export(
        model,
        (dummy,),
        output,
        opset_version=opset,
        do_constant_folding=True,
        input_names=list(inputs.keys()),
        output_names=list(outputs.keys()),
        dynamic_axes=dict(chain(inputs.items(), outputs.items())),
    )

    # Quantize model
    if quantize:
        if not ONNX_RUNTIME:
            raise ImportError('onnxruntime is not available - install "pipeline" extra to enable')

        output = self.quantization(output)

    if isinstance(output, BytesIO):
        # Reset stream and return bytes
        output.seek(0)
        output = output.read()

    return output

HFOnnx

示例

方法

__call__(path, task='default', output=None, quantize=False, opset=14)

`call(path, task='default', output=None, quantize=False, opset=14)`