ONNXΒΆ
About this page
This is an API reference for ONNX in BentoML. Please refer to ONNX guide for more information about how to use ONNX in BentoML.
- bentoml.onnx.save_model(name: Tag | str, model: onnx.ModelProto, *, signatures: dict[str, ModelSignatureDict] | dict[str, ModelSignature] | None = None, labels: dict[str, str] | None = None, custom_objects: dict[str, t.Any] | None = None, external_modules: t.List[ModuleType] | None = None, metadata: dict[str, t.Any] | None = None) bentoml.ModelΒΆ
Save a onnx model instance to the BentoML model store.
- Parameters:
name (
str) β The name to give to the model in the BentoML store. This must be a validTagname.model (
ModelProto) β The ONNX model to be saved.signatures (
dict[str, ModelSignatureDict], optional) β Signatures of predict methods to be used. If not provided, the signatures default to{"run": {"batchable": False}}. SeeModelSignaturefor more details.bentoml.onnxinternally useonnxruntime.InferenceSessionto run inference. When the original model is converted to ONNX format and loaded byonnxruntime.InferenceSession, the inference method of the original model is converted to therunmethod of theonnxruntime.InferenceSession.signatureshere refers to the predict method ofonnxruntime.InferenceSession, hence the only allowed method name insignaturesisrun.labels (
dict[str, str], optional) β A default set of management labels to be associated with the model. An example is{"training-set": "data-1"}.custom_objects (
dict[str, Any], optional) βCustom objects to be saved with the model. An example is
{"my-normalizer": normalizer}.Custom objects are currently serialized with cloudpickle, but this implementation is subject to change.
external_modules (
List[ModuleType], optional, default toNone) β user-defined additional python modules to be saved alongside the model or custom objects, e.g. a tokenizer module, preprocessor module, model configuration modulemetadata (
dict[str, Any], optional) βMetadata to be associated with the model. An example is
{"bias": 4}.Metadata is intended for display in a model management UI and therefore must be a default Python type, such as
strorint.
- Returns:
A BentoML model containing the saved ONNX model instance. store.
- Return type:
Model
Example:
import bentoml import torch import torch.nn as nn class ExtendedModel(nn.Module): def __init__(self, D_in, H, D_out): # In the constructor we instantiate two nn.Linear modules and assign them as # member variables. super(ExtendedModel, self).__init__() self.linear1 = nn.Linear(D_in, H) self.linear2 = nn.Linear(H, D_out) def forward(self, x, bias): # In the forward function we accept a Tensor of input data and an optional bias h_relu = self.linear1(x).clamp(min=0) y_pred = self.linear2(h_relu) return y_pred + bias N, D_in, H, D_out = 64, 1000, 100, 1 x = torch.randn(N, D_in) model = ExtendedModel(D_in, H, D_out) input_names = ["x", "bias"] output_names = ["output1"] tmpdir = "/tmp/model" model_path = os.path.join(tmpdir, "test_torch.onnx") torch.onnx.export( model, (x, torch.Tensor([1.0])), model_path, input_names=input_names, output_names=output_names, ) bento_model = bentoml.onnx.save_model("onnx_model", model_path, signatures={"run": {"batchable": True}})
- bentoml.onnx.load_model(bento_model: str | Tag | bentoml.Model, *, providers: ProvidersType | None = None, session_options: ort.SessionOptions | None = None) ort.InferenceSessionΒΆ
Load the onnx model with the given tag from the local BentoML model store.
- Parameters:
bento_model (
str|Tag|Model) β Either the tag of the model to get from the store, or a BentoML ~bentoml.Model instance to load the model from.providers (List[Union[str, Tuple[str, Dict[str, Any]], optional, default to
None) β Different providers provided by users. By default BentoML will use["CPUExecutionProvider"]when loading a model.session_options (onnxruntime.SessionOptions, optional, default to
None) β SessionOptions per use case. If not specified, then default toNone.
- Returns:
An instance of ONNX Runtime inference session created using ONNX model loaded from the model store.
- Return type:
onnxruntime.InferenceSession
Example:
import bentoml sess = bentoml.onnx.load_model("my_onnx_model")
- bentoml.onnx.get(tag_like: str | Tag) ModelΒΆ
Get the BentoML model with the given tag.
- Parameters:
tag_like β The tag of the model to retrieve from the model store.
- Returns:
A BentoML
Modelwith the matching tag.- Return type:
Model
Example:
import bentoml # target model must be from the BentoML model store model = bentoml.onnx.get("onnx_resnet50")