ONNXΒΆ
About this page
This is an API reference for ONNX in BentoML. Please refer to ONNX guide for more information about how to use ONNX in BentoML.
- bentoml.onnx.save_model(name: Tag | str, model: onnx.ModelProto, *, signatures: dict[str, ModelSignatureDict] | dict[str, ModelSignature] | None = None, labels: dict[str, str] | None = None, custom_objects: dict[str, t.Any] | None = None, external_modules: t.List[ModuleType] | None = None, metadata: dict[str, t.Any] | None = None) bentoml.Model ΒΆ
Save a onnx model instance to the BentoML model store.
- Parameters:
name (
str
) β The name to give to the model in the BentoML store. This must be a validTag
name.model (
ModelProto
) β The ONNX model to be saved.signatures (
dict[str, ModelSignatureDict]
, optional) β Signatures of predict methods to be used. If not provided, the signatures default to{"run": {"batchable": False}}
. SeeModelSignature
for more details.bentoml.onnx
internally useonnxruntime.InferenceSession
to run inference. When the original model is converted to ONNX format and loaded byonnxruntime.InferenceSession
, the inference method of the original model is converted to therun
method of theonnxruntime.InferenceSession
.signatures
here refers to the predict method ofonnxruntime.InferenceSession
, hence the only allowed method name insignatures
isrun
.labels (
dict[str, str]
, optional) β A default set of management labels to be associated with the model. An example is{"training-set": "data-1"}
.custom_objects (
dict[str, Any]
, optional) βCustom objects to be saved with the model. An example is
{"my-normalizer": normalizer}
.Custom objects are currently serialized with cloudpickle, but this implementation is subject to change.
external_modules (
List[ModuleType]
, optional, default toNone
) β user-defined additional python modules to be saved alongside the model or custom objects, e.g. a tokenizer module, preprocessor module, model configuration modulemetadata (
dict[str, Any]
, optional) βMetadata to be associated with the model. An example is
{"bias": 4}
.Metadata is intended for display in a model management UI and therefore must be a default Python type, such as
str
orint
.
- Returns:
A BentoML model containing the saved ONNX model instance. store.
- Return type:
Model
Example:
import bentoml import torch import torch.nn as nn class ExtendedModel(nn.Module): def __init__(self, D_in, H, D_out): # In the constructor we instantiate two nn.Linear modules and assign them as # member variables. super(ExtendedModel, self).__init__() self.linear1 = nn.Linear(D_in, H) self.linear2 = nn.Linear(H, D_out) def forward(self, x, bias): # In the forward function we accept a Tensor of input data and an optional bias h_relu = self.linear1(x).clamp(min=0) y_pred = self.linear2(h_relu) return y_pred + bias N, D_in, H, D_out = 64, 1000, 100, 1 x = torch.randn(N, D_in) model = ExtendedModel(D_in, H, D_out) input_names = ["x", "bias"] output_names = ["output1"] tmpdir = "/tmp/model" model_path = os.path.join(tmpdir, "test_torch.onnx") torch.onnx.export( model, (x, torch.Tensor([1.0])), model_path, input_names=input_names, output_names=output_names, ) bento_model = bentoml.onnx.save_model("onnx_model", model_path, signatures={"run": {"batchable": True}})
- bentoml.onnx.load_model(bento_model: str | Tag | bentoml.Model, *, providers: ProvidersType | None = None, session_options: ort.SessionOptions | None = None) ort.InferenceSession ΒΆ
Load the onnx model with the given tag from the local BentoML model store.
- Parameters:
bento_model (
str
|
Tag
|
Model
) β Either the tag of the model to get from the store, or a BentoML ~bentoml.Model instance to load the model from.providers (List[Union[str, Tuple[str, Dict[str, Any]], optional, default to
None
) β Different providers provided by users. By default BentoML will use["CPUExecutionProvider"]
when loading a model.session_options (onnxruntime.SessionOptions, optional, default to
None
) β SessionOptions per use case. If not specified, then default toNone
.
- Returns:
An instance of ONNX Runtime inference session created using ONNX model loaded from the model store.
- Return type:
onnxruntime.InferenceSession
Example:
import bentoml sess = bentoml.onnx.load_model("my_onnx_model")
- bentoml.onnx.get(tag_like: str | Tag) Model ΒΆ
Get the BentoML model with the given tag.
- Parameters:
tag_like β The tag of the model to retrieve from the model store.
- Returns:
A BentoML
Model
with the matching tag.- Return type:
Model
Example:
import bentoml # target model must be from the BentoML model store model = bentoml.onnx.get("onnx_resnet50")