Packaging for deploymentΒΆ
BentoML provides a standardized format called Bentos for packaging all the components needed to run AI/ML services - from source code and Python dependencies to model artifacts and configurations. This packaging system ensures your AI services remain consistent and reproducible across different environments.
Bento build optionsΒΆ
Build options refer to a set of configurations defined in a YAML file (typically named bentofile.yaml
) for building a BentoML project into a Bento. The following is an example for Hello world.
service: 'service:Summarization'
labels:
owner: bentoml-team
project: gallery
include:
- '*.py'
python:
packages:
- torch
- transformers
Here are the available fields in this file.
service
ΒΆ
service
is a required field and points to where a Service object resides. It is often defined as service: "service:class-name"
.
service
: The Python module, namely theservice.py
file.class-name
: The class-based Serviceβs name created inservice.py
, decorated with@bentoml.service
. If you have multiple Services inservice.py
, you can specify the main Service receiving user requests inbentofile.yaml
. Other Services will be started together with this main Service.
description
ΒΆ
description
allows you to annotate your Bento with relevant documentation, which can be written in plain text or Markdown format.
You can either directly provide the description in bentofile.yaml
or reference an external file through a path.
service: "service:svc"
description: |
## Description For My Bento π±
Use **any markdown syntax** here!
> BentoML is awesome!
include:
...
service: "service:svc"
description: "file: ./README.md"
include:
...
For descriptions sourced from an external file, you can use either an absolute or relative path. Make sure the file exists at the specified path when the bentoml build
command is run. For relative paths, the reference point is the build_ctx
, which defaults to the directory from which bentoml build
is executed.
labels
ΒΆ
labels
are key-value pairs associated with objects. In BentoML, both Bentos and models can have labels attached to them. These labels can serve various purposes, such as identifying or categorizing Bentos and models in BentoCloud. You can add or modify labels at any time.
labels:
owner: bentoml-team
stage: not-ready
include
ΒΆ
You use the include
field to include specific files when building the Bento. It supports wildcard characters and directory pattern matching. For example, setting it to *.py
means every Python file under the existing build_ctx
will be packaged into the Bento.
...
include:
- "data/"
- "**/*.py"
- "config/*.json"
- "path/to/a/file.csv"
If this field is not specified, BentoML includes all files under the build_ctx
by default, excluding those explicitly set in the exclude
field.
Note
Both include
and exclude
fields support gitignore style pattern matching.
exclude
ΒΆ
You use the exclude
field to exclude specific files when building the Bento. This is useful when you have many files in the working directory, as you only need to
specify the files to be ignored.
When setting this field, you specify the file pathspecs (similar to .gitignore
) that are relative to the build_ctx
directory.
...
include:
- "data/"
- "**/*.py"
exclude:
- "tests/"
- "secrets.key"
Alternatively, create a .bentoignore
file in the build_ctx
directory as follows:
__pycache__/
*.py[cod]
*$py.class
.ipynb_checkpoints/
training_data/
Note
exclude
is always applied after include
.
models
ΒΆ
You can specify the model to be used for building a Bento using a string model tag or a dictionary. When you start from an existing project, you can download models from BentoCloud to your local Load and manage models with the models
configurations by running bentoml models pull
.
See the following example for details. If you donβt define models in bentofile.yaml
, the model specified in the Service is used to build the Bento.
models:
- "summarization-model:latest" # A string model tag
- tag: "summarization-model:version1" # A dictionary
filter: "label:staging"
alias: "summarization-model_v1"
tag
: The name and version of the model, separated by a colon.filter
: This field uses the same filter syntax in BentoCloud. You use a filter to list specific models, such as the models with the same label. You can add multiple comma-separated filters to a model.alias
: An alias for the model. If this is specified, you can use it directly in code likebentoml.models.get(alias)
.
Python packagesΒΆ
You specify the required Python packages for a given Bento using the python.packages
field. BentoML allows you to specify the
desired version and install a package from a custom PyPI source or from a GitHub repository. If a package lacks a specific version,
BentoML will lock the versions of all Python packages for the current platform and Python when building a Bento.
python:
packages:
- "numpy"
- "matplotlib==3.5.1"
- "package>=0.2,<0.3"
- "torchvision==0.9.2"
- "git+https://github.com/username/mylib.git@main"
Note
You donβt need to specify bentoml
as a dependency in this field since the current version of BentoML will be added to the list by default. However,
you can override this by specifying a different BentoML version.
To include a package from a GitHub repository, use the pip requirements file format. You can specify the repository URL, the branch, tag, or commit to install from, and the subdirectory if the Python package is not in the root of the repository.
python:
packages:
# Install from a specific branch
- "git+https://github.com/username/repository.git@branch_name"
# Install from a specific tag
- "git+https://github.com/username/repository.git@v1.0.0"
# Install from a specific commit
- "git+https://github.com/username/repository.git@abcdef1234567890abcdef1234567890abcdef12"
# Install from a subdirectory
- "git+https://github.com/username/repository.git@branch_name#subdirectory=package_dir"
If your project depends on a private GitHub repository, you can include the Python package from the repository via SSH. Make sure that the environment where BentoML is running has the appropriate SSH keys configured and that these keys are added to GitHub. In the following example, git@github.com:username/repository.git
is the SSH URL for the repository.
python:
packages:
- "git+ssh://git@github.com/username/repository.git@branch_name"
If you already have a requirements.txt
file that defines Python packages for your project, you may also supply a path to the requirements.txt
file directly:
python:
requirements_txt: "./project-a/ml-requirements.txt"
Pip install optionsΒΆ
You can provide additional pip install
arguments in the python
field. If provided, these arguments will be applied to all packages defined in python.packages
as
well as the requirements_txt
file.
python:
requirements_txt: "./requirements.txt"
index_url: "https://my.mirror.com/simple"
no_index: False
trusted_host:
- "pypi.python.org"
- "my.mirror.com"
find_links:
- "https://download.pytorch.org/whl/cu80/stable.html"
extra_index_url:
- "https://<other api token>:@my.mirror.com/pypi/simple"
- "https://pypi.python.org/simple"
pip_args: "--pre -U --force-reinstall"
Note
By default, BentoML caches pip artifacts across all local image builds to speed up the build process.
If you want to force a re-download instead of using the cache, you can specify the pip_args: "--no-cache-dir"
option in your
bentofile.yaml
file, or use the --no-cache
option in the bentoml containerize
command. For example:
$ bentoml containerize my_bento:latest --no-cache
PyPI package lockingΒΆ
By default, BentoML automatically locks all package versions, as well as all packages in
their dependency graph, and
generates a requirements.lock.txt
file. This process uses
pip-compile under the hood.
If you have already specified a version for all packages, you can optionally disable
this behavior by setting the lock_packages
field to false
:
python:
requirements_txt: "requirements.txt"
lock_packages: false
When including Python packages from GitHub repositories, use the pack_git_packages
option (it defaults to true
) to control whether these packages should be cloned and packaged during the build process. This is useful for dependencies that may not be available via standard PyPI sources or for ensuring consistency with specific versions (for example, tags and commits) of a dependency directly from a Git repository.
python:
pack_git_packages: true # Enable packaging of Git-based packages
packages:
- "git+https://github.com/username/repository.git@abcdef1234567890abcdef1234567890abcdef12"
Note that lock_packages
controls whether the versions of all dependencies, not just those from Git, are pinned at the time of building the Bento. Disabling pack_git_packages
will also disable package locking (lock_packages
) unless explicitly set.
Note
BentoML will always try to lock the package versions against Linux x86_64 platform to match the deployment target. If the bento contains dependencies or transitive dependencies with environment markers, they will be resolved against Linux x86_64 platform.
For example, if the bento requires torch
, nvidia-*
packages will also be picked up into the final lock result although they are only required for Linux x86_64 platform.
If you want to build a bento for a different platform, you can pass --platform
option to bentoml build
command with the name of the target platform. For example:
$ bentoml build --platform macos
Python wheelsΒΆ
Python .whl
files are also supported as a type of dependency to include in a
Bento. Simply provide a path to your .whl
files under the wheels
field.
python:
wheels:
- ./lib/my_package.whl
If the wheel is hosted on a local network without TLS, you can indicate
that the domain is safe to pip with the trusted_host
field.
Python options tableΒΆ
The following table provides a full list of available configurations for the python
field.
Field |
Description |
---|---|
requirements_txt |
The path to a custom |
packages |
Packages to include in this Bento |
lock_packages |
Whether to lock the packages |
index_url |
Inputs for the |
no_index |
Whether to include the |
trusted_host |
List of trusted hosts used as inputs using the |
find_links |
List of links to find as inputs using the |
extra_index_url |
List of extra index URLs as inputs using the |
pip_args |
Any additional pip arguments that you want to add when installing a package |
wheels |
List of paths to wheels to include in the Bento |
envs
ΒΆ
Environment variables are important for managing configuration and secrets in a secure and flexible manner. They allow you to configure BentoML Services without hard-coding sensitive information, such as API keys, database credentials, or configurable parameters that might change between different environments.
You set environment variables under the envs
key in bentofile.yaml
. Each environment variable is defined with name
and value
keys. For example:
envs:
- name: "VAR_NAME"
value: "value"
- name: "API_KEY"
value: "your_api_key_here"
The specified environment variables will be injected into the Bento container.
Note
If you deploy your BentoML Service on BentoCloud, you can either set environment variables by specifying envs
in benfofile.yaml
or using the --env
flag when running bentoml deploy
. See Environment variables for details.
conda
ΒΆ
Conda dependencies can be specified under the conda
field. For example:
conda:
channels:
- default
dependencies:
- h2o
pip:
- "scikit-learn==1.2.0"
channels
: Custom conda channels to use. If it is not specified, BentoML will use the community-maintainedconda-forge
channel as the default.dependencies
: Custom conda dependencies to include in the environment.pip
: The specificpip
conda dependencies to include.
Optionally, you can export all dependencies from a pre-existing conda environment to an environment.yml
file, and provide this file in your bentofile.yaml
file. If it is specified, this file will overwrite any additional option specified.
To export a conda environment:
conda env export > environment.yml
To add it in your bentofile.yaml
:
conda:
environment_yml: "./environment.yml"
Note
Unlike Python packages, BentoML does not support locking conda package versions automatically. We recommend you specify a version in the configuration file.
See also
When conda
options are provided, BentoML will select a Docker base image
that comes with Miniconda pre-installed in the generated Dockerfile. Note that only
the debian
and alpine
distro support conda
. Learn more in
the docker
section below.
docker
ΒΆ
BentoML makes it easy to deploy a Bento to a Docker container. It provides a set of options for customizing the Docker image generated from a Bento.
The following docker
field contains some basic Docker configurations:
docker:
distro: debian
python_version: "3.11"
system_packages:
- libblas-dev
- liblapack-dev
- gfortran
BentoML uses BuildKit, a cache-efficient builder toolkit, to containerize Bentos. BuildKit comes with Docker 18.09. This means if you are using Docker via Docker Desktop, BuildKit will be available by default. If you are using a standalone version of Docker, you can install BuildKit by following the instructions here.
The following sections provide detailed explanations of certain Docker configurations.
OS distrosΒΆ
The following OS distros are currently supported in BentoML:
debian
: The default value, similar to Ubuntualpine
: A minimal Docker image based on Alpine Linuxubi8
: Red Hat Universal Base Imageamazonlinux
: Amazon Linux 2
Some of the distros may not support using conda or specifying CUDA for GPU. Here is the support matrix for all distros:
Distro |
Available Python Versions |
Conda Support |
CUDA Support (GPU) |
---|---|---|---|
debian |
3.7, 3.8, 3.9, 3.10 |
Yes |
Yes |
alpine |
3.7, 3.8, 3.9, 3.10 |
Yes |
No |
ubi8 |
3.8, 3.9 |
No |
Yes |
amazonlinux |
3.7, 3.8 |
No |
No |
Setup scriptΒΆ
For advanced Docker customization, you can also use the setup_script
field to inject
any script during the image build process. For example, with NLP
projects you can pre-download NLTK data in the image by setting the following values.
In the bentofile.yaml
file:
...
python:
packages:
- nltk
docker:
setup_script: "./setup.sh"
In the setup.sh
file:
#!/bin/bash
set -euxo pipefail
echo "Downloading NLTK data.."
python -m nltk.downloader all
Build a new Bento and then run bentoml containerize MY_BENTO --progress plain
to
view the Docker image build progress. The newly built Docker image will contain the
pre-downloaded NLTK dataset.
Tip
When working with bash scripts, we recommend you add set -euxo pipefail
to the beginning. Especially when set -e is missing, the script will fail silently
without raising an exception during bentoml containerize
. Learn more about
Bash Set builtin.
It is also possible to provide a Python script for initializing the Docker image. Hereβs an example:
In the bentofile.yaml
file:
...
python:
packages:
- nltk
docker:
setup_script: "./setup.py"
In the setup.py
file:
#!/usr/bin/env python
import nltk
print("Downloading NLTK data..")
nltk.download('treebank')
Note
Pay attention to #!/bin/bash
and #!/usr/bin/env python
in the
first line of the example scripts above. They are known as Shebang
and they are required in a setup script provided to BentoML.
Setup scripts are always executed after the specified Python packages, conda dependencies, and system packages are installed. Therefore, you can import and utilize those libraries in your setup script for the initialization process.
Docker options tableΒΆ
The following table provides a full list of available configurations for the docker
field.
Field |
Description |
---|---|
distro |
The OS distribution on the Docker image. It defaults to |
python_version |
The Python version on the Docker image. It defaults to the Python version in the build environment. |
cuda_version |
Deprecated. The CUDA version on the Docker image for running models that require GPUs. When using PyTorch or TensorFlow to run models on GPUs, we recommend you directly install them along with their respective CUDA dependencies, using |
system_packages |
The system packages that will be installed in the container. |
setup_script |
A Python or Shell script that will be executed during the Docker build process. |
base_image |
A user-provided Docker base image. This will override all other custom attributes of the image. |
dockerfile_template |
Customize the generated Dockerfile by providing a Jinja2 template that extends the default Dockerfile. |
Build a BentoΒΆ
With a bentofile.yaml
file, you build a Bento by running bentoml build
. Note that this command is part of the bentoml deploy
workflow. You should use this command only if you want to build a Bento without deploying it to BentoCloud.
$ bentoml build
Locking PyPI package versions.
βββββββ ββββββββββββ ββββββββββββ βββββββ ββββ βββββββ
βββββββββββββββββββββ ββββββββββββββββββββββββββ ββββββββ
ββββββββββββββ ββββββ βββ βββ βββ βββββββββββββββββ
ββββββββββββββ ββββββββββ βββ βββ βββββββββββββββββ
βββββββββββββββββββ ββββββ βββ ββββββββββββ βββ βββββββββββ
βββββββ βββββββββββ βββββ βββ βββββββ βββ βββββββββββ
Successfully built Bento(tag="summarization:lkpxx2u5o24wpxjr").
Possible next steps:
* Containerize your Bento with `bentoml containerize`:
$ bentoml containerize summarization:lkpxx2u5o24wpxjr [or bentoml build --containerize]
* Push to BentoCloud with `bentoml push`:
$ bentoml push summarization:lkpxx2u5o24wpxjr [or bentoml build --push]
After built, each Bento is automatically tagged with a unique version. It is also possible to set a specific version using the --version
option,
but this is generally unnecessary. Only use it if your team has a very specific naming convention for deployable artifacts.
bentoml build --version 1.0.1
Custom build contextΒΆ
For projects that are part of a larger codebase and interact with other local Python
modules or those containing multiple Bentos/Services, it might not be possible to
put all Service definition code and bentofile.yaml
in the projectβs root directory.
BentoML allows the placement of the Service definition and bentofile.yaml
anywhere in the project directory.
In such scenarios, specify the build_ctx
and bentofile
arguments when running the bentoml build
command.
build_ctx
: The build context represents the working directory of your Python project. It will be prepended to the PYTHONPATH during build process, ensuring the correct import of local Python modules. By default, itβs set to the current directory where thebentoml build
command is executed.bentofile
: It defaults to thebentofile.yaml
file in the build context.
To customize their values, use the following:
bentoml build -f ./src/my_project_a/bento_fraud_detect.yaml ./src/
StructureΒΆ
By default, all created Bentos are stored in the BentoML Bento Store, which is essentially a local directory. You can go to a specific Bento directory by running the following command:
cd $(bentoml get BENTO_TAG -o path)
Inside the directory, you might see different files and sub-directories depending on the configurations in bentofile.yaml
. A typical Bento contains the following key sub-directories:
src
: Contains files specified in theinclude
field ofbentofile.yaml
. These files are relative to user Python codeβs CWD (current working directory), which makes importing relative modules and file paths inside user code possible.apis
: Contains API definitions auto-generated from the Serviceβs API specifications.env
: Contains environment-related files for Bento initialization. These files are generated based on the build options specified inbentofile.yaml
.
Warning
We do not recommend you change files in a Bento directly, unless itβs for debugging purposes.
Containerize a BentoΒΆ
To containerize a Bento with Docker, simply run bentoml containerize <bento_tag>
. For example:
bentoml containerize summarization:latest
Note
For Mac computers with Apple silicon, you can specify the --platform
option to avoid potential compatibility issues with some Python libraries.
bentoml containerize --platform=linux/amd64 summarization:latest
The Docker imageβs tag is the same as the Bento tag by default. View the created Docker image:
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
summarization lkpxx2u5o24wpxjr 79a06b402644 2 minutes ago 6.66GB
Run the Docker image locally:
docker run -it --rm -p 3000:3000 summarization:lkpxx2u5o24wpxjr serve
With the Docker image, you can run the model in any Docker-compatible environment.
If you prefer a serverless platform to build and operate AI applications, you can deploy Bentos to BentoCloud. It gives AI application developers a collaborative environment and a user-friendly toolkit to ship and iterate AI products.