This chapter introduces the key features of BentoML. We recommend you read Quickstart before diving into this chapter.

Understand the BentoML Service and its key components.

Customize the input and output type of BentoML Services.

Gain a general understanding of BentoCloud deployment.

Create an OCI-compliant image for your BentoML project and deploy it anywhere.

Understand BentoML workers and how to configure them.

Customize the build configurations of a Bento.

Use the BentoML local Model Store to manage your models in a unified way.

Configure GPUs to power your machine learning server with BentoML.

Compose multiple models in your BentoML project.

Create distributed Services for advanced use cases.

Set concurrency to enable your Service to handle multiple requests simultaneously.

Create tests to verify the functionality of your model and the operational aspect of your Service.

Use BentoML clients to interact with your Service.

Enable adaptive batching to batch requests for reduced latency and optimized resource use.

Understand observability in BentoML, including monitoring, logging, tracing, and metrics.

Integrate ASGI frameworks in a BentoML Service to provide additional features to exposed endpoints.

Customize the runtime behaviors of your Service.

Confgiure hooks to run custom logic at different stages of a Service’s lifecycle.