Unified Model Serving Framework¶

github_stars pypi_status actions_status documentation_status join_slack


BentoML is a Python library for building online serving systems optimized for AI applications and model inference.

Start your BentoML journey¶

The BentoML documentation provides detailed guidance on the project with hands-on tutorials and examples. If you are a first-time user of BentoML, we recommend that you read the following documents in order:

Gain a basic understanding of the BentoML open-source framework, its workflow, installation, and a quickstart example.

Get started

Create different BentoML projects for common machine learning scenarios, like large language models, image generation, embeddings, speech recognition, and more.

Examples

Dive into BentoML’s features and advanced use cases, including GPU support, clients, monitoring, and performance optimization.

Guides

A fully managed platform for deploying and scaling BentoML in the cloud.

Get started

Stay informed¶

The BentoML team uses the following channels to announce important updates like major product releases and share tutorials, case studies, as well as community news.

To receive release notifications, star and watch the BentoML project on GitHub. For release notes and detailed changelogs, see the Releases page.