Understand BentoCloud#

BentoCloud is a fully-managed platform designed for building and operating AI applications. Developed by the original creators of BentoML, BentoCloud accelerates AI application development by providing powerful workflows to deploy and scale everything from large language models (LLMs) to your custom machine learning (ML) models. It provides solutions for addressing deployment, scalability, and collaboration challenges in the AI application delivery lifecycle.

Schedule a Demo Start Free Trial

Why BentoCloud?#

BentoCloud provides the following major benefits:

BentoML integration#

Allow easy integration with popular ML frameworks and MLOps tools as BentoCloud is built on the BentoML open-source standard and ecosystem. This ensures a unified approach to AI application development.


Offer a serverless environment that manages infrastructure complexity, enabling developers to concentrate solely on building and shipping AI applications.


Provide serverless autoscaling to handle varying traffic and minimize resource wastage. Scale-to-zero support offers additional resource optimization.


Implement observability with insights into the health and performance of your Bento Deployments. Use built-in dashboards to monitor resource consumption and service statuses. Furthermore, query logs for critical cluster events, enabling troubleshooting and in-depth analysis with filtering and breakdowns.

Team collaboration#

Facilitate centralized storage and management of AI models. Different team members can share, download, and iterate models and Bentos with smooth rollback and rollforward options for version control.


BentoCloud is available with the following two plans.


The Starter plan is designed for small teams of developers who want to focus on building AI applications without infrastructure management. With the autoscaling feature of BentoCloud, you only pay for the resources you use.


The Enterprise plan includes all the features offered in the Starter plan. It is tailored for teams that want to use BentoCloud in their own cloud or on-premises environment (BYOC), ensuring data security and compliance. If you prefer not to use your own cluster, we can provide a dedicated cloud environment for you. Either way, we take care of managing the infrastructure to ensure a scalable and secure model deployment experience.

Get started#

Getting started with BentoCloud is easy:

  1. Use the BentoML open-source framework to expose an ML model with a BentoML Service and package all the necessary files into a Bento.

  2. Upload and deploy the Bento to BentoCloud.

  3. When your Bento Deployment is up and running, manage the workload using the BentoCloud Console or the BentoML Command Line Interface (CLI).

To deploy your first AI application on BentoCloud, see Quickstart.