Guides¶

Read how-to guides to explore the key features of BentoCloud.

Create Deployments

Create a Bento Deployment on BentoCloud.

Create Deployments
Configure Deployments

Customize the configurations of your Deployment, such as scaling replicas, environment variables, and instance types.

Configure Deployments
Manage Deployments

Manage the Deployment lifecycle using the BentoML CLI or API.

Manage Deployments
Call Deployment endpoints

Run inference with Deployments.

Call Deployment endpoints
Autoscaling

Configure concurrency and autoscaling to achieve optimal resource utilization and cost-efficiency for your AI workloads.

Concurrency and autoscaling
Manage access tokens

Create and use API tokens to log in to BentoCloud or access protected Deployments.

Manage access tokens
Manage secrets

Store sensitive data like credentials in pre-defined secret templates or create custom secrets.

Manage secrets
Manage users

Implement custom access control for BentoCloud users.

Manage users
Batch inference jobs

Run batch inference jobs with BentoML and BentoCloud.

Batch inference jobs
Bring Your Own Cloud

The BentoCloud BYOC deployment helps you run AI applications in your own environment in a secure and cost-effective way.

Bring Your Own Cloud