BentoML
Deploying your models locally with BentoML.
BentoML is an open-source framework for machine learning model serving. it can be used to deploy models locally, in a cloud environment, or in a Kubernetes environment.
The BentoML Model Deployer is one of the available flavors of the Model Deployer stack component. Provided with the BentoML integration it can be used to deploy and manage BentoML models or Bento on a local running HTTP server.
The BentoML Model Deployer can be used to deploy models for local development and production use cases. While the integration mainly works in a local environment where pipelines are run, the used Bento can be exported and containerized, and deployed in a remote environment. Within the BentoML ecosystem, Yatai and bentoctl
are the tools responsible for deploying the Bentos into the Kubernetes cluster and Cloud Platforms. Full support for these advanced tools is in progress and will be available soon.
When to use it?
You should use the BentoML Model Deployer to:
Standardize the way you deploy your models to production within your organization.
if you are looking to deploy your models in a simple way, while you are still able to transform your model into a production-ready solution when that time comes.
If you are looking to deploy your models with other Kubernetes-based solutions, you can take a look at one of the other Model Deployer Flavors available in ZenML.
BentoML also allows you to deploy your models in a more complex production-grade setting. Bentoctl is one of the tools that can help you get there. Bentoctl takes your built Bento from a ZenML pipeline and deploys it with bentoctl
into a cloud environment such as AWS Lambda, AWS SageMaker, Google Cloud Functions, Google Cloud AI Platform, or Azure Functions. Read more about this in the From Local to Cloud with bentoctl
section.
The bentoctl
integration implementation is still in progress and will be available soon. The integration will allow you to deploy your models to a specific cloud provider with just a few lines of code using ZenML built-in steps.
How do you deploy it?
Within ZenML you can quickly get started with BentoML by simply creating Model Deployer Stack Component with the BentoML flavor. To do so you'll need to install the required Python packages on your local machine to be able to deploy your models:
To register the BentoML model deployer with ZenML you need to run the following command:
The ZenML integration will provision a local HTTP deployment server as a daemon process that will continue to run in the background to serve the latest models and Bentos.
How do you use it?
The recommended flow to use the BentoML model deployer is to first create a BentoML Service, then use the bento_builder_step
to build the model and service into a bento bundle, and finally deploy the bundle with the bentoml_model_deployer_step
.
BentoML Service and Runner
The first step to being able to deploy your models and use BentoML is to create a bento service which is the main logic that defines how your model will be served, and a bento runner which represents a unit of execution for your model on a remote Python worker.
The following example shows how to create a basic bento service and runner that will be used to serve a basic scikit-learn model.
ZenML Bento Builder step
Once you have your bento service and runner defined, we can use the built-in bento builder step to build the bento bundle that will be used to serve the model. The following example shows how can call the built-in bento builder step within a ZenML pipeline.
The Bento Builder step can be used in any orchestration pipeline that you create with ZenML. The step will build the bento bundle and save it to the used artifact store. Which can be used to serve the model in a local setting using the BentoML Model Deployer Step, or in a remote setting using the bentoctl
or Yatai. This gives you the flexibility to package your model in a way that is ready for different deployment scenarios.
ZenML BentoML Deployer step
We have now built our bento bundle, and we can use the built-in bentoml_model_deployer_step
to deploy the bento bundle to our local HTTP server. The following example shows how to call the built-in bento deployer step within a ZenML pipeline.
Note: the bentoml_model_deployer_step
can only be used in a local environment.
ZenML BentoML Pipeline examples
Once all the steps have been defined, we can create a ZenML pipeline and run it. The bento builder step expects to get the trained model as an input, so we need to make sure either we have a previous step that trains the model and outputs it or loads the model from a previous run. Then the deployer step expects to get the bento bundle as an input, so we need to make sure either we have a previous step that builds the bento bundle and outputs it or load the bento bundle from a previous run or external source.
The following example shows how to create a ZenML pipeline that trains a model, builds a bento bundle, and deploys it to a local HTTP server.
In more complex scenarios, you might want to build a pipeline that trains a model and builds a bento bundle in a remote environment. Then creates a new pipeline that retrieves the bento bundle and deploys it to a local http server, or to a cloud provider. The following example shows a pipeline example that does exactly that.
Predicting with the local deployed model
Once the model has been deployed we can use the BentoML client to send requests to the deployed model. ZenML will automatically create a BentoML client for you and you can use it to send requests to the deployed model by simply calling the service to predict the method and passing the input data and the API function name.
The following example shows how to use the BentoML client to send requests to the deployed model.
Deploying and testing locally is a great way to get started and test your model. However, a real-world scenario will most likely require you to deploy your model to a remote environment. The next section will show you how to deploy the Bento you built with ZenML pipelines to a cloud environment using the bentoctl
CLI.
From Local to Cloud with bentoctl
bentoctl
Bentoctl helps deploy any machine learning models as production-ready API endpoints into the cloud. It is a command line tool that provides a simple interface to manage your BentoML bundles.
The bentoctl
CLI provides a list of operators which are plugins that interact with cloud services, some of these operators are:
To deploy your BentoML bundle to the cloud, you need to install the bentoctl
CLI and the operator plugin for the cloud service you want to deploy to.
Once you have the bentoctl
CLI and the operator plugin installed, you can use the bentoctl
CLI to deploy your BentoML bundle to the cloud.
For more information and a full list of configurable attributes of the BentoML Model Deployer, check out the SDK Docs .
Last updated