MLflow
Deploying your models locally with MLflow.
The MLflow Model Deployer is one of the available flavors of the Model Deployer stack component. Provided with the MLflow integration it can be used to deploy and manage MLflow models on a local running MLflow server.
The MLflow Model Deployer is not yet available for use in production. This is a work in progress and will be available soon. At the moment it is only available for use in a local development environment.
When to use it?
MLflow is a popular open-source platform for machine learning. It's a great tool for managing the entire lifecycle of your machine learning. One of the most important features of MLflow is the ability to package your model and its dependencies into a single artifact that can be deployed to a variety of deployment targets.
You should use the MLflow Model Deployer:
if you want to have an easy way to deploy your models locally and perform real-time predictions using the running MLflow prediction server.
if you are looking to deploy your models in a simple way without the need for a dedicated deployment environment like Kubernetes or advanced infrastructure configuration.
If you are looking to deploy your models in a more complex way, you should use one of the other Model Deployer Flavors available in ZenML.
How do you deploy it?
The MLflow Model Deployer flavor is provided by the MLflow ZenML integration, so you need to install it on your local machine to be able to deploy your models. You can do this by running the following command:
To register the MLflow model deployer with ZenML you need to run the following command:
The ZenML integration will provision a local MLflow deployment server as a daemon process that will continue to run in the background to serve the latest MLflow model.
How do you use it?
Deploy a logged model
Following MLflow's documentation, if we want to deploy a model as a local inference server, we need the model to be logged in the MLflow experiment tracker first. Once the model is logged, we can use the model URI either from the artifact path saved with the MLflow run or using model name and version if a model is registered in the MLflow model registry.
In the following examples, we will show how to deploy a model using the MLflow Model Deployer, in two different scenarios:
We already know the logged model URI and we want to deploy it as a local inference server.
We don't know the logged model URI, since the model was logged in a previous step. We want to deploy the model as a local inference server. ZenML provides set of functionalities that would make it easier to get the model URI from the current run and deploy it.
Configuration
Within the MLFlowDeploymentService
you can configure:
name
: The name of the deployment.description
: The description of the deployment.pipeline_name
: The name of the pipeline that deployed the MLflow prediction server.pipeline_step_name
: The name of the step that deployed the MLflow prediction server.model_name
: The name of the model that is deployed in case of model registry the name must be a valid registered model name.model_version
: The version of the model that is deployed in case of model registry the version must be a valid registered model version.silent_daemon
: set to True to suppress the output of the daemon (i.e., redirect stdout and stderr to /dev/null). If False, the daemon output will be redirected to a log file.blocking
: set to True to run the service in the context of the current process and block until the service is stopped instead of running the service as a daemon process. Useful for operating systems that do not support daemon processes.model_uri
: The URI of the model to be deployed. This can be a local file path, a run ID, or a model name and version.workers
: The number of workers to be used by the MLflow prediction server.mlserver
: If True, the MLflow prediction server will be started as a MLServer instance.timeout
: The timeout in seconds to wait for the MLflow prediction server to start or stop.
Run inference on a deployed model
The following code example shows how you can load a deployed model in Python and run inference against it:
Load a prediction service deployed in another pipeline
Within the same pipeline, use the service from previous step to run inference this time using pre-built predict method
For more information and a full list of configurable attributes of the MLflow Model Deployer, check out the SDK Docs .
Last updated