Ask or search…


How to deploy your models locally with MLflow
This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.
The MLflow Model Deployer is one of the available flavors of the Model Deployer stack component. Provided with the MLflow integration it can be used to deploy and manage MLflow models on a local running MLflow server.
The MLflow Model Deployer is not yet available for use in production. This is a work in progress and will be available soon. At the moment it is only available for use in a local development environment.

When to use it?

MLflow is a popular open source platform for machine learning. It's a great tool for managing the entire lifecycle of your machine learning. One of the most important features of MLflow is the ability to package your model and its dependencies into a single artifact that can be deployed to a variety of deployment targets.
You should use the MLflow Model Deployer:
  • if you want to have an easy way to deploy your models locally and perform real-time predictions using the running MLflow prediction server.
  • if you are looking to deploy your models in a simple way without the need for a dedicated deployment environment like Kubernetes or advanced infrastructure configuration.
If you are looking to deploy your models in a more complex way, you should use one of the other Model Deployer Flavors available in ZenML (e.g. Seldon Core, KServe, etc.)

How do you deploy it?

The MLflow Model Deployer flavor is provided by the MLflow ZenML integration, you need to install it on your local machine to be able to deploy your models. You can do this by running the following command:
zenml integration install mlflow -y
To register the MLflow model deployer with ZenML you need to run the following command:
zenml model-deployer register mlflow_deployer --flavor=mlflow
The ZenML integration will provision a local MLflow deployment server as a daemon process that will continue to run in the background to serve the latest MLflow model.

How do you use it?

The first step to be able to deploy and use your MLflow model is to create Service deployment from code, this is done by setting the different parameters that the MLflow deployment step requires.
from zenml.steps import BaseStepConfig
from zenml.integrations.mlflow.steps import mlflow_deployer_step
from zenml.integrations.mlflow.steps import MLFlowDeployerConfig
class MLFlowDeploymentLoaderStepConfig(BaseStepConfig):
"""MLflow deployment getter configuration
pipeline_name: name of the pipeline that deployed the MLflow prediction
step_name: the name of the step that deployed the MLflow prediction
running: when this flag is set, the step only returns a running service
pipeline_name: str
step_name: str
running: bool = True
model_deployer = mlflow_deployer_step(name="model_deployer")
# Initialize a continuous deployment pipeline run
deployment = continuous_deployment_pipeline(
# as a last step to our pipeline the model deployer step is run with it config in place
You can run predictions on the deployed model with something like:
from zenml.integrations.mlflow.services import MLFlowDeploymentService
from zenml.steps import BaseStepConfig, Output, StepContext, step
from zenml.services import load_last_service_from_step
class MLFlowDeploymentLoaderStepConfig(BaseStepConfig):
# see implementation above
# Step to retrieve the service associated with the last pipeline run
def prediction_service_loader(
config: MLFlowDeploymentLoaderStepConfig, context: StepContext
) -> MLFlowDeploymentService:
"""Get the prediction service started by the deployment pipeline"""
service = load_last_service_from_step(
if not service:
raise RuntimeError(
f"No MLflow prediction service deployed by the "
f"{config.step_name} step in the {config.pipeline_name} pipeline "
f"is currently running."
return service
# Use the service for inference
def predictor(
service: MLFlowDeploymentService,
data: np.ndarray,
) -> Output(predictions=np.ndarray):
"""Run a inference request against a prediction service"""
service.start(timeout=10) # should be a NOP if already started
prediction = service.predict(data)
prediction = prediction.argmax(axis=-1)
return prediction
# Initialize an inference pipeline run
inference = inference_pipeline(
You can check the MLflow deployment example for more details.
For more information and a full list of configurable attributes of the MLflow Model Deployer, check out the API Docs.