Seldon
Deploying models to Kubernetes with Seldon Core.
Seldon Core also comes equipped with a set of built-in model server implementations designed to work with standard formats for packaging ML models that greatly simplify the process of serving models for real-time inference.
The Seldon Core model deployer integration is currently not supported under MacOS.
When to use it?
Seldon Core also comes equipped with a set of built-in model server implementations designed to work with standard formats for packaging ML models that greatly simplify the process of serving models for real-time inference.
You should use the Seldon Core Model Deployer:
If you are looking to deploy your model on a more advanced infrastructure like Kubernetes.
If you want to handle the lifecycle of the deployed model with no downtime, including updating the runtime graph, scaling, monitoring, and security.
Looking for more advanced API endpoints to interact with the deployed model, including REST and GRPC endpoints.
If you want more advanced deployment strategies like A/B testing, canary deployments, and more.
How to deploy it?
ZenML provides a Seldon Core flavor build on top of the Seldon Core Integration to allow you to deploy and use your models in a production-grade environment. In order to use the integration you need to install it on your local machine to be able to register a Seldon Core Model deployer with ZenML and add it to your stack:
To deploy and make use of the Seldon Core integration we need to have the following prerequisites:
Since the Seldon Model Deployer is interacting with the Seldon Core model server deployed on a Kubernetes cluster, you need to provide a set of configuration parameters. These parameters are:
kubernetes_namespace: the Kubernetes namespace where the Seldon Core deployment servers are provisioned and managed by ZenML. If not specified, the namespace set in the current configuration is used.
base_url: the base URL of the Kubernetes ingress used to expose the Seldon Core deployment servers.
Infrastructure Deployment
The Seldon Model Deployer can be deployed directly from the ZenML CLI:
Seldon Core Installation Example
Configure EKS cluster access locally, e.g:
Install Istio 1.5.0 (required for the latest Seldon Core version):
Set up an Istio gateway for Seldon Core:
Install Seldon Core:
Test that the installation is functional
with iris.yaml
defined as follows:
Then extract the URL where the model server exposes its prediction API:
And use curl to send a test prediction API request to the server:
Using a Service Connector
Depending on where your target Kubernetes cluster is running, you can use one of the following Service Connectors:
If you don't already have a Service Connector configured in your ZenML deployment, you can register one using the interactive CLI command. You have the option to configure a Service Connector that can be used to access more than one Kubernetes cluster or even more than one type of cloud resource:
Alternatively, you can configure a Service Connector through the ZenML dashboard:
If you already have one or more Service Connectors configured in your ZenML deployment, you can check which of them can be used to access the Kubernetes cluster that you want to use for your Seldon Core Model Deployer by running e.g.:
After having set up or decided on a Service Connector to use to connect to the target Kubernetes cluster where Seldon Core is installed, you can register the Seldon Core Model Deployer as follows:
A non-interactive version that connects the Seldon Core Model Deployer to a target Kubernetes cluster through a Service Connector:
A similar experience is available when you configure the Seldon Core Model Deployer through the ZenML dashboard:
Managing Seldon Core Authentication
The Seldon Core Model Deployer requires access to the persistent storage where models are located. In most cases, you will use the Seldon Core model deployer to serve models that are trained through ZenML pipelines and stored in the ZenML Artifact Store, which implies that the Seldon Core model deployer needs to access the Artifact Store.
When explicit credentials are configured in the Artifact Store, the Seldon Core Model Deployer doesn't need any additional configuration and will use those credentials automatically to authenticate to the same persistent storage service used by the Artifact Store. If the Artifact Store doesn't have explicit credentials configured, then Seldon Core will default to using whatever implicit authentication method is available in the Kubernetes cluster where it is running. For example, in AWS this means using the IAM role attached to the EC2 or EKS worker nodes, and in GCP this means using the service account attached to the GKE worker nodes.
If the Artifact Store used in combination with the Seldon Core Model Deployer in the same ZenML stack does not have explicit credentials configured, then the Seldon Core Model Deployer might not be able to authenticate to the Artifact Store which will cause the deployed model servers to fail.
To avoid this, we recommend that you use Artifact Stores with explicit credentials in the same stack as the Seldon Core Model Deployer. Alternatively, if you're running Seldon Core in one of the cloud providers, you should configure implicit authentication for the Kubernetes nodes.
If you want to use a custom persistent storage with Seldon Core, or if you prefer to manually manage the authentication credentials attached to the Seldon Core model servers, you can use the approach described in the next section.
Advanced: Configuring a Custom Seldon Core Secret
This method is not recommended, because it limits the Seldon Core model deployer to a single persistent storage service, whereas using the Artifact Store credentials gives you more flexibility in combining the Seldon Core model deployer with any Artifact Store in the same ZenML stack.
How do you use it?
Requirements
To run pipelines that deploy models to Seldon, you need the following tools installed locally:
Stack Component Registration
For registering the model deployer, we need the URL of the Istio Ingress Gateway deployed on the Kubernetes cluster. We can get this URL by running the following command (assuming that the service name is istio-ingressgateway
, deployed in the istio-system
namespace):
Now register the model deployer:
We can now use the model deployer in our stack.
Configuration
Within the SeldonDeploymentConfig
you can configure:
model_name
: the name of the model in the Seldon cluster and in ZenML.replicas
: the number of replicas with which to deploy the modelimplementation
: the type of Seldon inference server to use for the model. The implementation type can be one of the following:TENSORFLOW_SERVER
,SKLEARN_SERVER
,XGBOOST_SERVER
,custom
.parameters
: an optional list of parameters (SeldonDeploymentPredictorParameter
) to pass to the deployment predictor in the form of:name
type
value
resources
: the resources to be allocated to the model. This can be configured by passing aSeldonResourceRequirements
object with therequests
andlimits
properties. The values for these properties can be a dictionary with thecpu
andmemory
keys. The values for these keys can be a string with the amount of CPU and memory to be allocated to the model.serviceAccount
The name of the Service Account applied to the deployment.
Custom Code Deployment
ZenML enables you to deploy your pre- and post-processing code into the deployment environment together with the model by defining a custom predict function that will be wrapped in a Docker container and executed on the model deployment server, e.g.:
The custom predict function should get the model and the input data as arguments and return the model predictions. ZenML will automatically take care of loading the model into memory and starting the seldon-core-microservice
that will be responsible for serving the model and running the predict function.
After defining your custom predict function in code, you can use the seldon_custom_model_deployer_step
to automatically build your function into a Docker image and deploy it as a model server by setting the predict_function
argument to the path of your custom_predict
function:
Advanced Custom Code Deployment with Seldon Core Integration
The built-in Seldon Core custom deployment step is a good starting point for deploying your custom models. However, if you want to deploy more than the trained model, you can create your own custom class and a custom step to achieve this.
Last updated