Kubernetes Deployer

Deploying your pipelines to Kubernetes clusters.

Kubernetes is the industry-standard container orchestration platform for deploying and managing containerized applications at scale. The Kubernetes deployer is a deployer flavor included in the ZenML Kubernetes integration that deploys your pipelines to any Kubernetes cluster as production-ready services.

When to use it

You should use the Kubernetes deployer if:

  • you already have a Kubernetes cluster (EKS, GKE, AKS, or self-managed).

  • you need fine-grained control over deployment configuration (resources, networking, security).

  • you want production-grade features like autoscaling, health probes, and high availability.

  • you need to deploy to on-premises infrastructure or air-gapped environments.

  • you want to leverage existing Kubernetes expertise and tooling in your organization.

  • you need to integrate with existing Kubernetes resources (Ingress, NetworkPolicies, ServiceMonitors, etc.).

Prerequisites

To use the Kubernetes deployer, you need:

Kubernetes access and permissions

You have two different options to provide cluster access to the Kubernetes deployer:

  • use kubectl to authenticate locally with your Kubernetes cluster

  • (recommended) configure a Kubernetes Service Connector and link it to the Kubernetes deployer stack component.

Kubernetes Permissions

The Kubernetes deployer needs the following permissions in the target namespace:

  • Deployments: create, get, list, watch, update, delete

  • Services: create, get, list, watch, update, delete

  • Secrets: create, get, list, watch, update, delete

  • Pods: get, list, watch (for logs and status)

  • Namespaces: create, get (if creating namespaces)

If using additional resources (Ingress, HPA, NetworkPolicy, etc.), you'll also need permissions for those resource types.

Configuration use-case: local kubectl with context

This configuration assumes you have configured kubectl to authenticate with your cluster (i.e. by running kubectl config use-context <context-name>). This is the easiest way to get started:

Configuration use-case: Kubernetes Service Connector

This is the recommended approach for production and team environments. It makes credentials portable and manageable:

See the Kubernetes Service Connector documentation for more authentication methods including:

  • Service account tokens

  • Kubeconfig files

  • Cloud provider authentication (EKS, GKE, AKS)

Configuration use-case: In-cluster deployment

If your ZenML server runs inside the same Kubernetes cluster where you want to deploy pipelines, you can use in-cluster authentication:

This uses the service account token mounted into the pod running ZenML.

Configuring the stack

With the deployer registered, you can use it in your active stack:

ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME> and use it to deploy your pipeline as a Kubernetes Deployment with a Service. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.

You can now deploy any ZenML pipeline using the Kubernetes deployer:

Advanced configuration

The Kubernetes deployer follows a progressive complexity model, allowing you to start simple and add configuration as needed:

Level 1: Essential settings (80% of use cases)

Most deployments only need these basic settings:

Level 2: Production-ready configuration

For production deployments, add health probes, labels, and resource limits:

Level 3: Additional resources

Deploy additional Kubernetes resources alongside your main Deployment and Service (Ingress, HorizontalPodAutoscaler, PodDisruptionBudget, NetworkPolicy, etc.):

First, create a YAML file (e.g., k8s-resources.yaml) with your additional resources:

Then reference this file in your deployment settings:

Available template variables for use in your YAML files:

Core objects (access their properties directly):

  • {{deployment}}: Full DeploymentResponse object - access via {{deployment.id}}, {{deployment.name}}, etc.

  • {{settings}}: Full KubernetesDeployerSettings object - access via {{settings.service_port}}, {{settings.service_type}}, etc.

Common values:

  • {{name}}: Deployment/Service resource name (use this for Deployment name, Service name in Ingress, HPA target, etc.)

  • {{namespace}}: Kubernetes namespace

  • {{labels}}: Dict of all labels (includes ZenML-managed labels + custom labels)

    • {{labels['managed-by']}}: Always 'zenml'

    • {{labels['zenml-deployment-id']}}: Deployment UUID

    • {{labels['zenml-deployment-name']}}: Human-readable deployment name

  • {{replicas}}: Configured replica count

  • {{image}}: Container image URI

  • {{command}}: Container command

  • {{args}}: Container args

  • {{env}}: Environment variables dict

  • {{resources}}: Resource requests/limits dict

  • {{pod_settings}}: KubernetesPodSettings object (if configured)

Level 4: Custom Deployment and Service templates

For maximum control, you can completely override the built-in Deployment and Service templates by providing your own Jinja2 templates. This allows you to customize every aspect of the core Kubernetes resources that ZenML creates.

When to use custom templates:

  • You need to add features not supported by the standard settings (init containers, sidecar containers, custom volume types)

  • You want complete control over health probe configuration beyond the provided settings

  • You need specific Kubernetes features for compliance or security requirements

  • You're migrating existing Kubernetes manifests to ZenML and want to maintain the exact structure

How it works:

  1. Create a directory for your custom templates (e.g., ~/.zenml/k8s-templates/)

  2. Add one or both of these files to override the built-in templates:

    • deployment.yaml.j2 - Override the Deployment resource

    • service.yaml.j2 - Override the Service resource

  3. Configure the deployer to use your custom templates:

Available template variables:

Your custom templates have access to all the same context variables as the built-in templates:

Core objects:

  • deployment: Full DeploymentResponse object

  • settings: Full KubernetesDeployerSettings object (access health probes, ports, etc. via settings.X)

Common values:

  • name: Deployment/Service resource name

  • namespace: Kubernetes namespace

  • image: Container image URI

  • replicas: Number of replicas

  • labels: Dict of all labels

  • command: Container command (list)

  • args: Container args (list)

  • env: Environment variables dict

  • resources: Resource requests/limits dict (with requests and limits keys)

  • pod_settings: KubernetesPodSettings object (access volumes, affinity, tolerations, etc.)

Example: Custom deployment template with init container

Create ~/MyProject/k8s-templates/deployment.yaml.j2:

Tip: Start with ZenML's built-in templates as a reference. You can find them on GitHub:

Copy and modify them for your needs.

Complete settings reference

For a complete list of all available settings, see the KubernetesDeployerSettings class. Here's a comprehensive overview organized by category:

Basic Settings (common to all Deployers):

  • auth_key: User-defined authentication key for deployment API calls

  • generate_auth_key: Whether to generate a random authentication key

  • lcm_timeout: Maximum time in seconds to wait for deployment lifecycle operations

Essential Settings:

  • namespace: Kubernetes namespace for the deployment (defaults to deployer's kubernetes_namespace)

  • service_type: How to expose the service - LoadBalancer, NodePort, or ClusterIP (default: LoadBalancer)

  • service_port: Port to expose on the service (default: 8000)

  • image_pull_policy: When to pull images - Always, IfNotPresent, or Never (default: IfNotPresent)

  • labels: Additional labels to apply to all resources

  • annotations: Annotations to add to pod resources

Container Configuration:

  • command: Override container command/entrypoint

  • args: Override container args

  • service_account_name: Kubernetes service account for pods

  • image_pull_secrets: List of secret names for pulling private images

Health Probes:

  • readiness_probe_path: HTTP path for readiness probe (default: /api/health)

  • readiness_probe_initial_delay: Initial delay in seconds (default: 10)

  • readiness_probe_period: Probe interval in seconds (default: 10)

  • readiness_probe_timeout: Probe timeout in seconds (default: 5)

  • readiness_probe_failure_threshold: Failures before marking pod not ready (default: 3)

  • liveness_probe_path: HTTP path for liveness probe (default: /api/health)

  • liveness_probe_initial_delay: Initial delay in seconds (default: 30)

  • liveness_probe_period: Probe interval in seconds (default: 10)

  • liveness_probe_timeout: Probe timeout in seconds (default: 5)

  • liveness_probe_failure_threshold: Failures before restarting pod (default: 3)

Advanced Settings:

  • pod_settings: Advanced pod configuration (see KubernetesPodSettings)

  • additional_resources: List of paths to YAML files with additional K8s resources

  • strict_additional_resources: If True, fail deployment if any additional resource fails (default: True)

  • custom_templates_dir: Path to directory with custom Jinja2 templates

Internal Settings:

  • wait_for_load_balancer_timeout: Timeout for LoadBalancer IP assignment (default: 150 seconds, 0 to skip)

  • deployment_ready_check_interval: Interval between readiness checks (default: 2 seconds)

Check out this docs page for more information on how to specify settings.

Troubleshooting

Deployment stuck in pending state

Check pod events and logs:

Common causes:

  • Insufficient cluster resources

  • Image pull errors (check image_pull_secrets)

  • Node selector/affinity constraints not satisfied

  • PersistentVolumeClaim pending

LoadBalancer not getting external IP

If using LoadBalancer service type and it stays <pending>:

  • Check if your cluster supports LoadBalancer (cloud providers usually do, local clusters usually don't)

  • For local clusters, use NodePort instead

  • For production without LoadBalancer support, use ClusterIP with an Ingress

Additional resources failing to apply

If using strict_additional_resources=True and deployment fails:

Common issues:

  • Missing CRDs (install required operators)

  • Missing cluster components (metrics-server for HPA, ingress controller for Ingress)

  • Invalid resource references (check template variables)

Image pull errors

If pods can't pull the container image:

  • Verify image exists in registry: docker pull <image>

  • Check image_pull_secrets are configured correctly

  • Verify service account has access to registry

  • Check image pull policy (IfNotPresent vs Always)

Best practices

  1. Use service connectors in production for portable, manageable credentials

  2. Always configure health probes for production deployments

  3. Use Ingress with ClusterIP instead of LoadBalancer for cost and flexibility

  4. Use labels and annotations for organization, monitoring, and cost tracking

  5. Configure resource limits to prevent resource exhaustion

  6. Use HPA for autoscaling based on actual load

  7. Configure PodDisruptionBudget for high availability during cluster updates

  8. Keep additional resources in version control alongside your pipeline code

Last updated

Was this helpful?