Kubernetes Deployer

Deploying your pipelines to Kubernetes clusters.

Kubernetes is the industry-standard container orchestration platform for deploying and managing containerized applications at scale. The Kubernetes deployer is a deployer flavor included in the ZenML Kubernetes integration that deploys your pipelines to any Kubernetes cluster as production-ready services.

When to use it

You should use the Kubernetes deployer if:

  • you already have a Kubernetes cluster (EKS, GKE, AKS, or self-managed).

  • you need fine-grained control over deployment configuration (resources, networking, security).

  • you want production-grade features like autoscaling, health probes, and high availability.

  • you need to deploy to on-premises infrastructure or air-gapped environments.

  • you want to leverage existing Kubernetes expertise and tooling in your organization.

  • you need to integrate with existing Kubernetes resources (Ingress, NetworkPolicies, ServiceMonitors, etc.).

Prerequisites

To use the Kubernetes deployer, you need:

Kubernetes access and permissions

You have two different options to provide cluster access to the Kubernetes deployer:

  • use kubectl to authenticate locally with your Kubernetes cluster

  • (recommended) configure a Kubernetes Service Connector and link it to the Kubernetes deployer stack component.

Kubernetes Permissions

The Kubernetes deployer needs the following permissions in the target namespace:

  • Deployments: create, get, list, watch, update, delete

  • Services: create, get, list, watch, update, delete

  • Secrets: create, get, list, watch, update, delete

  • Pods: get, list, watch (for logs and status)

  • Namespaces: create, get (if creating namespaces)

If using additional resources (Ingress, HPA, NetworkPolicy, etc.), you'll also need permissions for those resource types.

Configuration use-case: local kubectl with context

This configuration assumes you have configured kubectl to authenticate with your cluster (i.e. by running kubectl config use-context <context-name>). This is the easiest way to get started:

zenml deployer register <DEPLOYER_NAME> \
    --flavor=kubernetes \
    --kubernetes_context=<CONTEXT_NAME> \
    --kubernetes_namespace=zenml-deployments

Configuration use-case: Kubernetes Service Connector

This is the recommended approach for production and team environments. It makes credentials portable and manageable:

# Register a Kubernetes service connector
zenml service-connector register <CONNECTOR_NAME> \
    --type kubernetes \
    --auth-method=token \
    --token=<YOUR_TOKEN> \
    --server=<CLUSTER_URL> \
    --certificate_authority=<CA_CERT_PATH> \
    --resource-type kubernetes-cluster

# Register the deployer and link it to the connector
zenml deployer register <DEPLOYER_NAME> \
    --flavor=kubernetes \
    --kubernetes_namespace=zenml-deployments \
    --connector <CONNECTOR_NAME>

See the Kubernetes Service Connector documentation for more authentication methods including:

  • Service account tokens

  • Kubeconfig files

  • Cloud provider authentication (EKS, GKE, AKS)

Configuration use-case: In-cluster deployment

If your ZenML server runs inside the same Kubernetes cluster where you want to deploy pipelines, you can use in-cluster authentication:

zenml deployer register <DEPLOYER_NAME> \
    --flavor=kubernetes \
    --incluster=True \
    --kubernetes_namespace=zenml-deployments

This uses the service account token mounted into the pod running ZenML.

Configuring the stack

With the deployer registered, you can use it in your active stack:

# Register and activate a stack with the new deployer
zenml stack register <STACK_NAME> -D <DEPLOYER_NAME> ... --set

ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME> and use it to deploy your pipeline as a Kubernetes Deployment with a Service. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.

You can now deploy any ZenML pipeline using the Kubernetes deployer:

zenml pipeline deploy --name my_deployment my_module.my_pipeline

Advanced configuration

The Kubernetes deployer follows a progressive complexity model, allowing you to start simple and add configuration as needed:

Level 1: Essential settings (80% of use cases)

Most deployments only need these basic settings:

from zenml import pipeline, step
from zenml.integrations.kubernetes.flavors.kubernetes_deployer_flavor import (
    KubernetesDeployerSettings
)
from zenml.config import ResourceSettings

@step
def greet(name: str) -> str:
    return f"Hello {name}!"

# Basic deployment settings
settings = {
    "deployer": KubernetesDeployerSettings(
        namespace="my-namespace",
        service_type="LoadBalancer",  # or "NodePort", "ClusterIP"
        service_port=8000,
    ),
    "resources": ResourceSettings(
        cpu_count=1,
        memory="2GB",
        min_replicas=1,
        max_replicas=3,
    )
}

@pipeline(settings=settings)
def greet_pipeline(name: str = "World"):
    greet(name=name)

Level 2: Production-ready configuration

For production deployments, add health probes, labels, and resource limits:

settings = {
    "deployer": KubernetesDeployerSettings(
        namespace="production",
        service_type="ClusterIP",  # Use with Ingress
        service_port=8000,

        
        # Labels for organization
        labels={
            "environment": "production",
            "team": "ml-platform",
            "version": "1.0",
        },
        
        # Prometheus monitoring
        annotations={
            "prometheus.io/scrape": "true",
            "prometheus.io/port": "8000",
            "prometheus.io/path": "/metrics",
        },
    ),
    "resources": ResourceSettings(
        cpu_count=2,
        memory="4GB",
        min_replicas=2,
        max_replicas=10,
    )
}

Level 3: Additional resources

Deploy additional Kubernetes resources alongside your main Deployment and Service (Ingress, HorizontalPodAutoscaler, PodDisruptionBudget, NetworkPolicy, etc.):

First, create a YAML file (e.g., k8s-resources.yaml) with your additional resources:

# Ingress for domain-based routing
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  namespace: {{namespace}}  # ZenML fills this
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: myapp.company.com  # Your domain
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: {{name}}  # ZenML fills this with the Service name
                port:
                  number: {{settings.service_port}}  # ZenML fills this

# HorizontalPodAutoscaler for autoscaling
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
  namespace: {{namespace}}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{name}}  # ZenML fills this with the Deployment name
  minReplicas: {{replicas}}  # ZenML fills this
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

# PodDisruptionBudget for high availability
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-pdb
  namespace: {{namespace}}
spec:
  minAvailable: 1
  selector:
    matchLabels:
      managed-by: {{labels['managed-by']}}
      zenml-deployment-id: {{labels['zenml-deployment-id']}}

Then reference this file in your deployment settings:

settings = {
    "deployer": KubernetesDeployerSettings(
        additional_resources=[
            "./k8s-resources.yaml"
        ],
        strict_additional_resources=True,  # Fail if any resource fails to apply
        # ... other settings
    )
}

Available template variables for use in your YAML files:

Core objects (access their properties directly):

  • {{deployment}}: Full DeploymentResponse object - access via {{deployment.id}}, {{deployment.name}}, etc.

  • {{settings}}: Full KubernetesDeployerSettings object - access via {{settings.service_port}}, {{settings.service_type}}, etc.

Common values:

  • {{name}}: Deployment/Service resource name (use this for Deployment name, Service name in Ingress, HPA target, etc.)

  • {{namespace}}: Kubernetes namespace

  • {{labels}}: Dict of all labels (includes ZenML-managed labels + custom labels)

    • {{labels['managed-by']}}: Always 'zenml'

    • {{labels['zenml-deployment-id']}}: Deployment UUID

    • {{labels['zenml-deployment-name']}}: Human-readable deployment name

  • {{replicas}}: Configured replica count

  • {{image}}: Container image URI

  • {{command}}: Container command

  • {{args}}: Container args

  • {{env}}: Environment variables dict

  • {{resources}}: Resource requests/limits dict

  • {{pod_settings}}: KubernetesPodSettings object (if configured)

Level 4: Custom Deployment and Service templates

For maximum control, you can completely override the built-in Deployment and Service templates by providing your own Jinja2 templates. This allows you to customize every aspect of the core Kubernetes resources that ZenML creates.

When to use custom templates:

  • You need to add features not supported by the standard settings (init containers, sidecar containers, custom volume types)

  • You want complete control over health probe configuration beyond the provided settings

  • You need specific Kubernetes features for compliance or security requirements

  • You're migrating existing Kubernetes manifests to ZenML and want to maintain the exact structure

How it works:

  1. Create a directory for your custom templates (e.g., ~/.zenml/k8s-templates/)

  2. Add one or both of these files to override the built-in templates:

    • deployment.yaml.j2 - Override the Deployment resource

    • service.yaml.j2 - Override the Service resource

  3. Configure the deployer to use your custom templates:

settings = {
    "deployer": KubernetesDeployerSettings(
        custom_deployment_template_file="~/.zenml/k8s-templates/deployment.yaml.j2",
        custom_service_template_file="~/.zenml/k8s-templates/service.yaml.j2",
        # ... other settings
    )
}

Available template variables:

Your custom templates have access to all the same context variables as the built-in templates:

Core objects:

  • deployment: Full DeploymentResponse object

  • settings: Full KubernetesDeployerSettings object (access health probes, ports, etc. via settings.X)

Common values:

  • name: Deployment/Service resource name

  • namespace: Kubernetes namespace

  • image: Container image URI

  • replicas: Number of replicas

  • labels: Dict of all labels

  • command: Container command (list)

  • args: Container args (list)

  • env: Environment variables dict

  • resources: Resource requests/limits dict (with requests and limits keys)

  • pod_settings: KubernetesPodSettings object (access volumes, affinity, tolerations, etc.)

Example: Custom deployment template with init container

Create ~/MyProject/k8s-templates/deployment.yaml.j2:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ name }}
  namespace: {{ namespace }}
  labels:
    app: {{ name }}
    managed-by: zenml
    {% if labels %}
    {% for key, value in labels.items() %}
    {{ key }}: {{ value | tojson }}
    {% endfor %}
    {% endif %}
spec:
  replicas: {{ replicas | default(1) }}
  selector:
    matchLabels:
      managed-by: zenml
      zenml-deployment-id: {{ labels.get('zenml-deployment-id') | tojson }}
  template:
    metadata:
      labels:
        app: {{ name }}
        managed-by: zenml
        {% if labels %}
        {% for key, value in labels.items() %}
        {{ key }}: {{ value | tojson }}
        {% endfor %}
        {% endif %}
      {% if settings.annotations %}
      annotations:
        {% for key, value in settings.annotations.items() %}
        {{ key }}: {{ value | tojson }}
        {% endfor %}
      {% endif %}
    spec:
      {% if settings.service_account_name %}
      serviceAccountName: {{ settings.service_account_name }}
      {% endif %}
      
      # Custom init container
      initContainers:
      - name: init-setup
        image: busybox:1.28
        command: ['sh', '-c', 'echo "Initializing..." && sleep 5']
      
      containers:
      - name: main
        image: {{ image }}
        imagePullPolicy: {{ settings.image_pull_policy }}
        {% if command %}
        command: {{ command | to_yaml | indent(10, first=True) }}
        {% endif %}
        {% if args %}
        args: {{ args | to_yaml | indent(10, first=True) }}
        {% endif %}
        ports:
        - containerPort: {{ settings.service_port }}
          name: http
        {% if env %}
        env:
        {% for key, value in env.items() %}
          - name: {{ key }}
            value: {{ value | tojson }}
        {% endfor %}
        {% endif %}
        {% if resources %}
        resources:
          {% if "requests" in resources %}
          requests:
          {% for key, value in resources.requests.items() %}
            {{ key }}: {{ value | tojson }}
          {% endfor %}
          {% endif %}
          {% if "limits" in resources %}
          limits:
          {% for key, value in resources.limits.items() %}
            {{ key }}: {{ value | tojson }}
          {% endfor %}
          {% endif %}
        {% endif %}
        readinessProbe:
          httpGet:
            path: {{ settings.readiness_probe_path }}
            port: {{ settings.service_port }}
          initialDelaySeconds: {{ settings.readiness_probe_initial_delay }}
          periodSeconds: {{ settings.readiness_probe_period }}
        livenessProbe:
          httpGet:
            path: {{ settings.liveness_probe_path }}
            port: {{ settings.service_port }}
          initialDelaySeconds: {{ settings.liveness_probe_initial_delay }}
          periodSeconds: {{ settings.liveness_probe_period }}

Tip: Start with ZenML's built-in templates as a reference. You can find them on GitHub:

Copy and modify them for your needs.

Complete settings reference

For a complete list of all available settings, see the KubernetesDeployerSettings class. Here's a comprehensive overview organized by category:

Basic Settings (common to all Deployers):

  • auth_key: User-defined authentication key for deployment API calls

  • generate_auth_key: Whether to generate a random authentication key

  • lcm_timeout: Maximum time in seconds to wait for deployment lifecycle operations

Essential Settings:

  • namespace: Kubernetes namespace for the deployment (defaults to deployer's kubernetes_namespace)

  • service_type: How to expose the service - LoadBalancer, NodePort, or ClusterIP (default: LoadBalancer)

  • service_port: Port to expose on the service (default: 8000)

  • image_pull_policy: When to pull images - Always, IfNotPresent, or Never (default: IfNotPresent)

  • labels: Additional labels to apply to all resources

  • annotations: Annotations to add to pod resources

Container Configuration:

  • command: Override container command/entrypoint

  • args: Override container args

  • service_account_name: Kubernetes service account for pods

  • image_pull_secrets: List of secret names for pulling private images

Health Probes:

  • readiness_probe_path: HTTP path for readiness probe (default: /api/health)

  • readiness_probe_initial_delay: Initial delay in seconds (default: 10)

  • readiness_probe_period: Probe interval in seconds (default: 10)

  • readiness_probe_timeout: Probe timeout in seconds (default: 5)

  • readiness_probe_failure_threshold: Failures before marking pod not ready (default: 3)

  • liveness_probe_path: HTTP path for liveness probe (default: /api/health)

  • liveness_probe_initial_delay: Initial delay in seconds (default: 30)

  • liveness_probe_period: Probe interval in seconds (default: 10)

  • liveness_probe_timeout: Probe timeout in seconds (default: 5)

  • liveness_probe_failure_threshold: Failures before restarting pod (default: 3)

Advanced Settings:

  • pod_settings: Advanced pod configuration (see KubernetesPodSettings)

  • additional_resources: List of paths to YAML files with additional K8s resources

  • strict_additional_resources: If True, fail deployment if any additional resource fails (default: True)

  • custom_templates_dir: Path to directory with custom Jinja2 templates

Internal Settings:

  • wait_for_load_balancer_timeout: Timeout for LoadBalancer IP assignment (default: 150 seconds, 0 to skip)

  • deployment_ready_check_interval: Interval between readiness checks (default: 2 seconds)

Check out this docs page for more information on how to specify settings.

Troubleshooting

Deployment stuck in pending state

Check pod events and logs:

# Get deployment info
zenml deployment describe my-deployment

# Follow logs
zenml deployment logs my-deployment -f

# Check Kubernetes resources directly
kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

Common causes:

  • Insufficient cluster resources

  • Image pull errors (check image_pull_secrets)

  • Node selector/affinity constraints not satisfied

  • PersistentVolumeClaim pending

LoadBalancer not getting external IP

If using LoadBalancer service type and it stays <pending>:

  • Check if your cluster supports LoadBalancer (cloud providers usually do, local clusters usually don't)

  • For local clusters, use NodePort instead

  • For production without LoadBalancer support, use ClusterIP with an Ingress

Additional resources failing to apply

If using strict_additional_resources=True and deployment fails:

# Check which resource failed
zenml deployment describe my-deployment

# Validate resources manually
kubectl apply --dry-run=client -f k8s-resources.yaml

Common issues:

  • Missing CRDs (install required operators)

  • Missing cluster components (metrics-server for HPA, ingress controller for Ingress)

  • Invalid resource references (check template variables)

Image pull errors

If pods can't pull the container image:

  • Verify image exists in registry: docker pull <image>

  • Check image_pull_secrets are configured correctly

  • Verify service account has access to registry

  • Check image pull policy (IfNotPresent vs Always)

Best practices

  1. Use service connectors in production for portable, manageable credentials

  2. Always configure health probes for production deployments

  3. Use Ingress with ClusterIP instead of LoadBalancer for cost and flexibility

  4. Use labels and annotations for organization, monitoring, and cost tracking

  5. Configure resource limits to prevent resource exhaustion

  6. Use HPA for autoscaling based on actual load

  7. Configure PodDisruptionBudget for high availability during cluster updates

  8. Keep additional resources in version control alongside your pipeline code

Last updated

Was this helpful?