Kubernetes Deployer

Deploying your pipelines to Kubernetes clusters.

Kubernetes is the industry-standard container orchestration platform for deploying and managing containerized applications at scale. The Kubernetes deployer is a deployer flavor included in the ZenML Kubernetes integration that deploys your pipelines to any Kubernetes cluster as production-ready services.

This component is only meant to be used within the context of a remote ZenML installation. Usage with a local ZenML setup may lead to unexpected behavior!

When to use it

You should use the Kubernetes deployer if:

you already have a Kubernetes cluster (EKS, GKE, AKS, or self-managed).
you need fine-grained control over deployment configuration (resources, networking, security).
you want production-grade features like autoscaling, health probes, and high availability.
you need to deploy to on-premises infrastructure or air-gapped environments.
you want to leverage existing Kubernetes expertise and tooling in your organization.
you need to integrate with existing Kubernetes resources (Ingress, NetworkPolicies, ServiceMonitors, etc.).

Prerequisites

To use the Kubernetes deployer, you need:

The ZenML kubernetes integration installed:
```
zenml integration install kubernetes
```
Docker installed and running.
A remote artifact store as part of your stack.
A remote container registry as part of your stack.
A running Kubernetes cluster (version 1.21 or higher recommended)
Kubernetes cluster access either via kubectl or a service connector.

Kubernetes access and permissions

You have two different options to provide cluster access to the Kubernetes deployer:

use kubectl to authenticate locally with your Kubernetes cluster
(recommended) configure a Kubernetes Service Connector and link it to the Kubernetes deployer stack component.

Kubernetes Permissions

The Kubernetes deployer needs the following permissions in the target namespace:

Deployments: create, get, list, watch, update, delete
Services: create, get, list, watch, update, delete
Secrets: create, get, list, watch, update, delete
Pods: get, list, watch (for logs and status)
Namespaces: create, get (if creating namespaces)

If using additional resources (Ingress, HPA, NetworkPolicy, etc.), you'll also need permissions for those resource types.

Configuration use-case: local kubectl with context

This configuration assumes you have configured kubectl to authenticate with your cluster (i.e. by running kubectl config use-context <context-name>). This is the easiest way to get started:

zenml deployer register <DEPLOYER_NAME> \
    --flavor=kubernetes \
    --kubernetes_context=<CONTEXT_NAME> \
    --kubernetes_namespace=zenml-deployments

This setup is not portable to other machines unless they have the same kubectl context configured. For production and team environments, use a service connector instead.

Configuration use-case: Kubernetes Service Connector

This is the recommended approach for production and team environments. It makes credentials portable and manageable:

# Register a Kubernetes service connector
zenml service-connector register <CONNECTOR_NAME> \
    --type kubernetes \
    --auth-method=token \
    --token=<YOUR_TOKEN> \
    --server=<CLUSTER_URL> \
    --certificate_authority=<CA_CERT_PATH> \
    --resource-type kubernetes-cluster

# Register the deployer and link it to the connector
zenml deployer register <DEPLOYER_NAME> \
    --flavor=kubernetes \
    --kubernetes_namespace=zenml-deployments \
    --connector <CONNECTOR_NAME>

See the Kubernetes Service Connector documentation for more authentication methods including:

Service account tokens
Kubeconfig files
Cloud provider authentication (EKS, GKE, AKS)

Configuration use-case: In-cluster deployment

If your ZenML server runs inside the same Kubernetes cluster where you want to deploy pipelines, you can use in-cluster authentication:

zenml deployer register <DEPLOYER_NAME> \
    --flavor=kubernetes \
    --incluster=True \
    --kubernetes_namespace=zenml-deployments

This uses the service account token mounted into the pod running ZenML.

Configuring the stack

With the deployer registered, you can use it in your active stack:

# Register and activate a stack with the new deployer
zenml stack register <STACK_NAME> -D <DEPLOYER_NAME> ... --set

ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME> and use it to deploy your pipeline as a Kubernetes Deployment with a Service. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.

You can now deploy any ZenML pipeline using the Kubernetes deployer:

zenml pipeline deploy --name my_deployment my_module.my_pipeline

Advanced configuration

The Kubernetes deployer follows a progressive complexity model, allowing you to start simple and add configuration as needed:

Level 1: Essential settings (80% of use cases)

Most deployments only need these basic settings:

from zenml import pipeline, step
from zenml.integrations.kubernetes.flavors.kubernetes_deployer_flavor import (
    KubernetesDeployerSettings
)
from zenml.config import ResourceSettings

@step
def greet(name: str) -> str:
    return f"Hello {name}!"

# Basic deployment settings
settings = {
    "deployer": KubernetesDeployerSettings(
        namespace="my-namespace",
        service_type="LoadBalancer",  # or "NodePort", "ClusterIP"
        service_port=8000,
    ),
    "resources": ResourceSettings(
        cpu_count=1,
        memory="2GB",
        min_replicas=1,
        max_replicas=3,
    )
}

@pipeline(settings=settings)
def greet_pipeline(name: str = "World"):
    greet(name=name)

Level 2: Production-ready configuration

For production deployments, add health probes, labels, and resource limits:

settings = {
    "deployer": KubernetesDeployerSettings(
        namespace="production",
        service_type="ClusterIP",  # Use with Ingress
        service_port=8000,

        
        # Labels for organization
        labels={
            "environment": "production",
            "team": "ml-platform",
            "version": "1.0",
        },
        
        # Prometheus monitoring
        annotations={
            "prometheus.io/scrape": "true",
            "prometheus.io/port": "8000",
            "prometheus.io/path": "/metrics",
        },
    ),
    "resources": ResourceSettings(
        cpu_count=2,
        memory="4GB",
        min_replicas=2,
        max_replicas=10,
    )
}

Level 3: Additional resources

Deploy additional Kubernetes resources alongside your main Deployment and Service (Ingress, HorizontalPodAutoscaler, PodDisruptionBudget, NetworkPolicy, etc.):

First, create a YAML file (e.g., k8s-resources.yaml) with your additional resources:

# Ingress for domain-based routing
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  namespace: {{namespace}}  # ZenML fills this
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: myapp.company.com  # Your domain
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: {{name}}  # ZenML fills this with the Service name
                port:
                  number: {{settings.service_port}}  # ZenML fills this

# HorizontalPodAutoscaler for autoscaling
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-hpa
  namespace: {{namespace}}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{name}}  # ZenML fills this with the Deployment name
  minReplicas: {{replicas}}  # ZenML fills this
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

# PodDisruptionBudget for high availability
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-pdb
  namespace: {{namespace}}
spec:
  minAvailable: 1
  selector:
    matchLabels:
      managed-by: {{labels['managed-by']}}
      zenml-deployment-id: {{labels['zenml-deployment-id']}}

Then reference this file in your deployment settings:

settings = {
    "deployer": KubernetesDeployerSettings(
        additional_resources=[
            "./k8s-resources.yaml"
        ],
        strict_additional_resources=True,  # Fail if any resource fails to apply
        # ... other settings
    )
}

Available template variables for use in your YAML files:

Core objects (access their properties directly):

{{deployment}}: Full DeploymentResponse object - access via {{deployment.id}}, {{deployment.name}}, etc.
{{settings}}: Full KubernetesDeployerSettings object - access via {{settings.service_port}}, {{settings.service_type}}, etc.

Common values:

{{name}}: Deployment/Service resource name (use this for Deployment name, Service name in Ingress, HPA target, etc.)
{{namespace}}: Kubernetes namespace
{{labels}}: Dict of all labels (includes ZenML-managed labels + custom labels)
- {{labels['managed-by']}}: Always 'zenml'
- {{labels['zenml-deployment-id']}}: Deployment UUID
- {{labels['zenml-deployment-name']}}: Human-readable deployment name
{{replicas}}: Configured replica count
{{image}}: Container image URI
{{command}}: Container command
{{args}}: Container args
{{env}}: Environment variables dict
{{resources}}: Resource requests/limits dict
{{pod_settings}}: KubernetesPodSettings object (if configured)

Important prerequisites:

Ingress: Requires an ingress controller (nginx, traefik, etc.) installed in your cluster
HPA: Requires metrics-server installed
ServiceMonitor: Requires Prometheus Operator CRDs installed
CRDs: Any custom resources must have their CRDs installed beforehand

Level 4: Custom Deployment and Service templates

For maximum control, you can completely override the built-in Deployment and Service templates by providing your own Jinja2 templates. This allows you to customize every aspect of the core Kubernetes resources that ZenML creates.

When to use custom templates:

You need to add features not supported by the standard settings (init containers, sidecar containers, custom volume types)
You want complete control over health probe configuration beyond the provided settings
You need specific Kubernetes features for compliance or security requirements
You're migrating existing Kubernetes manifests to ZenML and want to maintain the exact structure

How it works:

Create a directory for your custom templates (e.g., ~/.zenml/k8s-templates/)
Add one or both of these files to override the built-in templates:
- deployment.yaml.j2 - Override the Deployment resource
- service.yaml.j2 - Override the Service resource
Configure the deployer to use your custom templates:

settings = {
    "deployer": KubernetesDeployerSettings(
        custom_deployment_template_file="~/.zenml/k8s-templates/deployment.yaml.j2",
        custom_service_template_file="~/.zenml/k8s-templates/service.yaml.j2",
        # ... other settings
    )
}

Available template variables:

Your custom templates have access to all the same context variables as the built-in templates:

Core objects:

deployment: Full DeploymentResponse object
settings: Full KubernetesDeployerSettings object (access health probes, ports, etc. via settings.X)

Common values:

name: Deployment/Service resource name
namespace: Kubernetes namespace
image: Container image URI
replicas: Number of replicas
labels: Dict of all labels
command: Container command (list)
args: Container args (list)
env: Environment variables dict
resources: Resource requests/limits dict (with requests and limits keys)
pod_settings: KubernetesPodSettings object (access volumes, affinity, tolerations, etc.)

Example: Custom deployment template with init container

Create ~/MyProject/k8s-templates/deployment.yaml.j2:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ name }}
  namespace: {{ namespace }}
  labels:
    app: {{ name }}
    managed-by: zenml
    {% if labels %}
    {% for key, value in labels.items() %}
    {{ key }}: {{ value | tojson }}
    {% endfor %}
    {% endif %}
spec:
  replicas: {{ replicas | default(1) }}
  selector:
    matchLabels:
      managed-by: zenml
      zenml-deployment-id: {{ labels.get('zenml-deployment-id') | tojson }}
  template:
    metadata:
      labels:
        app: {{ name }}
        managed-by: zenml
        {% if labels %}
        {% for key, value in labels.items() %}
        {{ key }}: {{ value | tojson }}
        {% endfor %}
        {% endif %}
      {% if settings.annotations %}
      annotations:
        {% for key, value in settings.annotations.items() %}
        {{ key }}: {{ value | tojson }}
        {% endfor %}
      {% endif %}
    spec:
      {% if settings.service_account_name %}
      serviceAccountName: {{ settings.service_account_name }}
      {% endif %}
      
      # Custom init container
      initContainers:
      - name: init-setup
        image: busybox:1.28
        command: ['sh', '-c', 'echo "Initializing..." && sleep 5']
      
      containers:
      - name: main
        image: {{ image }}
        imagePullPolicy: {{ settings.image_pull_policy }}
        {% if command %}
        command: {{ command | to_yaml | indent(10, first=True) }}
        {% endif %}
        {% if args %}
        args: {{ args | to_yaml | indent(10, first=True) }}
        {% endif %}
        ports:
        - containerPort: {{ settings.service_port }}
          name: http
        {% if env %}
        env:
        {% for key, value in env.items() %}
          - name: {{ key }}
            value: {{ value | tojson }}
        {% endfor %}
        {% endif %}
        {% if resources %}
        resources:
          {% if "requests" in resources %}
          requests:
          {% for key, value in resources.requests.items() %}
            {{ key }}: {{ value | tojson }}
          {% endfor %}
          {% endif %}
          {% if "limits" in resources %}
          limits:
          {% for key, value in resources.limits.items() %}
            {{ key }}: {{ value | tojson }}
          {% endfor %}
          {% endif %}
        {% endif %}
        readinessProbe:
          httpGet:
            path: {{ settings.readiness_probe_path }}
            port: {{ settings.service_port }}
          initialDelaySeconds: {{ settings.readiness_probe_initial_delay }}
          periodSeconds: {{ settings.readiness_probe_period }}
        livenessProbe:
          httpGet:
            path: {{ settings.liveness_probe_path }}
            port: {{ settings.service_port }}
          initialDelaySeconds: {{ settings.liveness_probe_initial_delay }}
          periodSeconds: {{ settings.liveness_probe_period }}

Tip: Start with ZenML's built-in templates as a reference. You can find them on GitHub:

Copy and modify them for your needs.

When using custom templates, you're responsible for maintaining compatibility with ZenML's deployment lifecycle. Ensure your templates:

Use the correct label selectors (zenml-deployment-id, managed-by: zenml) for resource tracking
Expose the correct container port ({{ settings.service_port }})
Include health probes for proper deployment tracking
Pass through environment variables ({{ env }})
Use {{ name }} for consistent resource naming

Complete settings reference

For a complete list of all available settings, see the KubernetesDeployerSettings class. Here's a comprehensive overview organized by category:

Basic Settings (common to all Deployers):

auth_key: User-defined authentication key for deployment API calls
generate_auth_key: Whether to generate a random authentication key
lcm_timeout: Maximum time in seconds to wait for deployment lifecycle operations

Essential Settings:

namespace: Kubernetes namespace for the deployment (defaults to deployer's kubernetes_namespace)
service_type: How to expose the service - LoadBalancer, NodePort, or ClusterIP (default: LoadBalancer)
service_port: Port to expose on the service (default: 8000)
image_pull_policy: When to pull images - Always, IfNotPresent, or Never (default: IfNotPresent)
labels: Additional labels to apply to all resources
annotations: Annotations to add to pod resources

Container Configuration:

command: Override container command/entrypoint
args: Override container args
service_account_name: Kubernetes service account for pods
image_pull_secrets: List of secret names for pulling private images

Health Probes:

readiness_probe_path: HTTP path for readiness probe (default: /api/health)
readiness_probe_initial_delay: Initial delay in seconds (default: 10)
readiness_probe_period: Probe interval in seconds (default: 10)
readiness_probe_timeout: Probe timeout in seconds (default: 5)
readiness_probe_failure_threshold: Failures before marking pod not ready (default: 3)
liveness_probe_path: HTTP path for liveness probe (default: /api/health)
liveness_probe_initial_delay: Initial delay in seconds (default: 30)
liveness_probe_period: Probe interval in seconds (default: 10)
liveness_probe_timeout: Probe timeout in seconds (default: 5)
liveness_probe_failure_threshold: Failures before restarting pod (default: 3)

Advanced Settings:

pod_settings: Advanced pod configuration (see KubernetesPodSettings)
additional_resources: List of paths to YAML files with additional K8s resources
strict_additional_resources: If True, fail deployment if any additional resource fails (default: True)
custom_templates_dir: Path to directory with custom Jinja2 templates

Internal Settings:

wait_for_load_balancer_timeout: Timeout for LoadBalancer IP assignment (default: 150 seconds, 0 to skip)
deployment_ready_check_interval: Interval between readiness checks (default: 2 seconds)

Check out this docs page for more information on how to specify settings.

Troubleshooting

Deployment stuck in pending state

Check pod events and logs:

# Get deployment info
zenml deployment describe my-deployment

# Follow logs
zenml deployment logs my-deployment -f

# Check Kubernetes resources directly
kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

Common causes:

Insufficient cluster resources
Image pull errors (check image_pull_secrets)
Node selector/affinity constraints not satisfied
PersistentVolumeClaim pending

LoadBalancer not getting external IP

If using LoadBalancer service type and it stays <pending>:

Check if your cluster supports LoadBalancer (cloud providers usually do, local clusters usually don't)
For local clusters, use NodePort instead
For production without LoadBalancer support, use ClusterIP with an Ingress

Additional resources failing to apply

If using strict_additional_resources=True and deployment fails:

# Check which resource failed
zenml deployment describe my-deployment

# Validate resources manually
kubectl apply --dry-run=client -f k8s-resources.yaml

Common issues:

Missing CRDs (install required operators)
Missing cluster components (metrics-server for HPA, ingress controller for Ingress)
Invalid resource references (check template variables)

Image pull errors

If pods can't pull the container image:

Verify image exists in registry: docker pull <image>
Check image_pull_secrets are configured correctly
Verify service account has access to registry
Check image pull policy (IfNotPresent vs Always)

Best practices

Use service connectors in production for portable, manageable credentials
Always configure health probes for production deployments
Use Ingress with ClusterIP instead of LoadBalancer for cost and flexibility
Use labels and annotations for organization, monitoring, and cost tracking
Configure resource limits to prevent resource exhaustion
Use HPA for autoscaling based on actual load
Configure PodDisruptionBudget for high availability during cluster updates
Keep additional resources in version control alongside your pipeline code

PreviousDocker Deployer NextAWS App Runner Deployer

Last updated 2 days ago

Was this helpful?

Good morning

When to use it

Prerequisites

Kubernetes access and permissions

Kubernetes Permissions

Configuration use-case: local kubectl with context

Configuration use-case: Kubernetes Service Connector

Configuration use-case: In-cluster deployment

Configuring the stack

Advanced configuration

Level 1: Essential settings (80% of use cases)

Level 2: Production-ready configuration

Level 3: Additional resources

Level 4: Custom Deployment and Service templates

Complete settings reference

Troubleshooting

Deployment stuck in pending state

LoadBalancer not getting external IP

Additional resources failing to apply

Image pull errors

Best practices