databaseWorkspace Server

Configuration reference for the ZenML Workspace Server.

This page provides the configuration reference for the ZenML Workspace Server, including the workload manager that enables running pipelines from the UI. For an overview of what the Workspace Server does, see System Architecture.

circle-info

This configuration is relevant for Hybrid and Self-hosted deployments. In SaaS deployments, the Workspace Server is fully managed by ZenML.

Permissions

When running your own Workspace Server, you need full CRUD permissions on a dedicated database (MySQL only, PostgreSQL not supported for workspace servers).

Network Requirements

Direction
Source/Destination
Protocol
Purpose

Ingress

ZenML SDK clients

HTTPS

API requests from developers and CI/CD

Ingress

ZenML Pro Dashboard

HTTPS

UI data requests

Ingress

Orchestrator pods/tasks

HTTPS

Pipeline status updates, metadata logging

Egress

Database

TCP

MySQL persistent storage

Egress

Control Plane

HTTPS

Authentication

Egress

Secrets backend

HTTPS

AWS Secrets Manager, GCP Secret Manager, etc.

Egress

Artifact Store

HTTPS

Artifact visualizations

Egress

Kubernetes API

HTTPS

Workload manager pod creation (port 6443)

Workload Manager

The Workspace Server includes a workload manager that enables running pipelines directly from the ZenML Pro UI. This requires access to a Kubernetes cluster where ad-hoc runner pods can be created.

circle-exclamation

Requirements

  • Kubernetes cluster (1.24+) accessible from the workspace server

  • Dedicated namespace for runner pods

  • Service account with RBAC permissions to create/manage pods

Supported Implementations

Implementation
Platform
Use Case

KubernetesWorkloadManager

Any Kubernetes (EKS, GKE, AKS, self-managed)

Standard setup, fast minimalistic configuration

AWSKubernetesWorkloadManager

EKS

AWS-native with ECR image building and S3 log storage

GCPKubernetesWorkloadManager

GKE

GCP-native with GCR support (GCS log storage planned)

Environment Variables Reference

Required for all implementations:

Variable
Required
Description

ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE

Yes

Implementation class (see values below)

ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE

Yes

Kubernetes namespace for runner jobs (must exist)

ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT

Yes

Kubernetes service account for runner jobs (must exist)

Implementation source values:

  • zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManager

  • zenml_cloud_plugins.aws_kubernetes_workload_manager.AWSKubernetesWorkloadManager

  • zenml_cloud_plugins.gcp_kubernetes_workload_manager.GCPKubernetesWorkloadManager

Runner image configuration:

Variable
Required
Description

ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE

No

Whether to build runner images (default: false)

ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY

Conditional

Registry for runner images (required if building images)

ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE

No

Pre-built runner image (used if not building). Must have all requirements to instantiate the stack.

Optional configuration:

Variable
Description

ZENML_KUBERNETES_WORKLOAD_MANAGER_ENABLE_EXTERNAL_LOGS

Store logs externally (default: false, AWS only)

ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES

Pod resources in JSON format

ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED

Cleanup time for finished jobs (default: 2 days)

ZENML_KUBERNETES_WORKLOAD_MANAGER_NODE_SELECTOR

Node selector in JSON format

ZENML_KUBERNETES_WORKLOAD_MANAGER_TOLERATIONS

Tolerations in JSON format

ZENML_KUBERNETES_WORKLOAD_MANAGER_JOB_BACKOFF_LIMIT

Backoff limit for builder/runner jobs

ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_FAILURE_POLICY

Pod failure policy for builder/runner jobs

ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS

Max concurrent snapshot runs per pod (default: 2)

AWS-specific variables:

Variable
Required
Description

ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_BUCKET

Conditional

S3 bucket for logs (required if external logs enabled)

ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_REGION

Conditional

AWS region (required if building images)

Configuration Examples

Minimal Kubernetes Configuration:

Full AWS Configuration:

Full GCP Configuration:

Kubernetes RBAC

The service account needs these permissions in the workload manager namespace:

High Availability

For production deployments, consider multiple replicas (2+) behind a load balancer, database replication with read replicas, liveness/readiness probes, and auto-scaling based on CPU/memory utilization.

ZenML Scarf

Last updated

Was this helpful?