Workspace Server
Configuration reference for the ZenML Workspace Server.
This page provides the configuration reference for the ZenML Workspace Server, including the workload manager that enables running pipelines from the UI. For an overview of what the Workspace Server does, see System Architecture.
Permissions
When running your own Workspace Server, you need full CRUD permissions on a dedicated database (MySQL only, PostgreSQL not supported for workspace servers).
Network Requirements
Ingress
ZenML SDK clients
HTTPS
API requests from developers and CI/CD
Ingress
ZenML Pro Dashboard
HTTPS
UI data requests
Ingress
Orchestrator pods/tasks
HTTPS
Pipeline status updates, metadata logging
Egress
Database
TCP
MySQL persistent storage
Egress
Control Plane
HTTPS
Authentication
Egress
Secrets backend
HTTPS
AWS Secrets Manager, GCP Secret Manager, etc.
Egress
Artifact Store
HTTPS
Artifact visualizations
Egress
Kubernetes API
HTTPS
Workload manager pod creation (port 6443)
Workload Manager
The Workspace Server includes a workload manager that enables running pipelines directly from the ZenML Pro UI. This requires access to a Kubernetes cluster where ad-hoc runner pods can be created.
Snapshots are only available from ZenML workspace server version 0.90.0 onwards.
Requirements
Kubernetes cluster (1.24+) accessible from the workspace server
Dedicated namespace for runner pods
Service account with RBAC permissions to create/manage pods
Supported Implementations
KubernetesWorkloadManager
Any Kubernetes (EKS, GKE, AKS, self-managed)
Standard setup, fast minimalistic configuration
AWSKubernetesWorkloadManager
EKS
AWS-native with ECR image building and S3 log storage
GCPKubernetesWorkloadManager
GKE
GCP-native with GCR support (GCS log storage planned)
Environment Variables Reference
Required for all implementations:
ZENML_SERVER_WORKLOAD_MANAGER_IMPLEMENTATION_SOURCE
Yes
Implementation class (see values below)
ZENML_KUBERNETES_WORKLOAD_MANAGER_NAMESPACE
Yes
Kubernetes namespace for runner jobs (must exist)
ZENML_KUBERNETES_WORKLOAD_MANAGER_SERVICE_ACCOUNT
Yes
Kubernetes service account for runner jobs (must exist)
Implementation source values:
zenml_cloud_plugins.kubernetes_workload_manager.KubernetesWorkloadManagerzenml_cloud_plugins.aws_kubernetes_workload_manager.AWSKubernetesWorkloadManagerzenml_cloud_plugins.gcp_kubernetes_workload_manager.GCPKubernetesWorkloadManager
Runner image configuration:
ZENML_KUBERNETES_WORKLOAD_MANAGER_BUILD_RUNNER_IMAGE
No
Whether to build runner images (default: false)
ZENML_KUBERNETES_WORKLOAD_MANAGER_DOCKER_REGISTRY
Conditional
Registry for runner images (required if building images)
ZENML_KUBERNETES_WORKLOAD_MANAGER_RUNNER_IMAGE
No
Pre-built runner image (used if not building). Must have all requirements to instantiate the stack.
Optional configuration:
ZENML_KUBERNETES_WORKLOAD_MANAGER_ENABLE_EXTERNAL_LOGS
Store logs externally (default: false, AWS only)
ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_RESOURCES
Pod resources in JSON format
ZENML_KUBERNETES_WORKLOAD_MANAGER_TTL_SECONDS_AFTER_FINISHED
Cleanup time for finished jobs (default: 2 days)
ZENML_KUBERNETES_WORKLOAD_MANAGER_NODE_SELECTOR
Node selector in JSON format
ZENML_KUBERNETES_WORKLOAD_MANAGER_TOLERATIONS
Tolerations in JSON format
ZENML_KUBERNETES_WORKLOAD_MANAGER_JOB_BACKOFF_LIMIT
Backoff limit for builder/runner jobs
ZENML_KUBERNETES_WORKLOAD_MANAGER_POD_FAILURE_POLICY
Pod failure policy for builder/runner jobs
ZENML_SERVER_MAX_CONCURRENT_TEMPLATE_RUNS
Max concurrent snapshot runs per pod (default: 2)
AWS-specific variables:
ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_BUCKET
Conditional
S3 bucket for logs (required if external logs enabled)
ZENML_AWS_KUBERNETES_WORKLOAD_MANAGER_REGION
Conditional
AWS region (required if building images)
Configuration Examples
Minimal Kubernetes Configuration:
Full AWS Configuration:
Full GCP Configuration:
Kubernetes RBAC
The service account needs these permissions in the workload manager namespace:
High Availability
For production deployments, consider multiple replicas (2+) behind a load balancer, database replication with read replicas, liveness/readiness probes, and auto-scaling based on CPU/memory utilization.
Related Documentation
System Architecture - How components interact
Control Plane Configuration - Configure the Control Plane
Upgrades - Workspace Server - How to upgrade the Workspace Server
Last updated
Was this helpful?