Skypilot

Use Skypilot with ZenML.

The ZenML SkyPilot VM Orchestrator allows you to provision and manage VMs on any supported cloud provider (AWS, GCP, Azure, Lambda Labs) for running your ML pipelines. It simplifies the process and offers cost savings and high GPU availability.

Prerequisites

To use the SkyPilot VM Orchestrator, you'll need:

ZenML SkyPilot integration for your cloud provider installed (zenml integration install <PROVIDER> skypilot_<PROVIDER>)
Docker installed and running
A remote artifact store and container registry in your ZenML stack
A remote ZenML deployment
Appropriate permissions to provision VMs on your cloud provider
A service connector configured to authenticate with your cloud provider (not needed for Lambda Labs)

Configuring the Orchestrator

Configuration steps vary by cloud provider:

AWS, GCP, Azure:

Install the SkyPilot integration and connectors extra for your provider
Register a service connector with credentials that have SkyPilot's required permissions
Register the orchestrator and connect it to the service connector
Register and activate a stack with the new orchestrator

zenml service-connector register <PROVIDER>-skypilot-vm -t <PROVIDER> --auto-configure
zenml orchestrator register <ORCHESTRATOR_NAME> --flavor vm_<PROVIDER>  
zenml orchestrator connect <ORCHESTRATOR_NAME> --connector <PROVIDER>-skypilot-vm
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set

Lambda Labs:

Install the SkyPilot Lambda integration
Register a secret with your Lambda Labs API key
Register the orchestrator with the API key secret
Register and activate a stack with the new orchestrator

zenml secret create lambda_api_key --scope user --api_key=<KEY>
zenml orchestrator register <ORCHESTRATOR_NAME> --flavor vm_lambda --api_key={{lambda_api_key.api_key}}
zenml stack register <STACK_NAME> -o <ORCHESTRATOR_NAME> ... --set

Running a Pipeline

Once configured, you can run any ZenML pipeline using the SkyPilot VM Orchestrator. Each step will run in a Docker container on a provisioned VM.

Additional Configuration

You can further configure the orchestrator using cloud-specific Settings objects:

from zenml.integrations.skypilot_<PROVIDER>.flavors.skypilot_orchestrator_<PROVIDER>_vm_flavor import Skypilot<PROVIDER>OrchestratorSettings

skypilot_settings = Skypilot<PROVIDER>OrchestratorSettings(
   cpus="2",
   memory="16", 
   accelerators="V100:2",
   use_spot=True,
   region=<REGION>,
   ...  
)

@pipeline(
   settings={
       "orchestrator.vm_<PROVIDER>": skypilot_settings
   }
)

This allows specifying VM size, spot usage, region, and more.

You can also configure resources per step:

high_resource_settings = Skypilot<PROVIDER>OrchestratorSettings(...)

@step(settings={"orchestrator.vm_<PROVIDER>": high_resource_settings})  
def resource_intensive_step():
   ...

For more details and advanced options, see the full SkyPilot VM Orchestrator documentation.

PreviousMLflow NextConnect services (AWS, GCP, Azure, K8s etc)

Last updated 26 days ago