Execute Pipelines in the Cloud
Deploy pipelines to the public cloud.
Guide for cloud-specific deployments
Pre-requisites
Orchestrator
If one or more of the deployments are not in the Running
state, try increasing the number of nodes in your cluster.
If you're installing Kubeflow Pipelines manually, make sure the Kubernetes service is called exactly ml-pipeline
. This is a requirement for ZenML to connect to your Kubeflow Pipelines deployment.
Container Registry
The path value to register with ZenML should be in the format
ACCOUNT_ID.dkr.ecr.REGION.amazonaws.com
Authenticate your local
docker
CLI with your ECR registry using the following command. Replace the capitalized words with your values.
Artifact Store
Make sure that your EKS cluster is authorized to access the S3 bucket. This can be done in one of the following ways:
The path for your bucket should be in this format
s3://your-bucket
.
Metadata Store
The metadata store requires a SQL compatible database service to store information about pipelines, pipeline runs, steps and artifacts. If you want to avoid using a managed cloud SQL database service, ZenML has the option to reuse the gRPC metadata service installed and used internally by the Kubeflow Pipelines deployment. This is not recommended for production use.
To use the Kubeflow Pipelines metadata service, use a metadata store of flavor kubeflow
in your stack configuration and ignore this section.
If you decide to use an SQL metadata store backed by a managed cloud SQL database service, you will also need a matching secrets manager to store the SSL credentials (i.e. certificates) required to connect to it.
Find the endpoint (DNS name) and TCP port number for your DB instance. You will need them to register with ZenML. The endpoint should be in the format
INSTANCE-NAME.ID.REGION.rds.amazonaws.com
.You may also have to explicitly reconfigure the VPC security group rules to allow access to the database from outside the cloud as well as from the EKS cluster (if they do not share the same VPC).
Integrating with ZenML
To run our pipeline on Kubeflow Pipelines deployed to cloud, we will create a new stack with these components that you have just created.
Install the cloud provider and the
kubeflow
pluginRegister the metadata store component
if you decided to use the Kubeflow metadata service, you can configure a
kubeflow
flavor metadata-store:
otherwise, configure a
mysql
flavor metadata-store. You will also need a secrets manager component to store the MySQL credentials and a secret (namedmysql_secret
in the example) to be registered after the stack (step 5.).
Register the other stack components and the stack itself
Activate the newly created stack.
Create the secret for the
mysql
metadata store (not necessary if you used akubeflow
flavor metadata-store).Do a pipeline run and check your Kubeflow UI to see it running there! 🚀
The secrets manager is only required if you're using a metadata store backed by a managed cloud database service.
The metadata store secret user, password and SSL certificates are the ones set up and downloaded during the creation of the managed SQL database service.
The
--ssl_cert
and--ssl_key
parameters for themysql_secret
are only required if you set up client certificates for the MySQL service.You can choose any name for your stack components apart from the ones used in the script above.
Make sure to replace $PATH_TO_YOUR_BUCKET
and $PATH_TO_YOUR_CONTAINER_REGISTRY
with the actual URIs of your bucket and container registry.
Last updated