How to set up a minimal stack on Google Cloud Platform (GCP)
This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.
GCP Setup Guide
To get started using ZenML on the cloud, you need some basic infrastructure up and running that you can then make more complicated depending on your use-case. This guide sets up the easiest MLOps stack that we can run on GCP with ZenML.
This guide represents one of many ways to create a cloud stack on GCP. Every component could be replaced by a different implementation. Feel free to take this as your starting point.
Prerequisites
For this to work you need to have ZenML installed locally with all GCP requirements.
pipinstallzenmlzenmlintegrationinstallgcp
Additionally, you will need Docker installed on your system.
The cloud stack
A full cloud stack will necessarily contain these five stack components:
An artifact store to save all step output artifacts, in this guide we will use a GCP bucket for this purpose
A metadata store that keeps track of the relationships between artifacts, runs and parameters. In our case we will opt for a MySQL database on GCP Cloud SQL.
The orchestrator to run the pipelines. Here we will opt for a Vertex AI pipelines orchestrator. This is a serverless GCP specific offering with minimal hassle.
A container registry for pushing and pulling the pipeline image.
Finally, the secrets Manager to store passwords and SSL certificates.
Set Up gcloud CLI
Install the gcloud CLI on your machine. Here is a guide on how to install it.
gcloudauth
Set up a GCP project (Optional)
As a first step it might make sense to create a separate GCP project for your ZenML resources. However, this step is completely optional, and you can also move forward within an existing project. If some resources already exist, feel free to skip their creation step and just note down the relevant information.
For simplicity, just open up a terminal on the side and save relevant values as we go along. You will use these when we set up the ZenML stack. ZenML will use your project number at a later stage to connect to some resources, so let's it. You'll most probably find it right here.
PROJECT_NUMBER=<PROJECT_NUMBER># for example '492014921912'GCP_LOCATION=<GCP_LOCATION># for example 'europe-west3'
PARENT_ORG_ID=<PARENT_ORG_ID># for example 3928562984638PROJECT_NAME=<PROJECT_NAME># for example zenml-vertex-prjGCP_LOCATION=<GCP_LOCATION># for example 'europe-west3'gcloudprojectscreate $PROJECT_NAME --organization=$PARENT_ORG_IDgcloudconfigsetproject $PROJECT_NAMEPROJECT_NUMBER=$(gcloudprojectsdescribe $PROJECT_NAME --format="value(projectNumber)")
Enable billing
Before moving on, you'll have to make sure you attach a billing account to your project. In case you do not have the permissions to do so, you'll have to ask an organization administrator.
Vertex AI pipelines is at the heart of our GCP stack. As the orchestrator Vertex AI will run your pipelines and use all the other stack components.
All you'll need to do at this stage is enable Vertex AI here.
gcloudservicesenableaiplatform.googleapis.com
Enable Secrets Manager
The Secrets Manager will be needed so that the orchestrator will have secure access to the other resources.
Here is where you'll be able to enable the secrets manager.
gcloudservicesenablesecretmanager.googleapis.com
Enable Container Registry
The Vertex AI orchestrator uses Docker Images containing your pipeline code for pipeline orchestration.
For this to work you'll need to enable the GCP Docker registry here.
In order to use the container registry at a later point you will need to set the container registry URI. This is how it is usually constructed: gcr.io/<PROJECT_ID>.
The container registry has four options: gcr.io , us.gcr.io, eu.gcr.io , or asia.gcr.io. Choose the one appropriate for you.
CONTAINER_REGISTRY_URI=<CONTAINER_REGISTRY_URI># for example 'eu.gcr.io/zenml-project'
CONTAINER_REGISTRY_REGION=<CONTAINER_REGISTRY_REGION># can be 'eu', 'us', 'asia'gcloudservicesenablecontainerregistry.googleapis.comCONTAINER_REGISTRY_URI=$CONTAINER_REGISTRY_REGION".gcr.io/"$PROJECT_NAME
Set up Cloud Storage as Artifact Store
Storing of step artifacts is an important part of reproducible MLOps.
Within the configuration of the newly created bucket you can find the gsutil URI which you will need at a later point. It's usually going to look like this: gs://<bucket-name>
GSUTIL_URI=<GSUTIL_URI># for example 'gs://zenml_vertex_storage'
One of the most complex resources that you'll need to manage is the MySQL database.
To start, we create a MySQL database. Once created, it will take some time for the database to be set up.
Once it is set up you can find the IP-address. The password you set during creation of the instance is the root password. The default port for MySQL is 3306.
DB_HOST=<DB_HOST># for example '35.137.24.15'DB_PWD=<DB_PWD># for example 'secure_root_pwd'
Time to set up the connections to our database. To do this you'll need to go into the Connections menu. Under the Networking tab you'll need to add 0.0.0.0/0 to the authorized networks, thereby allowing all incoming traffic from everywhere. (Feel free to restrict this to your outgoing IP address)
For security reasons, it is also recommended to configure your database to only accept SSL connections. You'll find the relevant setting in the Security tab. Select SSL Connections only in order to encrypt all traffic with your database.
Now Create Client Certificate and download all three files. Save the paths to these three files as follows.
SSL_CA=<SSL_CA># for example /home/zen/Downloads/server-ca.pemSSL_CERT=<SSL_CERT># for example /home/zen/Downloads/client-cert.pemSSL_KEY=<SSL_KEY># for example /home/zen/Downloads/client-key.pem
Note the @ sign in front of these three variables. The @ sign tells the secret manager that these are file paths to be loaded from.
Finally, head on over to the Databases submenu and create your database and save its name.
DB_NAME=<DB_NAME># for example zenml_db
We have set some sensible defaults here, feel free to replace these with names of your own.
DB_INSTANCE=zenml-instDB_NAME=zenml_metadata_store_db# make sure this contains no '-'CERT_NAME=zenml-certCLIENT_KEY_PATH=$PROJECT_NAME"client-key.pem"CLIENT_CERT_PATH=$PROJECT_NAME"client-cert.pem"SERVER_CERT_PATH=$PROJECT_NAME"server-ca.pem"
# Enable the sql api for database creationgcloudservicesenablesqladmin.googleapis.com# Create the db instancegcloudsqlinstancescreate $DB_INSTANCE --tier=db-f1-micro \--region=$GCP_LOCATION--authorized-networks0.0.0.0/0
Make sure the instance is fully set up before continuing.
DB_HOST=$(gcloudsqlinstancesdescribe $DB_INSTANCE --format='get(ipAddresses[0].ipAddress)')gcloudsqlusersset-passwordroot--host=%--instance $DB_INSTANCE --password $DB_PASSWORD# Create Client certificate and download all three gcloudsqlinstancespatch $DB_INSTANCE --require-ssl
All the resources are created. Now we need to make sure the instance performing the compute engine (Vertex AI) needs to have the relevant permissions to access the other resources. For this you'll need to go
here to create a service account. Give it a relevant name and allow access to the following roles:
Vertex AI Custom Code Service Agent
Vertex AI Service Agent
Container Registry Service Agent
Secret Manager Admin
Also give your user access to the service account. This is the service account that will be used by the Vertex AI compute engine.
SERVICE_ACCOUNT=<SERVICE_ACCOUNT># for example zenml-vertex-sa@zenml-project.iam.gserviceaccount.com
SERVICE_ACCOUNT_ID=zenml-vertex-saUSER_EMAIL=<USER_EMAIL># for example user@zenml.ioSERVICE_ACCOUNT=${SERVICE_ACCOUNT_ID}"@"${PROJECT_NAME}".iam.gserviceaccount.com"
gcloudiamservice-accountscreate $SERVICE_ACCOUNT_ID \--display-name="zenml-vertex-sa" \--description="Service account for running Vertex Ai workflows from ZenML."gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} \--role="roles/aiplatform.customCodeServiceAgent" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} \--role="roles/aiplatform.serviceAgent" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} \--role="roles/containerregistry.ServiceAgent" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} \--role="roles/secretmanager.admin" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudiamservice-accountsadd-iam-policy-binding $SERVICE_ACCOUNT \--member="user:"${USER_EMAIL}--role="roles/iam.serviceAccountUser"
ZenML Stack
Everything on the GCP side is set up, you're ready to set up the ZenML stack components now.
Copy-paste this into your terminal and press enter.
Your first run might fail as one of the service accounts is only created once vertex is run for the first time. This service account will need to be given appropriate rights after the first run fails.
At the end of the logs you should be seeing a link to the Vertex AI dashboard. It should look something like this:
On the IAM page you will need to give permissions to the service account of the custom code workers.
For this, head over to your IAM configurations, click on Include Google-provided role grants on the top right and find the <project_number>@gcp-sa-aiplatform-cc.iam.gserviceaccount.com service account.
Now give this one the Container Registry Service Agent role on top of its existing role.
Within this guide you have set up and used a stack on GCP using the Vertex AI orchestrator. For more guides on different cloud set-ups, check out the Kubeflow and Kubernetes orchestrators respectively and find out if these are a better fit for you.
One Shot Setup
Quick setup commands
Set these parameters:
USER_EMAIL=<USER_EMAIL># for example user@zenml.ioPARENT_ORG_ID=<PARENT_ORG_ID># for example 294710374920BILLING_ACC=<BILLING_ACC># for example 20CIW7-183916-8GA18ZPROJECT_NAME=<PROJECT_NAME>CONTAINER_REGISTRY_REGION=eu# can be 'eu', 'us', 'asia'GCP_LOCATION=<GCP_LOCATION># for example europe-west3 DB_PASSWORD=<DB_PASSWORD># for example auk(/194
And run this (make sure all the commands worked):
# Create a project and attach it to the specified billing accountgcloudprojectscreate $PROJECT_NAME --organization=$PARENT_ORG_IDgcloudconfigsetproject $PROJECT_NAMEgcloudbetabillingprojectslink $PROJECT_NAME --billing-account $BILLING_ACC PROJECT_NUMBER=$(gcloudprojectsdescribe $PROJECT_NAME --format="value(projectNumber)")# Enable the three APIs for vertex ai, the container registry and the gcloudservicesenableaiplatform.googleapis.comgcloudservicesenablesecretmanager.googleapis.comgcloudservicesenablecontainerregistry.googleapis.comCONTAINER_REGISTRY_URI=$CONTAINER_REGISTRY_REGION".gcr.io/"$PROJECT_NAME# Create a storage bucketGSUTIL_URI=gs://${PROJECT_NAME}-bucketgsutilmb-p $PROJECT_NAME $GSUTIL_URI# Enable the sql api for database creationgcloudservicesenablesqladmin.googleapis.com# Create the db instanceDB_INSTANCE=zenml-instgcloudsqlinstancescreate $DB_INSTANCE --tier=db-f1-micro--region=$GCP_LOCATION--authorized-networks0.0.0.0/0DB_HOST=$(gcloudsqlinstancesdescribe $DB_INSTANCE --format='get(ipAddresses[0].ipAddress)')gcloudsqlusersset-passwordroot--host=%--instance $DB_INSTANCE --password $DB_PASSWORD# Create Client certificate and download all three CERT_NAME=zenml-certCLIENT_KEY_PATH=$PROJECT_NAME"client-key.pem"CLIENT_CERT_PATH=$PROJECT_NAME"client-cert.pem"SERVER_CERT_PATH=$PROJECT_NAME"server-ca.pem"gcloudsqlinstancespatch $DB_INSTANCE --require-sslgcloudsqlsslclient-certscreate $CERT_NAME $CLIENT_KEY_PATH --instance $DB_INSTANCEgcloudsqlsslclient-certsdescribe $CERT_NAME --instance=$DB_INSTANCE--format="value(cert)"> $CLIENT_CERT_PATHgcloudsqlinstancesdescribe $DB_INSTANCE --format="value(serverCaCert.cert)"> $SERVER_CERT_PATHDB_NAME=zenml_metadata_store_db# make sure this contains no '-' as this will failgcloudsqldatabasescreate $DB_NAME --instance=$DB_INSTANCE--collation=utf8_general_ci--charset=utf8# Configure the service accountsSERVICE_ACCOUNT_ID=zenml-vertex-saSERVICE_ACCOUNT=${SERVICE_ACCOUNT_ID}"@"${PROJECT_NAME}".iam.gserviceaccount.com"gcloudiamservice-accountscreate $SERVICE_ACCOUNT_ID --display-name="zenml-vertex-sa" \--description="Service account for running Vertex Ai workflows from ZenML."gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} --role="roles/aiplatform.customCodeServiceAgent" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} --role="roles/aiplatform.serviceAgent" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} --role="roles/containerregistry.ServiceAgent" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudprojectsadd-iam-policy-binding ${PROJECT_NAME} --role="roles/secretmanager.admin" \--member="serviceAccount:"${SERVICE_ACCOUNT}gcloudiamservice-accountsadd-iam-policy-binding $SERVICE_ACCOUNT \--member="user:"${USER_EMAIL}--role="roles/iam.serviceAccountUser"ORCHESTRATOR_NAME=$PROJECT_NAME"-gcp_vo"ARTIFACT_STORE_NAME=$PROJECT_NAME"-gcp_as"METADATA_STORE_NAME=$PROJECT_NAME"-gcp_ms"CONTAINER_REGISTRY_NAME=$PROJECT_NAME"-gcp_cr"SECRET_MANAGER_NAME=$PROJECT_NAME"-gcp_sm"STACK_NAME=$PROJECT_NAME"-gcp_stack"zenmlorchestratordelete $ORCHESTRATOR_NAMEzenmlartifact-storedelete $ARTIFACT_STORE_NAMEzenmlmetadata-storedelete $METADATA_STORE_NAMEzenmlcontainer-registrydelete $CONTAINER_REGISTRY_NAMEzenmlsecrets-managerdelete $SECRET_MANAGER_NAMEzenmlorchestratorregister $ORCHESTRATOR_NAME --flavor=vertex \--project=$PROJECT_NUMBER--location=$GCP_LOCATION \--workload_service_account=$SERVICE_ACCOUNTzenmlcontainer-registryregister $CONTAINER_REGISTRY_NAME --flavor=gcp \--uri=$CONTAINER_REGISTRY_URIzenmlsecrets-managerregister $SECRET_MANAGER_NAME \--flavor=gcp_secrets_manager--project_id=$PROJECT_NUMBERzenmlartifact-storeregister $ARTIFACT_STORE_NAME --flavor=gcp \--path=$GSUTIL_URIzenmlmetadata-storeregister $METADATA_STORE_NAME --flavor=mysql \--host=$DB_HOST--port=3306--database=$DB_NAME \--secret=mysql_secretzenmlstackregister $STACK_NAME -o $ORCHESTRATOR_NAME \-c $CONTAINER_REGISTRY_NAME -x $SECRET_MANAGER_NAME \-a $ARTIFACT_STORE_NAME -m $METADATA_STORE_NAME --setzenmlsecrets-managersecretregistermysql_secret--schema=mysql \--user=root--password=$DB_PASSWORD \--ssl_ca="@"$SERVER_CERT_PATH--ssl_cert="@"$CLIENT_CERT_PATH--ssl_key="@"$CLIENT_KEY_PATH