Configuring GCP Service Connectors to connect ZenML to GCP resources such as GCS buckets, GKE Kubernetes clusters, and GCR container registries.
The ZenML GCP Service Connector facilitates the authentication and access to managed GCP services and resources. These encompass a range of resources, including GCS buckets, GAR and GCR container repositories, and GKE clusters. The connector provides support for various authentication methods, including GCP user accounts, service accounts, short-lived OAuth 2.0 tokens, and implicit authentication.
This connector serves as a general means of accessing any GCP service by issuing OAuth 2.0 credential objects to clients. Additionally, the connector can handle specialized authentication for GCS, Docker, and Kubernetes Python clients. It also allows for the configuration of local Docker and Kubernetes CLIs.
The GCP Service Connector is part of the GCP ZenML integration. You can either install the entire integration or use a PyPI extra to install it independently of the integration:
pip install "zenml[connectors-gcp]" installs only prerequisites for the GCP Service Connector Type
zenml integration install gcp installs the entire GCP ZenML integration
It is not required to install and set up the GCP CLI on your local machine to use the GCP Service Connector to link Stack Components to GCP resources and services. However, it is recommended to do so if you are looking for a quick setup that includes using the auto-configuration Service Connector features.
The auto-configuration examples in this page rely on the GCP CLI being installed and already configured with valid credentials of one type or another. If you want to avoid installing the GCP CLI, we recommend using the interactive mode of the ZenML CLI to register Service Connectors:
zenml service-connector register -i --type gcp
Resource Types
Generic GCP resource
This resource type allows Stack Components to use the GCP Service Connector to connect to any GCP service or resource. When used by Stack Components, they are provided a Python google-auth credentials object populated with a GCP OAuth 2.0 token. This credentials object can then be used to create GCP Python clients for any particular GCP service.
This generic GCP resource type is meant to be used with Stack Components that are not represented by one of the other, more specific resource types like GCS buckets, Kubernetes clusters, or Docker registries. For example, it can be used with the Google Cloud Image Builder stack component, or the Vertex AI Orchestrator and Step Operator. It should be accompanied by a matching set of GCP permissions that allow access to the set of remote resources required by the client and Stack Component (see the documentation of each Stack Component for more details).
The resource name represents the GCP project that the connector is authorized to access.
GCS bucket
Allows Stack Components to connect to GCS buckets. When used by Stack Components, they are provided a pre-configured GCS Python client instance.
The configured credentials must have at least the following GCP permissions associated with the GCS buckets that it can access:
storage.buckets.list
storage.buckets.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
storage.objects.update
For example, the GCP Storage Admin role includes all of the required permissions, but it also includes additional permissions that are not required by the connector.
If set, the resource name must identify a GCS bucket using one of the following formats:
GCS bucket URI (canonical resource name): gs://{bucket-name}
GCS bucket name: {bucket-name}
GKE Kubernetes cluster
Allows Stack Components to access a GKE cluster as a standard Kubernetes cluster resource. When used by Stack Components, they are provided a pre-authenticated Python Kubernetes client instance.
The configured credentials must have at least the following GCP permissions associated with the GKE clusters that it can access:
container.clusters.list
container.clusters.get
In addition to the above permissions, the credentials should include permissions to connect to and use the GKE cluster (i.e. some or all permissions in the Kubernetes Engine Developer role).
If set, the resource name must identify a GKE cluster using one of the following formats:
GKE cluster name: {cluster-name}
GKE cluster names are project scoped. The connector can only be used to access GKE clusters in the GCP project that it is configured to use.
GAR container registry (including legacy GCR support)
Important Notice: Google Container Registryis being replaced by Artifact Registry**. Please start using Artifact Registry for your containers. As per Google's documentation, "after May 15, 2024, Artifact Registry will host images for the gcr.io domain in Google Cloud projects without previous Container Registry usage. After March 18, 2025, Container Registry will be shut down.".
Support for legacy GCR registries is still included in the GCP service connector. Users that already have GCP service connectors configured to access GCR registries may continue to use them without taking any action. However, it is recommended to transition to Google Artifact Registries as soon as possible by following the GCP guide on this subject and making the following updates to ZenML GCP Service Connectors that are used to access GCR resources:
add the IAM permissions documented here to the GCP Service Connector credentials to enable them to access the Artifact Registries.
users may keep the gcr.io GCR URLs already configured in the GCP Service Connectors as well as those used in linked Container Registry stack components given that these domains are redirected by Google to GAR as covered in the GCR transition guide. Alternatively, users may update the GCP Service Connector configuration and/or the Container Registry stack components to use the replacement Artifact Registry URLs.
The GCP Service Connector will list the legacy GCR registries as accessible for a GCP project even if the GCP Service Connector credentials do not grant access to GCR registries. This is required for backwards-compatibility and will be removed in a future release.
Allows Stack Components to access a Google Artifact Registry as a standard Docker registry resource. When used by Stack Components, they are provided a pre-authenticated Python Docker client instance.
The configured credentials must have at least the following GCP permissions:
The Artifact Registry Create-on-Push Writer role includes all of the above permissions.
This resource type also includes legacy GCR container registry support. When used with GCR registries, the configured credentials must have at least the following GCP permissions:
storage.buckets.get
storage.multipartUploads.abort
storage.multipartUploads.create
storage.multipartUploads.list
storage.multipartUploads.listParts
storage.objects.create
storage.objects.delete
storage.objects.list
The Storage Legacy Bucket Writer role includes all of the above permissions while at the same time restricting access to only the GCR buckets.
If set, the resource name must identify a GAR or GCR registry using one of the following formats:
Google Artifact Registry repository URI: [https://]<region>-docker.pkg.dev/<project-id>/<registry-id>[/<repository-name>]
Google Artifact Registry name: projects/<project-id>/locations/<location>/repositories/<repository-id>
This method may constitute a security risk, because it can give users access to the same cloud resources and services that the ZenML Server itself is configured to access. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the ZENML_ENABLE_IMPLICIT_AUTH_METHODS environment variable or the helm chart enableImplicitAuthMethods configuration option to true in the ZenML deployment.
This authentication method doesn't require any credentials to be explicitly configured. It automatically discovers and uses credentials from one of the following sources:
local ADC credential files set up by running gcloud auth application-default login (e.g. ~/.config/gcloud/application_default_credentials.json).
a GCP service account attached to the resource where the ZenML server is running. Only works when running the ZenML server on a GCP resource with a service account attached to it or when using Workload Identity (e.g. GKE cluster).
This is the quickest and easiest way to authenticate to GCP services. However, the results depend on how ZenML is deployed and the environment where it is used and is thus not fully reproducible:
when used with the default local ZenML deployment or a local ZenML server, the credentials are those set up on your machine (i.e. by running gcloud auth application-default login or setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to a service account key JSON file).
when connected to a ZenML server, this method only works if the ZenML server is deployed in GCP and will use the service account attached to the GCP resource where the ZenML server is running (e.g. a GKE cluster). The service account permissions may need to be adjusted to allow listing and accessing/describing the GCP resources that the connector is configured to access.
Note that the discovered credentials inherit the full set of permissions of the local GCP CLI credentials or service account attached to the ZenML server GCP workload. Depending on the extent of those permissions, this authentication method might not be suitable for production use, as it can lead to accidental privilege escalation. Instead, it is recommended to use the Service Account Key or Service Account Impersonation authentication methods to restrict the permissions that are granted to the connector clients.
A GCP project is required and the connector may only be used to access GCP resources in the specified project. When used remotely in a GCP workload, the configured project has to be the same as the project of the attached service account.
Example configuration
The following assumes the local GCP CLI has already been configured with user account credentials by running the gcloud auth application-default login command:
This method requires GCP user account credentials like those generated by the gcloud auth application-default login command.
By default, the GCP connector generates temporary OAuth 2.0 tokens from the user account credentials and distributes them to clients. The tokens have a limited lifetime of 1 hour. This behavior can be disabled by setting the generate_temporary_tokens configuration option to False, in which case, the connector will distribute the user account credentials JSON to clients instead (not recommended).
This method is preferred during development and testing due to its simplicity and ease of use. It is not recommended as a direct authentication method for production use cases because the clients are granted the full set of permissions of the GCP user account. For production, it is recommended to use the GCP Service Account or GCP Service Account Impersonation authentication methods.
A GCP project is required and the connector may only be used to access GCP resources in the specified project.
If you already have the local GCP CLI set up with these credentials, they will be automatically picked up when auto-configuration is used (see the example below).
Example auto-configuration
The following assumes the local GCP CLI has been configured with GCP user account credentials by running the gcloud auth application-default login command:
By default, the GCP connector generates temporary OAuth 2.0 tokens from the service account credentials and distributes them to clients. The tokens have a limited lifetime of 1 hour. This behavior can be disabled by setting the generate_temporary_tokens configuration option to False, in which case, the connector will distribute the service account credentials JSON to clients instead (not recommended).
A GCP project is required and the connector may only be used to access GCP resources in the specified project.
If you already have the GOOGLE_APPLICATION_CREDENTIALS environment variable configured to point to a service account key JSON file, it will be automatically picked up when auto-configuration is used.
Example configuration
The following assumes a GCP service account was created, granted permissions to access GCS buckets in the target project and a service account key JSON was generated and saved locally in the connectors-devel@zenml-core.json file:
The connector needs to be configured with the email address of the target GCP service account to be impersonated, accompanied by a GCP service account key JSON for the primary service account. The primary service account must have permission to generate tokens for the target service account (i.e. the Service Account Token Creator role). The connector will generate temporary OAuth 2.0 tokens upon request by using GCP direct service account impersonation. The tokens have a configurable limited lifetime of up to 1 hour.
A GCP project is required and the connector may only be used to access GCP resources in the specified project.
If you already have the GOOGLE_APPLICATION_CREDENTIALS environment variable configured to point to the primary service account key JSON file, it will be automatically picked up when auto-configuration is used.
Configuration example
For this example, we have the following set up in GCP:
a primary empty-connectors@zenml-core.iam.gserviceaccount.com GCP service account with no permissions whatsoever aside from the "Service Account Token Creator" role that allows it to impersonate the secondary service account below. We also generate a service account key for this account.
a secondary zenml-bucket-sl@zenml-core.iam.gserviceaccount.com GCP service account that only has permission to access the zenml-bucket-sl GCS bucket
First, let's show that the empty-connectors service account has no permission to access any GCS buckets or any other resources for that matter. We'll register a regular GCP Service Connector that uses the service account key (long-lived credentials) directly:
Error: Service connector 'gcp-empty-sa' verification failed: connector authorization failure: Failed to list GKE clusters:
403 Required "container.clusters.list" permission(s) for "projects/20219041791".
Error: Service connector 'gcp-empty-sa' verification failed: connector authorization failure: failed to list GCS buckets:
403 GET https://storage.googleapis.com/storage/v1/b?project=zenml-core&projection=noAcl&prettyPrint=false:
empty-connectors@zenml-core.iam.gserviceaccount.com does not have storage.buckets.list access to the Google Cloud project.
Permission 'storage.buckets.list' denied on resource (or it may not exist).
Error: Service connector 'gcp-empty-sa' verification failed: connector authorization failure: failed to fetch GCS bucket
zenml-bucket-sl: 403 GET https://storage.googleapis.com/storage/v1/b/zenml-bucket-sl?projection=noAcl&prettyPrint=false:
empty-connectors@zenml-core.iam.gserviceaccount.com does not have storage.buckets.get access to the Google Cloud Storage bucket.
Permission 'storage.buckets.get' denied on resource (or it may not exist).
Next, we'll register a GCP Service Connector that actually uses account impersonation to access the zenml-bucket-sl GCS bucket and verify that it can actually access the bucket:
Expanding argument value service_account_json to contents of file /home/stefan/aspyre/src/zenml/empty-connectors@zenml-core.json.
Successfully registered service connector `gcp-impersonate-sa` with access to the following resources:
┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓
┃ RESOURCE TYPE │ RESOURCE NAMES ┃
┠───────────────┼──────────────────────┨
┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃
┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛
External Account (GCP Workload Identity)
Use GCP workload identity federation to authenticate to GCP services using AWS IAM credentials, Azure Active Directory credentials or generic OIDC tokens.
This authentication method only requires a GCP workload identity external account JSON file that only contains the configuration for the external account without any sensitive credentials. It allows implementing a two layer authentication scheme that keeps the set of permissions associated with implicit credentials down to the bare minimum and grants permissions to the privilege-bearing GCP service account instead.
This authentication method can be used to authenticate to GCP services using credentials from other cloud providers or identity providers. When used with workloads running on AWS or Azure, it involves automatically picking up credentials from the AWS IAM or Azure AD identity associated with the workload and using them to authenticate to GCP services. This means that the result depends on the environment where the ZenML server is deployed and is thus not fully reproducible.
When used with AWS or Azure implicit in-cloud authentication, this method may constitute a security risk, because it can give users access to the identity (e.g. AWS IAM role or Azure AD principal) implicitly associated with the environment where the ZenML server is running. For this reason, all implicit authentication methods are disabled by default and need to be explicitly enabled by setting the ZENML_ENABLE_IMPLICIT_AUTH_METHODS environment variable or the helm chart enableImplicitAuthMethods configuration option to true in the ZenML deployment.
By default, the GCP connector generates temporary OAuth 2.0 tokens from the external account credentials and distributes them to clients. The tokens have a limited lifetime of 1 hour. This behavior can be disabled by setting the generate_temporary_tokens configuration option to False, in which case, the connector will distribute the external account credentials JSON to clients instead (not recommended).
A GCP project is required and the connector may only be used to access GCP resources in the specified roject. This project must be the same as the one for which the external account was configured.
If you already have the GOOGLE_APPLICATION_CREDENTIALS environment variable configured to point to an external account key JSON file, it will be automatically picked up when auto-configuration is used.
a GCP service account is configured with permissions to access the target resources and granted the roles/iam.workloadIdentityUser role for the workload identity pool and AWS provider
a GCP external account JSON file is generated for the GCP service account. This is used to configure the GCP connector.
This method has the major limitation that the user must regularly generate new tokens and update the connector configuration as OAuth 2.0 tokens expire. On the other hand, this method is ideal in cases where the connector only needs to be used for a short period of time, such as sharing access temporarily with someone else in your team.
Using any of the other authentication methods will automatically generate and refresh OAuth 2.0 tokens for clients upon request.
A GCP project is required and the connector may only be used to access GCP resources in the specified project.
Example auto-configuration
Fetching OAuth 2.0 tokens from the local GCP CLI is possible if the GCP CLI is already configured with valid credentials (i.e. by running gcloud auth application-default login). We need to force the ZenML CLI to use the OAuth 2.0 token authentication by passing the --auth-method oauth2-token option, otherwise, it would automatically pick up long-term credentials:
The following is an example of lifting GCP user credentials granting access to the same set of GCP resources and services that the local GCP CLI is allowed to access. The GCP CLI should already be configured with valid credentials (i.e. by running gcloud auth application-default login). In this case, the GCP user account authentication method is automatically detected:
Service connector 'gcp-auto' of type 'gcp' with id 'fe16f141-7406-437e-a579-acebe618a293' is owned by user 'default' and is 'private'.
'gcp-auto' gcp Service Connector Details
┏━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ PROPERTY │ VALUE ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ ID │ fe16f141-7406-437e-a579-acebe618a293 ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ NAME │ gcp-auto ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ TYPE │ 🔵 gcp ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ AUTH METHOD │ user-account ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ RESOURCE TYPES │ 🔵 gcp-generic, 📦 gcs-bucket, 🌀 kubernetes-cluster, 🐳 docker-registry ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ RESOURCE NAME │ <multiple> ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ SECRET ID │ 5eca8f6e-291f-4958-ae2d-a3e847a1ad8a ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ SESSION DURATION │ N/A ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ EXPIRES IN │ N/A ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ OWNER │ default ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ SHARED │ ➖ ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ CREATED_AT │ 2023-05-19 09:15:12.882929 ┃
┠──────────────────┼──────────────────────────────────────────────────────────────────────────┨
┃ UPDATED_AT │ 2023-05-19 09:15:12.882930 ┃
┗━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
Configuration
┏━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┓
┃ PROPERTY │ VALUE ┃
┠───────────────────┼────────────┨
┃ project_id │ zenml-core ┃
┠───────────────────┼────────────┨
┃ user_account_json │ [HIDDEN] ┃
┗━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┛
Local client provisioning
The local gcloud CLI, the Kubernetes kubectl CLI and the Docker CLI can be configured with credentials extracted from or generated by a compatible GCP Service Connector. Please note that unlike the configuration made possible through the GCP CLI, the Kubernetes and Docker credentials issued by the GCP Service Connector have a short lifetime and will need to be regularly refreshed. This is a byproduct of implementing a high-security profile.
Note that the gcloud local client can only be configured with credentials issued by the GCP Service Connector if the connector is configured with the GCP user account authentication method or the GCP service account authentication method and if the generate_temporary_tokens option is set to true in the Service Connector configuration.
Only the gcloud local application default credentials configuration will be updated by the GCP Service Connector configuration. This makes it possible to use libraries and SDKs that use the application default credentials to access GCP resources.
Local CLI configuration examples
The following shows an example of configuring the local Kubernetes CLI to access a GKE cluster reachable through a GCP Service Connector:
Service connector 'gcp-user-account' is correctly configured with valid credentials and has access to the following resources:
┏━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓
┃ RESOURCE TYPE │ RESOURCE NAMES ┃
┠───────────────────────┼────────────────────┨
┃ 🌀 kubernetes-cluster │ zenml-test-cluster ┃
┗━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛
Calling the login CLI command will configure the local Kubernetes kubectl CLI to access the Kubernetes cluster through the GCP Service Connector:
⠴ Attempting to configure local client using service connector 'gcp-user-account'...
Context "gke_zenml-core_zenml-test-cluster" modified.
Updated local kubeconfig with the cluster details. The current kubectl context was set to 'gke_zenml-core_zenml-test-cluster'.
The 'gcp-user-account' Kubernetes Service Connector connector was used to successfully configure the local Kubernetes cluster client/SDK.
To verify that the local Kubernetes kubectl CLI is correctly configured, the following command can be used:
kubectlcluster-info
Example Command Output
Kubernetes control plane is running at https://35.185.95.223
GLBCDefaultBackend is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
KubeDNS is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://35.185.95.223/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
A similar process is possible with GCR container registries:
Service connector 'gcp-user-account' is correctly configured with valid credentials and has access to the following resources:
┏━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ RESOURCE TYPE │ RESOURCE NAMES ┃
┠────────────────────┼─────────────────────────────────────────────┨
┃ 🐳 docker-registry │ europe-west1-docker.pkg.dev/zenml-core/test ┃
┗━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
⠦ Attempting to configure local client using service connector 'gcp-user-account'...
WARNING! Your password will be stored unencrypted in /home/stefan/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
The 'gcp-user-account' Docker Service Connector connector was used to successfully configure the local Docker/OCI container registry client/SDK.
To verify that the local Docker container registry client is correctly configured, the following command can be used:
Updated the local gcloud default application credentials file at '/home/user/.config/gcloud/application_default_credentials.json'
The 'gcp-user-account' GCP Service Connector connector was used to successfully configure the local Generic GCP resource client/SDK.
The GCP Service Connector can also be used with any Orchestrator or Model Deployer stack component flavor that relies on Kubernetes clusters to manage workloads. This allows GKE Kubernetes container workloads to be managed without the need to configure and maintain explicit GCP or Kubernetes kubectl configuration contexts and credentials in the target environment or in the Stack Component itself.
Similarly, Container Registry Stack Components can be connected to a Google Artifact Registry or GCR Container Registry through a GCP Service Connector. This allows container images to be built and published to GAR or GCR container registries without the need to configure explicit GCP credentials in the target environment or the Stack Component.
End-to-end examples
GKE Kubernetes Orchestrator, GCS Artifact Store and GCR Container Registry with a multi-type GCP Service Connector
This is an example of an end-to-end workflow involving Service Connectors that use a single multi-type GCP Service Connector to give access to multiple resources for multiple Stack Components. A complete ZenML Stack is registered and composed of the following Stack Components, all connected through the same Service Connector:
As a last step, a simple pipeline is run on the resulting Stack.
Configure the local GCP CLI with valid user account credentials with a wide range of permissions (i.e. by running gcloud auth application-default login) and install ZenML integration prerequisites:
zenmlintegrationinstall-ygcp
gcloudauthapplication-defaultlogin
Example Command Output
```text
Credentials saved to file: [/home/stefan/.config/gcloud/application_default_credentials.json]
These credentials will be used by any library that requests Application Default Credentials (ADC).
Quota project "zenml-core" was added to ADC which can be used by Google client libraries for billing
and quota. Note that some services may still bill the project owning the resource.
```
Make sure the GCP Service Connector Type is available
**NOTE**: from this point forward, we don't need the local GCP CLI credentials or the local GCP CLI at all. The steps that follow can be run on any machine regardless of whether it has been configured and authorized to access the GCP project.
4. find out which GCS buckets, GAR registries, and GKE Kubernetes clusters we can gain access to. We'll use this information to configure the Stack Components in our minimal GCP stack: a GCS Artifact Store, a Kubernetes Orchestrator, and a GCP Container Registry.
```text
The following 'kubernetes-cluster' resources can be accessed by service connectors that you have configured:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┓
┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃
┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────────────┼────────────────────┨
┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 🌀 kubernetes-cluster │ zenml-test-cluster ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┛
```
```text
Running with active stack: 'default' (global)
Successfully connected artifact store `gcs-zenml-bucket-sl` to the following resources:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓
┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃
┠──────────────────────────────────────┼────────────────┼────────────────┼───────────────┼──────────────────────┨
┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛
```
register and connect a Kubernetes Orchestrator Stack Component to a GKE cluster:
```text
Running with active stack: 'default' (global)
Successfully connected container registry `gcr-zenml-core` to the following resources:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃
┠──────────────────────────────────────┼────────────────┼────────────────┼────────────────────┼─────────────────────────────────────────────┨
┃ eeeabc13-9203-463b-aa52-216e629e903c │ gcp-demo-multi │ 🔵 gcp │ 🐳 docker-registry │ europe-west1-docker.pkg.dev/zenml-core/test ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
```
Combine all Stack Components together into a Stack and set it as active (also throw in a local Image Builder for completion):
zenmlimage-builderregisterlocal--flavorlocal
Example Command Output
```text
Running with active stack: 'default' (global)
Successfully registered image_builder `local`.
```
```sh
zenml stack register gcp-demo -a gcs-zenml-bucket-sl -o gke-zenml-test-cluster -c gcr-zenml-core -i local --set
```
Example Command Output
```text
Stack 'gcp-demo' successfully registered!
Active global stack set to:'gcp-demo'
```
Finally, run a simple pipeline to prove that everything works as expected. We'll use the simplest pipelines possible for this example:
from zenml import pipeline, step@stepdefstep_1() ->str:"""Returns the `world` string."""return"world"@step(enable_cache=False)defstep_2(input_one:str,input_two:str) ->None:"""Combines the two strings at its input and prints them.""" combined_str =f"{input_one}{input_two}"print(combined_str)@pipelinedefmy_pipeline(): output_step_one =step_1()step_2(input_one="hello", input_two=output_step_one)if__name__=="__main__":my_pipeline()
Saving that to a run.py file and running it gives us:
Example Command Output
```text
$ python run.py
Building Docker image(s) for pipeline simple_pipeline.
Building Docker image europe-west1-docker.pkg.dev/zenml-core/test/zenml:simple_pipeline-orchestrator.
- Including integration requirements: gcsfs, google-cloud-aiplatform>=1.11.0, google-cloud-build>=3.11.0, google-cloud-container>=2.21.0, google-cloud-functions>=1.8.3, google-cloud-scheduler>=2.7.3, google-cloud-secret-manager, google-cloud-storage>=2.9.0, kfp==1.8.16, kubernetes==18.20.0, shapely<2.0
No .dockerignore found, including all files inside build context.
Step 1/8 : FROM zenmldocker/zenml:0.39.1-py3.8
Step 2/8 : WORKDIR /app
Step 3/8 : COPY .zenml_integration_requirements .
Step 4/8 : RUN pip install --default-timeout=60 --no-cache-dir -r .zenml_integration_requirements
Step 5/8 : ENV ZENML_ENABLE_REPO_INIT_WARNINGS=False
Step 6/8 : ENV ZENML_CONFIG_PATH=/app/.zenconfig
Step 7/8 : COPY . .
Step 8/8 : RUN chmod -R a+rw .
Pushing Docker image europe-west1-docker.pkg.dev/zenml-core/test/zenml:simple_pipeline-orchestrator.
Finished pushing Docker image.
Finished building Docker image(s).
Running pipeline simple_pipeline on stack gcp-demo (caching disabled)
Waiting for Kubernetes orchestrator pod...
Kubernetes orchestrator pod started.
Waiting for pod of step step_1 to start...
Step step_1 has started.
Step step_1 has finished in 1.357s.
Pod of step step_1 completed.
Waiting for pod of step simple_step_two to start...
Step step_2 has started.
Hello World!
Step step_2 has finished in 3.136s.
Pod of step step_2 completed.
Orchestration pod completed.
Dashboard URL: http://34.148.132.191/default/pipelines/cec118d1-d90a-44ec-8bd7-d978f726b7aa/runs
```
VertexAI Orchestrator, GCS Artifact Store, Google Artifact Registry and GCP Image Builder with single-instance GCP Service Connectors
This is an example of an end-to-end workflow involving Service Connectors that use multiple single-instance GCP Service Connectors, each giving access to a resource for a Stack Component. A complete ZenML Stack is registered and composed of the following Stack Components, all connected through its individual Service Connector:
As a last step, a simple pipeline is run on the resulting Stack.
Configure the local GCP CLI with valid user account credentials with a wide range of permissions (i.e. by running gcloud auth application-default login) and install ZenML integration prerequisites:
zenmlintegrationinstall-ygcp
gcloudauthapplication-defaultlogin
Example Command Output
```text
Credentials saved to file: [/home/stefan/.config/gcloud/application_default_credentials.json]
These credentials will be used by any library that requests Application Default Credentials (ADC).
Quota project "zenml-core" was added to ADC which can be used by Google client libraries for billing
and quota. Note that some services may still bill the project owning the resource.
```
Make sure the GCP Service Connector Type is available
Register an individual single-instance GCP Service Connector using auto-configuration for each of the resources that will be needed for the Stack Components: a GCS bucket, a GCR registry, and generic GCP access for the VertexAI orchestrator and another one for the GCP Cloud Builder:
```text
Successfully registered service connector `gcs-zenml-bucket-sl` with access to the following resources:
┏━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓
┃ RESOURCE TYPE │ RESOURCE NAMES ┃
┠───────────────┼──────────────────────┨
┃ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃
┗━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛
```
```text
Successfully registered service connector `vertex-ai-zenml-core` with access to the following resources:
┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓
┃ RESOURCE TYPE │ RESOURCE NAMES ┃
┠────────────────┼────────────────┨
┃ 🔵 gcp-generic │ zenml-core ┃
┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛
```
```text
Successfully registered service connector `gcp-cloud-builder-zenml-core` with access to the following resources:
┏━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓
┃ RESOURCE TYPE │ RESOURCE NAMES ┃
┠────────────────┼────────────────┨
┃ 🔵 gcp-generic │ zenml-core ┃
┗━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛
```
**NOTE**: from this point forward, we don't need the local GCP CLI credentials or the local GCP CLI at all. The steps that follow can be run on any machine regardless of whether it has been configured and authorized to access the GCP project.
In the end, the service connector list should look like this:
```sh
zenml service-connector list
```
```text
Running with active stack: 'default' (global)
Successfully connected artifact store `gcs-zenml-bucket-sl` to the following resources:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┓
┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃
┠──────────────────────────────────────┼─────────────────────┼────────────────┼───────────────┼──────────────────────┨
┃ 405034fe-5e6e-4d29-ba62-8ae025381d98 │ gcs-zenml-bucket-sl │ 🔵 gcp │ 📦 gcs-bucket │ gs://zenml-bucket-sl ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┛
```
register and connect a Google Cloud Image Builder Stack Component to the target GCP project:
```text
Running with active stack: 'default' (repository)
Successfully connected image builder `gcp-zenml-core` to the following resources:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━┓
┃ CONNECTOR ID │ CONNECTOR NAME │ CONNECTOR TYPE │ RESOURCE TYPE │ RESOURCE NAMES ┃
┠──────────────────────────────────────┼──────────────────────────────┼────────────────┼────────────────┼────────────────┨
┃ 648c1016-76e4-4498-8de7-808fd20f057b │ gcp-cloud-builder-zenml-core │ 🔵 gcp │ 🔵 gcp-generic │ zenml-core ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━┛
```
register and connect a Vertex AI Orchestrator Stack Component to the target GCP project
NOTE: If we do not specify a workload service account, the Vertex AI Pipelines Orchestrator uses the Compute Engine default service account in the target project to run pipelines. You must grant this account the Vertex AI Service Agent role, otherwise the pipelines will fail. More information on other configurations possible for the Vertex AI Orchestrator can be found here.
```text
Stack 'gcp-demo' successfully registered!
Active repository stack set to:'gcp-demo'
```
Finally, run a simple pipeline to prove that everything works as expected. We'll use the simplest pipelines possible for this example:
from zenml import pipeline, step@stepdefstep_1() ->str:"""Returns the `world` string."""return"world"@step(enable_cache=False)defstep_2(input_one:str,input_two:str) ->None:"""Combines the two strings at its input and prints them.""" combined_str =f"{input_one}{input_two}"print(combined_str)@pipelinedefmy_pipeline(): output_step_one =step_1()step_2(input_one="hello", input_two=output_step_one)if__name__=="__main__":my_pipeline()
Saving that to a run.py file and running it gives us:
Example Command Output
```text
$ python run.py
Building Docker image(s) for pipeline simple_pipeline.
Building Docker image gcr.io/zenml-core/zenml:simple_pipeline-orchestrator.
- Including integration requirements: gcsfs, google-cloud-aiplatform>=1.11.0, google-cloud-build>=3.11.0, google-cloud-container>=2.21.0, google-cloud-functions>=1.8.3, google-cloud-scheduler>=2.7.3, google-cloud-secret-manager, google-cloud-storage>=2.9.0, kfp==1.8.16, shapely<2.0
Using Cloud Build to build image gcr.io/zenml-core/zenml:simple_pipeline-orchestrator
No .dockerignore found, including all files inside build context.
Uploading build context to gs://zenml-bucket-sl/cloud-build-contexts/5dda6dbb60e036398bee4974cfe3eb768a138b2e.tar.gz.
Build context located in bucket zenml-bucket-sl and object path cloud-build-contexts/5dda6dbb60e036398bee4974cfe3eb768a138b2e.tar.gz
Using Cloud Builder image gcr.io/cloud-builders/docker to run the steps in the build. Container will be attached to network using option --network=cloudbuild.
Running Cloud Build to build the Docker image. Cloud Build logs: https://console.cloud.google.com/cloud-build/builds/068e77a1-4e6f-427a-bf94-49c52270af7a?project=20219041791
The Docker image has been built successfully. More information can be found in the Cloud Build logs: https://console.cloud.google.com/cloud-build/builds/068e77a1-4e6f-427a-bf94-49c52270af7a?project=20219041791.
Finished building Docker image(s).
Running pipeline simple_pipeline on stack gcp-demo (caching disabled)
The attribute pipeline_root has not been set in the orchestrator configuration. One has been generated automatically based on the path of the GCPArtifactStore artifact store in the stack used to execute the pipeline. The generated pipeline_root is gs://zenml-bucket-sl/vertex_pipeline_root/simple_pipeline/simple_pipeline_default_6e72f3e1.
/home/stefan/aspyre/src/zenml/.venv/lib/python3.8/site-packages/kfp/v2/compiler/compiler.py:1290: FutureWarning: APIs imported from the v1 namespace (e.g. kfp.dsl, kfp.components, etc) will not be supported by the v2 compiler since v2.0.0
warnings.warn(
Writing Vertex workflow definition to /home/stefan/.config/zenml/vertex/8a0b53ee-644a-4fbe-8e91-d4d6ddf79ae8/pipelines/simple_pipeline_default_6e72f3e1.json.
No schedule detected. Creating one-off vertex job...
Submitting pipeline job with job_id simple-pipeline-default-6e72f3e1 to Vertex AI Pipelines service.
The Vertex AI Pipelines job workload will be executed using the connectors-vertex-ai-workload@zenml-core.iam.gserviceaccount.com service account.
Creating PipelineJob
INFO:google.cloud.aiplatform.pipeline_jobs:Creating PipelineJob
PipelineJob created. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1
INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob created. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1
To use this PipelineJob in another session:
INFO:google.cloud.aiplatform.pipeline_jobs:To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1')
INFO:google.cloud.aiplatform.pipeline_jobs:pipeline_job = aiplatform.PipelineJob.get('projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/europe-west1/pipelines/runs/simple-pipeline-default-6e72f3e1?project=20219041791
INFO:google.cloud.aiplatform.pipeline_jobs:View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/europe-west1/pipelines/runs/simple-pipeline-default-6e72f3e1?project=20219041791
View the Vertex AI Pipelines job at https://console.cloud.google.com/vertex-ai/locations/europe-west1/pipelines/runs/simple-pipeline-default-6e72f3e1?project=20219041791
Waiting for the Vertex AI Pipelines job to finish...
PipelineJob projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 current state:
PipelineState.PIPELINE_STATE_RUNNING
INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1 current state:
PipelineState.PIPELINE_STATE_RUNNING
...
PipelineJob run completed. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1
INFO:google.cloud.aiplatform.pipeline_jobs:PipelineJob run completed. Resource name: projects/20219041791/locations/europe-west1/pipelineJobs/simple-pipeline-default-6e72f3e1
Dashboard URL: https://34.148.132.191/default/pipelines/17cac6b5-3071-45fa-a2ef-cda4a7965039/runs
```