Connecting ZenML to resources
Understanding the workflow of using Service Connectors to access external resources with ZenML.
Connecting ZenML to resources
Everything around Service Connectors is expressed in terms of resources: a Kubernetes cluster is a resource, an S3 bucket is another resource. Different flavors of Stack Components need to use different resources to function: the Kubernetes and Tekton Orchestrators need access to a Kubernetes cluster, the S3 Artifact Store needs access to an S3 bucket. It is still possible to configure Stack Components like these to authenticate and connect directly to the target services that they need to interact with, but this is not simple to set up and it definitely isn't easily reproducible and maintainable.
Service Connectors simplify the configuration of ZenML Stack Components by taking over and mediating all concerns related to authentication and access to these resources. Once Service Connectors are set up, anyone can configure Stacks and Stack Components to easily access and utilize external resources in their ML pipelines without worrying about the specifics of authentication and access.
In this section, we walk through a typical workflow to explain conceptually the role that Service Connectors play in connecting ZenML to external resources.
The typical Service Connectors workflow
The first step is finding out what types of resources you can connect ZenML to. Maybe you have already planned out the infrastructure options for your MLOps platform and are looking to find out whether ZenML can accommodate them. Or perhaps you want to use a particular Stack Component flavor in your Stack and are wondering whether you can use a Service Connector to connect it to external resources.
This is where the Service Connector Type concept comes in. For now, it is sufficient to think of Service Connector Types as a way to describe all the different kinds of resources that Service Connectors can mediate access to. This is an example of listing the available Service Connector Types with the ZenML CLI.
Let's say our cloud provider of choice is AWS and we're looking to hook up an S3 bucket to an S3 Artifact Store stack component and potentially other AWS resources in addition to that. Note that there is an AWS Service Connector type that we can use to gain access to several types of resources, one of which is an S3 bucket. We'll use that in the next steps.
The second step is registering a Service Connector that effectively enables ZenML to authenticate to and access one or more remote resources. This step is best handled by someone with some infrastructure knowledge, but there are sane defaults and auto-detection mechanisms built into the AWS Service Connector that can make this a walk in the park even for the uninitiated. A simple example of this is registering an AWS Service Connector with AWS credentials automatically lifted up from your local host, giving ZenML access to the same resources that you can access from your local machine through the AWS CLI, such as EKS clusters, ECR repositories or S3 buckets:
The ZenML CLI provides an even easier and more interactive way of registering Service Connectors. Just use the -i
command line argument and follow the interactive guide:
The third step is preparing to configure the Stack Components and Stacks that you will use to run pipelines, the same way you would do it without Service Connectors, but this time you have the option of discovering which remote resources are available for you to use. For example, if you needed an S3 bucket for your S3 Artifact Store, you could run the following CLI command, which is the same as asking "which S3 buckets am I authorized to access through ZenML ?". The result is a list of resource names, identifying those S3 buckets and the Service Connectors that facilitate access to them:
The next step in this journey is configuring and connecting one or more Stack Components to a remote resource via the Service Connector registered and listed in previous steps. This is as easy as saying "I want this S3 Artifact Store to use the s3://ml-bucket
S3 bucket" or "I want this Kubernetes Orchestrator to use the mega-ml-cluster
Kubernetes cluster" and doesn't require any knowledge whatsoever about the authentication mechanisms or even the provenance of those resources. The following example creates an S3 Artifact store and connects it to an S3 bucket with the earlier connector:
The ZenML CLI provides an even easier and more interactive way of connecting a stack component to an external resource. Just pass the -i
command line argument and follow the interactive guide:
Of course, the stack component we just connected to the infrastructure is not really useful on its own. We need to make it part of a Stack, set the Stack as active, and finally run some pipelines on it. But Service Connectors no longer play any visible role in this part, which is why they're so useful: they do all the heavy lifting in the background so you can focus on what matters.
Last updated