Quickstart
A simple example to get started with ZenML
Our goal here is to help you to get the first practical experience with our tool and give you a brief overview on some basic functionalities of ZenML.
The quickest way to get started is to create a simple pipeline. We'll be using the MNIST dataset (originally developed by Yann LeCun and others) digits, and then later the Fashion MNIST dataset developed by Zalando.
If you want to run this notebook in an interactive environment, feel free to run it in a Google Colab.

Purpose

This quickstart guide is designed to provide a practical introduction to some of the main concepts and paradigms used by the ZenML framework.

Using Google Colab

You will want to use a GPU for this example. If you are following this quickstart in Google's Colab, follow these steps:
    Before running anything, you need to tell Colab that you want to use a GPU. You can do this by clicking on the ‘Runtime’ tab and selecting ‘Change runtime type’. A pop-up window will open up with a drop-down menu.
    Select ‘GPU’ from the menu and click ‘Save’.
    It may ask if you want to restart the runtime. If so, go ahead and do that.

Install libraries

1
# Install the ZenML CLI tool
2
!pip install zenml tensorflow tensorflow_datasets
Copied!
Once the installation is completed, you can go ahead and create your first ZenML repository for your project. As ZenML repositories are built on top of Git repositories, you can create yours in a desired empty directory through:
1
# Initialize a git repository
2
!git init
3
4
# Initialize ZenML's .zen file
5
!zenml init
Copied!
Now, the setup is completed. For the next steps, just make sure that you are executing the code within your ZenML repository.

Define ZenML Steps

In the code that follows, you can see that we are defining the various steps of our pipeline. Each step is decorated with @step, the main abstraction that is currently available for creating pipeline steps.
1
import numpy as np
2
import tensorflow as tf
3
4
from zenml.pipelines import pipeline
5
from zenml.steps import step
6
from zenml.steps.base_step_config import BaseStepConfig
7
from zenml.steps.step_output import Output
8
9
class TrainerConfig(BaseStepConfig):
10
"""Trainer params"""
11
12
epochs: int = 1
13
14
@step
15
def importer_mnist() -> Output(
16
X_train=np.ndarray, y_train=np.ndarray, X_test=np.ndarray, y_test=np.ndarray
17
):
18
"""Download the MNIST data store it as an artifact"""
19
(X_train, y_train), (
20
X_test,
21
y_test,
22
) = tf.keras.datasets.mnist.load_data()
23
return X_train, y_train, X_test, y_test
24
25
26
@step
27
def normalizer(
28
X_train: np.ndarray, X_test: np.ndarray
29
) -> Output(X_train_normed=np.ndarray, X_test_normed=np.ndarray):
30
"""Normalize the values for all the images so they are between 0 and 1"""
31
X_train_normed = X_train / 255.0
32
X_test_normed = X_test / 255.0
33
return X_train_normed, X_test_normed
34
35
36
@step
37
def trainer(
38
config: TrainerConfig,
39
X_train: np.ndarray,
40
y_train: np.ndarray,
41
) -> tf.keras.Model:
42
"""Train a neural net from scratch to recognise MNIST digits return our
43
model or the learner"""
44
model = tf.keras.Sequential(
45
[
46
tf.keras.layers.Flatten(input_shape=(28, 28)),
47
tf.keras.layers.Dense(10, activation="relu"),
48
tf.keras.layers.Dense(10),
49
]
50
)
51
52
model.compile(
53
optimizer=tf.keras.optimizers.Adam(0.001),
54
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
55
metrics=["accuracy"],
56
)
57
58
model.fit(
59
X_train,
60
y_train,
61
epochs=config.epochs,
62
)
63
64
# write model
65
return model
66
67
68
@step
69
def evaluator(
70
X_test: np.ndarray,
71
y_test: np.ndarray,
72
model: tf.keras.Model,
73
) -> np.ndarray:
74
"""Calculate the loss for the model for each epoch in a graph"""
75
76
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=2)
77
return np.array([test_loss, test_acc])
78
79
Copied!

Define ZenML Pipeline

A pipeline is defined with the @pipeline decorator. This defines the various steps of the pipeline and specifies the order in which they will be run.
1
@pipeline
2
def mnist_pipeline(
3
importer,
4
normalizer: normalizer,
5
trainer,
6
evaluator,
7
):
8
# Link all the steps artifacts together
9
X_train, y_train, X_test, y_test = importer()
10
X_trained_normed, X_test_normed = normalizer(X_train=X_train, X_test=X_test)
11
model = trainer(X_train=X_trained_normed, y_train=y_train)
12
evaluator(X_test=X_test_normed, y_test=y_test, model=model)
Copied!

Run the Pipeline with MNIST

Here we initialize an instance of our mnist_pipeline.
1
# Initialise the pipeline
2
p = mnist_pipeline(
3
importer=importer_mnist(),
4
normalizer=normalizer(),
5
trainer=trainer(config=TrainerConfig(epochs=1)),
6
evaluator=evaluator(),
7
)
8
p.run()
Copied!

From MNIST to Fashion MNIST

We got pretty good results on the MNIST model that we trained, but maybe we want to see how a similar training pipeline would work on a different dataset.
You can see how easy it is to switch out one data import step and processing for another in our pipeline.
1
# Define a new modified import data step to download the Fashion MNIST model
2
@step
3
def importer_fashion_mnist() -> Output(
4
X_train=np.ndarray, y_train=np.ndarray, X_test=np.ndarray, y_test=np.ndarray
5
):
6
"""Download the MNIST data store it as an artifact"""
7
(X_train, y_train), (
8
X_test,
9
y_test,
10
) = tf.keras.datasets.fashion_mnist.load_data()
11
return X_train, y_train, X_test, y_test
12
13
14
# Initialise a new pipeline
15
fashion_p = mnist_pipeline(
16
importer=importer_fashion_mnist(),
17
normalizer=normalizer(),
18
trainer=trainer(config=TrainerConfig(epochs=1)),
19
evaluator=evaluator(),
20
)
21
22
# Run the new pipeline
23
fashion_p.run()
Copied!
… and that's it for the quickstart. If you came here without a hiccup, you must have successfully installed ZenML, set up a ZenML repo, configured a training pipeline, executed it and evaluated the results. And, this is just the tip of the iceberg on the capabilities of ZenML.
However, if you had a hiccup or you have some suggestions/questions regarding our framework, you can always check our github or even better join us on our Slack channel.
For more detailed information on all the components and steps that went into this short example, please continue reading our more detailed documentation pages.
Cheers!
Last modified 55m ago