Chapter 3
Train some models.
If you want to see the code for this chapter of the guide, head over to the GitHub.

Train and evaluate the model.

Finally we can train and evaluate our model.

Create steps

For this we decide to add two steps, a trainer and an evaluator step. We also keep using TensorFlow to help with these.

Trainer

1
import numpy as np
2
import tensorflow as tf
3
4
from zenml.steps import step
5
from zenml.steps.base_step_config import BaseStepConfig
6
7
class TrainerConfig(BaseStepConfig):
8
"""Trainer params"""
9
10
epochs: int = 1
11
gamma: float = 0.7
12
lr: float = 0.001
13
14
@step
15
def tf_trainer(
16
config: TrainerConfig, # not an artifact, passed in when
17
X_train: np.ndarray,
18
y_train: np.ndarray,
19
) -> tf.keras.Model:
20
"""Train a neural net from scratch to recognise MNIST digits return our
21
model or the learner"""
22
model = tf.keras.Sequential(
23
[
24
tf.keras.layers.Flatten(input_shape=(28, 28)),
25
tf.keras.layers.Dense(10, activation="relu"),
26
tf.keras.layers.Dense(10),
27
]
28
)
29
30
model.compile(
31
optimizer=tf.keras.optimizers.Adam(0.001),
32
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
33
metrics=["accuracy"],
34
)
35
36
model.fit(
37
X_train,
38
y_train,
39
epochs=config.epochs,
40
)
41
42
# write model
43
return model
Copied!
A few things of note:
  • This is our first instance of parameterizing a step with a BaseStepConfig. This allows us to specify some parameters at run-time rather than via data artifacts between steps.
  • This time the trainer returns a tf.keras.Model, which ZenML takes care of storing in the artifact store. We will talk about how to 'take over' this storing via Materializers in a later chapter.

Evaluator

We also add a a simple evaluator:
1
@step
2
def tf_evaluator(
3
X_test: np.ndarray,
4
y_test: np.ndarray,
5
model: tf.keras.Model,
6
) -> float:
7
"""Calculate the loss for the model for each epoch in a graph"""
8
9
_, test_acc = model.evaluate(X_test, y_test, verbose=2)
10
return test_acc
Copied!
This gets the model and test data, and calculates simple model accuracy over the test set.

Pipeline

And now our pipeline looks like this:
1
@pipeline
2
def mnist_pipeline(
3
importer,
4
normalizer,
5
trainer,
6
evaluator,
7
):
8
# Link all the steps artifacts together
9
X_train, y_train, X_test, y_test = importer()
10
X_trained_normed, X_test_normed = normalizer(X_train=X_train, X_test=X_test)
11
model = trainer(X_train=X_trained_normed, y_train=y_train)
12
evaluator(X_test=X_test_normed, y_test=y_test, model=model)
Copied!
We can run it with the concrete functions:
1
# Run the pipeline
2
mnist_pipeline(
3
importer=importer_mnist(),
4
normalizer=normalize_mnist(),
5
trainer=tf_trainer(config=TrainerConfig(epochs=1)),
6
evaluator=tf_evaluator(),
7
).run()
Copied!
Beautiful, now the pipeline is truly doing something. Let's run it!

Run

You can run this as follows:
1
python chapter_3.py
Copied!
The output will look as follows (note: this is filtered to highlight the most important logs)
1
Creating pipeline: mnist_pipeline
2
Cache enabled for pipeline `mnist_pipeline`
3
Using orchestrator `local_orchestrator` for pipeline `mnist_pipeline`. Running pipeline..
4
Step `importer_mnist` has started.
5
Step `importer_mnist` has finished in 1.819s.
6
Step `normalize_mnist` has started.
7
Step `normalize_mnist` has finished in 2.036s.
8
Step `tf_trainer` has started.
9
Step `tf_trainer` has finished in 4.723s.
10
Step `tf_evaluator` has started.
11
`tf_evaluator` has finished in 0.742s.
Copied!

Inspect

If you add the following code to fetch the pipeline:
1
from zenml.core.repo import Repository
2
3
repo = Repository()
4
p = repo.get_pipeline(pipeline_name="mnist_pipeline")
5
runs = p.runs
6
print(f"Pipeline `mnist_pipeline` has {len(runs)} run(s)")
7
run = runs[-1]
8
print(f"The run you just made has {len(run.steps)} steps.")
9
step = run.get_step('evaluator')
10
print(
11
f"The `tf_evaluator step` returned an accuracy: {step.output.read()}"
12
)
Copied!
You get the following output:
1
Pipeline `mnist_pipeline` has 1 run(s)
2
The first run has 4 steps.
3
The `tf_evaluator step` returned an accuracy: 0.9100000262260437
Copied!
Wow, we just trained our first model! But have not stopped yet. What if did not want to use TensorFlow? Let's swap out our trainers and evaluators for different libraries.
Last modified 6d ago