This will open an interactive visualization in your local browser at http://127.0.0.1:8050/, where squares represent your artifacts and circles your pipeline steps.
The different nodes are color-coded in the visualization, so if your pipeline ever fails or runs for too long, you can find the responsible step at a glance, as it will be colored red or yellow respectively.
Visualizing Caching
In addition to Completed, Running, and Failed, there is also a separate Cached state. You already learned about caching in a previous section on Caching Pipeline Runs. Using the PipelineRunLineageVisualizer, you can see at a glance which steps were cached (green) and which were rerun (blue). See below for a detailed example.
Code Example
In the following example we use the PipelineRunLineageVisualizer to visualize the three pipeline runs from the Caching Pipeline Runs Example:
Code Example of this Section
import numpy as npfrom sklearn.base import ClassifierMixinfrom sklearn.datasets import load_digitsfrom sklearn.model_selection import train_test_splitfrom sklearn.svm import SVCfrom zenml.steps import BaseStepConfig, Output, stepfrom zenml.pipelines import pipelinefrom zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import ( PipelineRunLineageVisualizer,)from zenml.repository import Repository@stepdefdigits_data_loader() ->Output( X_train=np.ndarray, X_test=np.ndarray, y_train=np.ndarray, y_test=np.ndarray):"""Loads the digits dataset as a tuple of flattened numpy arrays.""" digits =load_digits() data = digits.images.reshape((len(digits.images), -1)) X_train, X_test, y_train, y_test =train_test_split( data, digits.target, test_size=0.2, shuffle=False )return X_train, X_test, y_train, y_testclassSVCTrainerStepConfig(BaseStepConfig):"""Trainer params""" gamma:float=0.001@step(enable_cache=False)# never cache this step, always retraindefsvc_trainer(config: SVCTrainerStepConfig,X_train: np.ndarray,y_train: np.ndarray,) -> ClassifierMixin:"""Train a sklearn SVC classifier.""" model =SVC(gamma=config.gamma) model.fit(X_train, y_train)return model@pipelinedeffirst_pipeline(step_1,step_2): X_train, X_test, y_train, y_test =step_1()step_2(X_train, y_train)first_pipeline_instance =first_pipeline( step_1=digits_data_loader(), step_2=svc_trainer())# The pipeline is executed for the first time, so all steps are run.first_pipeline_instance.run()latest_run= first_pipeline_instance.get_runs()[-1]PipelineRunLineageVisualizer().visualize(latest_run)# Step one will use cache, step two will rerun due to the decorator configfirst_pipeline_instance.run()latest_run = first_pipeline_instance.get_runs()[-1]PipelineRunLineageVisualizer().visualize(latest_run)# The complete pipeline will be rerunfirst_pipeline_instance.run(enable_cache=False)latest_run = first_pipeline_instance.get_runs()[-1]PipelineRunLineageVisualizer().visualize(latest_run)