Visualize Data Lineage
How to visualize ZenML pipeline runs
ZenML's Dash integration provides a PipelineRunLineageVisualizer that can be used to visualize pipeline runs in your local browser, as shown below:
Pipeline Run Visualization Example

Requirements

Before you can use the Dash visualizer, you first need to install ZenML's Dash integration:
zenml integration install dash -y
See the Integrations page for more details on ZenML integrations and how to install and use them.

Visualizing Pipelines

After a pipeline run has been started, we can access it using the Repository, as you learned in the last section on Inspecting Finished Pipeline Runs.
We can then visualize a run using the PipelineRunLineageVisualizer class:
from zenml.integrations.dash.visualizers.pipeline_run_lineage_visualizer import (
PipelineRunLineageVisualizer,
)
from zenml.repository import Repository
repo = Repository()
latest_run = repo.get_pipeline(<PIPELINE_NAME>).runs[-1]
PipelineRunLineageVisualizer().visualize(latest_run)
This will open an interactive visualization in your local browser at http://127.0.0.1:8050/, where squares represent your artifacts and circles your pipeline steps.
The different nodes are color-coded in the visualization, so if your pipeline ever fails or runs for too long, you can find the responsible step at a glance, as it will be colored red or yellow respectively.

Visualizing Caching

In addition to Completed, Running, and Failed, there is also a separate Cached state. You already learned about caching in a previous section on Caching Pipeline Runs. Using the PipelineRunLineageVisualizer, you can see at a glance which steps were cached (green) and which were rerun (blue). See below for a detailed example.

Code Example

In the following example we use the PipelineRunLineageVisualizer to visualize the three pipeline runs from the Caching Pipeline Runs Example:
Code Example of this Section