Creating custom visualizations

Creating your own visualizations.

It is simple to associate a custom visualization with an artifact in ZenML, if the visualization is one of the supported visualization types. Currently, the following visualization types are supported:

  • HTML: Embedded HTML visualizations such as data validation reports,

  • Image: Visualizations of image data such as Pillow images (e.g. PIL.Image) or certain numeric numpy arrays,

  • CSV: Tables, such as the pandas DataFrame .describe() output,

  • Markdown: Markdown strings or pages.

  • JSON: JSON strings or objects.

There are three ways how you can add custom visualizations to the dashboard:

  • If you are already handling HTML, Markdown, CSV or JSON data in one of your steps, you can have them visualized in just a few lines of code by casting them to a special class inside your step.

  • If you want to automatically extract visualizations for all artifacts of a certain data type, you can define type-specific visualization logic by building a custom materializer.

  • If you want to create any other custom visualizations, you can create a custom return type class with corresponding materializer and build and return this custom return type from one of your steps.

Visualization via Special Return Types

If you already have HTML, Markdown, CSV or JSON data available as a string inside your step, you can simply cast them to one of the following types and return them from your step:

  • zenml.types.HTMLString for strings in HTML format, e.g., "<h1>Header</h1>Some text",

  • zenml.types.MarkdownString for strings in Markdown format, e.g., "# Header\nSome text",

  • zenml.types.CSVString for strings in CSV format, e.g., "a,b,c\n1,2,3".

  • zenml.types.JSONString for strings in JSON format, e.g., {"key": "value"}.

Example:

from zenml.types import CSVString

@step
def my_step() -> CSVString:
    some_csv = "a,b,c\n1,2,3"
    return CSVString(some_csv)

This would create the following visualization in the dashboard:

Visualization via Materializers

If you want to automatically extract visualizations for all artifacts of a certain data type, you can do so by overriding the save_visualizations() method of the corresponding materializer. See the materializer docs page for more information on how to create custom materializers that do this.

Or, see a code example on GitHub where we visualize Hugging Face datasets by embedding their preview viewer.

How to think about creating a custom visualization

By combining the ideas behind the above two visualization approaches, you can visualize virtually anything you want inside your ZenML dashboard in three simple steps:

  1. Create a custom class that will hold the data that you want to visualize.

  2. Build a custom materializer for this custom class with the visualization logic implemented in the save_visualizations() method.

  3. Return your custom class from any of your ZenML steps.

Example: Facets Data Skew Visualization

As an example, have a look at the models, materializers, and steps of the Facets Integration, which can be used to visualize the data skew between multiple Pandas DataFrames:

1. Custom Class The FacetsComparison is the custom class that holds the data required for the visualization.

class FacetsComparison(BaseModel):
    datasets: List[Dict[str, Union[str, pd.DataFrame]]]

2. Materializer The FacetsMaterializer is a custom materializer that only handles this custom class and contains the corresponding visualization logic.

class FacetsMaterializer(BaseMaterializer):

    ASSOCIATED_TYPES = (FacetsComparison,)
    ASSOCIATED_ARTIFACT_TYPE = ArtifactType.DATA_ANALYSIS

    def save_visualizations(
        self, data: FacetsComparison
    ) -> Dict[str, VisualizationType]:
        html = ...  # Create a visualization for the custom type 
        visualization_path = os.path.join(self.uri, VISUALIZATION_FILENAME)
        with fileio.open(visualization_path, "w") as f:
            f.write(html)
        return {visualization_path: VisualizationType.HTML}

3. Step There are three different steps in the facets integration that can be used to create FacetsComparisons for different sets of inputs. E.g., the facets_visualization_step below takes two DataFrames as inputs and builds a FacetsComparison object out of them:

@step
def facets_visualization_step(
    reference: pd.DataFrame, comparison: pd.DataFrame
) -> FacetsComparison:  # Return the custom type from your step
    return FacetsComparison(
        datasets=[
            {"name": "reference", "table": reference},
            {"name": "comparison", "table": comparison},
        ]
    )

This is what happens now under the hood when you add the facets_visualization_step into your pipeline:

  1. The step creates and returns a FacetsComparison.

  2. When the step finishes, ZenML will search for a materializer class that can handle this type, finds the FacetsMaterializer, and calls the save_visualizations() method which creates the visualization and saves it into your artifact store as an HTML file.

  3. When you open your dashboard and click on the artifact inside the run DAG, the visualization HTML file is loaded from the artifact store and displayed.

Last updated