LogoLogo
ProductResourcesGitHubStart free
  • Documentation
  • Learn
  • ZenML Pro
  • Stacks
  • API Reference
  • SDK Reference
  • Overview
  • Integrations
  • Stack Components
    • Orchestrators
      • Local Orchestrator
      • Local Docker Orchestrator
      • Kubeflow Orchestrator
      • Kubernetes Orchestrator
      • Google Cloud VertexAI Orchestrator
      • AWS Sagemaker Orchestrator
      • AzureML Orchestrator
      • Databricks Orchestrator
      • Tekton Orchestrator
      • Airflow Orchestrator
      • Skypilot VM Orchestrator
      • HyperAI Orchestrator
      • Lightning AI Orchestrator
      • Develop a custom orchestrator
    • Artifact Stores
      • Local Artifact Store
      • Amazon Simple Cloud Storage (S3)
      • Google Cloud Storage (GCS)
      • Azure Blob Storage
      • Develop a custom artifact store
    • Container Registries
      • Default Container Registry
      • DockerHub
      • Amazon Elastic Container Registry (ECR)
      • Google Cloud Container Registry
      • Azure Container Registry
      • GitHub Container Registry
      • Develop a custom container registry
    • Step Operators
      • Amazon SageMaker
      • AzureML
      • Google Cloud VertexAI
      • Kubernetes
      • Modal
      • Spark
      • Develop a Custom Step Operator
    • Experiment Trackers
      • Comet
      • MLflow
      • Neptune
      • Weights & Biases
      • Google Cloud VertexAI Experiment Tracker
      • Develop a custom experiment tracker
    • Image Builders
      • Local Image Builder
      • Kaniko Image Builder
      • AWS Image Builder
      • Google Cloud Image Builder
      • Develop a Custom Image Builder
    • Alerters
      • Discord Alerter
      • Slack Alerter
      • Develop a Custom Alerter
    • Annotators
      • Argilla
      • Label Studio
      • Pigeon
      • Prodigy
      • Develop a Custom Annotator
    • Data Validators
      • Great Expectations
      • Deepchecks
      • Evidently
      • Whylogs
      • Develop a custom data validator
    • Feature Stores
      • Feast
      • Develop a Custom Feature Store
    • Model Deployers
      • MLflow
      • Seldon
      • BentoML
      • Hugging Face
      • Databricks
      • vLLM
      • Develop a Custom Model Deployer
    • Model Registries
      • MLflow Model Registry
      • Develop a Custom Model Registry
  • Service Connectors
    • Introduction
    • Complete guide
    • Best practices
    • Connector Types
      • Docker Service Connector
      • Kubernetes Service Connector
      • AWS Service Connector
      • GCP Service Connector
      • Azure Service Connector
      • HyperAI Service Connector
  • Popular Stacks
    • AWS
    • Azure
    • GCP
    • Kubernetes
  • Deployment
    • 1-click Deployment
    • Terraform Modules
    • Register a cloud stack
    • Infrastructure as code
  • Contribute
    • Custom Stack Component
    • Custom Integration
Powered by GitBook
On this page

Was this helpful?

Edit on GitHub
  1. Stack Components
  2. Data Validators

Develop a custom data validator

How to develop a custom data validator

PreviousWhylogsNextFeature Stores

Last updated 8 days ago

Was this helpful?

Before diving into the specifics of this component type, it is beneficial to familiarize yourself with our . This guide provides an essential understanding of ZenML's component flavor concepts.

Base abstraction in progress!

We are actively working on the base abstraction for the Data Validators, which will be available soon. As a result, their extension is not recommended at the moment. When you are selecting a data validator for your stack, you can use one of .

If you need to implement your own Data Validator flavor, you can still do so, but keep in mind that you may have to refactor it when the base abstraction is updated.

ZenML comes equipped with that integrate a variety of data logging and validation libraries, frameworks and platforms. However, if you need to use a different library or service as a backend for your ZenML Data Validator, you can extend ZenML to provide your own custom Data Validator implementation.

Build your own custom data validator

If you want to implement your own custom Data Validator, you can follow the following steps:

  1. Create a class which inherits from and override one or more of the abstract methods, depending on the capabilities of the underlying library/service that you want to integrate.

  2. If you need any configuration, you can create a class which inherits from the BaseDataValidatorConfig class.

  3. Bring both of these classes together by inheriting from the BaseDataValidatorFlavor.

  4. (Optional) You should also provide some standard steps that others can easily insert into their pipelines for instant access to data validation features.

Once you are done with the implementation, you can register it through the CLI. Please ensure you point to the flavor class via dot notation:

zenml data-validator flavor register <path.to.MyDataValidatorFlavor>

For example, if your flavor class MyDataValidatorFlavor is defined in flavors/my_flavor.py, you'd register it by doing:

zenml data-validator flavor register flavors.my_flavor.MyDataValidatorFlavor

ZenML resolves the flavor class by taking the path where you initialized zenml (via zenml init) as the starting point of resolution. Therefore, please ensure you follow of initializing zenml at the root of your repository.

If ZenML does not find an initialized ZenML repository in any parent directory, it will default to the current working directory, but usually it's better to not have to rely on this mechanism, and initialize zenml at the root.

Afterwards, you should see the new flavor in the list of available flavors:

zenml data-validator flavor list

It is important to draw attention to when and how these base abstractions are coming into play in a ZenML workflow.

  • The CustomDataValidatorFlavor class is imported and utilized upon the creation of the custom flavor through the CLI.

  • The CustomDataValidatorConfig class is imported when someone tries to register/update a stack component with this custom flavor. Especially, during the registration process of the stack component, the config will be used to validate the values given by the user. As Config object are inherently pydantic objects, you can also add your own custom validators here.

  • The CustomDataValidator only comes into play when the component is ultimately in use.

The design behind this interaction lets us separate the configuration of the flavor from its implementation. This way we can register flavors and components even when the major dependencies behind their implementation are not installed in our local setting (assuming the CustomDataValidatorFlavor and the CustomDataValidatorConfig are implemented in a different module/path than the actual CustomDataValidator).

general guide to writing custom component flavors in ZenML
the BaseDataValidator class
the best practice
ZenML Scarf
the existing flavors
Data Validator implementations