Usage Analytics

What are the usage statistics ZenML collects

This is an older version of the ZenML documentation. To read and view the latest version please visit this up-to-date URL.

Usage Analytics

In order to help us better understand how the community uses ZenML, the pip package reports anonymized usage statistics. You can always opt-out by using the CLI command:

zenml config analytics opt-out

Currently, opting in and out of analytics is a global setting applicable to all ZenML repositories within your system.

Why ZenML collects analytics

In addition to the community at large, ZenML is created and maintained by a startup based in Munich, Germany called ZenML GmbH. We're a team of techies that love MLOps and want to build tools that fellow developers would love to use in their daily work. This is us, if you want to put faces to the names!

However, in order to improve ZenML and understand how it is being used, we need to use analytics to have an overview of how it is used 'in the wild'. This not only helps us find bugs but also helps us prioritize features and commands that might be useful in future releases. If we did not have this information, all we really get is pip download statistics and chatting with people directly, which while being valuable, is not enough to seriously better the tool as a whole.

How ZenML collects these statistics

ZenML uses Segment as the data aggregation library for all our analytics. The entire code is entirely visible and can be seen at zenml_analytics.py. The main function is the track function that triggers a Segment Analytics Track event, which runs on a separate background thread from the main thread.

None of the data sent can identify you individually but allows us to understand how ZenML is being used holistically.

What does ZenML collect?

ZenML triggers an asynchronous Segment Track Event on the following events, which is also viewable in the zenml_analytics.py file in the GitHub repository.

# Pipelines
    RUN_PIPELINE = "Pipeline run"
    GET_PIPELINES = "Pipelines fetched"
    GET_PIPELINE = "Pipeline fetched"

    # Repo
    INITIALIZE_REPO = "ZenML initialized"

    # Profile
    INITIALIZED_PROFILE = "Profile initialized"

    # Components
    REGISTERED_STACK_COMPONENT = "Stack component registered"
    UPDATED_STACK_COMPONENT = "Stack component updated"

    # Stack
    REGISTERED_STACK = "Stack registered"
    SET_STACK = "Stack set"
    UPDATED_STACK = "Stack updated"

    # Analytics opt in and out
    OPT_IN_ANALYTICS = "Analytics opt-in"
    OPT_OUT_ANALYTICS = "Analytics opt-out"

    # Examples
    RUN_EXAMPLE = "Example run"
    PULL_EXAMPLE = "Example pull"

    # Integrations
    INSTALL_INTEGRATION = "Integration installed"

In addition, each Segment Track event collects the following metadata:

  • A unique UUID that is anonymous.

  • The ZenML version.

  • Operating system information, e.g. Ubuntu Linux 16.04

Last updated