5-minute Quick Wins
5-minute Quick Wins
Below is a menu of 5-minute quick wins you can sprinkle into an existing ZenML project with almost no code changes. Each entry explains why it matters, the micro-setup (under 5 minutes) and any tips or gotchas to anticipate.
Track params, metrics, and properties on every run
Foundation for reproducibility and analytics
Visualize and compare runs with parallel plots
Identify patterns and optimize faster
Automatic metric and artifact tracking
Zero-effort experiment tracking
Instant notifications for pipeline events
Stay informed without checking dashboards
Run pipelines automatically on schedule
Promote notebooks to production workflows
Eliminate cold starts in cloud environments
Reduce iteration time from minutes to seconds
Centralize credentials and tokens
Keep sensitive data out of code
Faster iteration on Docker before cloud
Quick feedback without cloud waiting times
Classify and filter ML assets
Find and relate your ML assets with ease
Track code state with every run
Perfect reproducibility and faster builds
Create rich visualizations effortlessly
Beautiful stakeholder-friendly outputs
Track models and their lifecycle
Central hub for model lineage and governance
Pre-configure your dependencies in a base image
Faster builds and consistent environments
1 Log rich metadata on every run
Why -- instant lineage, reproducibility, and the raw material for all other dashboard analytics. Metadata is the foundation for experiment tracking, model governance, and comparative analysis.
Works at multiple levels:
Within steps: Logs automatically attach to the current step
Pipeline runs: Track environment variables or overall run characteristics
Artifacts: Document data characteristics or processing details
Models: Capture hyperparameters, evaluation metrics, or deployment information
Best practices:
Use consistent keys across runs for better comparison
Group related metadata using nested dictionaries
Use ZenML's special metadata types for standardized representation
2 Activate the Experiment Comparison view (ZenML Pro)
Why -- side-by-side tables + parallel-coordinate plots of any numerical metadata help you quickly identify patterns, trends, and outliers across multiple runs. This visual analysis speeds up debugging and parameter tuning.
Setup -- once you've logged metadata (see quick win #1) nothing else to do; open Dashboard → Compare.
Compare experiments at a glance:
Table View: See all runs side-by-side with automatic change highlighting
Parallel Coordinates Plot: Visualize relationships between hyperparameters and metrics
Filter & Sort: Focus on specific runs or metrics that matter most
CSV Export: Download experiment data for further analysis (Pro tier)
Practical uses:
Compare metrics across model architectures or hyperparameter settings
Identify which parameters have the greatest impact on performance
Track how metrics evolve across iterations of your pipeline
3 Drop-in Experiment Tracker Autologging
Why -- Stream metrics, system stats, model files, and artifacts—all without modifying step code. Different experiment trackers offer varying levels of automatic tracking to simplify your MLOps workflows.Setup
The experiment tracker's autologging capabilities kick in based on your tracker's features:
MLflow
Comprehensive framework-specific autologging for TensorFlow, PyTorch, scikit-learn, XGBoost, LightGBM, Spark, Statsmodels, Fastai, and more. Automatically tracks parameters, metrics, artifacts, and environment details.
Weights & Biases
Out-of-the-box tracking for ML frameworks, media artifacts, system metrics, and hyperparameters.
Neptune
Requires explicit logging for most frameworks but provides automatic tracking of hardware metrics, environment information, and various model artifacts.
Comet
Automatic tracking of hardware metrics, hyperparameters, model artifacts, and source code. Framework-specific autologging similar to MLflow.
Example: Enable autologging in steps
Best Practices
Store API keys in ZenML secrets (see quick win #7) to prevent exposure in Git.
Configure the experiment tracker settings in your steps for more granular control.
For MLflow, use
@step(experiment_tracker="mlflow")
to enable autologging in specific steps only.Disable MLflow autologging when needed, e.g.:
experiment_tracker.disable_autologging()
.
Resources
4 Instant alerter notifications for successes/failures
Why -- get immediate notifications when pipelines succeed or fail, enabling faster response times and improved collaboration. Alerter notifications ensure your team is always aware of critical model training status, data drift alerts, and deployment changes without constantly checking dashboards.
Using in your pipelines
Key features
Rich message formatting with custom blocks, embedded metadata and pipeline artifacts
Human-in-the-loop approval using alerter ask steps for critical deployment decisions
Flexible targeting to notify different teams with specific alerts
Custom approval options to configure which responses count as approvals/rejections
5 Schedule the pipeline on a cron
Why -- promote "run-by-hand" notebooks to automated, repeatable jobs. Scheduled pipelines ensure consistency, enable overnight training runs, and help maintain regularly updated models.
Setup - Using Python
Key Features
Cron expressions for flexible scheduling (daily, weekly, monthly)
Start/end time controls to limit when schedules are active
Timezone awareness to ensure runs start at your preferred local time
Orchestrator-native scheduling leveraging your infrastructure's capabilities
Best Practices
Use descriptive schedule names like
daily-feature-engineering-prod-v1
For critical pipelines, add alert notifications for failures
Verify schedules were created both in ZenML and the orchestrator
When updating schedules, delete the old one before creating a new one
Common troubleshooting
For cloud orchestrators, verify service account permissions
Remember that deleting a schedule from ZenML doesn't remove it from the orchestrator!
6 Kill cold-starts with SageMaker Warm Pools / Vertex Persistent Resources
Why -- eliminate infrastructure initialization delays and reduce model iteration cycle time. Cold starts can add minutes to your workflow, but with warm pools, containers stay ready and model iterations can start in seconds.
Setup for AWS SageMaker
Setup for Google Cloud Vertex AI
Key benefits
Faster iteration cycles - no waiting for VM provisioning and container startup
Cost-effective - share resources across pipeline runs
No code changes - zero modifications to your pipeline code
Significant speedup - reduce startup times from minutes to seconds
Important considerations
SageMaker warm pools incur charges when resources are idle
For Vertex AI, set an appropriate persistent resource name for tracking
Resources need occasional recycling for updates or maintenance
7 Centralise secrets (tokens, DB creds, S3 keys)
Why -- eliminate hardcoded credentials from your code and gain centralized control over sensitive information. Secrets management prevents exposing sensitive information in version control, enables secure credential rotation, and simplifies access management across environments.
Setup - Basic usage
Setup - Multi-value secrets
Key features
Secure storage - credentials kept in secure backend storage, not in your code
Scoped access - restrict secret visibility based on user permissions
Easy rotation - update credentials in one place when they change
Multiple backends - support for Vault, AWS Secrets Manager, GCP Secret Manager, and more
Templated references - use
{{secret_name.key}}
syntax in any stack configuration
Best practices
Use a dedicated secret store in production instead of the default file-based store
Set up CI/CD to use service accounts with limited permissions
Regularly rotate sensitive credentials like API keys and access tokens
8 Run smoke tests locally before going to the cloud
Why -- significantly reduce iteration and debugging time by testing your pipelines with a local Docker orchestrator before deploying to remote cloud infrastructure. This approach gives you fast feedback cycles for containerized execution without waiting for cloud provisioning, job scheduling, and data transfer—ideal for development, troubleshooting, and quick feature validation.
When to switch back to cloud
Key benefits
Fast feedback cycles - Get results in minutes instead of hours
Cost savings - Test on your local machine instead of paying for cloud resources
Simplified debugging - Easier access to logs and containers
Consistent environments - Same Docker containerization as production
Reduced friction - No cloud provisioning delays or permission issues during development
Best practices
Create a small representative dataset for smoke testing
Use configuration parameters to enable smoke-test mode
Keep dependencies identical between smoke tests and production
Run the exact same pipeline code locally and in the cloud
Store sample data in version control for reliable testing
Use
prints
or logging to clearly indicate when running in smoke-test mode
This approach works best when you design your pipelines to be configurable from the start, allowing them to run with reduced data size, shorter training cycles, or simplified processing steps during development.
9 Organize with tags
Why -- add flexible, searchable labels to your ML assets that bring order to chaos as your project grows. Tags provide a lightweight organizational system that helps you filter pipelines, artifacts, and models by domain, status, version, or any custom category—making it easy to find what you're looking for in seconds.
Key features
Filter and search - Quickly find all assets related to a specific domain or project
Exclusive tags - Create tags where only one entity can have the tag at a time (perfect for "production" status)
Cascade tags - Apply pipeline tags automatically to all artifacts created during execution
Flexible organization - Create any tagging system that makes sense for your projects
Multiple entity types - Tag pipelines, runs, artifacts, models, and run templates
Common tag operations
Best practices
Create consistent tag categories (environment, domain, status, version, etc.)
Use a tag registry to standardize tag names across your team
Use exclusive tags for state management (only one "production" model)
Combine prefix patterns for better organization (e.g., "domain-financial", "status-approved")
Update tags as assets progress through your workflow
Document your tagging strategy for team alignment
10 Hook your Git repo to every run
Why -- capture exact code state for reproducibility, automatic model versioning, and faster Docker builds. Connecting your Git repo transforms data science from local experiments to production-ready workflows with minimal effort:
Code reproducibility: All pipelines track their exact commit hash and detect dirty repositories
Docker build acceleration: ZenML avoids rebuilding images when your code hasn't changed
Model provenance: Trace any model back to the exact code that created it
Team collaboration: Share builds across the team for faster iteration
Setup
How it works
When you run a pipeline, ZenML checks if your code is tracked in a registered repository
Your current commit and any uncommitted changes are detected and stored
ZenML can download files from the repository inside containers instead of copying them
Docker builds become highly optimized and are automatically shared across the team
Best practices
Keep a clean repository state when running important pipelines
Store your GitHub/GitLab tokens in ZenML secrets
For CI/CD workflows, this pattern enables automatic versioning with Git SHAs
Consider using
zenml pipeline build
to pre-build images once, then run multiple times
This simple setup can save hours of engineering time compared to manually tracking code versions and managing Docker builds yourself.
11 Simple HTML reports
Why -- create beautiful, interactive visualizations and reports with minimal effort using ZenML's HTMLString type and LLM assistance. HTML reports are perfect for sharing insights, summarizing pipeline results, and making your ML projects more accessible to stakeholders.
Setup
Sample LLM prompt for building reports
Key features
Rich formatting - Full HTML/CSS support for beautiful reports
Interactive elements - Add charts, tables, and responsive design
Easy sharing - Reports appear directly in ZenML dashboard
LLM assistance - Generate complex visualizations with simple prompts
No dependencies - Works out of the box without extra libraries
Advanced use cases
Create comparative reports showing before/after metrics
Build error analysis dashboards with filtering capabilities
Generate PDF-ready reports for stakeholder presentations
Simply return an HTMLString
from any step, and your visualization will automatically appear in the ZenML dashboard for that step's artifacts.
12 Register models in the Model Control Plane
Why -- create a central hub for organizing all resources related to a particular ML feature or capability. The Model Control Plane (MCP) treats a "model" as more than just code—it's a namespace that connects pipelines, artifacts, metadata, and workflows for a specific ML solution, providing seamless lineage tracking and governance that's essential for reproducibility, auditability, and collaboration.
Key features
Namespace organization - group related pipelines, artifacts, and resources under a single entity
Version tracking - automatically version your ML solutions with each pipeline run
Lineage management - trace all components back to training pipelines, datasets, and code
Stage promotion - promote solutions through lifecycle stages (dev → staging → production)
Metadata association - attach any metrics or parameters to track performance over time
Workflow integration - connect training, evaluation, and deployment pipelines in a unified view
Common model operations
Best practices
Create models with meaningful names that reflect the ML capability or business feature they represent
Use consistent metadata keys across versions for better comparison and tracking
Tag models with relevant attributes for easier filtering and organization
Set up model stages to track which ML solutions are in which environments
Use a single model entity to group all iterations of a particular ML capability, even when the underlying technical implementation changes
13 Create a parent Docker image for faster builds
Why -- reduce Docker build times from minutes to seconds and avoid dependency headaches by pre-installing common libraries in a custom parent image. This approach gives you faster iteration cycles, consistent environments across your team, and simplified dependency management—especially valuable for large projects with complex requirements.
Using your parent image in pipelines
Boost team productivity with a shared image
Key benefits
Dramatically faster builds - Only project-specific packages need installation
Consistent environments - Everyone uses the same base libraries
Simplified dependency management - Core dependencies defined once
Reduced cloud costs - Spend less on compute for image building
Lower network usage - Download common large packages just once
Best practices
Include all heavy dependencies and stack component requirements in your parent image
Version your parent image (e.g.,
zenml-parent:0.54.0
) to track changesDocument included packages with a version listing in a requirements.txt
Use multi-stage builds if your parent image needs compiled dependencies
Periodically update the parent image to incorporate security patches
Consider multiple specialized parent images for different types of workloads
For projects with heavy dependencies like deep learning frameworks, this approach can cut build times by 80-90%, turning a 5-minute build into a 30-second one. This is especially valuable in cloud environments where you pay for build time.
Last updated
Was this helpful?