BaseMaterializer
. All other materializers inherit from this class, and this class defines the interface of all materializers.artifact
object. The most important property of an artifact
object is the uri
. The uri
is created by ZenML at pipeline run time and points to the directory of a file system (the artifact store).handle_input
and handle_return
functions are important.handle_input
is responsible for reading the artifact from the artifact store.handle_return
is responsible for writing the artifact to the artifact store.ASSOCIATED_TYPES
and ASSOCIATED_ARTIFACT_TYPES
.ASSOCIATED_TYPES
is the data type that is being stored. ZenML uses this information to call the right materializer at the right time. i.e. If a ZenML step returns a pd.DataFrame
, ZenML will try to find any materializer that has pd.DataFrame
(or its subclasses) in its ASSOCIATED_TYPES
.ASSOCIATED_ARTIFACT_TYPES
simply define what type
of artifacts are being stored. This can be DataArtifact
, StatisticsArtifact
, DriftArtifact
, etc. This is simply a tag to query certain artifact types in the post-execution workflow.MyObject
that flows between two steps in a pipeline:MyObj
between steps (how could it? We just created this!). Therefore, we have to create our own materializer. To do this you can simply extend the BaseMaterializer
by sub-classing it.fileio
module to ensure your materialization logic works across artifact stores (local and remote like S3 buckets).{OUTPUT_NAME: MATERIALIZER_CLASS}
to the with_return_materializers
function.with_return_materializers
is only called on step1
, all downstream steps will use the same materializer by default.BaseArtifact
(or any of its subclasses) and has a property uri
that points to the unique path in the artifact store where the artifact is stored. One can use a non-materialized artifact by simply specifying it as the type in the step:zenml.artifacts.*
and include ModelArtifact
, DataArtifact
etc. Materializers link pythonic types to these artifact types implicitly, e.g., a keras.model
or torch.nn.Module
are pythonic types that are both linked to ModelArtifact
implicitly via their materializers. When using artifacts directly, one must be aware of which type they are by looking at the previous step's materializer: if the previous step produces a ModelArtifact
then you should specify ModelArtifact
in a non-materialized step.s1
and s2
produce identical artifacts, however s3
consumes materialized artifacts while s4
consumes non-materialized artifacts. s4
can now use the dict_.uri
and list_.uri
paths directly rather than their materialized counterparts.