Load your Data from Feature Stores
ZenML integrates with Feast so you can access your batch and online data via a Feature Store
This is an older version of the ZenML documentation. To check the latest version please visit https://docs.zenml.io
Feature Store
Feature stores allow data teams to serve data via an offline store and an online low-latency store where data is kept in sync between the two. It also offers a centralized registry where features (and feature schemas) are stored for use within a team or wider organization.
As a data scientist working on training your model, your requirements for how you access your batch / 'offline' data will almost certainly be different from how you access that data as part of a real-time or online inference setting. Feast solves the problem of developing train-serve skew where those two sources of data diverge from each other.
Feature stores are a relatively recent addition to commonly-used machine learning stacks. Feast is a leading open-source feature store, first developed by Gojek in collaboration with Google.
🗺 Features Stores & ZenML
There are two core functions that feature stores enable: access to data from an offline / batch store for training and access to online data at inference time. The ZenML Feast integration enables both of these behaviors.
ZenML assumes that users of the integration already have a feature store that they just need to connect with. The ZenML Feast integration currently supports your choice of offline data sources, and a Redis backend for your online feature serving. We encourage users to check out Feast's documentation and guides on how to set up your offline and online data sources via the configuration yaml
file.
Online data retrieval is currently possible in a local setting, but we don't currently support using the online data serving in the context of a deployed model or as part of model deployment. We will update this documentation as we develop out this feature.
Get your offline / batch data from a Feature Store
ZenML supports access to your feature store via a stack component that you can configure via the CLI tool. ( See here for details on how to do that.)
Getting features from a registered and active feature store is possible by creating your own step that interfaces into the feature store:
Note that ZenML's use of Pydantic to serialize and deserialize inputs stored in the ZenML metadata means that we are limited to basic data types. Pydantic cannot handle Pandas DataFrame
s, for example, or datetime
values, so in the above code you can see that we have to convert them at various points.
Get your online data from a Feature Store at inference time
COMING SOON: While the ZenML integration has an interface to access online feature store data, it currently is not usable in production settings with deployed models. We will update the docs when we enable this functionality.
Last updated