Module core.steps.preprocesser.standard_preprocesser.standard_preprocesser

Functions

infer_schema(schema) : Function to infer a schema from input data. Args: schema: Input dict mapping features to tf.Tensors.

Returns:
    schema_dict: Dict mapping features to their respective data types.

transformed_name(key) : Appends a suffix to a feature name, indicating that it has been transformed in a preprocessing step. Args: key: Name of the feature.

Classes

StandardPreprocesser(features: List[str] = None, labels: List[str] = None, overwrite: Dict[str, Any] = None, **unused_kwargs) : Standard Preprocessor step. This step can be used to apply a variety of standard preprocessing techniques from the field of Machine Learning, which are predefined in ZenML, to the data.

Standard preprocessing step for generic preprocessing. Preprocessing
steps are inferred on a feature-by-feature basis from its data type,
and different standard preprocessing methods are applied based on the
data type of the feature.

The overwrite argument can also be used to overwrite the data
type-specific standard preprocessing functions and apply other
functions instead, given that they are registered in the default
preprocessing method dictionary defined at the top of this file.

An entry in the `overwrite` dict could look like this: ::

    {feature_name:
        {"transform": [{"method": "scale_to_z_score",
                                   "parameters": {}}],
         "filling": [{"method": "max", "parameters": {}}]
    }

Args:
    features: List of data features to be preprocessed.
    labels: List of features in the data that are to be predicted.
    overwrite: Dict of dicts, mapping features to a list of
     custom preprocessing and filling methods to be used.
    **unused_kwargs: Additional unused keyword arguments. Their usage
     might change in the future.

### Ancestors (in MRO)

* zenml.core.steps.preprocesser.base_preprocesser.BasePreprocesserStep
* zenml.core.steps.base_step.BaseStep

### Static methods

`apply_filling(data, filling_list)`
:   Apply a list of fillings to input data.
    Args:
        data: Data to be input into the transform functions.
        filling_list: List of fillings to apply to the data. As of now,
         only the first filling in the list will be applied.
    
    Returns:
        data: Imputed data after the first filling in filling_list
        has been applied.

`apply_transform(key, data, transform_list)`
:   Apply a list of transformations to input data.
    Args:
        key: Key argument specific to vocabulary computation.
        data: Data to be input into the transform functions.
        transform_list: List of transforms to apply to the data.
    
    Returns:
        data: Transformed data after each of the transforms in
         transform_list have been applied.

### Methods

`get_preprocessing_fn(self)`
:

`preprocessing_fn(self, inputs: Dict)`
:   Standard preprocessing function.
    Args:
        inputs: Dict mapping features to their respective data.
    
    Returns:
        output: Dict mapping transformed features to their respective
         transformed data.