Module core.steps.sequencer.standard_sequencer.standard_sequencer

Classes

StandardSequencer(timestamp_column: str, category_column: str = None, overwrite: Dict[str, str] = None, resampling_freq: str = '1D', gap_threshold: int = 60000, sequence_length: int = 4, sequence_shift: int = 1, **kwargs) : Base class for all sequencer steps. These steps are used to specify transformation and filling operations on timeseries datasets that occur before the data preprocessing takes place.

Initializing the StandardSequencer step, which is responsible for
extracting sequences from any timeseries dataset.

The main logic behind this step can be summed up in a few steps
as follows:

1. First, we define how to add the corresponding timestamp to
each datapoint by using the function `get_timestamp_do_fn`. In this
implementation, the timestamp is expected to be a unix timestamp.

2. Similarly, we define how to add a categorical key to each
datapoint if a categorical column is provided.

3. Following that, we use the timestamp, the categorical key and
the gap threshold to split the data into so-called 'sessions'

4. Once the data is split into sessions, we resample the sessions
based on the `resampling_freq` to create equidistant timestamps,
fill the missing values and extract the finalized sequences based on
the `sequence_length` and `sequence_shift`

:param timestamp_column: string, the name of the column for the
timestamp resides
:param category_column: string, the name of the column of a possible
categorical feature
:param overwrite: dict, used to overwrite any of the default resampling
and filling behaviour
:param resampling_freq: string, the resampling frequency as an
Offset Alias
:param gap_threshold: int, the minimum gap between two sessions in
seconds
:param sequence_length: int, the desired length of a sequence in terms
of datapoints
:param sequence_shift: int, the number steps to shift before extracting
the next sequence
:param kwargs: additional params

### Ancestors (in MRO)

* zenml.core.steps.sequencer.base_sequencer.BaseSequencerStep
* zenml.core.steps.base_step.BaseStep

### Methods

`get_category_do_fn(self)`
:   Creates a class which inherits from beam.DoFn to add a categorical key
    to each datapoint
    
    :return: an instance of the beam.DoFn

`get_combine_fn(self)`
:   Creates a class which inherits from beam.CombineFn which processes
    sessions and extracts sequences from it
    
    :return: an instance of the beam.CombineFn

`get_timestamp_do_fn(self)`
:   Creates a class which inherits from beam.DoFn to add the timestamp
    to each datapoint
    
    :return: an instance of the beam.DoFn

`get_window(self)`
:   Returns a selected beam windowing strategy
    
    :return: the selected windowing strategy