Module core.steps.split.utils

Some split step utilities are implemented here.

Functions

get_categorical_value(example: tensorflow.core.example.example_pb2.Example, cat_col: str) : Helper function to get the categorical value from a tf.train.Example.

Args:
    example: tf.train.Example, data point in proto format.
    cat_col: Name of the categorical feature of which to extract the
    value from.

Returns:
    value: The categorical value found in the `cat_col` feature inside
    the tf.train.Example.

Raises:
    AssertionError: If the `cat_col` feature is not present in the
    tf.train.Example.

partition_cat_list(cat_list: List[Union[str, int]], c_ratio: Dict[str, float]) : Helper to split a category list by the entries in a category split dict.

Args:
    cat_list: List of categorical values found in the categorical column.
    c_ratio: Dict {fold: percentage} mapping the percentage of all
     categories to split folds.

Returns:
    cat_dict: Dict {fold: categorical_list} mapping lists of categorical
     values in the data to their designated split folds.