Module core.steps.split.utils¶
Some split step utilities are implemented here.
Functions¶
get_categorical_value(example: tensorflow.core.example.example_pb2.Example, cat_col: str)
: Helper function to get the categorical value from a tf.train.Example.
Args:
example: tf.train.Example, data point in proto format.
cat_col: Name of the categorical feature of which to extract the
value from.
Returns:
value: The categorical value found in the `cat_col` feature inside
the tf.train.Example.
Raises:
AssertionError: If the `cat_col` feature is not present in the
tf.train.Example.
partition_cat_list(cat_list: List[Union[str, int]], c_ratio: Dict[str, float])
: Helper to split a category list by the entries in a category split dict.
Args:
cat_list: List of categorical values found in the categorical column.
c_ratio: Dict {fold: percentage} mapping the percentage of all
categories to split folds.
Returns:
cat_dict: Dict {fold: categorical_list} mapping lists of categorical
values in the data to their designated split folds.