Module core.components.data_gen.component¶
Classes¶
DataGen(source: str, source_args: Dict[str, Any], instance_name: Union[str, NoneType] = None, examples: Union[tfx.types.component_spec.ChannelParameter, NoneType] = None)
: Base class for a TFX pipeline component.
An instance of a subclass of BaseComponent represents the parameters for a
single execution of that TFX pipeline component.
All subclasses of BaseComponent must override the SPEC_CLASS field with the
ComponentSpec subclass that defines the interface of this component.
Attributes:
SPEC_CLASS: a subclass of types.ComponentSpec used by this component
(required). This is a class level value.
EXECUTOR_SPEC: an instance of executor_spec.ExecutorSpec which describes how
to execute this component (required). This is a class level value.
DRIVER_CLASS: a subclass of base_driver.BaseDriver as a custom driver for
this component (optional, defaults to base_driver.BaseDriver). This is a
class level value.
spec: an instance of `SPEC_CLASS`. See types.ComponentSpec for more details.
platform_config: a protobuf message representing platform config for a
component instance.
Interface for all DataGen components, the main component responsible
for reading data and converting to TFRecords. This is how we handle
versioning data for now.
Args:
source:
source_args:
schema:
instance_name:
examples:
enable_cache:
### Ancestors (in MRO)
* tfx.dsl.components.base.base_component.BaseComponent
* tfx.dsl.components.base.base_node.BaseNode
* tfx.utils.json_utils.Jsonable
### Class variables
`EXECUTOR_SPEC`
:
`SPEC_CLASS`
: A specification of the inputs, outputs and parameters for a component.
Components should have a corresponding ComponentSpec inheriting from this
class and must override:
- PARAMETERS (as a dict of string keys and ExecutionParameter values),
- INPUTS (as a dict of string keys and ChannelParameter values) and
- OUTPUTS (also a dict of string keys and ChannelParameter values).
Here is an example of how a ComponentSpec may be defined:
class MyCustomComponentSpec(ComponentSpec):
PARAMETERS = {
'internal_option': ExecutionParameter(type=str),
}
INPUTS = {
'input_examples': ChannelParameter(type=standard_artifacts.Examples),
}
OUTPUTS = {
'output_examples': ChannelParameter(type=standard_artifacts.Examples),
}
To create an instance of a subclass, call it directly with any execution
parameters / inputs / outputs as kwargs. For example:
spec = MyCustomComponentSpec(
internal_option='abc',
input_examples=input_examples_channel,
output_examples=output_examples_channel)
Attributes:
PARAMETERS: a dict of string keys and ExecutionParameter values.
INPUTS: a dict of string keys and ChannelParameter values.
OUTPUTS: a dict of string keys and ChannelParameter values.
DataGenSpec(**kwargs)
: A specification of the inputs, outputs and parameters for a component.
Components should have a corresponding ComponentSpec inheriting from this
class and must override:
- PARAMETERS (as a dict of string keys and ExecutionParameter values),
- INPUTS (as a dict of string keys and ChannelParameter values) and
- OUTPUTS (also a dict of string keys and ChannelParameter values).
Here is an example of how a ComponentSpec may be defined:
class MyCustomComponentSpec(ComponentSpec):
PARAMETERS = {
'internal_option': ExecutionParameter(type=str),
}
INPUTS = {
'input_examples': ChannelParameter(type=standard_artifacts.Examples),
}
OUTPUTS = {
'output_examples': ChannelParameter(type=standard_artifacts.Examples),
}
To create an instance of a subclass, call it directly with any execution
parameters / inputs / outputs as kwargs. For example:
spec = MyCustomComponentSpec(
internal_option='abc',
input_examples=input_examples_channel,
output_examples=output_examples_channel)
Attributes:
PARAMETERS: a dict of string keys and ExecutionParameter values.
INPUTS: a dict of string keys and ChannelParameter values.
OUTPUTS: a dict of string keys and ChannelParameter values.
Initialize a ComponentSpec.
Args:
**kwargs: Any inputs, outputs and execution parameters for this instance
of the component spec.
### Ancestors (in MRO)
* tfx.types.component_spec.ComponentSpec
* tfx.utils.json_utils.Jsonable
### Class variables
`INPUTS`
:
`OUTPUTS`
:
`PARAMETERS`
: