Module core.backends.training.training_gcaip_backend

Definition of the GCAIP Training Backend

Classes

SingleGPUTrainingGCAIPBackend(project: str, job_dir: str, gpu_type: str = 'K80', machine_type: str = 'n1-standard-4', image: str = 'eu.gcr.io/maiot-zenml/zenml:cuda-0.2.0', job_name: str = 'train_1611569655', region: str = 'europe-west1', python_version: str = '3.7', max_running_time: int = 7200, **kwargs) : Runs a TrainerStep on Google Cloud AI Platform.

A training backend can be used to efficiently train a machine learning
model on large amounts of data. This triggers a Training job on the Google
Cloud AI Platform service: https://cloud.google.com/ai-platform.

This backend is meant for a training job with a single GPU only. The user
has a choice of three GPUs, specified in the GCPGPUTypes Enum.

An opinionated wrapper around a GCAIP training job.

Args:
    project: The GCP project in which to run the job.
    job_dir: A bucket where to store some metadata while training.
    gpu_type: The type of gpu.
    machine_type: The type of machine to use. This must conform to
    the GCP compatability matrix with the gpu_type. See details
    here: https://cloud.google.com/ai-platform/training/docs/using
    -gpus#compute-engine-machine-types-with-gpu
    image: The Docker image with which to run the job.
    job_name: The name of the job.
    region: The GCP region to run the job in.
    python_version: The Python version for the job.
    max_running_time: The maximum running time of the job in seconds.
    **kwargs:

### Ancestors (in MRO)

* zenml.core.backends.training.training_local_backend.TrainingLocalBackend
* zenml.core.backends.base_backend.BaseBackend

### Class variables

`BACKEND_TYPE`
:

### Methods

`get_custom_config(self)`
:   Return a dict to be passed as a custom_config to the Trainer.

`get_executor_spec(self)`
:   Return a TFX Executor spec for the Trainer Component.