Spark
Executing individual steps on Spark
Step Operators: SparkStepOperator
SparkStepOperatorfrom typing import Optional, Dict, Any
from zenml.step_operators import BaseStepOperatorConfig
class SparkStepOperatorConfig(BaseStepOperatorConfig):
"""Spark step operator config.
Attributes:
master: is the master URL for the cluster. You might see different
schemes for different cluster managers which are supported by Spark
like Mesos, YARN, or Kubernetes. Within the context of this PR,
the implementation supports Kubernetes as a cluster manager.
deploy_mode: can either be 'cluster' (default) or 'client' and it
decides where the driver node of the application will run.
submit_kwargs: is the JSON string of a dict, which will be used
to define additional params if required (Spark has quite a
lot of different parameters, so including them, all in the step
operator was not implemented).
"""
master: str
deploy_mode: str = "cluster"
submit_kwargs: Optional[Dict[str, Any]] = NoneWarning
Stack Component: KubernetesSparkStepOperator
KubernetesSparkStepOperatorWhen to use it
How to deploy it
Spark EKS Setup Guide
EKS Kubernetes Cluster
Docker image for the Spark drivers and executors
Configuring RBAC
How to use it
Additional configuration
Last updated
Was this helpful?