The exact explanation and defaults for spark config can be found here, None
means to use the spark native defaults
Generated by generate-config-docs.py Run
python ./generate_config_docs.py
to update this file
Source code: sparglim/config/configer.py
Avaliable environment variables for SparkEnvConfiger:
Default config:
SPAGLIM_APP_NAME
:spark.app.name
, default:Sparglim
.SPAGLIM_DEPLOY_MODE
:spark.submit.deployMode
, default:client
.SPARGLIM_SCHEDULER_MODE
:spark.scheduler.mode
, default:FAIR
.SPARGLIM_UI_PORT
:spark.ui.port
, default:None
.SPARGLIM_DRIVER_JAVA_OPTIONS
:spark.driver.defaultJavaOptions
, default:None
.SPARGLIM_EXECUTOR_JAVA_OPTIONS
:spark.executor.defaultJavaOptions
, default:None
.SPARGLIM_DRIVER_JAVA_EXTRA_OPTIONS
:spark.driver.extraJavaOptions
, default:None
.SPARGLIM_EXECUTOR_JAVA_EXTRA_OPTIONS
:spark.executor.extraJavaOptions
, default:None
.S3_ACCESS_KEY
orAWS_ACCESS_KEY_ID
:spark.hadoop.fs.s3a.access.key
, default:None
.S3_SECRET_KEY
orAWS_SECRET_ACCESS_KEY
:spark.hadoop.fs.s3a.secret.key
, default:None
.S3_ENTRY_POINT
:spark.hadoop.fs.s3a.endpoint
, default:None
.S3_ENTRY_POINT_REGION
orAWS_DEFAULT_REGION
:spark.hadoop.fs.s3a.endpoint.region
, default:None
.S3_PATH_STYLE_ACCESS
:spark.hadoop.fs.s3a.path.style.access
, default:None
.S3_MAGIC_COMMITTER
:spark.hadoop.fs.s3a.bucket.all.committer.magic.enabled
, default:None
.SPARGIM_KERBEROS_KEYTAB
:spark.kerberos.keytab
, default:None
.SPARGIM_KERBEROS_PRINCIPAL
:spark.kerberos.principal
, default:None
.
config_basic()
can config following:
SPAGLIM_APP_NAME
:spark.app.name
, default:Sparglim
.SPAGLIM_DEPLOY_MODE
:spark.submit.deployMode
, default:client
.SPARGLIM_SCHEDULER_MODE
:spark.scheduler.mode
, default:FAIR
.SPARGLIM_UI_PORT
:spark.ui.port
, default:None
.SPARGLIM_DRIVER_JAVA_OPTIONS
:spark.driver.defaultJavaOptions
, default:None
.SPARGLIM_EXECUTOR_JAVA_OPTIONS
:spark.executor.defaultJavaOptions
, default:None
.SPARGLIM_DRIVER_JAVA_EXTRA_OPTIONS
:spark.driver.extraJavaOptions
, default:None
.SPARGLIM_EXECUTOR_JAVA_EXTRA_OPTIONS
:spark.executor.extraJavaOptions
, default:None
.
config_s3()
can config following:
S3_ACCESS_KEY
orAWS_ACCESS_KEY_ID
:spark.hadoop.fs.s3a.access.key
, default:None
.S3_SECRET_KEY
orAWS_SECRET_ACCESS_KEY
:spark.hadoop.fs.s3a.secret.key
, default:None
.S3_ENTRY_POINT
:spark.hadoop.fs.s3a.endpoint
, default:None
.S3_ENTRY_POINT_REGION
orAWS_DEFAULT_REGION
:spark.hadoop.fs.s3a.endpoint.region
, default:None
.S3_PATH_STYLE_ACCESS
:spark.hadoop.fs.s3a.path.style.access
, default:None
.S3_MAGIC_COMMITTER
:spark.hadoop.fs.s3a.bucket.all.committer.magic.enabled
, default:None
.
config_kerberos()
can config following:
SPARGIM_KERBEROS_KEYTAB
:spark.kerberos.keytab
, default:None
.SPARGIM_KERBEROS_PRINCIPAL
:spark.kerberos.principal
, default:None
.
config_local()
can config following:
SPARGLIM_MASTER
:spark.master
, default:local[*]
.SPARGLIM_LOCAL_MEMORY
:spark.driver.memory
, default:512m
.
config_connect_client()
can config following:
SPARGLIM_REMOTE
:spark.remote
, default:sc://localhost:15002
.
config_connect_server()
can config following:
SPARGLIM_CONNECT_SERVER_PORT
:spark.connect.grpc.binding.port
, default:None
.SPARGLIM_CONNECT_GRPC_ARROW_MAXBS
:spark.connect.grpc.arrow.maxBatchSize
, default:None
.SPARGLIM_CONNECT_GRPC_MAXIM
:spark.connect.grpc.maxInboundMessageSize
, default:None
.
config_k8s()
can config following:
SPARGLIM_MASTER
:spark.master
, default:k8s://https://kubernetes.default.svc
.SPARGLIM_K8S_NAMESPACE
:spark.kubernetes.namespace
, default:None
.SPARGLIM_K8S_IMAGE
:spark.kubernetes.container.image
, default:wh1isper/spark-executor:3.4.1
.SPARGLIM_K8S_IMAGE_PULL_SECRETS
:spark.kubernetes.container.image.pullSecrets
, default:None
.SPARGLIM_K8S_IMAGE_PULL_POLICY
:spark.kubernetes.container.image.pullPolicy
, default:IfNotPresent
.SPARK_EXECUTOR_NUMS
:spark.executor.instances
, default:3
.SPARGLIM_K8S_EXECUTOR_LABEL_LIST
:spark.kubernetes.executor.label.*
, default:sparglim-executor
. A string seperated by,
will be convertedSPARGLIM_K8S_EXECUTOR_ANNOTATION_LIST
:spark.kubernetes.executor.annotation.*
, default:sparglim-executor
. A string seperated by,
will be convertedSPARGLIM_DRIVER_HOST
:spark.driver.host
, default:None
.SPARGLIM_DRIVER_BINDADDRESS
:spark.driver.bindAddress
, default:0.0.0.0
.SPARGLIM_DRIVER_POD_NAME
:spark.kubernetes.driver.pod.name
, default:None
.SPARGLIM_K8S_EXECUTOR_REQUEST_CORES
:spark.kubernetes.executor.cores
, default:None
.SPARGLIM_K8S_EXECUTOR_LIMIT_CORES
:spark.kubernetes.executor.limit.cores
, default:None
.SPARGLIM_EXECUTOR_REQUEST_MEMORY
:spark.executor.memory
, default:512m
.SPARGLIM_EXECUTOR_LIMIT_MEMORY
:spark.executor.memoryOverhead
, default:None
.SPARGLIM_K8S_GPU_VENDOR
:spark.executor.resource.gpu.vendor
, default:nvidia.com
.SPARGLIM_K8S_GPU_DISCOVERY_SCRIPT
:spark.executor.resource.gpu.discoveryScript
, default:/opt/spark/examples/src/main/scripts/getGpusResources.sh
.SPARGLIM_K8S_GPU_AMOUNT
:spark.executor.resource.gpu.amount
, default:None
.SPARGLIM_RAPIDS_SQL_ENABLED
:spark.rapids.sql.enabled
, default:None
.
S3 secrets tokens(and others) need only be configured on the Driver
or Connect Server
, Configuration in Connect client
take no effort.