-
Hi, all We recently test BatchSession with SessionConfAdvisor. At some point, spark jobs dose not have some necessary confs that should be delivered from our SessionConfAdvisor. After some research, found below line: normalizedConf ++ overlayConf.getBatchConf(batchType) So, if we want to use same configs at sql and batch session, the SessionConfAdvisor makes same config with two different notation:
Is this intended? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
This is a good question and I do have some arguments offline with other committers. The original intention here is: it is desirable to have different default configurations for batch and interactive Spark jobs, and we need a mechanism to isolate those configurations. @turboFei is the author of the batch API, and made this decision while I wasn't aware at the beginning. I agree this is a simple and crude approach to achieving isolation, the side-effect is, it's intuitive, I actually get asked many times why configuring So, we may need to introduce a flag to control the behavior, something like:
|
Beta Was this translation helpful? Give feedback.
-
Yes, it is by design, as batch job are normal spark application, not long running/ad-hoc and just submitted via kyuubi gateway, we can not share the same config template for INTERACTIVE and BATCH jobs. For example: We will limit the max files created per task for interactive jobs, as we think they are ad-hoc queries. |
Beta Was this translation helpful? Give feedback.
-
Yes, it is fine to introduce a flag to control that. |
Beta Was this translation helpful? Give feedback.
This is a good question and I do have some arguments offline with other committers.
The original intention here is: it is desirable to have different default configurations for batch and interactive Spark jobs, and we need a mechanism to isolate those configurations.
@turboFei is the author of the batch API, and made this decision while I wasn't aware at the beginning. I agree this is a simple and crude approach to achieving isolation, the side-effect is, it's intuitive, I actually get asked many times why configuring
spark.submit.deployMode=cluster
inkyuubi-defaults.conf
does not work for batch Spark jobs.So, we may need to introduce a flag to control the behavior, something like: