-
Notifications
You must be signed in to change notification settings - Fork 28.7k
[SPARK-25442][SQL][K8S] Support STS to run in k8s deployments with spark deployment mode as cluster. #22433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
Thank you for your first contribution, @suryag10 .
And, just out of curious, do we need this change? - exec "${SPARK_HOME}"/sbin/spark-daemon.sh submit $CLASS 1 --name "Thrift JDBC/ODBC Server" "$@"
+ exec "${SPARK_HOME}"/sbin/spark-daemon.sh submit $CLASS 1 --name "Thrift-JDBC-ODBC-Server" "$@" |
Without the above change, it fails to start the driver pod as well. Spaces, "/" are not allowed for the "name" in the kubernetes world. |
Does it fail in k8s or does spark k8s code error out ? |
Following is the error seen without the fix: This is not specific to Kubernetes, but more of a generic DNS (DNS-1123) |
Test build #96105 has finished for PR 22433 at commit
|
It is an implementation detail of k8s integration that application name is expected to be DNS compliant ... spark does not have that requirement; and yarn/mesos/standalone/local work without this restriction. |
As this script is common start point for all the resource managers(k8s/yarn/mesos/standalone/local), i guess changing this to fit for all the cases has a value add, instead of doing at each resource manager level. Thoughts? |
I'm wondering, is there some reason this isn't supported in cluster mode for yarn & mesos? Or put another way, what is the rationale for k8s being added as an exception to this rule? |
Your changes to the name handling don’t comply with this, so agree with @mridulm you should move this change elsewhere and more broadly support name validation/sanitization for submitted applications in kubernetes |
Please note that I am specifically referring only to the need for changing application |
I donno the specific reason why this was not supported in yarn and mesos. The initial contributions to the spark on K8S started with cluster mode(with restriction for client mode). So this PR enhances such that STS can run in k8s deployments with spark cluster mode(In the latest spark code i had observed that the client mode also works(need to cross verify this once)). |
Agreed with @mridulm that the naming restriction is specific to k8s and should be handled in a k8s specific way, e.g., somewhere around https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L208. |
Ok, Will update the PR with the same. |
Hi, Handling of this conversion is already present in I had reverted back the change in start-thriftserver.sh file. Please review and merge. |
@mridulm @liyinan926 @jacobdr @ifilonenko I had reverted back the fix in start-thriftserver.sh. Please review and merge. |
…ark deployment mode as cluster.
can somebody pls review and merge? |
In the scenario of a cluster-mode submission, what is the command-line behavior? Does the thrift-server script "block" until the thrift server pod is shut down? |
If possible, there should be some basic integration testing. Run a thrift server command against the minishift cluster used by the other testing. |
STS is a server and its best way of deployment in K8S cluster is either done through the helm chart or through the yaml file(although it can be done through the method you had suggested, but i guess that scenario would be a rare case and there will be no HA of the STS server if it is triggered from outside). |
By default the script returns but can be made to block by setting the environment variable SPARK_NO_DAEMONIZE. Once this is done, script blocks until the thrift server pod is shut down |
Will add this a separate PR. |
Can some body pls merge this? |
I am observing some weird behaviour when i am trying to respond to the comments. Hence i am adding the resposes to comments as below.
STS is a server and its best way of deployment in K8S cluster is either done through the helm chart or through the yaml file(although it can be done through the method you had suggested, but i guess that scenario would be a rare case and there will be no HA of the STS server if it is triggered from outside).
By default the script returns but can be made to block by setting the environment variable SPARK_NO_DAEMONIZE. Once this is done, script blocks until the thrift server pod is shut down
Will add it as a separate PR. Pls merge this, if you are ok with the responses. |
@suryag10 you were probably encountering github server problems from yesterday: |
@suryag10, all things being equal, it is considered preferable to provide testing for new functionality on the same PR. Are there are logistical problems adding testing here? |
The bug here should be SPARK-23078; no point in filing duplicate bugs. Also, could anyone answer my question in the bug? Seems like we don't need this anymore. |
No updates on the bug so I assume what I wrote is correct. Closing. |
What changes were proposed in this pull request?
Code is enhanced to allow the STS run in kubernetes deployment with spark deploy mode of cluster.
How was this patch tested?
Started the sts in cluster mode in K8S deployment and was able to run some queries using the beeline client.