-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Reflection to support custom Spark Implementation at Runtime #1362
Conversation
Signed-off-by: Ahmed Hussein <[email protected]> Fixes NVIDIA#1360 Adds a workaorund to run against open source Spark and custom Spark implementation that overrides the constructors of Graph objects.
import scala.annotation.StaticAnnotation | ||
import scala.annotation.meta.{beanGetter, beanSetter, field, getter, param, setter} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit extra newline
@@ -0,0 +1,32 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels like it shouldn't be in the stubs directory, its critical class and there is no platform specific version, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved it to the same package.
case _: java.lang.NoSuchMethodError => | ||
dbRuntimeReflection.constructCluster(id, name, desc, nodes, metrics) | ||
} | ||
GraphReflectionAPIHelper.api.get.constructCluster(id, name, desc, nodes, metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these (GraphReflectionAPIHelper.api) are all Options and we just call get here on it without doing else or checking so is there any reason for that to be an option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the code to define the "method graphBuilder" as a field depending on the initialization of the APIs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using the python tools CLI, we download the open source Spark and provide that to the JAR tool. This will happen even if are running the tools inside Databricks env.
In that case, do we need a follow up on python side to use custom runtime jars instead of open source Spark?
Signed-off-by: Ahmed Hussein <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tgravescs !
I updated the code and put a code comment pointing to the issue in Scala 2.12
case _: java.lang.NoSuchMethodError => | ||
dbRuntimeReflection.constructCluster(id, name, desc, nodes, metrics) | ||
} | ||
GraphReflectionAPIHelper.api.get.constructCluster(id, name, desc, nodes, metrics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the code to define the "method graphBuilder" as a field depending on the initialization of the APIs
@@ -0,0 +1,32 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved it to the same package.
Signed-off-by: Ahmed Hussein <[email protected]>
We have the issue #1359 to allow customers to specify their own dependency. |
core/src/main/scala/org/apache/spark/scheduler/ToolsListenerEventExtraAPIs.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Ahmed Hussein <[email protected]>
|
||
// for 10.4 it is only one constructor with 3 arguments. | ||
// final java.lang.String name, final long accumulatorId, final java.lang.String metricType | ||
private val isDB104OrOlder: Boolean = constr.paramLists.flatten.size < 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question - If there is any addition or deletion of arguments in the future DB versions, should we add a flag or shim it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, when DB changes its API, we will revisit the implementation and do an extension instead of relying on a flag.
I did not want to change that part since it was working fine and we had no need to change .
Signed-off-by: Ahmed Hussein [email protected]
Fixes #1360
Adds a workaorund to run against open source Spark and custom Spark implementation that overrides the constructors of Graph objects.
This PR also should improve performance on Databricks environment.
Before the change, we used to try to call normal Spark API, then catch the java exception, finally load the constructor method and call it with correct arguments.
After the change, the tools detects the runtime, and uses reflection everytime to call the constructor. This way, it reduces the overhead on each object allocation.