Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Reflection to support custom Spark Implementation at Runtime #1362

Merged
merged 4 commits into from
Oct 7, 2024

Conversation

amahussein
Copy link
Collaborator

@amahussein amahussein commented Sep 27, 2024

Signed-off-by: Ahmed Hussein [email protected]

Fixes #1360

Adds a workaorund to run against open source Spark and custom Spark implementation that overrides the constructors of Graph objects.

This PR also should improve performance on Databricks environment.
Before the change, we used to try to call normal Spark API, then catch the java exception, finally load the constructor method and call it with correct arguments.
After the change, the tools detects the runtime, and uses reflection everytime to call the constructor. This way, it reduces the overhead on each object allocation.

Signed-off-by: Ahmed Hussein <[email protected]>

Fixes NVIDIA#1360

Adds a workaorund to run against open source Spark and custom Spark
implementation that overrides the constructors of Graph objects.
@amahussein amahussein added core_tools Scope the core module (scala) build dependencies Pull requests that update a dependency file labels Sep 27, 2024
@amahussein amahussein self-assigned this Sep 27, 2024
import scala.annotation.StaticAnnotation
import scala.annotation.meta.{beanGetter, beanSetter, field, getter, param, setter}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit extra newline

@@ -0,0 +1,32 @@
/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like it shouldn't be in the stubs directory, its critical class and there is no platform specific version, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it to the same package.

case _: java.lang.NoSuchMethodError =>
dbRuntimeReflection.constructCluster(id, name, desc, nodes, metrics)
}
GraphReflectionAPIHelper.api.get.constructCluster(id, name, desc, nodes, metrics)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these (GraphReflectionAPIHelper.api) are all Options and we just call get here on it without doing else or checking so is there any reason for that to be an option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the code to define the "method graphBuilder" as a field depending on the initialization of the APIs

Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using the python tools CLI, we download the open source Spark and provide that to the JAR tool. This will happen even if are running the tools inside Databricks env.

In that case, do we need a follow up on python side to use custom runtime jars instead of open source Spark?

Signed-off-by: Ahmed Hussein <[email protected]>
Copy link
Collaborator Author

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tgravescs !
I updated the code and put a code comment pointing to the issue in Scala 2.12

case _: java.lang.NoSuchMethodError =>
dbRuntimeReflection.constructCluster(id, name, desc, nodes, metrics)
}
GraphReflectionAPIHelper.api.get.constructCluster(id, name, desc, nodes, metrics)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the code to define the "method graphBuilder" as a field depending on the initialization of the APIs

@@ -0,0 +1,32 @@
/*
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it to the same package.

Signed-off-by: Ahmed Hussein <[email protected]>
@amahussein
Copy link
Collaborator Author

When using the python tools CLI, we download the open source Spark and provide that to the JAR tool. This will happen even if are running the tools inside Databricks env.

In that case, do we need a follow up on python side to use custom runtime jars instead of open source Spark?

We have the issue #1359 to allow customers to specify their own dependency.
The open source works fine in DB. So, we can leave it as it now.
The trick with setting the dependencies for each individual CSP is that we have to redefine the yaml configuration assuming the user is running on a cluster node and for each one, set the path of the SPARK jars on the local disk.
Currently, we do not need to do all that unless we learn that it affects the QualX significantly (which does not at the moment)

Signed-off-by: Ahmed Hussein <[email protected]>

// for 10.4 it is only one constructor with 3 arguments.
// final java.lang.String name, final long accumulatorId, final java.lang.String metricType
private val isDB104OrOlder: Boolean = constr.paramLists.flatten.size < 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - If there is any addition or deletion of arguments in the future DB versions, should we add a flag or shim it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, when DB changes its API, we will revisit the implementation and do an extension instead of relying on a flag.
I did not want to change that part since it was working fine and we had no need to change .

@amahussein amahussein merged commit b053680 into NVIDIA:dev Oct 7, 2024
14 checks passed
@amahussein amahussein deleted the rapids-tools-1360 branch October 7, 2024 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build core_tools Scope the core module (scala) dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Running Tools on custom Spark raises runtime exception
4 participants