Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix python tool picking up wrong JAR version in Fat wheel mode #1357

Merged
merged 3 commits into from
Sep 24, 2024

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Sep 23, 2024

Fixes #1355.

Currently, in fat wheel mode, the python tools uses a glob search to pick the the first JAR file that matches the pattern rapids-4-spark-tools_*.jar. This PR fixes this bug by selecting the exact JAR file that was packaged during the fat build.

Changes

  • Extract the tools JAR file name that was packaged during the fat build and set it as a context variable fatWheelModeJarFileName
  • Use the above the file name to get the correct tools JAR from the cache folder.

Testing

  • Reproduced the case using steps provided in the issue description
    • Build fat wheel for 24.08.2 and then for 24.08.3-SNAPSHOT
    • 24.08.3-SNAPSHOT was picking up 24.08.2 JAR.
  • Manually tested before and after the changes.

Before:

INFO rapids.tools.qualification: Using Spark RAPIDS user tools version 24.08.3
...
INFO rapids.tools.qualification.ctxt: Using jar from wheel file /var/tmp/spark_rapids_user_tools_cache/rapids-4-spark-tools_2.12-24.08.2.jar
INFO rapids.tools.qualification: Downloading the tools jars /var/tmp/spark_rapids_user_tools_cache/rapids-4-spark-tools_2.12-24.08.2.jar
INFO rapids.tools.qualification: RAPIDS accelerator tools jar is downloaded to work_dir /tools_run/qual_20240923210702_cB71064C/work_dir/rapids-4-spark-tools_2.12-24.08.2.jar
INFO rapids.tools.qualification: Using Spark RAPIDS Accelerator Tools jar version 24.08.2

After:

INFO rapids.tools.qualification: Using Spark RAPIDS user tools version 24.08.3
...
INFO rapids.tools.qualification.ctxt: Using jar from wheel file /var/tmp/spark_rapids_user_tools_cache/rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar
INFO rapids.tools.qualification: Downloading the tools jars /var/tmp/spark_rapids_user_tools_cache/rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar
INFO rapids.tools.qualification: RAPIDS accelerator tools jar is downloaded to work_dir /tools_run/qual_20240923222802_18fA0d42/work_dir/rapids-4-spark-tools_2.12-24.08.3-SNAPSHOT.jar
INFO rapids.tools.qualification: Using Spark RAPIDS Accelerator Tools jar version 24.08.3

Notes:

  • We could have simplified this by getting the Tools JAR version from python and using it as a regex in fat wheel mode.
  • However, being explicit with the packaged JAR avoids any confusion.
  • This approach also allows for future enhancements, such as enabling build.sh to accept a custom JAR.

@parthosa parthosa added bug Something isn't working user_tools Scope the wrapper module running CSP, QualX, and reports (python) labels Sep 23, 2024
@parthosa parthosa self-assigned this Sep 23, 2024
@parthosa parthosa marked this pull request as ready for review September 23, 2024 23:35
cindyyuanjiang
cindyyuanjiang previously approved these changes Sep 24, 2024
Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa! LGTM.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa !
A couple of nits

Signed-off-by: Partho Sarthi <[email protected]>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTME!

Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa!

@parthosa parthosa merged commit e620d17 into NVIDIA:dev Sep 24, 2024
14 checks passed
@parthosa parthosa deleted the spark-rapids-tools-1355 branch September 24, 2024 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] user-tools will pick up any version available for the tools_jar in the cache location
3 participants