Simplify mvn verify cache key [skip ci] #12237

YanxuanLiu · 2025-02-26T16:11:42Z

Currently the cache key ID contains hash value of all pom files and dependences sha1 md5 value. Intercept first 8 chars to avoid make it too long.

Signed-off-by: Yanxuan Liu <[email protected]>

YanxuanLiu · 2025-02-26T16:12:16Z

test run: https://github.com/YanxuanLiu/spark-rapids/actions/runs/13547756045/job/37863528686?pr=26#step:4:51

Signed-off-by: Yanxuan Liu <[email protected]>

gerashegalov · 2025-02-26T19:01:52Z

Help reviewers understand the need for this change. Did we hit some error on gh? Can you file an issue with the description and reference it in your PR ?

gerashegalov · 2025-02-26T19:11:56Z

.github/workflows/mvn-verify-check.yml

@@ -54,7 +54,8 @@ jobs:
        run: |
          set -x
          depsSHA1=$(. .github/workflows/mvn-verify-check/get-deps-sha1.sh 2.12)
-          cacheKey="${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}-${{ github.event.pull_request.base.ref }}-${depsSHA1}"
+          hashfile=$(echo ${{ hashFiles('**/pom.xml') }} | cut -c1-8)


Instead of truncating hashFile why not feed all the inputs to md5sum as in https://github.com/NVIDIA/spark-rapids/pull/12237/files#diff-4e6d49cc27e982e1413522f21efeb5632004b1b911cbbdfc8a53af561ee4ad92R53 and then truncate if it is really necessary to truncate it

Good point, now we just use the first 8 chars of md5sum of raw hash value & deps sha1.

Latest test run: https://github.com/YanxuanLiu/spark-rapids/actions/runs/13558189224/job/37896469348?pr=26#step:4:51

Signed-off-by: Yanxuan Liu <[email protected]>

YanxuanLiu · 2025-02-27T03:25:50Z

Help reviewers understand the need for this change. Did we hit some error on gh? Can you file an issue with the description and reference it in your PR ?

It is to prevent the cache key from being too long in the future which may be truncated by github leading to a conflict.

gerashegalov · 2025-02-27T05:46:21Z

Help reviewers understand the need for this change. Did we hit some error on gh? Can you file an issue with the description and reference it in your PR ?

It is to prevent the cache key from being too long in the future which may be truncated by github leading to a conflict.

Where is GH key truncation documented? Or is it something you have seen in the action logs and could provide a link?

Provide a rationale for why manual key truncation is better than key truncation done by GH

Signed-off-by: Yanxuan Liu <[email protected]>

pxLi · 2025-02-27T05:56:27Z

Where is GH key truncation documented? Or is it something you have seen in the action logs and could provide a link?

There is no actual issue occurring. This is a proactive preventative measure, similar to how Kubernetes would truncate excessively long labels.

Provide a rationale for why manual key truncation is better than key truncation done by GH

It provides deterministic behavior where we control the key truncation ourselves, rather than encountering unexpected failures similar to k8s pod naming issues I mentioned above. In k8s, pod label segments must be 63 characters or less, but the platform silently accepts longer names and truncates them without error notification. This hidden truncation can cause different workloads to be inadvertently scheduled to the same instance. Similarly, with our cache_key implementation, we're implementing controlled truncation to ensure we have predictable, deterministic short keys that avoid these hidden collision problems.

YanxuanLiu · 2025-02-27T06:01:56Z

Help reviewers understand the need for this change. Did we hit some error on gh? Can you file an issue with the description and reference it in your PR ?

It is to prevent the cache key from being too long in the future which may be truncated by github leading to a conflict.

Where is GH key truncation documented? Or is it something you have seen in the action logs and could provide a link?

Provide a rationale for why manual key truncation is better than key truncation done by GH

As GH official document says, it allows maximum 512 characters for cache key, or action will fail.

YanxuanLiu added 3 commits February 26, 2025 22:41

intercept sha1 first 8 chars

ea188db

Signed-off-by: Yanxuan Liu <[email protected]>

intercept first 8 of hash and md5

1f9ff07

Signed-off-by: Yanxuan Liu <[email protected]>

213

3baed7d

Signed-off-by: Yanxuan Liu <[email protected]>

YanxuanLiu requested a review from a team as a code owner February 26, 2025 16:11

update license header

dca153f

Signed-off-by: Yanxuan Liu <[email protected]>

gerashegalov reviewed Feb 26, 2025

View reviewed changes

gerashegalov assigned YanxuanLiu Feb 26, 2025

gerashegalov changed the title ~~Simplify mvn verify cache key~~ Simplify mvn verify cache key [skip ci] Feb 26, 2025

gerashegalov added the build Related to CI / CD or cleanly building label Feb 26, 2025

YanxuanLiu added 2 commits February 27, 2025 11:17

md5sum for pom hash and deps

63ea3bf

Signed-off-by: Yanxuan Liu <[email protected]>

optimize

39439ff

Signed-off-by: Yanxuan Liu <[email protected]>

YanxuanLiu requested review from pxLi, yinqingh and NvTimLiu February 27, 2025 04:29

correct cache key

41cf75a

Signed-off-by: Yanxuan Liu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify mvn verify cache key [skip ci] #12237

Simplify mvn verify cache key [skip ci] #12237

YanxuanLiu commented Feb 26, 2025

YanxuanLiu commented Feb 26, 2025

gerashegalov commented Feb 26, 2025

gerashegalov Feb 26, 2025

YanxuanLiu Feb 27, 2025

YanxuanLiu Feb 27, 2025

YanxuanLiu commented Feb 27, 2025

gerashegalov commented Feb 27, 2025

pxLi commented Feb 27, 2025 •

edited

Loading

YanxuanLiu commented Feb 27, 2025

Simplify mvn verify cache key [skip ci] #12237

Are you sure you want to change the base?

Simplify mvn verify cache key [skip ci] #12237

Conversation

YanxuanLiu commented Feb 26, 2025

YanxuanLiu commented Feb 26, 2025

gerashegalov commented Feb 26, 2025

gerashegalov Feb 26, 2025

Choose a reason for hiding this comment

YanxuanLiu Feb 27, 2025

Choose a reason for hiding this comment

YanxuanLiu Feb 27, 2025

Choose a reason for hiding this comment

YanxuanLiu commented Feb 27, 2025

gerashegalov commented Feb 27, 2025

pxLi commented Feb 27, 2025 • edited Loading

YanxuanLiu commented Feb 27, 2025

pxLi commented Feb 27, 2025 •

edited

Loading