Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GLIBCXX_3.4.20 in aarch64 pytorch native #2919

Closed
rahulsmit opened this issue Jan 5, 2024 · 11 comments
Closed

Support GLIBCXX_3.4.20 in aarch64 pytorch native #2919

rahulsmit opened this issue Jan 5, 2024 · 11 comments
Labels
enhancement New feature or request

Comments

@rahulsmit
Copy link

Description

Currently, the aarch64 pytorch native dependencies is downloading the libstdc++.so.6 from this URL,

https://publish.djl.ai/extra/aarch64/libstdc%2B%2B.so.6

which has only support upto GLIBCXX_3.4.19. While using a BOM is recommended for PyTorch native, the aarch64 native doesnt support GLIBCXX_3.4.20

This is breaking, opensearch knn nmslib functionality as it requires a GLIBCXX_3.4.20
Ref: https://github.com/opensearch-project/k-NN/tree/main/jni

Even when setting LD_LIBRARY_PATH, intermittently the libstdc++.so.6 is being referred from here throwing below exception

[ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [opensearch-data-0] fatal error in thread [opensearch[opensearch-data-0][refresh][T#2]], exiting java.lang.UnsatisfiedLinkError: /usr/share/opensearch/plugins/opensearch-knn/lib/libopensearchknn_nmslib.so: /usr/share/opensearch/data/ml_cache/pytorch/1.13.1-20221220-cpu-precxx11-linux-aarch64/libstdc++.so.6: version GLIBCXX_3.4.20' not found (required by /usr/share/opensearch/plugins/opensearch-knn/lib/libopensearchknn_nmslib.so)`

Will this change the current api? How?
No
Who will benefit from this enhancement?
Developers who can use the BOM with Opensearch ml-commons and ultimatley support opensearch knn nmslib

References

@rahulsmit rahulsmit added the enhancement New feature or request label Jan 5, 2024
@frankfliu
Copy link
Contributor

A quick workaround on your side could be manually load your libstdc++.so.6 before DJL load PyTorch engine.

@frankfliu
Copy link
Contributor

I have updated libstdc++.so.6 to GLIBCXX_3.4.24

@rahulsmit
Copy link
Author

@frankfliu Thank you so much. I will quickly rebuild and try it out

I was unable to load the libstdc++.so.6 manually as DJL loads the files into the cache automatically. And it replaces the whole directory and starts using its libstdc++.

@rahulsmit
Copy link
Author

@frankfliu Quick followup question, do the pytorch-native have a nightly build that might generate new artifact with updated libstdc++.so.6.
I am using BOM ai.djl:bom:0.21.0

@frankfliu
Copy link
Contributor

frankfliu commented Jan 5, 2024

You can load libstdc++.so.6 manually with System.load(PATH), Java won't load the same shared library twice. The 2nd load from DJL will be ignored. So your version will be used.

If you are downloading the pytorch native library at runtime, it should just work. You don't need rebuild your package. You only need to clean up your local djl engine cache:

rm -rf ~/.djl.ai/pytorch

@rahulsmit
Copy link
Author

rahulsmit commented Jan 5, 2024

Unfortunately, we dont have network connectivity at runtime. So the native libs are being downloaded at the compile time. We use the BOM in build.gradle

implementation platform("ai.djl:bom:0.21.0")

and refer to PyTorch Native like below

implementation group: 'ai.djl.pytorch', name: "pytorch-native-${project.pytorchFlavor}", classifier: "${project.platformClassifier}" 

I am assuming that the jar release for 0.21.0 BOM will not be automatically updated and we will need to rebuild ourselves.

@frankfliu
Copy link
Contributor

The maven release cannot be updated. You can download the jar and manually update the file and rejar it. But I don't recommend you do that. The easiest way to workaround this issue is as I suggested before: manually load your version of libstdc++.so.6 file

@rahulsmit
Copy link
Author

Hi @frankfliu Explicitly loading my libstdc++.so.6 before djl does, fixed the issue. Thank you for your idea.

Was curious - Was there any specific reason that there was a different libstdc++.so.6 file for aarch64 and it didnt have support beyond GLIBCXX_3.4.19

Should we be cautious using our libstdc++.so.6?

@frankfliu
Copy link
Contributor

The version we bounded is to support centos7. Since most of aarch64 user uses amazonlinux:2. We can bump GLIBCXX version to 3.4.24, but which also means aarch64 cannot support centos7 any more.

Our -x86_64, we will continue use lower version of libstdcxx++.so.6

@rahulsmit
Copy link
Author

Thanks for you help throughout

@frankfliu
Copy link
Contributor

frankfliu commented Jan 9, 2024

@rahulsmit

I create a PR so you can leverage DJL to load your libstdc++.so.6 file: #2929

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants