Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

installation #4

Open
bakachan19 opened this issue Oct 11, 2024 · 4 comments
Open

installation #4

bakachan19 opened this issue Oct 11, 2024 · 4 comments

Comments

@bakachan19
Copy link

Hi, sorry to bother you.
Would it be possible to provide a container (i.e. apptainer) file with the installation?
I've been struggling for the past 2 days with the installation and I still cannot make it work. I am getting issues with libraries compatibilities, cuda, etc..

Thank you!

@zhang-ziang
Copy link
Collaborator

@bakachan19 Thank you for your interest in our work. Since it involves a lot of previous work, the configuration of the working environment may indeed be somewhat complex. Could you please elaborate on the details of the issue you're encountering?

@bakachan19
Copy link
Author

bakachan19 commented Oct 14, 2024

Dear @zhang-ziang,

Thank you for getting back to me.
Initially I had issues with installing the packages related with KNN_CUDA and was encountering issues with ninja and cuda runtime ( I was using conda). So I decided to try and build a container (pytorch and cuda12) to have more control over the environment. During the process, I bumped into several package compatibility issues. In the requirements, it is stated to use protobuf==4.23.4, but this was giving me conflict issues. I had to change it to protobuf=3.20.0 (hopefully this will not impact the performance of the models). I also had to install braceexpand, wget, webdataset, loguru because I was getting issues with importing the modules. Also, I had to put the torch version to 2.1.2 instead of version 2.2.2, as mentioned in the requirements. I used pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 xformers --index-url https://download.pytorch.org/whl/cu121. I also installed exformers because I saw that it was required in one of the models (although not sure if it is actually needed).

So, after some trials I managed to use OmniBind_Base and OmniBind_Large to extract embeddings for text and images. However, I am getting errors when I try to use point clouds.
Moreover, I am not able to use OmniBind_Full because I ran out of memory on the GPU (I used an H100 with 80gb and still was not enough). I tried putting the OmniBind_Full on cpu but I get an error: "NotImplementedError: Please define 'emb_audios' method. Seems like the model is not initialized properly in the experts.py or something... Any idea on how to fix this?

Also, for inference the instructions provided in the readme are not working. I used the code in inference.py to perform inference. Furthermore, it would be very useful if you mentioned what operating system to use: when building the container, I tested Ubuntu 20.04 first because of the nvidia container coming with python 3.8 and pytorch+cuda, but then I had to switch to Ubuntu 22.04 and install python 3.8 there. Knowing the GPU requirements to run inference on this methods would also be really beneficial.

Sorry for the long post and thank you for your help!

@zhang-ziang
Copy link
Collaborator

Dear @zhang-ziang,

Thank you for getting back to me. Initially I had issues with installing the packages related with KNN_CUDA and was encountering issues with ninja and cuda runtime ( I was using conda). So I decided to try and build a container (pytorch and cuda12) to have more control over the environment. During the process, I bumped into several package compatibility issues. In the requirements, it is stated to use protobuf==4.23.4, but this was giving me conflict issues. I had to change it to protobuf=3.20.0 (hopefully this will not impact the performance of the models). I also had to install braceexpand, wget, webdataset, loguru because I was getting issues with importing the modules. Also, I had to put the torch version to 2.1.2 instead of version 2.2.2, as mentioned in the requirements. I used pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 xformers --index-url https://download.pytorch.org/whl/cu121. I also installed exformers because I saw that it was required in one of the models (although not sure if it is actually needed).

So, after some trials I managed to use OmniBind_Base and OmniBind_Large to extract embeddings for text and images. However, I am getting errors when I try to use point clouds. Moreover, I am not able to use OmniBind_Full because I ran out of memory on the GPU (I used an H100 with 80gb and still was not enough). I tried putting the OmniBind_Full on cpu but I get an error: "NotImplementedError: Please define 'emb_audios' method. Seems like the model is not initialized properly in the experts.py or something... Any idea on how to fix this?

Also, for inference the instructions provided in the readme are not working. I used the code in inference.py to perform inference. Furthermore, it would be very useful if you mentioned what operating system to use: when building the container, I tested Ubuntu 20.04 first because of the nvidia container coming with python 3.8 and pytorch+cuda, but then I had to switch to Ubuntu 22.04 and install python 3.8 there. Knowing the GPU requirements to run inference on this methods would also be really beneficial.

Sorry for the long post and thank you for your help!

The torch version may not really important here, because OmniBind only uses the most basic operations. The protobuf version does require some switching when configuring the experiment environment, but it does not affect the performance of the model.
The Uni3D environment used in OmniBind may be different from other encoders, and there may indeed be some conflicts. Please tell me more about the error messages, and maybe I can provide some suggestions.
For GPU memory issues, you can try putting different encoders on different GPU to spread the load evenly. As some open source models are implemented using packages such as xformer, they don't work well on cpus and can only use GPU.
Our system, python, and cuda version: Ubuntu22.04, Python 3.10, Cuda 11.8
Please let me know if you have more details about the error. :)

@zhang-ziang
Copy link
Collaborator

@bakachan19 Error "NotImplementedError: Please define 'emb_audios' method you mentioned is fixed in the latest version. The error is due to a typo in the OmniBind Full config in omni_utils.py . Thank you for pointing it out. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants