-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference Problem #42
Comments
python interleaved_generation.py -i 'Please introduce the city of Gyumri with pictures.' -s "./test/" |
It was running for ages so I stopped
|
I can't generate anything in your example. What did I do wrong? |
Thank you for your interest! Inference on Anole-7b requires at least 20GB of memory. It might be related to memory issue. Do you mind try using another GPU with larger memory? Thanks! |
Can I not perform model inference with a single 3090 24GB ?
…-LLLKAMING-
***@***.***
------------------ Original ------------------
From: Ethan Chern ***@***.***>
Date: Fri,Sep 13,2024 9:34 PM
To: GAIR-NLP/anole ***@***.***>
Cc: LKAMING ***@***.***>, Author ***@***.***>
Subject: Re: [GAIR-NLP/anole] Inference Problem (Issue #42)
Thank you for your interest! Inference on Anole-7b requires at least 20GB of memory. It might be related to memory issue. Do you mind try using another GPU with larger memory? Thanks!
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hi, quantization might be helpful: https://github.com/GAIR-NLP/anole/pull/21/files |
I reinstalled the environment and ran it according to the steps, but this still happens.
|
I have been installing according to your steps, but I keep having problems, which makes me very frustrated. |
Use Python 3.10 or change them to the type of "rank: Union[int, None] = None", I recommend to use Python 3.10, I found a lot place need to change . |
after i use the quantization function,the program also says OutOfMemoryError as to:
my device also RTX 3090 ,i dont know how to solve this problem.If you can help me solve it, I would be extremely grateful |
Same device as yours. This issue occurs during model initialization(unquantized_model = ChameleonInferenceModel()), which is why quantization has not taken effect. |
Why does it take so long to infer just two pictures?
The text was updated successfully, but these errors were encountered: