You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I know it is very little memory, but it is what I have by now.
By default, the demo code won't inference because of cuda out of memory. I tried to reduce the batch size of the inference to just 1, but is not enough.
Do you know a way to reduce the memory consumption running the inference?
I know that the best solution is to upgrade the GPU to a RTX 3090/4090/A6000, but before that I would like to try another way if possible.
Thank you!
David Martin Rius
The text was updated successfully, but these errors were encountered:
@deepanwayx I suppose full inference precision is 32 bit, correct? If so, did you guys made any test to check whether 16 bit would still deliver good acceptable results?
Hello,
I know it is very little memory, but it is what I have by now.
By default, the demo code won't inference because of cuda out of memory. I tried to reduce the batch size of the inference to just 1, but is not enough.
Do you know a way to reduce the memory consumption running the inference?
I know that the best solution is to upgrade the GPU to a RTX 3090/4090/A6000, but before that I would like to try another way if possible.
Thank you!
David Martin Rius
The text was updated successfully, but these errors were encountered: