You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
is there any distributed constrains here? or due to the limitation of CLI? it feels strange to have user convert torch tensor to numpy array and then to disk npy file to pass it to the modelrunner.generate, and then within the generator, disk npy gets to convert back to torch tensor again.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
is there any distributed constrains here? or due to the limitation of CLI? it feels strange to have user convert torch tensor to numpy array and then to disk npy file to pass it to the modelrunner.generate, and then within the generator, disk npy gets to convert back to torch tensor again.
torch tensor saved to disk https://github.com/NVIDIA/TensorRT-LLM/blob/main/examples/multimodal/run.py#L170
conver back to torch tensor https://github.com/NVIDIA/TensorRT-LLM/blob/main/tensorrt_llm/runtime/model_runner.py#L272
Beta Was this translation helpful? Give feedback.
All reactions