Does triton inference server do model loading optimization #5984

sfc-gh-zhwang · 2023-06-23T17:42:56Z

sfc-gh-zhwang
Jun 23, 2023

When loading onnx/pytorch/fastertransformer model, does triton load the model from disk to cpu/memory and to gpu or triton directly load the model to gpu memory?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does triton inference server do model loading optimization #5984

{{title}}

Replies: 0 comments

Select a reply

Does triton inference server do model loading optimization #5984

sfc-gh-zhwang Jun 23, 2023

Replies: 0 comments

sfc-gh-zhwang
Jun 23, 2023