Inference speed worse on AMD CPU than on Intel CPU #119

CrazyChildren · 2024-05-27T15:20:53Z

CrazyChildren
May 27, 2024

i test chronos with intel core cpu(mac pro), linux with intel cpu(server), and linux with amd(server) on same code. it seems amd cpu has ~30x worse in inference time.

in intel cpu it approximate cost 0.7s with batch_num = 1, predict_len = 1, context_len = 70.
however in AMD, it about 30s.

i don't know it's my specific case. but i found some one said turn on AMP in AMD CPU by using auto_cast to bfloat16 would case decresing performance. Bfloat16 CPU inference speed is too slow on AMD cpu

i'm quite a newbie in torch. so if someone find a solution, please post here. thx

abdulfatir · 2024-05-28T08:31:57Z

abdulfatir
May 28, 2024
Maintainer

@CrazyChildren one quick check to verify if this is indeed due to bf16 (which is the likely case) is to load the model in fp32. Here's the relevant code:

import pandas as pd  # requires: pip install pandas
import torch
from chronos import ChronosPipeline

pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cuda",  # use "cpu" for CPU inference and "mps" for Apple Silicon
    torch_dtype=torch.float32,
)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference speed worse on AMD CPU than on Intel CPU #119

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Inference speed worse on AMD CPU than on Intel CPU #119

CrazyChildren May 27, 2024

Replies: 1 comment

abdulfatir May 28, 2024 Maintainer

CrazyChildren
May 27, 2024

abdulfatir
May 28, 2024
Maintainer