Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not seem to work on older CPUs #11

Open
qwertyuu opened this issue Mar 12, 2023 · 1 comment
Open

Does not seem to work on older CPUs #11

qwertyuu opened this issue Mar 12, 2023 · 1 comment

Comments

@qwertyuu
Copy link

Hello @MiscellaneousStuff ! Hope you are well.

I recently learned about whisper and was eager to try your CPU-based adaptation for speed increase.

However, me not being too rich, I have an older server running a pentium G3220.

When I went to try the quantization on this CPU, the program quit without a message (even running python using -vvv)

If I use the model as-is, without weight quantization, the code runs well (but slow! ha).

I think a requirement is a CPU supporting AVX2, since it is supported on my main PC (Which is an i7-6700). My friend also tried quantization on their server which did not support AVX2 and it failed, too (took a while to figure out why, I'll tell you)

So there you have it. Maybe there's a workaround for older CPUs? I did not find any notes of necessary AVX2 on torch's quantize_dynamic function documentation.

Thanks!

@MiscellaneousStuff
Copy link
Owner

MiscellaneousStuff commented Mar 12, 2023

Hello, thanks for your interest in the project! This page https://pytorch.org/docs/stable/quantization.html (Section: Backend/Hardware Support) explains that Dynamic Quantization relies on “fbgemm” which is a library that accelerates vector operations on the CPU using AVX2 instructions. However it seems to indicate that even without AVX2 it should still work, but it should just be a bit slower. If your deploying this on your server for a production use case or are committed to making this work on your CPU-only server, TensorRT may be more appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants