Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CPU device-id support #150

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Tradunsky
Copy link

Thank you for an easy to use CLI ❤️

Currently, if the library runs on CPU-only machine it fails with the following error:

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

It makes sense to have the library optimised for GPU 👍🏻 However, do you think CPU option can also be valuable for debugging locally?

@@ -91,11 +91,13 @@
def main():
args = parser.parse_args()

dtype = torch.float32 if args.device_id == "cpu" else torch.float16
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this because otherwise fails on i5 CPU:

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

@Vaibhavs10
Copy link
Owner

Hi @Tradunsky - Sorry for the delayed update. I've been quite sick and a lil busy with work in the past few weeks.

This is a lovely addition however, this would be incredibly slow for any audio above a minute. A more fool-proof way would be to use something like: ONNX pipeline (https://huggingface.co/Intel/whisper-large-v2-onnx-int4).

This would require a bit of rework but should still be doable, would you like to update your PR to add support for it?

@Tradunsky
Copy link
Author

Hi @Vaibhavs10,
Hope you feel better now 🙏🏻

That's a great suggestion! I was also thinking of using neural compressor to speed up this on intel CPU 🤔

One potential drawback is users would have to switch between their local debugging and deployable models, risking to deal with model performance differences.

Anyway, I would love to contribute when I have time 🙏🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants