-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add NPU support for huggingface.py #1787
Conversation
@haileyschoelkopf @lintangsutawika would you mind having a check? |
06d4b13
to
0f4fe94
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jiaqiw09 !
- using accelerate to do evaluation on multiple cards
The major change is just replace f"cuda:{accelerator.local_process_index}" with f"{accelerator.device}", it does the same thing, and I think it may help adapt more different devices later if accelerate supports.
This is a good change. For the rest though:
- could we rename
device_counts
either back togpus
or todevice_count
? - we should be able to handle
"npu:{i}"
. as @LSinev mentions.
Before any merge of this we'd want to be sure we can 1) make certain that this code isn't going to break any existing integrations and 2) try to keep this logic minimally invasive--currently it seems to rely on Accelerate and torch NPU support which I presume is recent. I don't want these to make requirements for installing the library more unwieldy.
Actually, would you be able to point out to me where NPU support lives in torch, or the easiest path to installing such? It's my understanding that as of #1470 this support was not native, in which case I'd be wary of merging code that requires torch.npu
if this is still the case.
@haileyschoelkopf @lintangsutawika thanks for your suggestion and thanks @statelesshz support in fixing code. I have just test code in NPU and GPU, three methods all work. Would you mind having a look again? Best |
what this PR do
issue: #1797
This PR add NPU support for huggingface.py. It just does some fix of existing code to support NPU device.
what part to fix
Currently, the class
HFLM
just support three different ways to do evaluations:cuda:0
how to fix and why
Here are explanation of my code:
- using single card just set
cuda:0
Just simply add
["npu"]
and["npu:0"]
to device_list asmps
. If users want to use in different card, they can exportASCEND_RT_VISIBLE_DEVICES =1 or 2 or 3
anddevice npu
to run task.- using accelerate to do evaluation on multiple cards
The major change is just replace
f"cuda:{accelerator.local_process_index}"
withf"{accelerator.device}"
, it does the same thing, and I think it may help adapt more different devices later if accelerate supports.- using device_map = 'auto' to do evaluation on multiple cards
For
device_map = 'auto'
, there is something different. If people want to usedevice_map = 'auto'
in NPUs, they can use following codethe card info should be set. I do this because of this issue, I met same problem in NPUs. And the problem could be solved by setting specific card. I think it's better just set the device.