Finetuning in 2:4 sparsity w4a16 example fails with multiple GPUs #911

arunpatala · 2024-11-13T12:36:49Z

Thanks for this nice repo.

Describe the bug
Finetuning in 2:4 sparsity w4a16 example fails with multiple GPUs

Expected behavior
The finetuning step expected to train successfully with multi GPUs

Environment
Using four NVIDIA A10 GPUs on aws notebook instance

To Reproduce
cd examples/quantization_2of4_sparse_w4a16
python llama7b_sparse_w4a16.py

Errors
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0!

Additional context

I have seen example of examples/finetuning/example_fsdp_config.yaml
But not sure how to use it for finetuning in the above example.

Also would like to know if there is any paper that discusses this approach?
Is the finetuning being done on the whole model and in what precision?
Will finetuning of quantized models further with lora adapters be supported?

Thanks
Arun

dsikka · 2024-11-13T18:33:53Z

Hi @arunpatala this is a known bug that we're tracking.

arunpatala · 2024-11-14T04:47:06Z

thanks

dsikka · 2024-11-26T02:49:35Z

Hi @arunpatala do you mind using the transformers off of main to check if the error persists?

arunpatala · 2024-11-26T06:33:57Z

I dont have access to multi GPU machine right now. I will test it next time I have access. Thanks

arunpatala added the bug Something isn't working label Nov 13, 2024

dsikka self-assigned this Nov 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuning in 2:4 sparsity w4a16 example fails with multiple GPUs #911

Finetuning in 2:4 sparsity w4a16 example fails with multiple GPUs #911

arunpatala commented Nov 13, 2024 •

edited

Loading

dsikka commented Nov 13, 2024

arunpatala commented Nov 14, 2024

dsikka commented Nov 26, 2024

arunpatala commented Nov 26, 2024

Finetuning in 2:4 sparsity w4a16 example fails with multiple GPUs #911

Finetuning in 2:4 sparsity w4a16 example fails with multiple GPUs #911

Comments

arunpatala commented Nov 13, 2024 • edited Loading

dsikka commented Nov 13, 2024

arunpatala commented Nov 14, 2024

dsikka commented Nov 26, 2024

arunpatala commented Nov 26, 2024

arunpatala commented Nov 13, 2024 •

edited

Loading