-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add xpu support #396
base: main
Are you sure you want to change the base?
add xpu support #396
Conversation
I ran the UTs on XPU, but got "Segmentation fault (core dumped)" at one test, under investigation. |
Which specific Intel GPU did you test on? Also: I just added xpu support to simpo_loss (which was added later on and still had "cuda" hard coded). |
I use "Intel Data Center GPU Max 1550". And I tested your latest code. All tests pass except "pytest -rA test/transformers/test_rms_norm.py::test_correctness[True-BaseRMSNorm-0.0-none-dtype1-0.2-0.02-2-128-512]", but this one is a known issue and got fixed in the latest |
I don't have this issue. It could be because I use nightly
|
Thanks for the update. This PR looks good to me. |
Looks good to me! @mgrabban i just invited you as the collab of this repo, can you check the email? After acceptance, can you create a new branch in the main repo, and create a new PR based on that branch? Our CI has issues currently, so any PR from external folks cannot run CI. Thanks in advance!! |
This is done now. See #407 |
## Summary Replica of #396 Adds xpu support so all tests, benchmarks etc. run on XPUs or Intel GPUs. ## Details infer_device() function is moved to a separate file and in any file where previously "cuda" was needed, infer_device is imported and "cuda" is replaced with return value of a call to infer_device() ## Testing Done <!--- This is a required section; please describe how this change was tested. ---> A100 80GB PCIe, RTX 3060, Intel Data Center GPU Max 1550 <!-- Replace BLANK with your device type. For example, A100-80G-PCIe Complete the following tasks before sending your PR, and replace `[ ]` with `[x]` to indicate you have done them. --> - Hardware Type: <BLANK> - [x] run `make test` to ensure correctness - [ ] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence --------- Co-authored-by: Shao Tang <[email protected]>
Summary
Adds xpu support so all tests, benchmarks etc. run on XPUs or Intel GPUs.
Details
infer_device()
function is moved to a separate file and in any file where previously "cuda" was needed,infer_device
is imported and "cuda" is replaced with return value of a call toinfer_device()
Testing Done
A100 80GB PCIe, RTX 3060, Intel Data Center GPU Max 1550
make test
to ensure correctnessmake checkstyle
to ensure code stylemake test-convergence
to ensure convergence