-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SYCL: Migrate away from deprecated ggml_tensor->backend #10840
Conversation
Thanks for the PR. FYI most of us are on holiday so we may not be able to review until next month. If this is not urgent please give us some time to review it. |
ggml/src/ggml-sycl/ggml-sycl.cpp
Outdated
@@ -146,27 +145,11 @@ void ggml_backend_sycl_print_sycl_devices() { | |||
} | |||
} | |||
|
|||
static inline int get_sycl_env(const char *env_name, int default_val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are removing this, is there any other addition where sycl env is called for logging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment.
I suggest keeping this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to PR #9709 , the backend specific logging system was supposed to be replaced with a common logging system. Although I agree that this will enable all debug logs if --log-verbose
cmdline argument is passed or enabled by default in test-backend-ops.
cc @slaren I think best here is to enable them only when something like GGML_BACKEND_DEBUG=1
is set in environment.
Edit: I am restoring GGML_SYCL_DEBUG implementation for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. for platform benchmark purposes it makes sense to have the platform specific debug on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank your for the improvement.
But I don't think it bring benefit to SYCL backend user:
1.
GGML_SYCL_DEBUG
is used to debug the performance and error in SYCL backend only.
It can be opened by environment in user running time without rebuild the source code.
Looks like GGML_LOG_DEBUG
need to rebuild the source code after define macro.
It limits the online trouble shooting.
The GGML_SYCL_DEBUG
function is more powerful than GGML_LOG_DEBUG
.
We can't reduce the function for unify the code.
Customer should be first. :)
Replace it by common GGML_LOG_DEBUG
log function, will mix the logs of common code and SYCL backend.
In performance test, there will be more log info.
This reverts commit 2607b7d. Let's keep the current SYCL specific logging mechanism for now
ggml/src/ggml-sycl/common.hpp
Outdated
@@ -163,8 +161,7 @@ inline dpct::err0 ggml_sycl_set_device(const int device) try { | |||
int current_device_id; | |||
SYCL_CHECK(CHECK_TRY_ERROR(current_device_id = get_current_device_id())); | |||
|
|||
// GGML_SYCL_DEBUG("ggml_sycl_set_device device_id=%d, | |||
// current_device_id=%d\n", device, current_device); | |||
GGML_LOG_DEBUG("ggml_sycl_set_device device_id=%d,current_device_id=%d\n", device, current_device_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to restore code.
This log will appear more times during inference,
Suggest remark it as default.
ggml/src/ggml-sycl/ggml-sycl.cpp
Outdated
id = get_current_device_id())); | ||
// GGML_SYCL_DEBUG("current device index %d\n", id); | ||
src_ptr = (char *) extra->data_device[id]; | ||
} else if (ggml_backend_buffer_is_sycl(src->buffer) || ggml_backend_buffer_is_sycl_split(src->buffer)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use string compare to replace int compare, it will reduce the performance.
Is it necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you already know that the previous backend path which used int is deprecated for a long time:
Line 590 in 152610e
GGML_DEPRECATED(enum ggml_backend_type backend, "use the buffer type to find the storage location of the tensor"); |
I reused the already existed buffer implementation.
Also, I didn't notice any noticible slowdowns. If you do, please share the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no concern with the changes here but I'll let @NeoZhangJianyu merge it if he's happy with the recent changes.
@Rbiessy Thank you. If only I could get the result of test on a multi GPU/RPC setup before merging, it would be nice since I couldn't able to do that due to lack of hardware. |
We also don't test multi-GPU or RPC setup at Codeplay so far. I tested this with a few models on A100 and found no regression. |
This should have been done way earlier as
ggml_tensor->backend
has been deprecated for a very long time.There are some doubts in this, and thus have added comments which I will remove after discussing with the collaborators.
So far, backend test ops (for single GPU) are passing with this change.
cc: @airMeng @NeoZhangJianyu @abhilash1910 @Rbiessy
Also, integrated with GGML_LOG for debug logs and remove backend specific logging system.
With new log, it will be better to debug, for example:
Wish there was a level to control the verbosity using environmental variables of the debug log levels rather than passing command line arguments. This is reverted for now..