You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 13, 2022. It is now read-only.
While having each stage run in its own thread was all fun and games when working with CUDA, recent examinations have shown that this approach possibly downgrades performance when doing work on the host (with OpenMP for example). This is especially true for stages which have different workloads.
We should therefore be able to run multiple stages in the same thread of execution.
The text was updated successfully, but these errors were encountered:
While having each stage run in its own thread was all fun and games when working with CUDA, recent examinations have shown that this approach possibly downgrades performance when doing work on the host (with OpenMP for example). This is especially true for stages which have different workloads.
We should therefore be able to run multiple stages in the same thread of execution.
The text was updated successfully, but these errors were encountered: