Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a section to tech note about GPU-based halting #26346

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions doc/rst/technotes/gpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -599,6 +599,43 @@ See the `asyncTaskComm
benchmark for a full example of a pattern that benefits from oversubscribing
GPUs.

GPU-based Halting
~~~~~~~~~~~~~~~~~

Standard Chapel has a number of features that can cause a program to exit,
or "halt". The 2.3 release of Chapel introduced the ability to execute halting
functions on the GPU, allowing Chapel-generated GPU kernels to halt the
execution of the whole program. This makes it possible to both invoke halts
directly via Chapel's :proc:`~Errors.halt`, and to invoke functions that
themselves halt. In prior releases, doing so made code ineligible for GPU
execution.

The following program demonstrates this feature, printing "halt reached in
GPU kernel".

.. code-block:: chapel

on here.gpus[0] {
@assertOnGpu
foreach i in 1..10 {
halt();
}
}

There are some caveats to the current implementation:

* String manipulation for printing halt messages requires a number of features
ill-suited for the GPU. As a result, at this time, functions that use
the string-enabled overloads of ``halt()`` will still not work on the GPU.
This will be improved in future releases.
* Presently, halting is implemented by setting a flag from the kernel that
is later accessed by the host program. As a consequence, kernel execution
proceeds past the ``halt()`` call; however, once the kernel
is executed, the program exits.
* There is a race condition between several Chapel tasks using the same
device to launch kernels, which can interfere with the behavior of ``halt()``.
This will be fixed in future releases.

Known Limitations
-----------------

Expand Down
Loading