You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My proposal is to use the poisson equation to illustrate the main concepts on programming with OpenMP offload.
It is simple and of scientific relevance.
The Example is written to C.
Solve the poisson equation using a relaxation method .
Can illustrate :
basic parallelisation mechanism: turn the CPU loop into a GPU loop
data mapping of variables : data transfers are only done once and not at each time step
custom mappers: map in one go both the charge and the density fields
register pressure: have the density be computed before. The aerofoil problem from wee archie demo is limited by register pressure with CUDA
latency hiding:
performance as size changes
use of collapse clause
bandwith:
On thread process a larger block area, leads to less loads per thread ( performance benefit on A2 ? )
Use shared memory block ( statically allocated )
Memory coalescing: have the loop in the right orders
async:
compute energy from the field ( requires a dependency )
solve multiple poisson equations
interoperability:
solve the poisson equation using FFTW and using a library to do so
The text was updated successfully, but these errors were encountered:
My proposal is to use the poisson equation to illustrate the main concepts on programming with OpenMP offload.
It is simple and of scientific relevance.
The Example is written to C.
Solve the poisson equation using a relaxation method .
Can illustrate :
basic parallelisation mechanism: turn the CPU loop into a GPU loop
data mapping of variables : data transfers are only done once and not at each time step
custom mappers: map in one go both the charge and the density fields
register pressure: have the density be computed before. The aerofoil problem from wee archie demo is limited by register pressure with CUDA
latency hiding:
bandwith:
async:
interoperability:
The text was updated successfully, but these errors were encountered: