-
Notifications
You must be signed in to change notification settings - Fork 54
GPUhack example compilation
Peter Willendrup edited this page Nov 13, 2020
·
14 revisions
Note bene: Since the below instructions were written, compilation is also achievable via
mcrun -c --openacc
- but the rest of the page is relevant for debugging purposes.
hpclogin2(username) $ cp $MCSTAS/examples/BNL_H8.instr .
hpclogin2(username) $ mcstas BNL_H8.instr
Warning: 'V_sample' is an obsolete component (not maintained).
CFLAGS=
hpclogin2(username) $ ls
BNL_H8.c BNL_H8.instr
hpclogin2(username) $ mcsplit.py BNL_H8.c
hpclogin2(username) $ ls
BNL_H8.c BNL_H8.instr _BNL_H8.c _mccode-r.c _mccode-r.h _mccode_main.c _mcstas-r.c _mcstas-r.h
--> _BNL_H8.c works "the same" as BNL_H8.c,
--> ...by inclusion of the other .c/.h snippets
hpclogin2(username) $ pgcc -ta=tesla,managed -Minfo=accel -DOPENACC BNL_H8.c -o BNL_H8_gpu.out
Meaning of the various switches:
- -ta=tesla - make the target architecture "tesla" i.e. graphic card. multicore also available for CPU use
- cc70: "Compute capability 7.0" - needs not be set explicitly
- managed: Make memory "CUDA managed memory" - we should try to eliminate this requirement
- -Minfo=accel: Give extra information on acceleration achieved
- -DOPENACC: Use the special McStas defines to enable support for PGI compiler / GPU
PGC-W-0221-Redefinition of symbol MC_RAND_MAX (BNL_H8.c: 495)
PGC-W-0118-Function Table_File_List_store does not contain a return statement (BNL_H8.c: 5986)
noprintf:
1143, Generating acc routine seq
Generating Tesla code
str_comp:
1148, Generating acc routine seq
Generating Tesla code
str_len:
1158, Generating acc routine seq
Generating Tesla code
mcget_ncount:
3299, Generating acc routine seq
Generating Tesla code
mcget_run_num:
3306, Generating acc routine seq
Generating Tesla code
coords_set:
3536, Generating acc routine seq
Generating Tesla code
coords_get:
3548, Generating acc routine seq
Generating Tesla code
coords_add:
3558, Generating acc routine seq
Generating Tesla code
coords_sub:
3571, Generating acc routine seq
Generating Tesla code
coords_neg:
3584, Generating acc routine seq
Generating Tesla code
coords_scale:
3595, Generating acc routine seq
Generating Tesla code
coords_sp:
3606, Generating acc routine seq
Generating Tesla code
coords_xp:
3615, Generating acc routine seq
Generating Tesla code
coords_len:
3626, Generating acc routine seq
Generating Tesla code
coords_mirror:
3632, Generating acc routine seq
Generating Tesla code
coords_print:
3650, Generating acc routine seq
Generating Tesla code
rot_set_rotation:
3698, Generating acc routine seq
Generating Tesla code
rot_test_identity:
3734, Generating acc routine seq
Generating Tesla code
rot_mul:
3746, Generating acc routine seq
Generating Tesla code
rot_copy:
3764, Generating acc routine seq
Generating Tesla code
rot_transpose:
3776, Generating acc routine seq
Generating Tesla code
rot_apply:
3793, Generating acc routine seq
Generating Tesla code
vec_prod_func:
3823, Generating acc routine seq
Generating Tesla code
scalar_prod:
3835, Generating acc routine seq
Generating Tesla code
norm_func:
3840, Generating acc routine seq
Generating Tesla code
mccoordschange:
3950, Generating acc routine seq
Generating Tesla code
mccoordschange_polarisation:
3982, Generating acc routine seq
Generating Tesla code
normal_vec:
4000, Generating acc routine seq
Generating Tesla code
solve_2nd_order:
4060, Generating acc routine seq
Generating Tesla code
_randvec_target_circle:
4109, Generating acc routine seq
Generating Tesla code
_randvec_target_rect_angular:
4176, Generating acc routine seq
Generating Tesla code
_randvec_target_rect_real:
4254, Generating acc routine seq
Generating Tesla code
kiss_srandom:
4527, Generating acc routine seq
Generating Tesla code
kiss_random:
4538, Generating acc routine seq
Generating Tesla code
_hash:
4565, Generating acc routine seq
Generating Tesla code
fast_kiss:
4574, Generating acc routine seq
Generating Tesla code
_randnorm2:
4643, Generating acc routine seq
Generating Tesla code
_gaussian_double:
4653, Generating acc routine seq
Generating Tesla code
_randtriangle:
4667, Generating acc routine seq
Generating Tesla code
_uniform_double:
4674, Generating acc routine seq
Generating Tesla code
_rand01:
4683, Generating acc routine seq
Generating Tesla code
_randpm1:
4692, Generating acc routine seq
Generating Tesla code
_rand0max:
4701, Generating acc routine seq
Generating Tesla code
_randminmax:
4709, Generating acc routine seq
Generating Tesla code
mcsetstate:
5295, Generating acc routine seq
Generating Tesla code
mcgetstate:
5325, Generating acc routine seq
Generating Tesla code
mcgenstate:
5347, Generating acc routine seq
Generating Tesla code
mcrestore_neutron:
5386, Generating acc routine seq
Generating Tesla code
inside_rectangle:
5411, Generating acc routine seq
Generating Tesla code
box_intersect:
5429, Generating acc routine seq
Generating Tesla code
cylinder_intersect:
5545, Generating acc routine seq
Generating Tesla code
sphere_intersect:
5601, Generating acc routine seq
Generating Tesla code
plane_intersect:
5625, Generating acc routine seq
Generating Tesla code
Table_Index:
6527, Generating acc routine seq
Generating Tesla code
Table_Value:
6596, Generating acc routine seq
Generating Tesla code
Table_Value2d:
6676, Generating acc routine seq
Generating Tesla code
Table_Interp1d:
7111, Generating acc routine seq
Generating Tesla code
Table_Interp1d_nearest:
7128, Generating acc routine seq
Generating Tesla code
Table_Interp2d:
7147, Generating acc routine seq
Generating Tesla code
StdReflecFunc:
7244, Generating acc routine seq
Generating Tesla code
TableReflecFunc:
7300, Generating acc routine seq
Generating Tesla code
init:
10107, Generating update device(_Origin_var)
10108, Generating update device(_Source_var)
10109, Generating update device(_D0_Source_var)
10110, Generating update device(_SC1_var)
10111, Generating update device(_D1_SC1_Out_var)
10112, Generating update device(_As1_var)
10113, Generating update device(_As2_var)
10114, Generating update device(_As3_var)
10115, Generating update device(_As4_var)
10116, Generating update device(_D2_A4_var)
10117, Generating update device(_Mono_Cradle_var)
10118, Generating update device(_PG1Xtal_var)
10119, Generating update device(_Mono_Out_var)
10120, Generating update device(_D4_SC2_In_var)
10121, Generating update device(_SC2_var)
10122, Generating update device(_D5_SC2_Out_var)
10123, Generating update device(_Sample_Cradle_var)
10124, Generating update device(_Sample_Out_var)
10125, Generating update device(_Sample_var)
10126, Generating update device(_D7_SC3_In_var)
10127, Generating update device(_SC3_var)
10128, Generating update device(_D8_SC3_Out_var)
10129, Generating update device(_Ana_Cradle_var)
10130, Generating update device(_PG2Xtal_var)
10131, Generating update device(_Ana_Out_var)
10132, Generating update device(_D10_SC4_In_var)
10133, Generating update device(_SC4_var)
10134, Generating update device(_He3H_var)
10138, Generating update device(_instrument_var)
class_Progress_bar_trace:
10175, Generating acc routine seq
Generating Tesla code
class_Source_simple_trace:
10243, Generating acc routine seq
Generating Tesla code
class_PSD_monitor_trace:
10326, Generating acc routine seq
Generating Tesla code
class_Guide_trace:
10384, Generating acc routine seq
Generating Tesla code
class_Slit_trace:
10523, Generating acc routine seq
Generating Tesla code
class_Monochromator_flat_trace:
10552, Generating acc routine seq
Generating Tesla code
class_V_sample_trace:
10723, Generating acc routine seq
Generating Tesla code
raytrace:
10933, Generating acc routine seq
Generating Tesla code
raytrace_all:
11475, Generating Tesla code
11476, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11475, Local memory used for particleN
raytrace_all_funnel:
11520, Generating Tesla code
11521, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11520, Generating implicit copyout(particles[:innerloop])
11534, Generating Tesla code
11535, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11534, Local memory used for _particle_save
Generating implicit copy(.I0001,.I0004)
11554, Generating Tesla code
11555, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11554, Local memory used for _particle_save
Generating implicit copy(.I0001,.I0004)
11574, Generating Tesla code
11575, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11574, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11594, Generating Tesla code
11595, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11594, Local memory used for _particle_save
Generating implicit copy(.I0001,.I0004)
11614, Generating Tesla code
11615, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11614, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11634, Generating Tesla code
11635, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11634, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11654, Generating Tesla code
11655, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11654, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11674, Generating Tesla code
11675, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11674, Local memory used for _particle_save
Generating implicit copy(.I0001,.I0004)
11694, Generating Tesla code
11695, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11694, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11714, Generating Tesla code
11715, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11714, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11734, Generating Tesla code
11735, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11734, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11751, Generating Tesla code
11752, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11751, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11771, Generating Tesla code
11772, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11771, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11788, Generating Tesla code
11789, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11788, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11808, Generating Tesla code
11809, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11808, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11828, Generating Tesla code
11829, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11828, Local memory used for _particle_save
Generating implicit copy(.I0001,.I0004)
11848, Generating Tesla code
11849, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11848, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11865, Generating Tesla code
11866, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11865, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11882, Generating Tesla code
11883, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11882, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11902, Generating Tesla code
11903, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11902, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11922, Generating Tesla code
11923, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11922, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11942, Generating Tesla code
11943, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11942, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11962, Generating Tesla code
11963, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11962, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11979, Generating Tesla code
11980, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11979, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
11999, Generating Tesla code
12000, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
11999, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
12016, Generating Tesla code
12017, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
12016, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
12036, Generating Tesla code
12037, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
12036, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
12056, Generating Tesla code
12057, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
12056, Local memory used for _particle_save
Generating implicit copy(.I0004,.I0001)
12074, Generating Tesla code
12075, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
finally:
12319, Generating update self(_Origin_var)
12320, Generating update self(_Source_var)
12321, Generating update self(_D0_Source_var)
12322, Generating update self(_SC1_var)
12323, Generating update self(_D1_SC1_Out_var)
12324, Generating update self(_As1_var)
12325, Generating update self(_As2_var)
12326, Generating update self(_As3_var)
12327, Generating update self(_As4_var)
12328, Generating update self(_D2_A4_var)
12329, Generating update self(_Mono_Cradle_var)
12330, Generating update self(_PG1Xtal_var)
12331, Generating update self(_Mono_Out_var)
12332, Generating update self(_D4_SC2_In_var)
12333, Generating update self(_SC2_var)
12334, Generating update self(_D5_SC2_Out_var)
12335, Generating update self(_Sample_Cradle_var)
12336, Generating update self(_Sample_Out_var)
12337, Generating update self(_Sample_var)
12338, Generating update self(_D7_SC3_In_var)
12339, Generating update self(_SC3_var)
12340, Generating update self(_D8_SC3_Out_var)
12341, Generating update self(_Ana_Cradle_var)
12342, Generating update self(_PG2Xtal_var)
12343, Generating update self(_Ana_Out_var)
12344, Generating update self(_D10_SC4_In_var)
12345, Generating update self(_SC4_var)
12346, Generating update self(_He3H_var)
12348, Generating update self(_instrument_var)
PGC/x86-64 Linux 19.4-0: compilation completed with warnings
Next topic: GPUhack example run
Please compile and run such binaries via e.g. the generic interactive nodes, accessible via
linuxsh
Compile:
host(username) $ mpicc -DUSE_MPI -lmpi -lm -DRNG_ALG=2 BNL_H8.c -o BNL_H8_mpi.out
Run:
host(username) $ mpirun -np 10 ./BNL_H8_mpi.out -dtest_mpi_10 lambda=2.36