Skip to content

Collecting Profiling Information

James Price edited this page Mar 19, 2014 · 2 revisions

Oclgrind can collect counts of instructions executed while executing a kernel in order to produce an instruction histogram. To enable this feature, pass the --inst-counts flag to oclgrind. The resulting output (example below) will show the total number of each instruction type (SPIR) executed during the kernel invocation. The counts for memory loads and stores will be split into the separate address spaces, and will include the total number of bytes read or written. Calls to OpenCL builtin functions will be split into distinct functions.

Instructions executed for kernel 'timestep':
         899,500 - call llvm.fmuladd.f32()
         660,000 - add
         600,316 - br
         599,650 - fsub
         540,000 - getelementptr
         479,900 - fmul
         360,416 - sext
         300,000 - store global (1,200,000 bytes)
         300,000 - load global (1,200,000 bytes)
         299,950 - fadd
         240,000 - phi
         180,416 - icmp
         180,000 - mul
         179,950 - select
         120,832 - trunc
         120,832 - call get_global_id()
         120,000 - fdiv
         120,000 - srem
          60,416 - ret
          60,416 - udiv
          60,416 - urem
          60,000 - fcmp
Clone this wiki locally