Skip to content

Multi Threaded Rebinner

Stuart Campbell edited this page Nov 8, 2016 · 2 revisions

A new mechanism to execute multi-threaded rebinner is implemented. In many ways, it is just the most classic data parallel design. But lately, that term has been given a new catchy name, map-reduce. Hence, to be cool, our mthread launcher is also called map, reduce in the gmesh toolkit.

gmeshrebin2 -0.8 0 0.002 -0.21 0.21 0.001 0.5 2.0 0.002 < ../exp/rotdata_test.inb > /dev/null 

This is how gmeshrebin2 was used. gmeshrebin2 unfortunately has a large memory footprint. A 400x400x750 rebin volume requires about 2GB. Obviously we would run into problems when running two or more threads on the same machine.

So the first effort here is to refactor gmeshrebin2 to operate in a filter mode. That is the entire volume is not maintained by the rebinner itself. Instead it is collected by a subsequent module (to "reduce" everything together). The major change here is how data is stored. We switched to a tree + linear list method and as it turns out works very well.

This refactored rebinner is called gmeshrebin3.

gmeshrebin3 -0.8 0 0.002 -0.21 0.21 0.001 0.5 2.0 0.002 < ../exp/rotdata_test.inb > /dev/null 

In this above command, gmeshrebin3 operates in exactly the same way as gmeshrebin2. The same input, same output, same command-line arguments.

gmeshrebin3 -f -0.8 0 0.002 -0.21 0.21 0.001 0.5 2.0 0.002 < ../exp/rotdata_test.inb > /dev/null 

This one invokes the filter mode. Still the same input, the same arguments (after "-f"), but the output is a list of voxels in a compressed form to represent a sparse-volume. Its output is not directly viewable using other gmesh toolkit programs.

Now, let's launch it with all the correct plumbing.

map -n 3 gmeshrebin3 -f -0.8 0 0.002 -0.21 0.21 0.001 0.5 2.0 0.002 < ../exp/rotdata_test.inb > /dev/null 

This launches 3 parallel threads of gmeshrebin3, map handles all the inputs, reduce is called implicitly and handles collection of rebinned voxels and all the output. Good thing is, the output is back to the very format used by gmeshrebin2 and non-filter mode gmeshrebin3.

3 is obviously a user setting. It should be set to match the number of real cores/processors available on the machine.

This test file, rotdata_test.inb happens to have 322560 parallelipeds/pixels to rebin. Using 1 rebinner thread on a 2.5 GHz Intel Core-2 Duo Q9800, takes about 6 seconds. using 2 rebinner threads, it is down to 3 seconds. On outback, using 2 threads, the ball park runtime is less than 2 seconds.

The benchmark correctness test case created by M.R. still works. It runs like:

map -n 2 gmeshrebin3 -f 0.0 2.0 2.0 0.0 6.0 3.0 0.0 6.0 3.0 < ../data/reuter_input.inb > /dev/null

The output must always say:

...
rebinned sum_energy : 8.333333e+00
...

Otherwise, it is not correct.

Clone this wiki locally