-
Notifications
You must be signed in to change notification settings - Fork 0
Getting Started with Distributed Version
This manual is intended for a more experienced user who wants to build the distributed version of OpenPixi from sources. A basic knowledge or willingness to learn maven is of a great advantage. A short overview of the sections in this document follows
- Restrictions
- How do I compile the distributed version?
- How do I run the distributed version?
- How do I test the distributed version?
- How do I profile the distributed version?
- How do I trace the particle movement?
Currently, there are few restrictions to the distributed version; namely, the number of grid cells in x and y direction as well as the number of nodes among which the computation is distributed have to be powers of two.
First of all, you need to install IBIS framework which is used for the communication. The IBIS framework is available at the download page of openpixi project. To install IBIS unzip the downloaded package and run the following command from the same directory where ipl-2.3-standalone.jar
is.
mvn install:install-file -Dfile=ipl-2.3-standalone.jar -DgroupId=org.openpixi.pixi -DartifactId=ipl -Dversion=2.3 -Dpackaging=jar
If you are developing under eclipse you might have some errors in the pom file which needs to be resolved first. If you are experiencing errors concerning build-helper-maven-plugin (the full message of the error is: "Plugin execution not covered by lifecycle configuration: org.codehaus.mojo:build-helper-maven-plugin"), you need to click on the error marker in the left pane which will offer you to install "m2e connector for build-helper-maven-plugin".
By default the compilation of the distributed version is turned off in the pom file. The distributed version is only compiled under maven profile distributed. More about maven profiles can be found on the site of maven project.
To use the distributed profile from command line run
mvn compile -P distributed
To use the distributed profile from eclipse go to Project -> Properties -> Maven
and into the field Active Maven Profiles
write distributed. Afterwards, eclipse will automatically compile the sources of the distributed version.
To run the distributed version of pixi one has to first start IBIS IPL server. It can be easily done by running the ipl-server
script which is located in the script directory - openpixi/pixi/scripts
. (The ipl server utility allows the different calculating nodes to get to know about each other.)
After the ipl server is running, one needs to start pixi either in several terminals on one computer or on a cluster with several computers. To run distributed pixi one needs to use the following command
run <numOfNodes> <iplServer>
from the openpixi/pixi
directory where
-
numOfNodes
is number of nodes taking part in the distributed calculation. (This is not necessarily the number of computers as you can easily start multiple processes on one computer. Consequently, it is the total number of distributed pixi processes.) -
iplServer
: address of the computer running the ipl server utility
If you would like to try the distributed version on your computer, you can do so by executing the following steps
- start three terminals and navigate to the directory
openpixi/pixi
- run
ipl-server
in one of the terminals - in the remaining two terminals run the script
run 2 localhost
. Alternatively, you can also run it from your IDE by running directly the main class:org.openpixi.pixi.distributed.ui.MainProfile -numOfNodes 2 -iplServer localhost
Currently, you can only run distributed pixi on VSC 1 cluster as the VSC 2 does not have the required Java 1.6 on its compute nodes.
First of all, you will need to set maven and java home variables in your .bashrc file. You can set them for example as follows
# Verify the used paths on your own
export M2_HOME=/opt/sw/maven2
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
export JAVA_HOME=/usr/lib/jvm/java
export PATH=$JAVA_HOME/bin:$PATH
For the variables to take effect you have to logout and login again.
Secondly, one has to download pixi through git clone git://github.com/openpixi/openpixi.git
and compile it in the command line.
Finally, to run the application on the VSC cluster you have to first start the ipl-server
script and afterwards run
vsc-run <numOfHosts> <numOfProcesses>
from openpixi/pixi
directory where
-
numOfHosts
is the number of computers you would like to use -
numOfProcesses
is the number of processes you would like to start on each node
Following the restrictions at the beginning the numOfHosts * numOfProcesses
has to be power of two.
For example, running vsc-run 2 4
would start 8 pixi processes, 4 on each of the 2 host computers.
The results are collected in files named out.JOB_ID.HOSTNAME.PROCESS
where
-
JOB_ID
is id of our job assigned to us by VSC -
HOSTNAME
is the name of the computer on which the calculation took place -
PROCESS
distinguishes among the multiple processes we started at one computer
If you want to run OpenPixi on a different cluster than VSC 1, you most probably need to slightly modify the variables used in scripts vsc-run
and vsc-distribute
to match the set up of the cluster in question (this also applies when you are only changing to VSC 2).
All the tests of the distributed simulation compare the results with the non distributed simulation. There are two possible ways how to run the tests
-
Run the class
org.openpixi.pixi.distributed.TrueDistSimTest -numOfNodes NUMBER -iplServer SERVER
fromNUMBER
of terminals or from VSC by modifying the main class inrun
script. (You also have to start the ipl server.) -
Run the class
org.openpixi.pixi.distributed.ComplexDistSimTest
without any parameters which will run the distributed simulation utilizing threads. The advantage of this test is that you can run it comfortably from your IDE. (You are not expected to start the ipl server, it will be started automatically by the test.)
The above tests test the distributed version under various different settings. If you would like to test the application under your specific settings, you can modify the settings specified in class org.openpixi.pixi.distributed.ComplexDistSimTest
and afterwards run it.
You can profile (collect time measurements) of the distributed version similarly as in the non distributed version. The only difference is that at the end of the simulation you get the measurements from class DistributedProfileInfo
which adds times specific for the distributed version such as network waiting times. The measurements are automatically displayed if you run the class MainProfile
.
Similarly, as with the non distributed version you can get useful particle movement information by compiling the application with "aspectj-debug" profile. The distributed version adds information about particles which are exchanged among neighboring nodes.