Skip to content

Provides a seamless, unified parallel environment configuration and tools to support most MPI implementations without the hassle of setting a PE up for each one (then documenting it, then getting the users to figure it out.

License

Notifications You must be signed in to change notification settings

brlindblom/gepetools

Repository files navigation

=== About =========================================================

This package contains various scripts and code for creating
"universal" parallel environments (PE).  These PEs are capable
of handling jobs utilizing the following, tested, MPI 
impementations:

 * OpenMPI
 * HPMPI
 * Intel MPI
 * MVAPICH/MVAPICH2
 * MPICH/MPICH2
 * LAM/MPI

Supported applications that use MPI but have their own methods of 
executing parallel jobs:

 * ANSYS 13
 * Comsol 4.x

Other non-MPI message-passing/memory-sharing implementations that
will be supported include

 * Linda (of Gaussian fame)

=== Description of Files ==========================================

startpe.sh   - Called by start_proc_args, sets up necessary 
               environment including machines files in $TMPDIR, 
               mpdboot symlinks, etc.  Available machine files are
             
               machines.mpich
               machines.mvapich
               machines.mpich2
               machines.mvapich2
               machines.intelmpi
               machines.hpmpi

stoppe.sh    - Cleans up the mess created by startpe.sh

pe_env_setup - Only very recent gridengine releases allow for 
               running parallel jobs from within an interactive
               qlogin session.  This script allows that
               functionality by simply doing

               qlogin -pe mpi.4 8
               ...
               user@node:$ source $SGE_ROOT/gepetools/pe_env_setup

               You can now load modules and run tightly-integrated
               MPI jobs using mpirun/mpiexec

mpdboot      - Takes the pain out of launching your MPD daemons
               across the cluster.  Call it prior to mpirun/mpiexec
               with any MPD-enabled MPI implementation.  Uses
               tight-integration.

extJobInfo   - How about some friggin' visibility into what your
               mpi processes are doing???

               Simply do:

               source $SGE_ROOT/gepetools/extJobInfo

               prior to your mpirun/mpiexec in your submit script.
               You'll be blessed with a file 
               
               ${JOB_NAME}.${JOB_ID}.extJobInfo

               which will have process info for your job's (currently, 
               master only) child processes including memory, cpu,
               and state information

=== Installation ==================================================

This package can be extracted anywhere by the final installation 
directory.  Its best if the installation directory is on a shared 
directory.

Just run

./install.sh <install_dir>

=== Example jobs using the JSV code
You should be ready to submit jobs...

Examples below:

1. MPICH Example

#$ ... SGE directives ...
#$ -l nodes=4,ppn=4   # 16 slots total, 4 ppn
#$ ...

# Doesn't everyone use modules?
module purge
module add mpi/mpich/1.2.7

mpirun -np $NSLOTS -machinefile $MPICH_HOSTS myexecutable
###

2. MPICH2/MVAPICH2 pre-hydra Example

#$ ...
#$ -l pcpus=32      # 32 slots, round-robin
#$ ...

module purge
module add mpi/mpich2/1.4

mpdboot
mpirun -np $NSLOTS myexecutable
###

3. MPICH2/MVAPICH2/IntelMPI w/ Hydra

#$ ...
#$ -l nodes=8,ppn=8 # 64 slots, 8 ppn
#$ ...

module purge
module add mpi/intel/1.4

mpiexec -n $NSLOTS myexecutable
###

4. OpenMPI

#$ ...
#$ -l nodes=8,ppn=12 # 96 slots, 12 ppn
#$ ...

module purge
module add mpi/openmpi/1.4.4

mpirun myexecutable
###

5. LAM

#$ ...
#$ -l nodes=2,ppn=4
#$ ...

module purge
module add mpi/lam/7.1.4

lamboot $LAM_HOSTS
lamnodes
mpirun -np $NSLOTS myexecutable
lamclean
###

About

Provides a seamless, unified parallel environment configuration and tools to support most MPI implementations without the hassle of setting a PE up for each one (then documenting it, then getting the users to figure it out.

Resources

License

Stars

Watchers

Forks

Packages

No packages published