Copper is a read-only cooperative caching layer aimed to enable scalable data loading on massive amounts of compute nodes. This aims to avoid the I/O bottleneck in the storage network and effectively use the compute network for data movement.
The current intended use of copper is to improve the performance of python imports - dynamic shared library loading on Aurora. However, copper can used to improve the performance of any type of redundant data loading on a supercomputer.
More documentation can be found here: readthedocs
- init
- open
- read
- readdir
- readlink
- getattr
- ioctl
- destroy
module load copper
CUPATH=$COPPER_ROOT/bin/cu_fuse # If you are building copper on your own, set this path to your cu_fuse binary
mkdir -p ${LOGDIR}
clush --hostfile ${PBS_NODEFILE} "mkdir -p ${CU_FUSE_MNT_VIEWDIR}"
read -r -d '' CMD << EOM
numactl --physcpubind="0-3"
-tpath / # / will be mounted under CU_FUSE_MNT_VIEWDIR
-vpath ${CU_FUSE_MNT_VIEWDIR} # To provide the fuse mounted location
-log_output_dir ${LOGDIR} # To provide where the copper logs will be stored
-log_level 6 # To provide the level of copper logging 6 more 0 less
-log_type file # To direct logging to file / stdout / both
-net_type cxi # To provide the network protocol
-nf ${PBS_NODEFILE} # To provide the hostlist where cu_fuse will be mounted
-trees 1 # To provide the number of trees to form in the overlay network
-max_cacheable_byte_size $((10*1024*1024)) # To provide the size of access that goes through copper
-facility_address_book ${facility_address_book} # To provide the path to the facility_address_book file
-s ${CU_FUSE_MNT_VIEWDIR} # To start fuse in single threaded mode.
clush --hostfile ${PBS_NODEFILE} $CMD # To start copper on all the compute nodes
# instead of clush you can also use the following to start copper as a background process on all compute nodes
# mpirun --np ${NRANKS} --ppn ${RANKS_PER_NODE} --cpu-bind=list:0-3 sh ./scripts/filesystem/ &
time mpirun --np ${NRANKS} --ppn ${RANKS_PER_NODE} --cpu-bind=${CPU_BINDING} --genvall \
--genv=PYTHONPATH=${CU_FUSE_MNT_VIEWDIR}/lus/flare/projects/Aurora_deployment/kaushik/copper/july12/copper/run/copper_conda_env \
python3 -c "import torch; print(torch.__file__)"
clush --hostfile ${PBS_NODEFILE} "fusermount3 -u ${CU_FUSE_MNT_VIEWDIR}"
clush --hostfile ${PBS_NODEFILE} "rm -rf ${CU_FUSE_MNT_VIEWDIR}"
clush --hostfile ${PBS_NODEFILE} "pkill -9 cu_fuse"
In order to build we recommend installing mochi dependencies using spack and their environment feature. You can find the instructions to install spack here. The required mochi services are margo, mercury, thallium, and cereal. The instructions to install the listed mochi services can be found here.
Assuming you have a mochi environment setup correctly you should now be able to build by running the following commands.
git clone
. copper/gitrepos/git-spack-repo/spack/share/spack/
git clone
cd copper/gitrepos/git-mochi-repos/platform-configurations/ANL/Aurora
[compare with copper/scripts/build_helper/aurora_spack.yaml]
git clone
spack repo add copper/gitrepos/git-mochi-repos/mochi-spack-packages
module load cmake # on aurora
spack env create kaushik_env_1 spack.yaml
spack env activate kaushik_env_1
spack add mochi-margo
spack install
# incase of any issue with the spack environment, completely delete spack and start again
spack env remove kaushik_env_1
# from the next time onwards you only need
. copper/gitrepos/git-spack-repo/spack/share/spack/
spack env activate kaushik_env_1
git clone
cd copper/scripts/build_helper/
Set the following variables in the copied ``
spack env activate <MOCHI_ENV>
sh copper/scripts/build_helper/