Skip to content

Commit

Permalink
Merge branch 'release/v0.7.0'
Browse files Browse the repository at this point in the history
  • Loading branch information
paboyle committed May 6, 2017
2 parents c363bdd + 66d819c commit 51bf150
Show file tree
Hide file tree
Showing 453 changed files with 70,600 additions and 13,145 deletions.
25 changes: 23 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
################
*~
*#
*.sublime-*

# Precompiled Headers #
#######################
Expand Down Expand Up @@ -47,7 +48,9 @@ Config.h.in
config.log
config.status
.deps
*.inc
Make.inc
eigen.inc
Eigen.inc

# http://www.gnu.org/software/autoconf #
########################################
Expand Down Expand Up @@ -89,6 +92,7 @@ build*/*
#####################
*.xcodeproj/*
build.sh
.vscode

# Eigen source #
################
Expand All @@ -101,4 +105,21 @@ lib/fftw/*
# libtool macros #
##################
m4/lt*
m4/libtool.m4
m4/libtool.m4

# github pages #
################
gh-pages/

# Buck files #
##############
.buck*
buck-out
BUCK
make-bin-BUCK.sh

# generated sources #
#####################
lib/qcd/spin/gamma-gen/*.h
lib/qcd/spin/gamma-gen/*.cc

28 changes: 18 additions & 10 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ cache:
matrix:
include:
- os: osx
osx_image: xcode7.2
osx_image: xcode8.3
compiler: clang
- compiler: gcc
dist: trusty
sudo: required
addons:
apt:
sources:
Expand All @@ -24,6 +26,8 @@ matrix:
- binutils-dev
env: VERSION=-4.9
- compiler: gcc
dist: trusty
sudo: required
addons:
apt:
sources:
Expand All @@ -38,6 +42,7 @@ matrix:
- binutils-dev
env: VERSION=-5
- compiler: clang
dist: trusty
addons:
apt:
sources:
Expand All @@ -52,6 +57,7 @@ matrix:
- binutils-dev
env: CLANG_LINK=http://llvm.org/releases/3.8.0/clang+llvm-3.8.0-x86_64-linux-gnu-ubuntu-14.04.tar.xz
- compiler: clang
dist: trusty
addons:
apt:
sources:
Expand All @@ -73,13 +79,15 @@ before_install:
- if [[ "$TRAVIS_OS_NAME" == "linux" ]] && [[ "$CC" == "clang" ]]; then export LD_LIBRARY_PATH="${GRIDDIR}/clang/lib:${LD_LIBRARY_PATH}"; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew update; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install libmpc; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install openmpi; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]] && [[ "$CC" == "gcc" ]]; then brew install gcc5; fi

install:
- export CC=$CC$VERSION
- export CXX=$CXX$VERSION
- echo $PATH
- which autoconf
- autoconf --version
- which automake
- automake --version
- which $CC
- $CC --version
- which $CXX
Expand All @@ -92,15 +100,15 @@ script:
- cd build
- ../configure --enable-precision=single --enable-simd=SSE4 --enable-comms=none
- make -j4
- ./benchmarks/Benchmark_dwf --threads 1
- ./benchmarks/Benchmark_dwf --threads 1 --debug-signals
- echo make clean
- ../configure --enable-precision=double --enable-simd=SSE4 --enable-comms=none
- make -j4
- ./benchmarks/Benchmark_dwf --threads 1
- ./benchmarks/Benchmark_dwf --threads 1 --debug-signals
- make check
- echo make clean
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then export CXXFLAGS='-DMPI_UINT32_T=MPI_UNSIGNED -DMPI_UINT64_T=MPI_UNSIGNED_LONG'; fi
- ../configure --enable-precision=single --enable-simd=SSE4 --enable-comms=mpi-auto
- make -j4
- if [[ "$TRAVIS_OS_NAME" == "linux" ]]; then mpirun.openmpi -n 2 ./benchmarks/Benchmark_dwf --threads 1 --mpi 2.1.1.1; fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then mpirun -n 2 ./benchmarks/Benchmark_dwf --threads 1 --mpi 2.1.1.1; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]] && [[ "$CC" == "clang" ]]; then ../configure --enable-precision=single --enable-simd=SSE4 --enable-comms=mpi-auto ; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]] && [[ "$CC" == "clang" ]]; then make -j4; fi
- if [[ "$TRAVIS_OS_NAME" == "linux" ]] && [[ "$CC" == "clang" ]]; then mpirun.openmpi -n 2 ./benchmarks/Benchmark_dwf --threads 1 --mpi 2.1.1.1; fi


15 changes: 11 additions & 4 deletions Makefile.am
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
# additional include paths necessary to compile the C++ library
SUBDIRS = lib benchmarks tests
SUBDIRS = lib benchmarks tests extras

.PHONY: tests
include $(top_srcdir)/doxygen.inc

tests: all
$(MAKE) -C tests tests
bin_SCRIPTS=grid-config


.PHONY: bench check tests doxygen-run doxygen-doc $(DX_PS_GOAL) $(DX_PDF_GOAL)

tests-local: all
bench-local: all
check-local: all

AM_CXXFLAGS += -I$(top_builddir)/include

ACLOCAL_AMFLAGS = -I m4
34 changes: 28 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,26 @@ Last update Nov 2016.

_Please do not send pull requests to the `master` branch which is reserved for releases._

### Compilers

Intel ICPC v16 and later

Clang v3.5 and later (need 3.8 and later for OpenMP)

GCC v4.9.x (recommended)

GCC v6.3 and later

### Important:

Some versions of GCC appear to have a bug under high optimisation (-O2, -O3).

The safety of these compiler versions cannot be guaranteed at this time. Follow Issue 100 for details and updates.

GCC v5.x

GCC v6.1, v6.2

### Bug report

_To help us tracking and solving more efficiently issues with Grid, please report problems using the issue system of GitHub rather than sending emails to Grid developers._
Expand Down Expand Up @@ -95,10 +115,10 @@ install Grid. Other options are detailed in the next section, you can also use `
`CXX`, `CXXFLAGS`, `LDFLAGS`, ... environment variables can be modified to
customise the build.

Finally, you can build and install Grid:
Finally, you can build, check, and install Grid:

``` bash
make; make install
make; make check; make install
```

To minimise the build time, only the tests at the root of the `tests` directory are built by default. If you want to build tests in the sub-directory `<subdir>` you can execute:
Expand All @@ -116,13 +136,15 @@ If you want to build all the tests at once just use `make tests`.
- `--with-fftw=<path>`: look for FFTW in the UNIX prefix `<path>`
- `--enable-lapack[=<path>]`: enable LAPACK support in Lanczos eigensolver. A UNIX prefix containing the library can be specified (optional).
- `--enable-mkl[=<path>]`: use Intel MKL for FFT (and LAPACK if enabled) routines. A UNIX prefix containing the library can be specified (optional).
- `--enable-numa`: ???
- `--enable-numa`: enable NUMA first touch optimisation
- `--enable-simd=<code>`: setup Grid for the SIMD target `<code>` (default: `GEN`). A list of possible SIMD targets is detailed in a section below.
- `--enable-gen-simd-width=<size>`: select the size (in bytes) of the generic SIMD vector type (default: 32 bytes).
- `--enable-precision={single|double}`: set the default precision (default: `double`).
- `--enable-precision=<comm>`: Use `<comm>` for message passing (default: `none`). A list of possible SIMD targets is detailed in a section below.
- `--enable-rng={ranlux48|mt19937}`: choose the RNG (default: `ranlux48 `).
- `--enable-rng={sitmo|ranlux48|mt19937}`: choose the RNG (default: `sitmo `).
- `--disable-timers`: disable system dependent high-resolution timers.
- `--enable-chroma`: enable Chroma regression tests.
- `--enable-doxygen-doc`: enable the Doxygen documentation generation (build with `make doxygen-doc`)

### Possible communication interfaces

Expand All @@ -136,7 +158,7 @@ The following options can be use with the `--enable-comms=` option to target dif
| `mpi3l[-auto]` | MPI communications using MPI 3 shared memory and leader model |
| `shmem ` | Cray SHMEM communications |

For the MPI interfaces the optional `-auto` suffix instructs the `configure` scripts to determine all the necessary compilation and linking flags. This is done by extracting the informations from the MPI wrapper specified in the environment variable `MPICXX` (if not specified `configure` will scan though a list of default names).
For the MPI interfaces the optional `-auto` suffix instructs the `configure` scripts to determine all the necessary compilation and linking flags. This is done by extracting the informations from the MPI wrapper specified in the environment variable `MPICXX` (if not specified `configure` will scan though a list of default names). The `-auto` suffix is not supported by the Cray environment wrapper scripts. Use the standard versions instead.

### Possible SIMD types

Expand All @@ -157,14 +179,14 @@ Alternatively, some CPU codenames can be directly used:

| `<code>` | Description |
| ----------- | -------------------------------------- |
| `KNC` | [Intel Xeon Phi codename Knights Corner](http://ark.intel.com/products/codename/57721/Knights-Corner) |
| `KNL` | [Intel Xeon Phi codename Knights Landing](http://ark.intel.com/products/codename/48999/Knights-Landing) |
| `BGQ` | Blue Gene/Q |

#### Notes:
- We currently support AVX512 only for the Intel compiler. Support for GCC and clang will appear in future versions of Grid when the AVX512 support within GCC and clang will be more advanced.
- For BG/Q only [bgclang](http://trac.alcf.anl.gov/projects/llvm-bgq) is supported. We do not presently plan to support more compilers for this platform.
- BG/Q performances are currently rather poor. This is being investigated for future versions.
- The vector size for the `GEN` target can be specified with the `configure` script option `--enable-gen-simd-width`.

### Build setup for Intel Knights Landing platform

Expand Down
61 changes: 47 additions & 14 deletions TODO
Original file line number Diff line number Diff line change
@@ -1,6 +1,26 @@
TODO:
---------------

Peter's work list:
2)- Precision conversion and sort out localConvert <--
3)- Remove DenseVector, DenseMatrix; Use Eigen instead. <-- started
4)- Binary I/O speed up & x-strips
-- Profile CG, BlockCG, etc... Flop count/rate -- PARTIAL, time but no flop/s yet
-- Physical propagator interface
-- Conserved currents
-- GaugeFix into central location
-- Multigrid Wilson and DWF, compare to other Multigrid implementations
-- HDCR resume

Recent DONE
-- Cut down the exterior overhead <-- DONE
-- Interior legs from SHM comms <-- DONE
-- Half-precision comms <-- DONE
-- Merge high precision reduction into develop
-- multiRHS DWF; benchmark on Cori/BNL for comms elimination
-- slice* linalg routines for multiRHS, BlockCG

-----
* Forces; the UdSdU term in gauge force term is half of what I think it should
be. This is a consequence of taking ONLY the first term in:

Expand All @@ -21,16 +41,8 @@ TODO:
This means we must double the force in the Test_xxx_force routines, and is the origin of the factor of two.
This 2x is applied by hand in the fermion routines and in the Test_rect_force routine.


Policies:

* Link smearing/boundary conds; Policy class based implementation ; framework more in place

* Support different boundary conditions (finite temp, chem. potential ... )

* Support different fermion representations?
- contained entirely within the integrator presently

- Sign of force term.

- Reversibility test.
Expand All @@ -41,11 +53,6 @@ Policies:

- Audit oIndex usage for cb behaviour

- Rectangle gauge actions.
Iwasaki,
Symanzik,
... etc...

- Prepare multigrid for HMC. - Alternate setup schemes.

- Support for ILDG --- ugly, not done
Expand All @@ -55,9 +62,11 @@ Policies:
- FFTnD ?

- Gparity; hand opt use template specialisation elegance to enable the optimised paths ?

- Gparity force term; Gparity (R)HMC.
- Random number state save restore

- Mobius implementation clean up to rmove #if 0 stale code sequences

- CG -- profile carefully, kernel fusion, whole CG performance measurements.

================================================================
Expand Down Expand Up @@ -90,6 +99,7 @@ Insert/Extract
Not sure of status of this -- reverify. Things are working nicely now though.

* Make the Tensor types and Complex etc... play more nicely.

- TensorRemove is a hack, come up with a long term rationalised approach to Complex vs. Scalar<Scalar<Scalar<Complex > > >
QDP forces use of "toDouble" to get back to non tensor scalar. This role is presently taken TensorRemove, but I
want to introduce a syntax that does not require this.
Expand All @@ -112,6 +122,8 @@ Not sure of status of this -- reverify. Things are working nicely now though.
RECENT
---------------

- Support different fermion representations? -- DONE
- contained entirely within the integrator presently
- Clean up HMC -- DONE
- LorentzScalar<GaugeField> gets Gauge link type (cleaner). -- DONE
- Simplified the integrators a bit. -- DONE
Expand All @@ -123,6 +135,26 @@ RECENT
- Parallel io improvements -- DONE
- Plaquette and link trace checks into nersc reader from the Grid_nersc_io.cc test. -- DONE


DONE:
- MultiArray -- MultiRHS done
- ConjugateGradientMultiShift -- DONE
- MCR -- DONE
- Remez -- Mike or Boost? -- DONE
- Proto (ET) -- DONE
- uBlas -- DONE ; Eigen
- Potentially Useful Boost libraries -- DONE ; Eigen
- Aligned allocator; memory pool -- DONE
- Multiprecision -- DONE
- Serialization -- DONE
- Regex -- Not needed
- Tokenize -- Why?

- Random number state save restore -- DONE
- Rectangle gauge actions. -- DONE
Iwasaki,
Symanzik,
... etc...
Done: Cayley, Partial , ContFrac force terms.

DONE
Expand Down Expand Up @@ -207,6 +239,7 @@ Done
FUNCTIONALITY: it pleases me to keep track of things I have done (keeps me arguably sane)
======================================================================================================

* Link smearing/boundary conds; Policy class based implementation ; framework more in place -- DONE
* Command line args for geometry, simd, etc. layout. Is it necessary to have -- DONE
user pass these? Is this a QCD specific?

Expand Down
9 changes: 4 additions & 5 deletions VERSION
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
Version : 0.6.0
Version : 0.7.0

- AVX512, AVX2, AVX, SSE good
- Clang 3.5 and above, ICPC v16 and above, GCC 4.9 and above
- MPI and MPI3
- HiRep, Smearing, Generic gauge group
- Clang 3.5 and above, ICPC v16 and above, GCC 6.3 and above recommended
- MPI and MPI3 comms optimisations for KNL and OPA finished
- Half precision comms
Loading

0 comments on commit 51bf150

Please sign in to comment.