CG

Dec 8, 2019

9294bc4 · Dec 8, 2019

Name	Name	Last commit message	Last commit date
parent directory ..
Makefile	Makefile	Init	Dec 8, 2019
README.carefully	README.carefully	Init	Dec 8, 2019
cg.c	cg.c	Init	Dec 8, 2019
globals.h	globals.h	Init	Dec 8, 2019

README.carefully

Note: please observe that in the routine conj_grad three 
implementations of the sparse matrix-vector multiply have
been supplied.  The default matrix-vector multiply is not
loop unrolled.  The alternate implementations are unrolled
to a depth of 2 and unrolled to a depth of 8.  Please
experiment with these to find the fastest for your particular
architecture.  If reporting timing results, any of these three may
be used without penalty.

Performance examples:
The non-unrolled version of the multiply is actually (slightly: 
maybe %5) faster on the sp2-66MHz-WN on 16 nodes than is the 
unrolled-by-2 version below.   On the Cray t3d, the reverse is true, 
i.e., the unrolled-by-two version is some 10% faster.  
The unrolled-by-8 version below is significantly faster
on the Cray t3d - overall speed of code is 1.5 times faster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

CG

CG

README.carefully

Files

CG

Directory actions

More options

Directory actions

More options

Latest commit

History

CG

Folders and files

parent directory

README.carefully