Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'make test' in examples/ssymm fails: 'tiled' output differs #42

Open
bondhugula opened this issue May 12, 2019 · 3 comments
Open

'make test' in examples/ssymm fails: 'tiled' output differs #42

bondhugula opened this issue May 12, 2019 · 3 comments

Comments

@bondhugula
Copy link
Owner

make[1]: Entering directory '/home/uday/git/pluto/examples/ssymm'
touch .test
./orig 2> out_orig
12.579868s
./tiled 2> out_tiled
3.264796s
diff -q out_orig out_tiled
Files out_orig and out_tiled differ
make[1]: *** [../common.mk:106: test] Error 1

@bondhugula
Copy link
Owner Author

This is the transformation found for 'tiled'.

../../polycc ssymm.c --noparallel --codegen-context=100 -o ssymm.tiled.c
[pluto] compute_deps (isl)
[pluto] Number of statements: 3
[pluto] Total number of loops: 8
[pluto] Number of deps: 15
[pluto] Maximum domain dimensionality: 3
[pluto] Number of parameters: 1
[pluto] Diamond tiling not possible/useful
[pluto] Affine transformations [<iter coeff's> ]

T(S1): (2, i, k, j)
loop types (scalar, loop, loop, loop)

T(S2): (0, i, j, k)
loop types (scalar, loop, loop, loop)

T(S3): (1, i, j, 0)
loop types (scalar, loop, loop, scalar)

[Pluto] After tiling:
T(S1): (2, i/32, k/32, j/32, i, k, j)
loop types (scalar, loop, loop, loop, loop, loop, loop)

T(S2): (0, i/32, j/32, k/32, i, j, k)
loop types (scalar, loop, loop, loop, loop, loop, loop)

T(S3): (1, i/32, j/32, i, j, 0, 0)
loop types (scalar, loop, loop, loop, loop, scalar, scalar)

[Pluto] After intra-tile optimize
T(S1): (2, i/32, k/32, j/32, i, j, k)
loop types (scalar, loop, loop, loop, loop, loop, loop)

T(S2): (0, i/32, j/32, k/32, i, j, k)
loop types (scalar, loop, loop, loop, loop, loop, loop)

T(S3): (1, i/32, j/32, i, j, 0, 0)
loop types (scalar, loop, loop, loop, loop, scalar, scalar)

[pluto] using statement-wise -fs/-ls options: S1(5,7), S2(5,7), S3(4,7),
[Pluto] Output written to ssymm.tiled.c

[pluto] Timing statistics
[pluto] SCoP extraction + dependence analysis time: 0.001345s
[pluto] Auto-transformation time: 0.022889s
[pluto] Total constraint solving time (LP/MIP/ILP) time: 0.002466s
[pluto] Code generation time: 0.041953s
[pluto] Other/Misc time: 0.224972s
[pluto] Total time: 0.291159s
[pluto] All times: 0.001345 0.022889 0.041953 0.224972
gcc -O3 -march=native -mtune=native -ftree-vectorize -DTIME ssymm.tiled.c -o tiled -lm

@bondhugula
Copy link
Owner Author

The output of 'par' and 'tiled' match, but they both differ from that of 'orig'.

@bondhugula
Copy link
Owner Author

This is a precision related issue likely associated with value unsafe optimizations. The error goes away with -O1. And even at -O3, the difference is for a small number of values and at the sixth decimal place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant