Re-implement SYCL backend parallel_for
to improve bandwidth utilization
#535
Job | Run time |
---|---|
19m 37s | |
16m 50s | |
17s | |
10m 25s | |
15m 30s | |
4m 44s | |
24m 9s | |
12m 21s | |
9m 54s | |
6m 30s | |
9m 52s | |
9m 55s | |
11m 18s | |
8m 21s | |
11m 17s | |
12m 12s | |
5m 17s | |
26m 20s | |
17m 36s | |
18m 9s | |
4h 10m 34s |