Re-implement SYCL backend parallel_for
to improve bandwidth utilization
#432
Job | Run time |
---|---|
15s | |
17m 46s | |
15m 38s | |
4m 57s | |
16m 52s | |
24m 27s | |
12m 36s | |
11m 51s | |
10m 18s | |
6m 11s | |
9m 46s | |
9m 43s | |
11m 9s | |
8m 30s | |
11m 10s | |
11m 44s | |
5m 9s | |
26m 7s | |
17m 28s | |
18m 45s | |
4h 10m 22s |