Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main'
Browse files Browse the repository at this point in the history
  • Loading branch information
jussienko committed Nov 27, 2024
2 parents 118255e + c4321cc commit dfd7bd9
Show file tree
Hide file tree
Showing 4 changed files with 3 additions and 5 deletions.
2 changes: 1 addition & 1 deletion docs/03-memory-access-hierarchy.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ lang: en
<div class="column" style=width:68%>
- Accessible by all threads in a grid
- Slow, latency of eg. 600-700 cycles
- Still, high bandwidth compared to CPU memory (1600 TB/s in AMD MI250X)
- Still, high bandwidth compared to CPU memory (1600 GB/s for a single GCD of AMD MI250X)
- Can be controlled by host (via pointer operations)
- Lifetime of the program
</div>
Expand Down
4 changes: 0 additions & 4 deletions docs/06-kokkos.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,6 @@ lang: en
- Execution units may have distinct memories
</div>

<div class="column">
![](img/kokkos-node-doc.png){.center width=70%}
</div>

# Execution and Memory Spaces

- Kokkos uses an execution space model to abstract the details of parallel hardware
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ int main(int argc, char** argv)
Kokkos::MDRangePolicy<Kokkos::Rank<2> >({1, 1}, {nx-1, ny-1}),
laplFunctor(A, L, dx, dy));

Kokkos::fence();
double t1 = timer.seconds();

// Check the result
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ int main(int argc, char** argv)
(A(i,j-1) - 2.0*A(i,j) + A(i,j+1)) * inv_dy2;
});

Kokkos::fence();
double t1 = timer.seconds();

// Check the result
Expand Down

0 comments on commit dfd7bd9

Please sign in to comment.