diff --git a/docs/03-memory-access-hierarchy.md b/docs/03-memory-access-hierarchy.md
index 0662141..46809b6 100644
--- a/docs/03-memory-access-hierarchy.md
+++ b/docs/03-memory-access-hierarchy.md
@@ -30,7 +30,7 @@ lang: en
- Accessible by all threads in a grid
- Slow, latency of eg. 600-700 cycles
- - Still, high bandwidth compared to CPU memory (1600 TB/s in AMD MI250X)
+ - Still, high bandwidth compared to CPU memory (1600 GB/s for a single GCD of AMD MI250X)
- Can be controlled by host (via pointer operations)
- Lifetime of the program
diff --git a/docs/06-kokkos.md b/docs/06-kokkos.md
index c9a51b8..b138d5d 100644
--- a/docs/06-kokkos.md
+++ b/docs/06-kokkos.md
@@ -53,10 +53,6 @@ lang: en
- Execution units may have distinct memories
-
-![](img/kokkos-node-doc.png){.center width=70%}
-
-
# Execution and Memory Spaces
- Kokkos uses an execution space model to abstract the details of parallel hardware
diff --git a/exercises/kokkos/05-laplacian/solution-functor/laplacian.cpp b/exercises/kokkos/05-laplacian/solution-functor/laplacian.cpp
index 14b1575..14cbd90 100644
--- a/exercises/kokkos/05-laplacian/solution-functor/laplacian.cpp
+++ b/exercises/kokkos/05-laplacian/solution-functor/laplacian.cpp
@@ -61,6 +61,7 @@ int main(int argc, char** argv)
Kokkos::MDRangePolicy >({1, 1}, {nx-1, ny-1}),
laplFunctor(A, L, dx, dy));
+ Kokkos::fence();
double t1 = timer.seconds();
// Check the result
diff --git a/exercises/kokkos/05-laplacian/solution-lambda/laplacian.cpp b/exercises/kokkos/05-laplacian/solution-lambda/laplacian.cpp
index 0a9fcd1..f6bf022 100644
--- a/exercises/kokkos/05-laplacian/solution-lambda/laplacian.cpp
+++ b/exercises/kokkos/05-laplacian/solution-lambda/laplacian.cpp
@@ -50,6 +50,7 @@ int main(int argc, char** argv)
(A(i,j-1) - 2.0*A(i,j) + A(i,j+1)) * inv_dy2;
});
+ Kokkos::fence();
double t1 = timer.seconds();
// Check the result