-
Notifications
You must be signed in to change notification settings - Fork 8
Distributed Regridding
Consider the following dense matrix representation of an example matrix:
We can use Julia's sparse matrix representation to store the dense matrix as a collection of integers and vectors instead:
using SparseArrays
dense_matrix = [6 0 7 0; 0 0 1 0; -8 3 0 0; 0 0 5 -2]
sparse_matrix = SparseMatrixCSC(dense_matrix)
sparse_matrix.m: 4
sparse_matrix.n: 4
sparse_matrix.nzval:
[6 -8 3 7 1 5 -2]
, with length 7
sparse_matrix.rowval:
[1 3 3 1 2 4 4]
with length 7 (= |sparse_matrix.nzval|
)
sparse_matrix.colptr:
[1 3 4 7 8]
with length 5 (= sparse_matrix.n + 1
)
Here, m
and n
are the number of rows and columns in sparse_matrix
, respectively.
sparse_matrix.nzval
is a vector containing the nonzero values of dense_matrix
.
sparse_matrix.rowval
is a vector containing the row indices of the nonzero values of dense_matrix
. That is, sparse_matrix.rowval[i]
gives the row index of the value sparse_matrix.nzval[i]
in the original dense_matrix
.
sparse_matrix.colptr
is a vector containing the indices into sparse_matrix.nzval
of the first nonzero value in each column of the dense_matrix
. It also contains a final bound which is 1 larger than the largest index of sparse_matrix.nzval
(i.e. |sparse_matrix.nzval| + 1
). Values from column j
in the original dense_matrix
will have indices in sparse_matrix.nzval
in the range [sparse_matrix.colptr[j], sparse_matrix.colptr[j+1])
. The number of nonzero values in column j
is given by sparse_matrix.colptr[j + 1] - sparse_matrix.colptr[j]
. Because this vector contains one index for each column of the original matrix and one extra bound value, its length is always n + 1
.
This is perhaps a less intuitive approach than storing all the column indices into the original
dense_matrix
would be (as is done forrowval
), but it encodes sufficient information for the user to be able to recover the dense matrix from the sparse representation, and requires storing fewer numbers than storing all column indices would.
These fields provide all the necessary information to index into a sparse matrix, or to convert a sparse matrix to the corresponding dense matrix.
Consider the case where we want to remap from a source space with 1 element in the vertical, 2 elements in the horizontal, and 3 nodes per element (nex = 1, ney = 2, nq = 3), to a target space with 1 element in the vertical, 3 elements in the horizontal, and 3 nodes per element (nex = 1, ney = 3, nq = 3). The source space has 2 * 3^2 = 18 nodes total, and the target space has 3 * 3^2 = 27 nodes total.
This example is represented in the digram below, where each square is an element and each *
is a node.
source space target space
----------------------------- -------------------------------------------
|* * *|* * *| |* * *|* * *|* * *|
| | | | | | |
|* * *|* * *| ------> |* * *|* * *|* * *|
| | | | | | |
|* * *|* * *| |* * *|* * *|* * *|
----------------------------- -------------------------------------------
The weight matrix used to map between these spaces will have 18 columns (corresponding to the nodes of the source space) and 27 rows (corresponding to the nodes of the target space).
The weight matrix for the mapping can be represented as a sparse matrix, and will have the following properties:
weights.m = 27
weights.n = 18
length(weights.colptr) = 19 = weights.n + 1
length(weights.rowval) = length(weights.nzval) = 96
Julia's SparseMatrixCSC documentation
TempestRemap, which is written in C, doesn't use Julia's sparse matrix representation. Instead, it stores 3 arrays: one containing the nonzero values of the matrix (S
), one containing the corresponding row values in the reconstructed matrix (row
), and one containing the corresponding column values in the reconstructed matrix (col
). That is, the nonzero weight stored in S[i]
has row index row[i]
and column index col[i]
in the normal matrix representation.
In our remapping code, we construct a LinearMap
object, which contains all of the necessary information for the remapping, including the weight matrix. The components related to the sparse matrix are weights
, source_local_idxs
, and target_local_idxs
. weights
is simply a vector of the nonzero weights, analogous to sparse_matrix.nzval
above. source_local_idxs
is a 3-tuple of the form (i, j, e)
where these are vectors containing the row index, column index, and element index in the source space of each value in weights
(I'm not 100% sure if i
or j
is row vs. column). Each of these vectors has the same length as weights
. target_unique_idxs
stores the same values, but for the target space rather than the source space. Notably, source_local_idxs
actually stores the unique global indices (we should change this to actually store the local indices ASAP, or change the name), and target_local_idxs
stores the unique local indices. This is because our current "distributed" implementation is only capable of regridding from a serial source space to a distributed source space, but we should extend this soon (written 20 July 2023).
ClimaCore uses the following hierarchy to build up a Space
that fields can be defined on:
- Domain: physical domain (rectangle, sphere, etc)
- Mesh: adds elements on top of the domain, diving it into a type of grid
- Topology: adds ordering to the elements of the mesh, contains information about whether the space is distributed
- Space: adds nodes within elements of the topology
- Integer
gidx
providing element ordering globally across all processes
- Integer
lidx
providing element ordering locally within each process - For a serial space, this is the same as the global index
- When working on a distributed space, each element is fully contained within one process. There is no overlap of a given element across multiple processes.
- The number of local indices across all processes is equal to the number of global indices for a distributed space.
- A tuple of integers
(elem_x, elem_y)
that provides a unique indexing for each element
The example below shows the global indices (gidx
), local indices (lidx
), and Cartesian indices (CI
) for a mesh of 6 elements. These elements are divided among two processes (PID
s), which comes into play for the local indexing.
A few notes on this example:
- The global indices (and therefore the local indices) are ordered based on a space-filling curve, as is typically used in ClimaCore.
- The ordering of
lidx
indices follows the ordering ofgidx
indices even across processes. This is because we first specify the ordering of globalgidx
indices (here, using a space-filling curve), then partition elements by process, then assignlidx
to each element of a process starting at 1 and following the order ofgidx
.
- The ordering of
- The Cartesian indices do not use a space-filling curve since they're based on the x and y positions of each element.
- The 3 elements in the bottom row are on
PID 1
. The 3 elements in the top row are onPID 2
. Because there are 3 elements on each process, thelidx
indices for each row (process) are in the range[1, 3]
.- Note that the elements in each row will not necessarily all be on the same process. For example, if we had 3 processes instead of 2, this would not be the case.
-------------------------------------------
| | | |
| gidx = 6 | gidx = 5 | gidx = 4 |
| lidx = 3 | lidx = 2 | lidx = 1 |
| CI = (2, 1) | CI = (2, 2) | CI = (2, 3) |
| | | | PID 2
- - - - ------------------------------------------- - - - -
| | | | PID 1
| gidx = 1 | gidx = 2 | gidx = 3 |
| lidx = 1 | lidx = 2 | lidx = 3 |
| CI = (1, 1) | CI = (1, 2) | CI = (1, 3) |
| | | |
-------------------------------------------
- Constructed by traversing all nodes, keeping a counter of node number. For each node, assign it the node number and increment by 1 after each assignment
- This results in multiple number assignments for nodes which exist at the boundary of more than one element, since these nodes are effectively duplicated for each element they appear within.
- Constructed similarly to redundant indices, except that nodes which already have numbers are skipped
- This results in a single unique index for each node, even those which appear in multiple elements
The example below shows the global redundant (blue) and unique (black) nodal indices for nodes on the mesh used in the previous example.
The redundant indices are generated using the following algorithm:
1. Initialize a `node_number = 1` 2. Loop over global element indices `gidx` in ascending order. Do the following for each element: a. Begin at the leftmost node in the bottom row, and traverse nodes left to right in each row, then traverse rows from the bottom to top. b. At each node, assign this node the current `node_number`, then increment `node_number` by 1.
The unique indices are generated using this algorithm with one change: in step 2b, if a node already has a value, it is not given an additional value and node_number
is not incremented.
A few notes on this example:
- We could similarly show the local redundant and unique indices of nodes in elements distributed across different processes. The counting algorithms are the same, with the only change that both types of indexing reset to 1 at the first node on a new process.
- This example shows nodal indexing where the elements are ordered using a space-filling curve (as was shown in the element indexing example of the previous section).
- Local element indices (usually denoted by
lidx
) - Redundant node indices
- Typically applies space filling curve for element ordering
- Unique global indices - one integer per node
- Does not provide information about element indexing
- Uses element ordering of input topology