Skip to content

Using XArray and dask in satpy

Martin Raspaud edited this page Mar 1, 2018 · 24 revisions

XArray

import xarray as xr

XArray's DataArray is now the standard data structure for arrays in satpy. They allow the array to have define dimensions, coordinates, and attributes (that we use for the metadata).

To create such an array, you can do for example

my_dataarray = xr.DataArray(my_data, dims=['y', 'x'],
                            coords={'x': np.arange(...)},
                            attrs={'sensor': 'olci'})

my_data can be a regular numpy array, a numpy memmap, or, if you want to keep things lazy, a dask array (more on dask later).

In satpy, the dimension of the arrays should include

  • x for the x or pixel dimension
  • y for the y or line dimension
  • bands for composites
  • time can also be provided, but we have limited support for it at the moment. Use metadata for common cases (start_time, end_time)

Dimensions are accessible through my_dataarray.dims. To get the size of a given dimension, use sizes:

my_dataarray.sizes['x']

Coordinates can be defined for those dimensions when it makes sense:

Dask

import dask.array as da

Helpful functions:

  • map_blocks
  • map_overlap
  • atop
  • store
  • tokenize