Skip to content

Map algebra reference

Dave Johnson edited this page Dec 6, 2017 · 57 revisions

This page provides information about each of the map algebra functions available in MrGeo, including a description of the function and the parameters passed to it.

Quick Links

Raster Math Operations

Standard math operations in MrGeo allow users to form algebraic expressions on input images such as adding images together, subtracting one image from another, performing trigonometric functions on an image, etc... If an input value is nodata, then the output value will also be set to nodata.

The available operations are:

  • Addition

    Produces an image with the same type as the input with the most specific type. For example, if one input is int16 and the other is float32, the output will be float32. If the pixel value of one of the inputs is nodata, then the output value of that pixel will be nodata.

# Add two images
[myImage1] + [myImage2]
# Add a constant value to every pixel of an input image
[myImage] + 1
# Add multiple images
[myImage1] + [myImage2] + [myImage3]
  • Subtraction

    Produces an image with the same type as the input with the most specific type. For example, if one input is int16 and the other is float32, the output will be float32. If the pixel value of one of the inputs is nodata, then the output value of that pixel will be nodata.

# Subtract one image from another
[myImage1] - [myImage2]
# Subtract a constant value from each pixel of an image
[myIMage1] - 1.0
  • Division

    Produces an image with the same type as the input with the most specific type. For example, if one input is int16 and the other is float32, the output will be float32. If the pixel value of one of the inputs is nodata, then the output value of that pixel will be nodata.

# Divide one image by another another
[myImage1] / [myImage2]
# Divide each pixel value in an image by a constant
[myIMage1] / 2.0
  • Multiplication

    Produces an image with the same type as the input with the most specific type. For example, if one input is int16 and the other is float32, the output will be float32. If the pixel value of one of the inputs is nodata, then the output value of that pixel will be nodata.

# Multiply two images
[myImage1] * [myImage2]
# Multiply each pixel value in an image by a constant
[myIMage1] * 2.0
  • Unary minus (i.e. negative number)

    Produces an image of the same type as the input where each output pixel is the negative value of the input pixel.

-[myImage]
[myImage] * -1.0
  • Absolute value

    Produces an image of the same type as the input where each output pixel is the absolute value of the input pixel.

abs([MyImage])
  • Sine

    Produces an image of type float32 where the output pixel is the sine of the input pixel value. The input pixel value must be an angle in radians.

sin([MyImage])
  • Cosine

    Produces an image of type float32 where the output pixel is the cosine of the input pixel value. The input pixel value must be an angle in radians.

cos([MyImage])
  • Tangent

    Produces an image of type float32 where the output pixel is the tangent of the input pixel value. The input pixel value must be an angle in radians.

tan([MyImage])
  • Logarithm

    Produces an image of type float32 where the output pixel is the logarithm of the input pixel value.

# Compute the natural log of each pixel value
log([MyImage])

# Compute the log of the specified base for each pixel value. The base must be a number (not an image).
log([myImage], base)
  • Power

    For each pixel in input1 and the corresponding pixel in input2, this function produces an image of type float32 where the output pixel is input1 raised to the input2 power.

pow([input1], [input2])
  • isNull

    Produces an image of type byte where each pixel is set to either zero (0) if the corresponding input pixel is the nodata value defined for that image or one (1) otherwise. The output image will not contain any nodata values.

isNull([MyImage])

Raster Comparison Operations

These operations compare each pixel of one image with either the corresponding pixel of another image or a constant (or another expression). The result is a single band image of type byte where the value is zero (0) if the condition is false, and one (1) if it is true. If an input pixel value is nodata, then the output pixel value will be set to nodata.

  • Less Than
[myImage1] < [myImage2]
[myImage1] lt [myImage2]
[myImage1] < 100.0
[myImage1] lt [100.0]
  • Less Equal
[myImage1] <= [myImage2]
[myImage1] le [myImage2]
[myImage1] <= 100.0
[myImage1] le [100.0]
  • Greater Than
[myImage1] > [myImage2]
[myImage1] gt [myImage2]
[myImage1] > 100.0
[myImage1] gt [100.0]
  • Greater Equal
[myImage1] >= [myImage2]
[myImage1] ge [myImage2]
[myImage1] >= 100.0
[myImage1] ge [100.0]
  • Equal
[myImage1] == [myImage2]
[myImage1] eq [myImage2]
[myImage1] == 100.0
[myImage1] eq [100.0]
  • Not Equal
[myImage1] != [myImage2]
[myImage1] ^= [myImage2]
[myImage1] <> [myImage2]
[myImage1] ne [myImage2]
[myImage1] != 100.0
[myImage1] ^= 100.0
[myImage1] <> 100.0
[myImage1] ne [100.0]

Logical Raster Operations

These operations compare each pixel value of one image with either the corresponding pixel value of another image or a constant value (or an expression that evaluates to an image or constant). A value of zero correlates to false, and any other value correlates to true. The output image is of type byte, and the value is zero (0) if the expression is false and one (1) if true.

It is also important to note that if either input contains a nodata value, the result will also be nodata (which is defined as a byte value 255).

  • And
[myImage1] && [myImage2]
[myImage1] & [myImage2]
[myImage1] and [myImage2]
  • Or
[myImage1] || [myImage2]
[myImage1] | [myImage2]
[myImage1] or [myImage2]
  • Xor
[myImage1] xor [myImage2]

Raster Functions

This section describes map algebra functions that produce raster output. The inputs vary by function, so they are described separately for each function.

bandcombine Function

Takes two or more arguments and produces a multi-band output image where the bands are numbered according to the order they are listed in the arguments. The output image will have one band for each input image.

Command Line Syntax

bandcombine(image1, image2, [image3, ...])
Parameters
  • image1

    Becomes band 1 of the output image

  • image2

    Becomes band 2 of the output image

  • image3

    Becomes band 3 of the output image

Examples:
# With landsat bands for a scene ingested, the following will produce a 3-band true color image
bandcombine([landsat8-image-b4], [landsat8-image-b3, [landsat8-image-b2])

Python Syntax

image1.bandcombine(image2, [image3, ...])
Parameters
  • image1

    Becomes band 1 of the output image

  • image2

    Becomes band 2 of the output image

  • image3

    Becomes band 3 of the output image

Examples:
# With landsat bands for a scene ingested, the following will produce a 3-band true color image
b4 = mrgeo.load_image("landsat8-image-b4")
b3 = mrgeo.load_image("landsat8-image-b3")
b2 = mrgeo.load_image("landsat8-image-b2")
b4.bandcombine(b3, b2)

bandextract Function

Extracts a single band from a multi-band image.

Command Line Syntax

bandextract(image, bandNum)
Parameters
  • image

    The multi-band image from which to extract a band

  • bandNum

    The number of the band to extract where the first band is 1

Examples:
# Extract the green band from an RGB image
bandextract([my-rgb-image], 2)

Python Syntax

image.bandextract(band=1)
Parameters
  • image

    The multi-band image from which to extract a band

  • bandNum

    The number of the band to extract where the first band is 1

Examples:
# Extract the green band from an RGB image
image = mrgeo.load_image("my-rgb-image")
image.bandextract(2)

con Function

Sets the output value of each pixel based on a condition and assignment expressions passed to the function. This is roughly equivalent an if/then/else type of assignment. It can also be called as an if/then/else if/then/else expression with any number of "else if" expressions. The output type will be the most specific data type of all of the input images used in the expressions passed to this operation.

Command Line Syntax

# Use con like an if/then/else
#
# if (condition)
# then result = trueAssignment
# else result = falseAssignment
con(condition, trueAssignment, falseAssignment)

# Use con like an if/then/elseif/then/else
# if (condition1)
# then result = trueAssignment1
# else if (condition2)
# then result = trueAssignment2
# else result = falseAssignment
con(condition1, trueAssignment1, condition2, trueAssignment2, falseAssignment)
Parameters
  • condition

    Boolean image expression to evaluate

  • trueAssignment

    The value to assign to the output pixel if condition evaluates to true for the input pixel. A raster expression or a constant can be used here

  • falseAssignment

    The value to assign to the output pixel if condition evaluates to false for the input pixel. A raster expression or a constant can be used here.

Examples:
# if the source pixel is less than 100, set the output pixel to 0, otherwise set it to 1.
# Note that if any pixel value in myImage is nodata, the output pixel will be nodata
con([myimage] < 100.0, 0, 1)

# if the source pixel is less than 100, set the output pixel to 0, else if it's less than
# 200, set it to 1, otherwise set it to 1.
# Note that if any pixel value in myImage is nodata, the output pixel will be nodata
con([myimage] < 100.0, 0, [myimage] < 200.0, 1, 2)

Python Syntax

# Use con like an if/then/else with constants for both "if" and "else"
#
# if (condition)
# then result = positiveConst
# else result = negativeConst
condition.con(positiveConst, negativeConst)

# Use con like an if/then/elseif/then/else with a combination of rasters and constants
# if (condition1)
# then result = image
# else if (condition2)
# then result = positiveRaster
# else result = negativeRaster
condition1.con(image, condition2, positiveRaster, negativeRaster)
Parameters
  • condition

    Boolean image expression to evaluate

  • positiveConst

    The constant value to assign to the output pixel if condition evaluates to true for the input pixel. This cannot be used in conjunction with positiveRaster.

  • positiveRaster

    The raster whose pixel value is assigned to assign to the output pixel if condition evaluates to true for the input pixel. This cannot be used in conjunction with positiveConst.

  • negativeConst

    The constant value to assign to the output pixel if condition evaluates to false for the input pixel. This cannot be used in conjunction with negativeRaster.

  • negativeRaster

    The raster whose pixel value is assigned to assign to the output pixel if condition evaluates to false for the input pixel. This cannot be used in conjunction with negativeConst.

The caller is required to specify both a positive value (positiveConst or positiveRaster) and a negative value (negativeConst or negativeRaster).

Examples:
# if the source pixel is less than 100, set the output pixel to 0, otherwise set it to 1.
# Note that if any pixel value in myImage is nodata, the output pixel will be nodata
condition = ([myimage] < 100.0)
condition.con(0, 1)

# if the source pixel is less than 100, set the output pixel to 0, else if it's less than
# 200, set it to 1, otherwise set it to 1.
# Note that if any pixel value in myImage is nodata, the output pixel will be nodata
condition1 = ([myimage] < 100.0)
condition2 = ([myimage] < 200.0)
condition1.con(0, condition2, 1, 2)

convert Function

Converts the data type of the input image. When up-converting from a smaller data type to a larger data type, the values are retained as-is (only the "truncate" conversion method is supported). Think of this as a type-cast operation where the actual value does not change. When down-converting to a smaller data type, there are a few conversion methods available:

  • "truncate" - executes "max(min(sourceValue, maxDestValue), minDestValue)"
  • "mod" - executes "sourceValue modulo maxDestValue"
  • "fit" - determines on a scale of 0.0 - 1.0 where the source value resides in the range of min source value through max source value. It then projects that value into the destination min to max range.

If the source value is nodata, then it translates that to the nodata value for the destination data type (shown in the table below).

The following tables shows the min and max allowable values and nodata value for each data type:

Data Type min max nodata
"byte" 0 254 255
"short" -32767 32767 -32768
"int" -2147483647 2147483647 -2147483648
"float32" -3.4028235E38 3.4028235E38 NaN
"float64" -1.7976931348623157E308 1.7976931348623157E308 NaN

Command Line Syntax

convert(input, toType, conversionMethod)
Parameters
  • input - The source image to be converted to a different data type
  • toType - one of:
    • "byte"
    • "short"
    • "int"
    • "float32"
    • "float64"
  • conversionMethod - one of:
    • "truncate" (default value when this parameter is omitted)
    • "mod"
    • "fit"
Examples:
# convert a float32 image to an int image using modulo on the source values
s = convert([myFloatImage], "int", "mod");

# convert a short image to a float image - defaults to "truncate" which in this case retains
# all of the source values
s = convert([myShortImage], "float32")

Python Syntax

image.convert(toType, conversionMethod="truncate")
Parameters
  • image - The source image to be converted to a different data type
  • toType - one of:
    • "byte"
    • "short"
    • "int"
    • "float32"
    • "float64"
  • conversionMethod - one of:
    • "truncate" (default value when this parameter is omitted)
    • "mod"
    • "fit"
Examples:
# convert a float32 image to an int image using modulo on the source values
s = myFloatImage.convert("int", "mod");

# convert a short image to a float image - defaults to "truncate" which in this case retains
# all of the source values
s = myShortImage.convert("float32")

CostDistance Function

CostDistance calculates the cost starting from one or more source points to all other accessible pixels given the friction surface and a maxCost parameter. The following rules apply:

  • Any null or negative value in the friction surface is considered impassible
  • The cost raster is assumed to be cost per meter not cost per pixel (e.g. seconds per meter).

Each of the source points is translated to its corresponding pixel in the friction surface, and the algorithm begins with those source pixels. For all the pixels in the processing queue, it computes the cost to traverse to all eight neighbor pixels. If the new cost computed for any neighbor pixel is smaller than the currently stored cost for that neighbor pixel, then it changes the cost to that pixel and adds that neighbor pixel to the processing queue. This continues until there are no more pixels in the processing queue. Note also that if maxCost >= 0, it will stop processing neighbors once the cost to the neighbor exceeds the maxCost value.

When the provided friction surface is single band, the same friction value is used when computing the cost to all eight neighbors. When using an 8-band friction surface, it uses the value from the appropriate band based on the direction of travel to each neighbor pixel. The first band of the friction surface is the "up" direction, and they are numbered clockwise from there. So band 1 is up, band 2 is up and to the right, band 3 is to the right, etc...

The slope and directionalslope map algebra functions may be useful when computing friction surfaces. For directional friction surfaces, running raster math against it will execute in each of the bands. This makes the map algebra for computing a directional friction surface easy when the same computation is required in each direction. If different calculations are required for each band, use bandextract to split apart a directional slope, then run the required map algebra on each individual band, and lastly recombine the individual bands into a directional friction surface using bandcombine.

Command Line Syntax

CostDistance(sourcePoints, [zoomLevel], friction, [maxCost])
Parameters
  • sourcePoints

    Specifies the points from which the cost distance is calculated. This can be any vector data source of points (e.g. InlineCsv, shapefile, PostGIS, GeoWave).

  • zoomLevel

    This optional parameter specifies the zoom level of the output raster. If not specified, the maximum zoom level of the input friction image will be used.

  • friction

    The friction image with pixel values in cost per meter (e.g. seconds per meter)

  • maxCost

    Calculate cost out to a maximum of this value. No cost limit if this parameter is not provided or assigned to a value < 0.

Examples:
# Define the starting point for cost distance using InlineCsv
srcPoint = InlineCsv("GEOMETRY", "'POINT(66.67 34.04)'");
friction = [/mrgeo/images/myfriction];
# Uses the maxzoom level of the friction surface and max cost of 30000
result = CostDistance(srcPoint, friction, 30000);

Python Syntax

friction.costdistance(maxCost, zoom, args)
Parameters
  • friction

    The friction image with pixel values in cost per meter.

  • zoom

    This parameter specifies the zoom level of the output raster. Use -1 to indicate the maximum zoom level of the input friction image will be used.

  • maxCost

    Calculate cost out to a maximum of this value. No cost limit if this parameter is not provided or assigned to a value < 0.

  • args

    A variable argument list of longitude/latitude pairs of the point(s) from which the cost distance is calculated. See examples below.

Examples:
friction = mrgeo.load_image("/mrgeo/images/myfriction")
# Uses the max zoom level of the friction surface and max cost of 30000. It specifies
# a single source point at longitude 66.67 and latitude 34.04
result = friction.costdistance(30000, -1, 66.67, 34.04);

Crop Functions

Crops the size of the input raster either to tile bounds or to exact bounds. An exact crop will fill the tiles at the edges of the cropped area with nodata values for any pixels within those tiles that are outside the specified crop boundaries. A regular crop will not assign nodata values to the pixels outside of the crop boundaries. If the crop bounds are larger than the actual image bounds, the resulting image will be expanded with tiles containing nodata values. Note that the "fill" raster function is more suitable to use in cases where you're not actually cropping the input image on any side, but instead expanding the image.

The caller can either pass the bounds values explicitly using minX, minY, maxX and maxY or pass another raster whose bounds are used for the crop area.

Command Line Syntax

crop(image, minX, minY, maxX, maxY)
crop(image, cropToImage)
cropexact(image, minX, minY, maxX, maxY)
cropexact(image, cropToImage)
Parameters
  • image

    MrGeo image to be cropped

  • minX

    The minimum longitude in degrees (left edge of the crop area)

  • minY

    The minimum latitude in degrees (bottom edge of the crop area)

  • maxX

    The maximum longitude in degrees (right edge of the crop area)

  • maxY

    The maximum latitude in degrees (top edge of the crop area)

  • cropToImage

    A MrGeo image whose bounds should be used as the crop bounds. This argument can be used in place of minX, minY, maxX and maxY.

Examples:
crop([myimage], -93.3, 25.0, -91.0, 26.5)

# Crop myimage to the bounds of myOtherImage
crop([myimage], [myOtherImage])

Python Syntax

image.crop(rasterForBounds=None, w=None, s=None, e=None, n=None)
image.cropexact(rasterForBounds=None, w=None, s=None, e=None, n=None)
Parameters

The caller must pass either rasterForBounds (leaving w, s, e, and n set to None) or set w, s, e, and n and set rasterForBounds to None.

  • image

    MrGeo image to be cropped.

  • w The minimum longitude in degrees (left edge of the crop area). When setting this, also set s, e, and n and set rasterForBounds = None.

  • s

    The minimum latitude in degrees (bottom edge of the crop area)

  • e

    The maximum longitude in degrees (right edge of the crop area)

  • n

    The maximum latitude in degrees (top edge of the crop area)

  • rasterForBounds

    A MrGeo image whose bounds should be used as the crop bounds. This argument can be used in place of minX, minY, maxX and maxY.

Examples:
myImage = mrgeo.load_image("my-image")
myImage.crop(w=-93.3, s=25.0, e=-91.0, n=26.5)

# Crop myimage to the bounds of myOtherImage
myOtherImage = mrgeo.load_image("my-other-image")
myImage.cropexact(myOtherImage)

Export Function

Exports a MrGeo image to a local image file or files depending on command-line options selected. There are options for exporting the image as a single output file, one output file per MrGeo tile, or one output file for a group of MrGeo tiles. See the set of arguments below for the many available options.

If you wish to provide a value for one of the optional arguments, you must also provider a value for all of the preceding optional arguments as well.

Command Line Syntax

export(image, output, [singleFile], [zoom], [numTiles], [mosaicSize], [format], [randomTile], [tmsNaming], [colorScale], [tileIds], [bounds], [allLevels])
Parameters
  • image

    MrGeo image to be exported

  • output

    The output template used for naming the output file(s). The template can include the placeholders "$X" and "$Y" which are replaced by the x/y tile coordinates, "$Z" which is replaced by the zoom level, "$LAT" and "$LON" which are replaced by the integer lat/lon values of the min latitude and min longitude of the image bounds.

  • singleFile (optional)

    true if you want to export to a single file, false otherwise. When exporting to a single file, care must be taken to ensure the image you are requesting is not too large to fit into memory. This value is false by default.

  • zoom (optional)

    The zoom level of the MrGeo image to be exported. This will only work if you have built a pyramid for the MrGeo image. The default is -1, meaning the max zoom for the image.

  • numTiles (optional)

    Set a a limit on the number of tiles to export. The default is -1, meaning no limit.

  • mosaicSize (optional)

    When specified, it exports multiple files, each of which contains N x N tiles where N is the value specified for this argument. This is useful for a large image that cannot be exported at the zoom level you desire without running out of memory and when you wish to have fewer output files that are larger. The default value is -1, meaning no mosaicing is done.

  • format (optional)

    Must be "tif", "png" or "jpg". The default value is "tif".

  • randomTile (optional)

    When "true", it exports random tile(s) from the input MrGeo image. The numTiles argument must be >= 1 in order to export random tiles. The default value is "false".

  • tmsNaming (optional)

    When "true" the MrGeo tiles are written out in a TMS sub-directory and tile-naming format "output/Z/X/.tif" where Z is the zoom level, X is the tile's x coordinate and Y is the tile's Y coordinate. The default value is "false".

  • colorScale (optional)

    The name of the MrGeo color scale to use for the output image.

  • tileIds (optional)

    A comma-separated list of TMS tile id's to be exported. This argument is not yet implemented, so it has no effect. The default value is "".

  • bounds (optional)

    The bounds of the MrGeo image to export in the format "minLon, minLat, maxLon, maxLat". The default value is "", meaning the full bounds of the image.

  • allLevels (optional)

    When "true", it exports an image for each zoom of the MrGeo image. Note that the output name should include a "$Z" component to distinguish the various zoom outputs.

Examples:
export([myimage])

Python Syntax

image.export(name, singleFile=False, zoom=-1 numTiles=-1, mosaic=-1, format="tif", randomTiles=False, tms=False, colorscale="", tileids="", bounds="", allLevels=False, overridenodata=float('-inf'))
Parameters
  • image

    MrGeo image to be exported

  • name

    The template used for naming the output file(s). The template can include the placeholders "$X" and "$Y" which are replaced by the x/y tile coordinates, "$Z" which is replaced by the zoom level, "$LAT" and "$LON" which are replaced by the integer lat/lon values of the min latitude and min longitude of the image bounds.

  • singleFile

    True if you want to export to a single file, False otherwise. When exporting to a single file, care must be taken to ensure the image you are requesting is not too large to fit into memory.

  • zoom

    The zoom level of the MrGeo image to be exported. This will only work if you have built a pyramid for the MrGeo image. The default value of -1 means the max zoom for the image.

  • numTiles

    Set a a limit on the number of tiles to export. The default is -1, meaning no limit.

  • mosaic

    When specified, it exports multiple files, each of which contains N x N tiles where N is the value specified for this argument. This is useful for a large image that cannot be exported at the zoom level you desire without running out of memory and when you wish to have fewer output files that are larger. The default value is -1, meaning no mosaicing is done.

  • format

    Must be "tif", "png" or "jpg". The default value is "tif".

  • randomTiles

    When True, it exports random tile(s) from the input MrGeo image. The numTiles argument must be >= 1 in order to export random tiles. The default value is False.

  • tms

    When True, the MrGeo tiles are written out in a TMS sub-directory and tile-naming format "output/Z/X/.tif" where Z is the zoom level, X is the tile's x coordinate and Y is the tile's Y coordinate. The default value is False.

  • colorScale

    The name of the MrGeo color scale to use for the output image. When set to "", a default is selected depending on the number of bands in the image.

  • tileIds

    A comma-separated list of TMS tile id's to be exported. This argument is not yet implemented, so it has no effect. The default value is "".

  • bounds

    The bounds of the MrGeo image to export in the format "minLon, minLat, maxLon, maxLat". The default value is "", meaning the full bounds of the image.

  • allLevels

    When True, it exports an image for each zoom of the MrGeo image. Note that the output name should include a "$Z" component to distinguish the various zoom outputs.

Examples:
myImage = export()

Fill Functions

There are two fill functions that will replace nodata pixel values with the specified fill value and generate new tiles containing the fill value for any tiles that were missing from the input image.

When "fill" is called, the output image retains the same bounds as the input image. When "fillbounds" is called, it expands or shrinks the output image accordingly.

The caller can either pass the bounds values explicitly using minX, minY, maxX and maxY or pass another raster whose bounds are used for the crop area.

Command Line Syntax

fill(image, fillValue)
fillbounds(image, fillValue, minX, minY, maxX, maxY)
fillbounds(image, fillValue, cropToImage)
Parameters
  • image

    MrGeo image to be filled

  • fillValue

    The value with which to replace nodata pixel values and set all pixels in newly created tiles

  • minX

    The minimum longitude in degrees (left edge of the bounds)

  • minY

    The minimum latitude in degrees (bottom edge of the bounds)

  • maxX

    The maximum longitude in degrees (right edge of the bounds)

  • maxY

    The maximum latitude in degrees (top edge of the bounds)

  • cropToImage

    A MrGeo image whose bounds should be used as the crop bounds. This argument can be used in place of minX, minY, maxX and maxY.

Examples:
# Keep the original input image's bounds and just replace nodata values with 0.0
fill([myimage], 0.0)

# Change the bounds for the new image, and replace nodata values with 0.0. Use FAST fill type
fillbounds([myimage], 0.0, -91.0, 23.0, -88.0, 26.0)

# Change the bounds for the new image to match the bounds of myOtherImage, and replace nodata
# values with 0.0. Use FAST fill type
fillbounds([myimage], 0.0, [myOtherImage])

Python Syntax

image.fill(constFill=None, fillRaster=None)
image.fillbounds(constFill=None, boundsRaster=None, fillRaster=None, w=None, s=None, e=None, n=None)
Parameters
  • image

    MrGeo image to be filled

  • constValue

    The value with which to replace nodata pixel values

  • cropToImage

    A MrGeo image whose bounds should be used as the crop bounds. This argument can be used in place of w, s, e and n.

  • w

    The minimum longitude in degrees (west edge of the bounds)

  • s

    The minimum latitude in degrees (south edge of the bounds)

  • e

    The maximum longitude in degrees (east edge of the bounds)

  • n

    The maximum latitude in degrees (north edge of the bounds)

Examples:
myImage = mrgeo.load_image("my-image")

# Keep the original input image's bounds and just replace nodata values with 0.0
myImage.fill(constValue=0.0)

# Change the bounds for the new image, and replace nodata values with 0.0. Use FAST fill type
myImage.fillbounds(constValue=0.0, w=-91.0, s=23.0, e=-88.0, n=26.0)

# Change the bounds for the new image to match the bounds of myOtherImage, and replace nodata
# values with 0.0. Use FAST fill type
moOtherImage = mrgeo.load_image("my-other-image")
myImage.fillbounds(constValue=0.0, fillRaster=myOtherImage)

focalStat Function

The focalStat function performs focal statistical computations on each pixel of an input raster. The caller specifies the statistic to compute, the input raster, the size of the neighborhood, and how to treat nodata values. The end result is that for each pixel in the input raster, it computes the statistic from the pixel values of the specified square neighborhood of pixels around the source pixel and stores the resulting value for that pixel in the output.

If any source image pixels are nodata, the output value for that pixel will be set to nodata.

When the neighborhood is an odd number of pixels, the pixel being processed will be directly centered inside the neighborhood as depicted below:

x

When the neighborhood is an even number of pixels, the pixel being processed will be slightly above and left of the center of the neighborhood as depicted below:

x

Command Line Syntax

focalStat(stat, image, neighborhoodSize, ignoreNoData)
Parameters
  • stat

    the statistic to compute against the neighborhood pixel values (including the pixel being processed). Must be one of:

    • "min"
    • "max"
    • "range"
    • "count"
    • "sum"
    • "mean"
    • "median"
    • "stddev"
  • image

    MrGeo input image to process

  • neighborhoodSize

    The size of the square neighborhood of pixels to use which can be specified in pixels (e.g. "10p") or meters (e.g. "300m")

  • ignoreNoData

    When "true", the statistical computation will only include neighborhood pixels that are not nodata. The nodata pixel values are ignored. When "false", if any of the pixels in the neighborhood are nodata, then the output pixel value is set to nodata.

Examples:
# Compute the mean value of the neighborhood around each pixel where the neighborhood is 15 pixels wide and tall
focalStat("mean", [myimage], "15p", "true")

# Compute the median value of the neighborhood around each pixel where the neighborhood is 300 meters wide and tall
focalStat("mean", [myimage], "300m", "true")

Python Syntax

image.focalStat(stat, neighborhoodSize, ignoreNoData)
Parameters
  • stat

    the statistic to compute against the neighborhood pixel values (including the pixel being processed). Must be one of:

    • "min"
    • "max"
    • "range"
    • "count"
    • "sum"
    • "mean"
    • "median"
    • "stddev"
  • image

    MrGeo input image to process

  • neighborhoodSize

    The size of the square neighborhood of pixels to use which can be specified in pixels (e.g. "10p") or meters (e.g. "300m")

  • ignoreNoData

    When True, the statistical computation will only include neighborhood pixels that are not nodata. The nodata pixel values are ignored. When False, if any of the pixels in the neighborhood are nodata, then the output pixel value is set to nodata.

Examples:
myImage = mrgeo.load_image("my-image")
# Compute the mean value of the neighborhood around each pixel where the neighborhood is 15 pixels wide and tall
myImage.focalStat("mean", "15p", True)

# Compute the median value of the neighborhood around each pixel where the neighborhood is 300 meters wide and tall
myImage.focalStat("mean", "300m", True)

Kernel Function

Performs a spatial kernel by doing a weighted average over each pixel. The weighted average is determined by the kernel method used. The following rules also apply:

  • Any input null value is ignored and not included in weighting.
  • Only the first channel of the raster is processed.

Command Line Syntax

kernel(type, image, parameters)
Parameters
  • type

    The kernel type to apply.

    • gaussian - gaussian kernel
    • laplacian - laplacian kernel
  • image

    MrGeo image to be blurred

  • parameters

    Parameters for the specific kernel type used

    • Gaussian & Laplacian
      • sigma

    The sigma to use for the kernel in meters

Examples:
kernel("gaussian", [myimage], 125.0)
kernel("laplacian", [myimage], 250.0)

Python Syntax

image.kernel(type, sigma)
Parameters
  • image

    MrGeo image on which to run the kernel

  • type

    The kernel type to apply.

    • gaussian - gaussian kernel
    • laplacian - laplacian kernel
  • sigma

    The sigma to use for the kernel in meters

Examples:
myImage = mrgeo.load_image("my-image")
myImage.kernel("gaussian", 125.0)
myImage.kernel("laplacian", 250.0)

Mosaic Function

Mosaics multiple images together into a single output image. The order of the inputs is important because each pixel of the output gets its value from the first input that has a non-nodata value for that pixel. If the bounds of the inputs differ, the resulting image will be across the maximum bounding rectangle of all the inputs.

Command Line Syntax

mosaic(input1, input2, [input3, ...])
Parameters
  • inputs - each of the rasters to mosaic together
Examples:
mosaic([myimage1], [myimage2])
mosaic([myimage1], [myimage2], [myimage3], [myimage4])

Python Syntax

input1.mosaic(args)
Parameters
  • input1 - the first image to be included in the mosaic
  • args - a variable argument list for image2 through imageN to be included in the mosaic
Examples:
myImage1 = mrgeo.load_image("image1")
myImage2 = mrgeo.load_image("image2")
myImage3 = mrgeo.load_image("image3")
myImage4 = mrgeo.load_image("image4")

myImage1.mosaic(myImage2)
myImage1.mosaic(myImage2, myImage3, myImage4)

quantiles Function

Computes quantiles for an existing raster and produces a copy of that raster with the quantiles stored in its metadata. Because quantiles are compute-intensive, MrGeo does not compute them automatically for all images like it does other statistics. This map algebra function allows the user to control when quantiles are computed and for which images. To speed up the computation, the caller can optionally specify a value for the "fraction" parameter, resulting in random sampling the specified fraction of pixels from each tile to use in the computation. For example, specifying 0.5 results in using half of the available pixel values to compute the quantiles. Using a fraction < 1.0 results in estimated quantile values in favor of speeding up the operation.

Command Line Syntax

quantiles(inputImage, numQuantiles, [fraction])
Parameters
  • inputImage - The image for which quantiles should be computed
  • numQuantiles - The number of quantiles to compute. Specifying 4 indicates to compute quartiles which will compute the 3 values that define the quartiles. Specifying 10 indicates to compute deciles which will compute the 9 values that define the deciles.
  • fraction - A value between 0.0 and 1.0 that determines the fraction of the pixels from each tile that are randomly sampled without replacement for the quantile computation.
Examples:

In the following script, the output of the map algebra script itself is a raster of 0's and 1's based on whether the slope at each pixel is 0 or greater than 0. In order to save the slope computation itself, include the "save" function call.

# compute exact deciles for the input image
quantiles([myInput], 10);
# compute estimated deciles for the input image by sampling 30 percent of the pixel values of each input tile
# without replacement
quantiles([myInput], 10, 0.3);

Python Syntax

inputImage.quantiles(numQuantiles, fraction=1.0)
Parameters
  • inputImage - The image for which quantiles should be computed
  • numQuantiles - The number of quantiles to compute. Specifying 4 indicates to compute quartiles which will compute the 3 values that define the quartiles. Specifying 10 indicates to compute deciles which will compute the 9 values that define the deciles.
  • fraction - A value between 0.0 and 1.0 that determines the fraction of the pixels from each tile that are randomly sampled without replacement for the quantile computation.
Examples:

In the following script, the output of the map algebra script itself is a raster of 0's and 1's based on whether the slope at each pixel is 0 or greater than 0. In order to save the slope computation itself, include the "save" function call.

myInput = mrgeo.load_image("my-input")
# compute exact deciles for the input image
myInput.quantiles(10);
# compute estimated deciles for the input image by sampling 30 percent of the pixel values of each input tile
# without replacement
myInput.quantiles(10, 0.3);

RasterizeVector and RasterizePoints Functions

Converts the specified vector input into a raster output. The value assigned to output pixels is determined by the aggregation type specified as follows:

  • MASK - Paints a 0 to all pixels that intersect geometries. All other pixels receive the nodata value.
  • SUM - If the optional attributed is specified, sum the value of the specified attribute across all the intersecting geometries. If no attribute is provided, then count the number of overlapping geometries. Pixels that do not overlap any geometries are set to the nodata value.
  • LAST - Use the value from the specified attribute of the last intersecting geometry. All other pixels receive the nodata value. The ordering of the geometries is not defined and may change from one tile to the next.
  • MIN - Use the minimum value of the specified attribute from all intersecting geometries. All pixels that do not intersect a geometry are set to the nodata value.
  • MAX - Use the maximum value of the specified attribute from all intersecting geometries. All pixels that do not intersect a geometry are set to the nodata value.
  • AVERAGE - Use the average value of the specified attribute across all intersecting geometries. All pixels that do not intersect a geometry are set to the nodata value.

The bounds parameters determine the geographic size of the output image. The bounds parameters are optional, and if not provided, this function will do an initial scan of all features in the input to determine the minimum bounding rectangle. Given the time that requires, if you know the bounds, it would be faster to provide it in in the call.

RasterizePoints is a performance optimized version of RasterizeVector when the input vector data contains only point data.

Command Line Syntax

RasterizeVector(vectorInput, aggregationType, cellSize, [attribute], [minX, minY, maxX, maxY], [boundsRaster], [linePointWidth])
Parameters
  • vectorInput - This is an HDFS path to a vector source (.tsv, .csv, .shp file)
  • aggregationType - The type of aggregation to use for calculating individual pixel values when those pixels overlap features from the vectorInput. See the description of this function for details.
  • cellSize - Determines the pixel resolution of the output image. Specify a value in:
    • meters, for example "30m"
    • a specific zoom level, for example "15z"
    • a value in degrees, for example "0.0000214577" or "0.0000214577d"
  • attribute - The name of the attribute to use for aggregation types that use the value of a specific attribute (see the description of this function for more information). For aggregation types that don't require an attribute, do not include this parameter.
  • minX (optional) - The left edge of the output image
  • minY (optional) - The bottom edge of the output image
  • maxX (optional) - The right edge of the output image
  • maxY (optional) - The top edge of the output image
  • boundsRaster (optional) - An existing MrGeo image whose bounds should be used for the output image. If the minX, minY, maxX and maxY are specified, then boundsRaster cannot be used and vice versa.
  • linePointWidth (optional) - The width to use when drawing lines and points while rasterizing. Defaults to 1 pixel. Can be specified in pixels (e.g. "1p"), meters (e.g. "10m") or degrees (e.g. 0.0000013411)
Examples:
# The MASK aggregation type does not use an attribute from the features, so it must be left out.
# The bounds of the output image can be specified with min lon, min lat, max lon, max lat
# arguments
RasterizeVector([/mrgeo/vectors/myFeatures.shp], "MASK", "15z", -92.0, 38.0, -89.0, 41.0)

# The bounds of the output image can be obtained from a MrGeo image's metadata. The line point width
# can be specified in meters.
RasterizeVector([/mrgeo/vectors/myFeatures.shp], "MASK", "15z", [/mrgeo/images/image-for-bounds], "10m")

# The bounds are optional, but execution is slower because it has to scan the input features
# and compute the minimum bounding rectangle
RasterizeVector([/mrgeo/vectors/myFeatures.shp], "MASK", "15z")

# You can specify the linePointWidth without specifying bounds
RasterizeVector([/mrgeo/vectors/myFeatures.shp], "MASK", "15z", "2p")

# The SUM aggregation type requires an attribute name
RasterizeVector([/mrgeo/vectors/myFeatures.shp], "SUM", "15z", "field1", -92.0, 38.0, -89.0, 41.0)
OR
RasterizePoints([/mrgeo/vectors/myPoints.shp], "SUM", "15z", "field1", -92.0, 38.0, -89.0, 41.0)

Python Syntax

vectorInput.rasterizevector(aggregator, cellsize, column=None, lineWidth=1.0)
pointsInput.rasterizepoints(aggregator, cellsize, column=None, lineWidth=1.0)
Parameters
  • vectorInput - This is the vector data to be rasterized
  • aggregator - The type of aggregation to use for calculating individual pixel values when those pixels overlap features from the vectorInput. See the description section for more details on the aggregators available.
  • cellsize - Determines the pixel resolution of the output image. Specify a value in:
    • meters, for example "30m"
    • a specific zoom level, for example "15z"
    • a value in degrees, for example "0.0000214577" or "0.0000214577d"
  • column - The name of the column/attribute to use for aggregation types that use the value of a specific attribute (see the description of this function for more information). For aggregation types that don't require an attribute, set this parameter to None.
Examples:
myFeatures = mrgeo.load_vector("myFeatures.shp")
myPoints = mrgeo.load_vector("myPoints.shp")
# The MASK aggregation type does not use an attribute from the features, so it must be left out
myFeatures.rasterizevector("MASK", "15z")

# The SUM aggregation type requires an attribute name
myFeatures.rasterizevector("SUM", "15z", "field1")
myPoints.rasterizepoints("SUM", "15z", "field1")

# Note that the lineWidth argument is always specified in pixels from Python
myFeatures.rasterizevector("SUM", "15z", "field1", lineWidth=2.0)

save Function

Saves the result of a map algebra expression to an named output. If the output already exists, it will be overwritten. This is useful for saving intermediate results during the process of running a map algebra script.

Command Line Syntax

save(input, outputName)
Parameters
  • input - The data to be persisted (typically a previously assigned variable or an expression)
  • outputName - The name where the output is to be persisted. If that name is "myoutput" for example, then you could reference that in a map algebra script as "[myoutput]" if you need to use it later.
Examples:

In the following script, the output of the map algebra script itself is a raster of 0's and 1's based on whether the slope at each pixel is 0 or greater than 0. In order to save the slope computation itself, include the "save" function call.

s = slope([myElevation]);
save(s, "myElevationSlope");
result = con([myElevationSlope] > 0, 1, 0);

Python Syntax

input.save(outputName)
Parameters
  • input - The data to be persisted (typically a previously assigned variable or an expression)
  • outputName - The name where the output is to be persisted. If that name is "myoutput" for example, then you could reference that in a map algebra script by loading it with a call to mrgeo.load_image or mrgeo.load_vector
Examples:

In the following script, the output of the map algebra script itself is a raster of 0's and 1's based on whether the slope at each pixel is 0 or greater than 0. In order to save the slope computation itself, include the "save" function call.

s = slope([myElevation]);
save(s, "myElevationSlope");
result = con(mrgeo.load_image("myElevationSlope") > 0, 1, 0);

slope Function

Computes the slope at each pixel using elevation input data. Takes a single band raster elevation input and calculates the slope as rise over run and returns that as a single band 32-bit float raster. Slope is calculated using Horn's formula as described here.

If units for the output pixel values are not specified, the default units are radians.

Command Line Syntax

slope(elevationInput, [units])
Parameters
  • elevationInput - The elevation data from which to compute slope
  • units (optional) - The units of the pixel values in the output
    • "deg"
    • "rad"
    • "gradient"
    • "percent"
Examples:
slope([myElevation], "deg");

Python Syntax

elevationInput.slope(units="rad")
Parameters
  • elevationInput - The elevation data from which to compute slope
  • units (optional) - The units of the pixel values in the output
    • "deg"
    • "rad" (default value)
    • "gradient"
    • "percent"
Examples:
myElevation = mrgeo.load_image("my-elevation")
myElevation.slope("deg");

directionalslope Function

Computes the slope from each pixel to each of its 8 neighbors using elevation input data and stores the results in eight different bands as follows:

Band 8 Band 1 Band 2
Band 7 X Band 3
Band 6 Band 5 Band 4

The slope value in each direction is calculated as the arctan of rise over run.

Command Line Syntax

directionalslope(elevationInput, units)
Parameters
  • elevationInput - The elevation data from which to compute slope
  • units (optional) - The units of the pixel values in the output. The default value is "rad".
    • "deg"
    • "rad"
    • "gradient"
    • "percent"
Examples:
mySlope = directionalslope([my-elevation], "deg");

Python Syntax

elevationInput.directionalslope(units="rad")
Parameters
  • elevationInput - The elevation data from which to compute slope
  • units (optional) - The units of the pixel values in the output
    • "deg"
    • "rad"
    • "gradient"
    • "percent"
Examples:
myElevation = mrgeo.load_image("myElevation")
mySlope = myElevation.directionalslope("deg");

aspect Function

Computes the aspect at each pixel using elevation input data. Takes a single band raster elevation input and calculates the aspect as compass direction where 0 degrees is true north, 90 degrees is east and so on. Aspect is calculated using Horn's formula as described here.

If units for the output pixel values are not specified, the default units are radians.

Aspect assigns the value of the flatValue argument to pixels that are flat. Specify NaN for flatValue if you want flat pixels to have a nodata value.

Command Line Syntax

aspect(elevationInput, [units], [flatValue])
Parameters
  • elevationInput - The elevation data from which to compute slope
  • units (optional) - The units of the pixel values in the output
    • "deg"
    • "rad"
    • "percent"
  • flatValue (optional) - the value to assign to pixels that are flat, default value is -1.
Examples:
aspect([my-elevation], "deg");
aspect([my-elevation], "deg", 0);
aspect([my-elevation], "deg", "NaN");

Python Syntax

elevationInput.aspect(units="rad",flatValue=-1.0)
Parameters
  • elevationInput - The elevation data from which to compute slope
  • units (optional) - The units of the pixel values in the output
    • "deg"
    • "rad" (default value if units = None)
    • "percent"
  • flatValue (optional) - the value to assign to pixels that are flat, default value is -1.
Examples:
myElevation = mrgeo.load_image("my-elevation")
myElevation.aspect(units="deg", flatValue=float("nan"))

statistics Function

For a set of input images, this function computes statistics for each pixel across that set of input images. For example, given three input images, you can compute the average value for each pixel across all three of those images. This is very useful for time series analysis of imagery. The output is an image that encompasses the overall bounds of all of the input images, and each output pixel contains the computed statistic for that pixel across all the input images. For any given pixel, a nodata value for that pixel in any of the inputs is ignored for the computation. For example when computing the "count" statistic across three inputs, if a pixel has nodata in one of the inputs, then the output value for that pixel will be 2.

Command Line Syntax

statistics(method, image1, image2 [, imageN ...])
stats(method, image1, image2 [, imageN ...])
Parameters
  • method - the name of the statistic to compute - one of:
    • "count"
    • "min"
    • "max"
    • "mean"
    • "median"
    • "mode"
    • "stddev"
    • "sum"
  • images - input images across which to compute pixel statistics
Examples:
stats("mean", [myImage1], [myImage2], [myImage3]);
statistics("median", [myImage1], [myImage2], [myImage3], [myImage4]);

Python Syntax

image1.statistics(method, image2 [, imageN ...])
image1.stats(method, image2 [, imageN ...])
Parameters
  • method - the name of the statistic to compute - one of:
    • "count"
    • "min"
    • "max"
    • "mean"
    • "median"
    • "mode"
    • "stddev"
    • "sum"
  • images - input images across which to compute pixel statistics
Examples:
myImage1 = mrgeo.load_image("my-image-1")
myImage2 = mrgeo.load_image("my-image-2")
myImage3 = mrgeo.load_image("my-image-3")
myImage4 = mrgeo.load_image("my-image-4")
myImage1.stats("mean", myImage2, myImage3);
myImage1.statistics("median", myImage2, myImage3, myImage4);

zoom Function

The output of this function is the input image pinned to the specified zoom level. What this means is that anytime the output is used as input to another map algebra function, only the specified zoom level of the image is used. An example use case is when running a cost distance, it would be faster to process a lower resolution friction surface if the higher resolution is not needed (maybe for preliminary results for example).

Command Line Syntax

zoom(image, zoom)
Parameters
  • image - input image for which to pin the zoom level
  • zoom - the zoom level of the image to use (1..20)
Examples:
srcPoint = InlineCsv("GEOMETRY", "'POINT(66.67 34.04)'");
friction = zoom([/mrgeo/images/myfriction], 9);
# Uses the maxzoom level of the friction surface and max cost of 30000
result = CostDistance(srcPoint, friction, 30000);

Python Syntax

image.zoom(9)
Parameters
  • zoom - the zoom level to use (1..20)
Examples:
myImage = mrgeo.load_image("my-image")
myImageZoom9 = myImage.zoom(9)

Other Functions

LeastCostPath Function

LeastCostPath calculates the path from source points to destination points on a cost raster (most likely generated by the CostDistance operation). The process follows the least cost pixels starting from the destination locations until reaching the first source location.

The output is a vector RDD that can be used as input to any map algebra function that takes a vector input source. It contains one LINESTRING feature for each destination point corresponding to the least cost path to that destination point. The output features also include the following attributes:

  • VALUE - the path cost
  • DISTANCE - the path distance in meters (the sum of the distances between the points in the linestring)
  • MINSPEED - the minimum speed along the path
  • MAXSPEED - the maximum speed along the path
  • AVGSPEED - the average speed along the path (i.e. DISTANCE / VALUE)

Command Line Syntax

LeastCostPath(costRaster, destinationPoints)
Parameters
  • costRaster

    The raster populated with cost values for each pixel

  • destinationPoint

    A source of vector points. One least cost path will be generated for each point to the closest source point from the cost raster.

Examples:
dest = InlineCsv("GEOMETRY","'POINT(66.2 34.7)'");
lcp = LeastCostPath([/mrgeo/images/cost], dest);

Python Syntax

destinationPoints.leastcostpath(costRaster)
Parameters
  • costRaster

    The raster populated with cost values for each pixel

  • destinationPoints

    A source of vector points. One least cost path will be generated for each point to the closest source point from the cost raster.

Examples:
costRaster = costdistance(...)
destinationPoints = inlinecsv("GEOMETRY","'POINT(66.2 34.7)'")
lcp = LeastCostPath(costRaster, destinationPoints)

Vector Data Support

While MrGeo is a raster processing system, for some of its map algebra functionality, it needs to be able to read vector data. Notable examples are cost distance, least cost path, rasterize vector and rasterize points. MrGeo also needs to support saving vector data since the least cost path function produces a linestring with some attributes.

Vector Input

MrGeo supports reading several types of vector data outlined in the following sections. In map algebra, a vector input source must be enclosed in square brackets (e.g. [roadways.shp]).

It also supports defining vector data directly in map algebra for cases where there is not much data involved - like specifying a point for cost distance or least cost path. This can be done with the InlineCsv map algebra function described below.

Delimited Text

MrGeo supports either comma-separated or tab-separated vector data. To read from a comma-separated format, the referenced filename must have a ".csv" extension, and a tab-separated file must have a ".tsv" extension. A schema must also be provided, and that can be done either in the first line of the delimited file itself or with a ".columns" file stored in the same location with the ".tsv" or ".csv" file.

If the first line of the file contains the schema, it should specify a name for each field. Most of the field names are not important to MrGeo as far as its own processing. However, for the geometry, you must either name two fields "x" and "y" (for point data) when the values are longitude and latitude in degrees or you must name a single field "geometry" if the value is WKT.

In the delimited file itself, fields are separated by the appropriate delimiter (comma or tab based on the extension of the file as noted above). String fields can be encapsulated in double quotes. Embedded double quotes inside of a field value are not supported. Field values can contain delimiters within them as long as it is enclosed in double quotes.

Columns file

The columns file is an XML file containing the schema for its corresponding delimited file. For example, you could store in HDFS (or S3) a file named my-input.tsv and there must be another file in the same directory named my-input.tsv.columns. The following sample columns file is created when saving the results of least cost path, and it shows the two field types available and the straightforward syntax of the XML file:

<?xml version="1.0" encoding="UTF-8"?>
<AllColumns firstLineHeader="false">
  <Column name="GEOMETRY" type="Nominal"/>
  <Column name="VALUE" type="Numeric"/>
  <Column name="DISTANCE" type="Numeric"/>
  <Column name="MINSPEED" type="Numeric"/>
  <Column name="MAXSPEED" type="Numeric"/>
  <Column name="AVGSPEED" type="Numeric"/>
</AllColumns>

Shapefile

In order to read data from a shapefile, simply enclose the name of the .shp file in square brackets. Note that if you only specify the file name (e.g. [roadways.shp]), the HDFS data provider will look for that file in the vector base path specified in mrgeo.conf. You can also use the full path to the shapefile within HDFS (or S3).

PostGIS

Reading vector data from PostGIS involves specifying the source database connection properties and query information in the data source definition directly in map algebra. The available properties are described below. The properties are delimited in the data source specification by semi-colons. You can use double quotes around the property values if needed. An example is

[pg:url=jdbc:postgresql://my.server/planet;username=my-user;password="my-password";query="select highway, ST_AsText(way) AS geom from planet_osm_line where ST_Intersects(way, ST_MakeEnvelope(6, 46, 10, 50, 4326)) AND (highway is not null)";ssl=true;wktField=geom;geometryField=way]

PostGIS data source properties include:

Property Description
url The JDBC syntax URL for connecting to your Postgres database instance. It looks like "jdbc:postgres://my.server/db-name", replacing "my.server" with the name or IP address of your server and "db-name" with the name of the database on that Postgres instance tot connect to.
username The name of the database user for running the query. Make sure to pick a user who has read access to the table(s) to be queried
password The password for the selected user
query The query to execute (in PostGIS syntax). You may select any subset of columns from a single table or a joined set of tables. You may include a WHERE clause or whatever other syntax is needed. One of the returned fields must contain the geometry in WKT format. The PostGIS ST_AsText function returns WKT for a geometry field. It can be an actual field or an aliased field.
ssl true or false depending on whether your Postgres instance requires connecting via SSL.
wktField The name of the returned column containing the WKT for the geometry. MrGeo converts this WKT to a MrGeo geometry object internally for processing.
geometryField The name of the actual column containing the geometry. This is used for computing the bounds of the returned set of features for metadata. This is not needed if you have specified mbrQuery.
countQuery MrGeo will attempt to dynamically generate a SQL statement for counting the total number of features being queried. That count is used for partitioning the data for processing. It does this by using the 'query' field value and replacing everything in between the first "SELECT" and "FROM" keywords with "COUNT(*)". Due to the complexity of SQL, that approach may not always work, in which case the user can explicitly specify the count query using this property.
mbrQuery This property is used only when reporting metadata from the command line. If this property is not specified, MrGeo will try to use the value of 'query', replacing everything between the first "SELECT" and "FROM" keywords with "ST_AsText(ST_Extent(geometryField))" to query the database for the extents (minimum bounding rectangle) of the features. Due to the complexity of SQL syntax, this may not always work, and in that case, the mbrQuery property can be explicitly set here. Note that "geometryField" is replaced the actual value of the "geometryField" property you define in this data source.

GeoWave

Reading data from GeoWave requires configuring MrGeo properly to connect to GeoWave. Once that is accomplished, you can read from a GeoWave data source by specifying the name of the GeoWave source you wish to read. In addition, MrGeo allows you to perform filtering on your GeoWave data if desired.

Configuring GeoWave Access

Modify mrgeo.conf to include the following settings:

# Since GeoWave does not support writing vector data, set the preferred provider
# to hdfs which does support writing.
preferred.vector.provider=hdfs

# Specify the GeoWave store names that MrGeo can access
geowave.storenames=mystore

# Include index settings from GeoWave config file. These settings
# are stored by GeoWave in ~/.geowave/0.9.1-config.properties. They
# are written there after running the GeoWave "addindex" command.
# Multiple indexes can be specified.
index.myindex.opts.allTiers=false
index.myindex.opts.numPartitions=32
index.myindex.opts.partitionStrategy=ROUND_ROBIN
index.myindex.opts.pointOnly=false
index.myindex.type=spatial

# Include store settings from GeoWave config file. These settings
# are stored by GeoWave in ~/.geowave/0.9.1-config.properties. They
# are written there after running the GeoWave "addstore" command.
# This will make all of the adapters in the specified data stores
# available to MrGeo. Multiple data stores can be specified.
store.mystore.opts.createTable=true
store.mystore.opts.enableBlockCache=true
store.mystore.opts.gwNamespace=geowave_test
store.mystore.opts.instance=accumulo
store.mystore.opts.password=secret
store.mystore.opts.persistAdapter=true
store.mystore.opts.persistDataStatistics=true
store.mystore.opts.persistIndex=true
store.mystore.opts.useAltIndex=false
store.mystore.opts.useLocalityGroups=true
store.mystore.opts.user=root
store.mystore.opts.zookeeper=localhost\:2181
store.mystore.type=accumulo

GeoWave Map Algebra Syntax

Within map algebra, GeoWave data sources are enclosed in square brackets like all other data sources. To disambiguate them from other sources, it is suggested to use the "geowave" prefix recognized by the GeoWave data provider for MrGeo - e.g. [geowave:mystore.roadways]. In this case, it will read all of the features from the roadways resource with the mystore data store of GeoWave. The syntax also allows for filtering of data from a specified GeoWave resource. One or more of the following filtering mechanisms can be specified (separated by semi-colons).

Timestamp Filtering

Timestamp filtering works when the GeoWave resource is indexed with timestamp information and therefore performs the filtering very quickly. The syntax is as follows:

[mystore.events;startTime=2016-01-01T00:00:00;endTime=2016-02-01T00:00:00]
Spatial Filtering

Spatial filtering works when the GeoWave resource is indexed spatially and therefore performs the filtering very quickly. The syntax is as follows:

[geowave:mystore.events;spatial="POLYGON((-120 30, -120 40, -115 40, -115 30, -120 30))"]

Within the WKT specification of the geometry to filter by, when you surround the WKT with double quotes, you can embed double quotes in the WKT by escaping them with a backslash ().

CQL Filtering

CQL filtering is meant to filter by attributes stored with the vector data. For fields that are indexed, using the other filters is significantly faster because GeoWave can take advantage of the indexes. With CQL filtering, the indexes are not used. The CQL syntax itself follows the Geoserver CQL syntax. For example:

[geowave:mystore.counties;cql="STATE=\"CA\""]
Combined Filtering

In order to combine multiple types of filtering, use a semi-colon to separate the filters as show below:

[geowave:mystore.events;spatial="POLYGON((-118.5 33.5, -118.5 34.5, -117.5 34.5, -117.5 33.5, -119.5 33.5))";cql="CRIME_TYPE = \"theft\""]

InlineCsv Map Algebra Function

This function is a convenience for the case where there is a small set of features to use as vector input so that the features can be included directly in map algebra instead of having to load them from an external data source.

Each feature can include geometry as well as additional attributes, and multiple features are supported. The entire set of features is specified as a string in map algebra. Within that string, the schema for the data must be specified followed by records/features are delimited by a semi-colon (;). Fields within a record are delimited by a comma.

Command Line Syntax

InlineCsv(columnNames, values)
Parameters
  • columnNames - a delimited string containing the definition for each column

    Column names are delimited by a comma. The feature geometry can be read either as latitude and longitude coordinates for points or as WKT geometry. If latitudes and longitudes are used, those fields must be given the names "x" and "y". If a WKT geometry field is used, you must name that field "geometry".

  • values - a delimited string containing one or more records with values for each column

    Multiple features can be specified, separated by semi-colons. Within each record, column values must be separated by commas. Field values can be encapsulated in single quotes if needed. Use of single quotes is required if field values can contain embedded semi-colons or commas. It does not support embedded single quotes. For the "geometry" column, the values must be in WKT format.

Examples:
InlineCsv("x,y,crime_type", "-118.1934, 34.243, theft;-117.845, 33.649, breaking and entering");
InlineCsv("geometry,crime_type", "'POINT(-118.1934 34.243)', theft;'POINT(-117.845 33.649)', breaking and entering");

Python Syntax

inlinecsv(columnNames, values)
Parameters
  • columnNames - a delimited string containing the definition for each column

    Column names are delimited by a comma. The feature geometry can be read either as latitude and longitude coordinates for points or as WKT geometry. If latitudes and longitudes are used, those fields must be given the names "x" and "y". If a WKT geometry field is used, you must name that field "geometry".

  • values - a delimited string containing one or more records with values for each column

    Multiple features can be specified, separated by semi-colons. Within each record, column values must be separated by commas. Field values can be encapsulated in single quotes if needed. Use of single quotes is required if field values can contain embedded semi-colons or commas. It does not support embedded single quotes. For the "geometry" column, the values must be in WKT format.

Examples:
inlinecsv("x,y,crime_type", "-118.1934, 34.243, theft;-117.845, 33.649, breaking and entering");
inlinecsv("geometry,crime_type", "'POINT(-118.1934 34.243)', theft;'POINT(-117.845 33.649)', breaking and entering");

Vector Output

MrGeo currently only has the ability to write data to a delimited text file in HDFS or S3. See here for more information on the actual format.

Clone this wiki locally