Skip to content

Commit

Permalink
Merge pull request #446 from databrickslabs/feature/docs
Browse files Browse the repository at this point in the history
Add new raster capabilities to the docs.
  • Loading branch information
Milos Colic authored Nov 6, 2023
2 parents 67a0cdc + 06ceda9 commit ab7b5c7
Show file tree
Hide file tree
Showing 4 changed files with 1,492 additions and 509 deletions.
24 changes: 12 additions & 12 deletions docs/source/api/raster-format-readers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Mosaic provides spark readers for the following raster formats:
* XPM using .xpm file extension - https://gdal.org/drivers/raster/xpm.html
* GRIB using .grb file extension - https://gdal.org/drivers/raster/grib.html
* Zarr using .zarr file extension - https://gdal.org/drivers/raster/zarr.html
Other formats supported by GDAL will be added in future releases.
Other formats are supported if supported by GDAL available drivers.

Mosaic provides two flavors of the readers:
* spark.read.format("gdal") for reading 1 file per spark task
Expand All @@ -32,7 +32,7 @@ spark.read.format("gdal")
A base Spark SQL data source for reading GDAL raster data sources.
It reads metadata of the raster and exposes the direct paths for the raster files.
The output of the reader is a DataFrame with the following columns:
* path - path to the raster file on dbfs (StringType)
* tile - loaded raster tile (RasterTileType)
* ySize - height of the raster in pixels (IntegerType)
* xSize - width of the raster in pixels (IntegerType)
* bandCount - number of bands in the raster (IntegerType)
Expand All @@ -59,23 +59,23 @@ The output of the reader is a DataFrame with the following columns:
.option("driverName", "GTiff")\
.load("dbfs:/path/to/raster.tif")
df.show()
+--------------------+-----+-----+---------+--------------------+--------------------+----+--------------------+
| path|ySize|xSize|bandCount| metadata| subdatasets|srid| proj4Str|
+--------------------+-----+-----+---------+--------------------+--------------------+----+--------------------+
|dbfs:/path/to/ra...| 100| 100| 1|{AREA_OR_POINT=Po...| null| 4326|+proj=longlat +da...|
+--------------------+-----+-----+---------+--------------------+--------------------+----+--------------------+
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+
| tile| ySize| xSize| bandCount| metadata| subdatasets| srid| proj4Str|
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+
| {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "GTiff" } | 100 | 100 | 1 | {AREA_OR_POINT=Po...| null| 4326| +proj=longlat +da...|
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+

.. code-tab:: scala

val df = spark.read.format("gdal")
.option("driverName", "GTiff")
.load("dbfs:/path/to/raster.tif")
df.show()
+--------------------+-----+-----+---------+--------------------+--------------------+----+--------------------+
| path|ySize|xSize|bandCount| metadata| subdatasets|srid| proj4Str|
+--------------------+-----+-----+---------+--------------------+--------------------+----+--------------------+
|dbfs:/path/to/ra...| 100| 100| 1|{AREA_OR_POINT=Po...| null| 4326|+proj=longlat +da...|
+--------------------+-----+-----+---------+--------------------+--------------------+----+--------------------+
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+
| tile| ySize| xSize| bandCount| metadata| subdatasets| srid| proj4Str|
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+
| {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "GTiff" } | 100 | 100 | 1 | {AREA_OR_POINT=Po...| null| 4326| +proj=longlat +da...|
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+

.. warning::
Issue 350: https://github.com/databrickslabs/mosaic/issues/350
Expand Down
Loading

0 comments on commit ab7b5c7

Please sign in to comment.