Skip to content

Commit

Permalink
Merge branch 'databrickslabs:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
a0x8o authored Jan 25, 2024
2 parents 1d3042f + 9e01d87 commit 25cf059
Show file tree
Hide file tree
Showing 77 changed files with 1,724 additions and 1,599 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ test_that("aggregate vector functions behave as intended", {
sdf.intersection <- join(sdf.l, sdf.r, sdf.l$left_index == sdf.r$right_index, "inner")
sdf.intersection <- summarize(
groupBy(sdf.intersection, sdf.intersection$left_id, sdf.intersection$right_id),
agg_intersects = st_intersects_aggregate(column("left_index"), column("right_index")),
agg_intersection = st_intersection_aggregate(column("left_index"), column("right_index")),
agg_intersects = st_intersects_agg(column("left_index"), column("right_index")),
agg_intersection = st_intersection_agg(column("left_index"), column("right_index")),
left_geom = first(column("left_geom")),
right_geom = first(column("right_geom"))
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,8 @@ test_that("aggregate vector functions behave as intended", {
inner_join(sdf.r, by = c("left_index" = "right_index"), keep = TRUE) %>%
dplyr::group_by(left_id, right_id) %>%
dplyr::summarise(
agg_intersects = st_intersects_aggregate(left_index, right_index),
agg_intersection = st_intersection_aggregate(left_index, right_index),
agg_intersects = st_intersects_agg(left_index, right_index),
agg_intersection = st_intersection_agg(left_index, right_index),
left_geom = max(left_geom, 1),
right_geom = max(right_geom, 1)
) %>%
Expand Down
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ We recommend using Databricks Runtime versions 13.3 LTS with Photon enabled.

> DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify `%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.
As of the 0.4.0 release, Mosaic issues the following ERROR when initialized on a cluster that is neither Photon Runtime nor Databricks Runtime ML [[ADB](https://learn.microsoft.com/en-us/azure/databricks/runtime/) | [AWS](https://docs.databricks.com/runtime/index.html) | [GCP](https://docs.gcp.databricks.com/runtime/index.html)]:
:warning: **Mosaic 0.4.x series issues the following ERROR on a standard, non-Photon cluster [[ADB](https://learn.microsoft.com/en-us/azure/databricks/runtime/) | [AWS](https://docs.databricks.com/runtime/index.html) | [GCP](https://docs.gcp.databricks.com/runtime/index.html)]:**

> DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic 0.4.x series restricts executing this cluster.
Expand Down Expand Up @@ -147,14 +147,14 @@ __Note: Mosaic 0.4.x SQL bindings for DBR 13 not yet available in Unity Catalog

Here are some example notebooks, check the language links for latest [[Python](/notebooks/examples/python/) | [Scala](/notebooks/examples/scala/) | [SQL](/notebooks/examples/sql/) | [R](/notebooks/examples/R/)]:

| Example | Description | Links |
| --- | --- | --- |
| __Quick Start__ | Example of performing spatial point-in-polygon joins on the NYC Taxi dataset | [python](/notebooks/examples/python/QuickstartNotebook.ipynb), [scala](notebooks/examples/scala/QuickstartNotebook.ipynb), [R](notebooks/examples/R/QuickstartNotebook.r), [SQL](notebooks/examples/sql/QuickstartNotebook.ipynb) |
| Shapefiles | Examples of reading multiple shapefiles | [python](notebooks/examples/python/Shapefiles/) |
| Spatial KNN | Runnable notebook-based example using Mosaic [SpatialKNN](https://databrickslabs.github.io/mosaic/models/spatial-knn.html) model | [python](notebooks/examples/python/SpatialKNN) |
| NetCDF | Read multiple NetCDFs, process through various data engineering steps before analyzing and rendering | [python](notebooks/examples/python/NetCDF/) |
| STS Transfers | Detecting Ship-to-Ship transfers at scale by leveraging Mosaic to process AIS data. | [python](notebooks/examples/python/Ship2ShipTransfers), [blog](https://medium.com/@timo.roest/ship-to-ship-transfer-detection-b370dd9d43e8) |
| EO Gridded STAC | End-to-end Earth Observation series showing downloading Sentinel-2 STAC assets for Alaska from [MSFT Planetary Computer](https://planetarycomputer.microsoft.com/), tiling them to H3 global grid, band stacking, NDVI, merging (mosaicing), clipping, and applying a [Segment Anything Model](https://huggingface.co/facebook/sam-vit-huge) | [python](notebooks/examples/python/EarthObservation/EOGriddedSTAC) |
| Example | Description | Links |
| --- | --- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| __Quick Start__ | Example of performing spatial point-in-polygon joins on the NYC Taxi dataset | [python](/notebooks/examples/python/Quickstart/QuickstartNotebook.ipynb), [scala](notebooks/examples/scala/QuickstartNotebook.ipynb), [R](notebooks/examples/R/QuickstartNotebook.r), [SQL](notebooks/examples/sql/QuickstartNotebook.ipynb) |
| Shapefiles | Examples of reading multiple shapefiles | [python](notebooks/examples/python/Shapefiles/) |
| Spatial KNN | Runnable notebook-based example using Mosaic [SpatialKNN](https://databrickslabs.github.io/mosaic/models/spatial-knn.html) model | [python](notebooks/examples/python/SpatialKNN) |
| NetCDF | Read multiple NetCDFs, process through various data engineering steps before analyzing and rendering | [python](notebooks/examples/python/NetCDF/) |
| STS Transfers | Detecting Ship-to-Ship transfers at scale by leveraging Mosaic to process AIS data. | [python](notebooks/examples/python/Ship2ShipTransfers), [blog](https://medium.com/@timo.roest/ship-to-ship-transfer-detection-b370dd9d43e8) |
| EO Gridded STAC | End-to-end Earth Observation series showing downloading Sentinel-2 STAC assets for Alaska from [MSFT Planetary Computer](https://planetarycomputer.microsoft.com/), tiling them to H3 global grid, band stacking, NDVI, merging (mosaicing), clipping, and applying a [Segment Anything Model](https://huggingface.co/facebook/sam-vit-huge) | [python](notebooks/examples/python/EarthObservation/EOGriddedSTAC) |

You can import those examples in Databricks workspace using [these instructions](https://docs.databricks.com/en/notebooks/index.html).

Expand Down
7 changes: 4 additions & 3 deletions docs/source/api/raster-format-readers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@ Mosaic provides spark readers for the following raster formats:
Other formats are supported if supported by GDAL available drivers.

Mosaic provides two flavors of the readers:
* spark.read.format("gdal") for reading 1 file per spark task
* mos.read().format("raster_to_grid") reader that automatically converts raster to grid.

* :code:`spark.read.format("gdal")` for reading 1 file per spark task
* :code: `mos.read().format("raster_to_grid")` reader that automatically converts raster to grid.


spark.read.format("gdal")
Expand Down Expand Up @@ -91,7 +92,7 @@ mos.read().format("raster_to_grid")
***********************************
Reads a GDAL raster file and converts it to a grid.
It uses a pattern similar to standard spark.read.format(*).option(*).load(*) pattern.
The only difference is that it uses mos.read() instead of spark.read().
The only difference is that it uses :code:`mos.read()` instead of :code:`spark.read()`.
The raster pixels are converted to grid cells using specified combiner operation (default is mean).
If the raster pixels are larger than the grid cells, the cell values can be calculated using interpolation.
The interpolation method used is Inverse Distance Weighting (IDW) where the distance function is a k_ring
Expand Down
Loading

0 comments on commit 25cf059

Please sign in to comment.