Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python/fix/raster tests #452

Merged
merged 51 commits into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from 38 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
6446112
added back R to build automation
sllynn Oct 23, 2023
19a71ec
temporarily reduced min coverage
sllynn Oct 23, 2023
0c2fbe5
temporarily reduced min coverage
sllynn Oct 24, 2023
f6bd09c
update R `libPath`
sllynn Oct 24, 2023
af24a7c
install dependencies from posit public binaries
sllynn Oct 24, 2023
05f3c1a
remove references to `RasterAPI`
sllynn Oct 24, 2023
739c89d
added caching for slow download of Spark
sllynn Oct 24, 2023
c53995c
added caching for slow download of Spark
sllynn Oct 24, 2023
cb11c4b
updated user agent for installing from posit public binaries
sllynn Oct 24, 2023
0c9610e
isolate install.packages issue
sllynn Oct 24, 2023
cdf762a
isolate install.packages issue
sllynn Oct 24, 2023
9fcca23
isolate install.packages issue
sllynn Oct 24, 2023
51145fb
isolate install.packages issue
sllynn Oct 24, 2023
318ff64
isolate install.packages issue
sllynn Oct 24, 2023
20c65f7
isolate install.packages issue
sllynn Oct 24, 2023
adfbda3
split out steps to enable caching of dependencies
sllynn Oct 25, 2023
db84dcb
split out steps to enable caching of dependencies
sllynn Oct 25, 2023
bd0f8d5
add back R tests
sllynn Oct 25, 2023
d5f18f4
fixed existing R tests
sllynn Oct 25, 2023
1aa7bc2
testing rlib actions
sllynn Oct 25, 2023
280f003
corrected typo in `build_main.yml`
sllynn Oct 25, 2023
4a9e4c5
testing `use-public-rspm` flag
sllynn Oct 25, 2023
ec256c8
testing `use-public-rspm` flag
sllynn Oct 25, 2023
0c9c598
testing `use-public-rspm` flag
sllynn Oct 25, 2023
a6f2fc0
removed some cruft from the R build files
sllynn Oct 25, 2023
0f60feb
restored `minimum.coverage` in `pom.xml`
sllynn Oct 25, 2023
fc1cc74
skip coverage check if tests skipped
sllynn Oct 26, 2023
f0d3dd0
Merge branch 'feature/grid_tiles' of github.com:databrickslabs/mosaic…
sllynn Oct 30, 2023
03d09c7
Merge branch 'main' of github.com:databrickslabs/mosaic into python/f…
sllynn Nov 8, 2023
2c6e096
added python tests for some raster functions. corrected some issues e…
sllynn Nov 8, 2023
dff0def
change to untar procedure to skip existing files (as we expected a ze…
sllynn Nov 8, 2023
bb2beda
removed a stray scratch file
sllynn Nov 8, 2023
ea13470
added scala project changes
sllynn Nov 8, 2023
ece8ec1
added python tests for all non-aggregation raster functions
sllynn Nov 8, 2023
96040b2
typo in setup.cfg
sllynn Nov 9, 2023
6222dce
added numpy as setup requirement to make GDAL native array type work
sllynn Nov 9, 2023
67f4d98
trying to fix GDAL / numpy dependency
sllynn Nov 9, 2023
aee46b3
fix `on` filters for automation?
sllynn Nov 9, 2023
fd853bb
added explicit numpy installation to GH actions
sllynn Nov 9, 2023
43a7ae8
added explicit gdal deps installation to GH actions, even if tests sk…
sllynn Nov 9, 2023
e7e4192
added explicit gdal deps installation to GH actions, even if tests sk…
sllynn Nov 9, 2023
f8fb537
added explicit gdal deps installation to GH actions, even if tests sk…
sllynn Nov 9, 2023
43342b5
added explicit gdal deps installation to GH actions, even if tests sk…
sllynn Nov 9, 2023
a4ab661
added explicit gdal deps installation to GH actions, even if tests sk…
sllynn Nov 9, 2023
8734597
added explicit gdal deps installation to GH actions, even if tests sk…
sllynn Nov 9, 2023
e8cb0e7
added more gdal extended dependencies
sllynn Nov 9, 2023
ed037c7
facepalm
sllynn Nov 9, 2023
1c38d55
facepalm++
sllynn Nov 9, 2023
943ff63
bump spark ver for 12.2LTS and remove ubuntugis deps (test)
sllynn Nov 9, 2023
f5525c5
added final tests for raster aggregators
sllynn Nov 9, 2023
0461940
removed st_buffer_cap_style
sllynn Nov 9, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/build_main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@ name: build main
on:
push:
branches-ignore:
- "R/*"
- "r/*"
- "python/*"
- "scala/*"
- "R/**"
- "r/**"
- "python/**"
- "scala/**"
pull_request:
branches:
- "**"
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: build_python
on:
push:
branches:
- "python/*"
- "python/**"

jobs:
build:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/build_r.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ name: build_R
on:
push:
branches:
- 'r/*'
- 'R/*'
- 'r/**'
- 'R/**'

jobs:
build:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_scala.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: build_scala
on:
push:
branches:
- "scala/"
- "scala/**"

jobs:
build:
Expand Down
8 changes: 2 additions & 6 deletions python/mosaic/api/aggregators.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,7 @@ def st_intersection_aggregate(
)


def st_intersection_agg(
leftIndex: ColumnOrName, rightIndex: ColumnOrName
) -> Column:
def st_intersection_agg(leftIndex: ColumnOrName, rightIndex: ColumnOrName) -> Column:
"""
Computes the intersection of all `leftIndex` : `rightIndex` pairs
and unions these to produce a single geometry.
Expand Down Expand Up @@ -100,9 +98,7 @@ def st_intersects_aggregate(
)


def st_intersects_agg(
leftIndex: ColumnOrName, rightIndex: ColumnOrName
) -> Column:
def st_intersects_agg(leftIndex: ColumnOrName, rightIndex: ColumnOrName) -> Column:
"""
Tests if any `leftIndex` : `rightIndex` pairs intersect.

Expand Down
6 changes: 4 additions & 2 deletions python/mosaic/api/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,9 @@ def st_bufferloop(
)


def st_buffer_cap_style(geom: ColumnOrName, radius: ColumnOrName, cap_style: ColumnOrName) -> Column:
def st_buffer_cap_style(
sllynn marked this conversation as resolved.
Show resolved Hide resolved
geom: ColumnOrName, radius: ColumnOrName, cap_style: ColumnOrName
) -> Column:
"""
Compute the buffered geometry based on geom and radius.

Expand All @@ -231,7 +233,7 @@ def st_buffer_cap_style(geom: ColumnOrName, radius: ColumnOrName, cap_style: Col
"st_buffer_cap_style",
pyspark_to_java_column(geom),
pyspark_to_java_column(radius),
pyspark_to_java_column(cap_style)
pyspark_to_java_column(cap_style),
)


Expand Down
7 changes: 2 additions & 5 deletions python/mosaic/api/gdal.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,14 +25,11 @@ def setup_gdal(
-------

"""
sc = spark.sparkContext
mosaicContextClass = getattr(
sc._jvm.com.databricks.labs.mosaic.functions, "MosaicContext"
mosaicGDALObject = getattr(
spark.sparkContext._jvm.com.databricks.labs.mosaic.gdal, "MosaicGDAL"
)
mosaicGDALObject = getattr(sc._jvm.com.databricks.labs.mosaic.gdal, "MosaicGDAL")
mosaicGDALObject.prepareEnvironment(spark._jsparkSession, init_script_path)
print("GDAL setup complete.\n")
print(f"Shared objects (*.so) stored in: {shared_objects_path}.\n")
print(f"Init script stored in: {init_script_path}.\n")
print(
"Please restart the cluster with the generated init script to complete the setup.\n"
Expand Down
53 changes: 32 additions & 21 deletions python/mosaic/api/raster.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,29 +13,29 @@
"rst_boundingbox",
"rst_clip",
"rst_combineavg",
"rst_fromfile",
"rst_frombands",
"rst_fromfile",
"rst_georeference",
"ret_getnodata",
"rst_getnodata",
"rst_getsubdataset",
"rst_height",
"rst_isempty",
"rst_initnodata",
"rst_isempty",
"rst_memsize",
"rst_metadata",
"rst_merge",
"rst_numbands",
"rst_metadata",
"rst_ndvi",
"rst_numbands",
"rst_pixelheight",
"rst_pixelwidth",
"rst_rastertogridavg",
"rst_rastertogridcount",
"rst_rastertogridmax",
"rst_rastertogridmin",
"rst_rastertogridmedian",
"rst_rastertoworldcoord",
"rst_rastertogridmin",
"rst_rastertoworldcoordx",
"rst_rastertoworldcoordy",
"rst_rastertoworldcoord",
"rst_retile",
"rst_rotation",
"rst_scalex",
Expand All @@ -45,17 +45,17 @@
"rst_skewy",
"rst_srid",
"rst_subdatasets",
"rst_summary",
"rst_subdivide",
"rst_summary",
"rst_tessellate",
"rst_to_overlapping_tiles",
"rst_tryopen",
"rst_upperleftx",
"rst_upperlefty",
"rst_width",
"rst_worldtorastercoord",
"rst_worldtorastercoordx",
"rst_worldtorastercoordy",
"rst_worldtorastercoord",
]


Expand Down Expand Up @@ -172,7 +172,7 @@ def rst_georeference(raster: ColumnOrName) -> Column:
)


def ret_getnodata(raster: ColumnOrName) -> Column:
def rst_getnodata(raster: ColumnOrName) -> Column:
"""
Returns the nodata value of the band.

Expand All @@ -190,7 +190,7 @@ def ret_getnodata(raster: ColumnOrName) -> Column:

"""
return config.mosaic_context.invoke_function(
"ret_getnodata", pyspark_to_java_column(raster)
"rst_getnodata", pyspark_to_java_column(raster)
)


Expand Down Expand Up @@ -253,8 +253,7 @@ def rst_initnodata(raster: ColumnOrName) -> Column:

"""
return config.mosaic_context.invoke_function(
"rst_initnodata",
pyspark_to_java_column(raster)
"rst_initnodata", pyspark_to_java_column(raster)
)


Expand Down Expand Up @@ -897,13 +896,16 @@ def rst_fromfile(raster: ColumnOrName, sizeInMB: ColumnOrName) -> Column:
"""

return config.mosaic_context.invoke_function(
"rst_fromfile",
pyspark_to_java_column(raster),
pyspark_to_java_column(sizeInMB)
"rst_fromfile", pyspark_to_java_column(raster), pyspark_to_java_column(sizeInMB)
)


def rst_to_overlapping_tiles(raster: ColumnOrName, width: ColumnOrName, height: ColumnOrName, overlap: ColumnOrName) -> Column:
def rst_to_overlapping_tiles(
raster: ColumnOrName,
width: ColumnOrName,
height: ColumnOrName,
overlap: ColumnOrName,
) -> Column:
"""
Tiles the raster into tiles of the given size.
:param raster:
Expand All @@ -916,7 +918,7 @@ def rst_to_overlapping_tiles(raster: ColumnOrName, width: ColumnOrName, height:
pyspark_to_java_column(raster),
pyspark_to_java_column(width),
pyspark_to_java_column(height),
pyspark_to_java_column(overlap)
pyspark_to_java_column(overlap),
)


Expand Down Expand Up @@ -1048,7 +1050,10 @@ def rst_worldtorastercoord(

"""
return config.mosaic_context.invoke_function(
"rst_worldtorastercoord", pyspark_to_java_column(raster)
"rst_worldtorastercoord",
pyspark_to_java_column(raster),
pyspark_to_java_column(x),
pyspark_to_java_column(y),
)


Expand All @@ -1074,7 +1079,10 @@ def rst_worldtorastercoordx(

"""
return config.mosaic_context.invoke_function(
"rst_worldtorastercoordx", pyspark_to_java_column(raster)
"rst_worldtorastercoordx",
pyspark_to_java_column(raster),
pyspark_to_java_column(x),
pyspark_to_java_column(y),
)


Expand All @@ -1100,5 +1108,8 @@ def rst_worldtorastercoordy(

"""
return config.mosaic_context.invoke_function(
"rst_worldtorastercoordy", pyspark_to_java_column(raster)
"rst_worldtorastercoordy",
pyspark_to_java_column(raster),
pyspark_to_java_column(x),
pyspark_to_java_column(y),
)
1 change: 1 addition & 0 deletions python/mosaic/config/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@
display_handler: DisplayHandler
ipython_hook: InteractiveShell
notebook_utils = None
default_gdal_init_script_path: str = "/dbfs/FileStore/geospatial/mosaic/gdal/"
4 changes: 1 addition & 3 deletions python/mosaic/core/mosaic_context.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,7 @@ def __init__(self, spark: SparkSession):
IndexSystem = self._indexSystemFactory.getIndexSystem(self._index_system)
GeometryAPIClass = getattr(self._mosaicPackageObject, self._geometry_api)

self._context = self._mosaicContextClass.build(
IndexSystem, GeometryAPIClass()
)
self._context = self._mosaicContextClass.build(IndexSystem, GeometryAPIClass())

def invoke_function(self, name: str, *args: Any) -> MosaicColumn:
func = getattr(self._context.functions(), name)
Expand Down
6 changes: 5 additions & 1 deletion python/setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,14 @@ classifiers =
[options]
packages = find:
python_requires = >=3.7.0
setup_requires =
pyspark==3.3.2
ipython>=7.22.0

install_requires =
keplergl==0.3.2
h3==3.7.3
ipython>=7.22.0
gdal[numpy]==3.4.3

[options.package_data]
mosaic =
Expand Down
Binary file not shown.
20 changes: 20 additions & 0 deletions python/test/test_gdal_install.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from .utils import SparkTestCase, GDALInstaller


class TestGDALInstall(SparkTestCase):
def test_setup_gdal(self):
installer = GDALInstaller(self.spark)
try:
installer.copy_objects()
except Exception:
self.fail("Copying objects with `setup_gdal()` raised an exception.")

try:
installer_result = installer.run_init_script()
except Exception:
self.fail("Execution of GDAL init script raised an exception.")

self.assertEqual(installer_result, 0)

gdalinfo_result = installer.test_gdalinfo()
self.assertEqual(gdalinfo_result, "GDAL 3.4.3, released 2022/04/22\n")
Loading
Loading