Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MDArray API #433

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,19 @@ GeoInterfaceRecipes = "0329782f-3d07-4b52-b9f6-d3137cf03c7a"
ImageCore = "a09fc81d-aa75-5fe9-8630-4744c3626534"
Tables = "bd369af6-aec1-5ad0-b16a-f7cc5008161c"

[weakdeps]
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"

[extensions]
ArchGDALMakieExt = "Makie"

[compat]
CEnum = "0.4, 0.5"
ColorTypes = "0.10, 0.11"
Dates = "<0.0.1,1"
Dates = "<0.0.1, 1"
DiskArrays = "0.3, 0.4"
Extents = "0.1"
GDAL = "1.7"
GDAL = "1.8"
GeoFormatTypes = "0.4.2"
GeoInterface = "1"
GeoInterfaceMakie = "0.1"
Expand All @@ -35,11 +41,5 @@ Makie = "0.20, 0.21"
Tables = "1"
julia = "1.6"

[extensions]
ArchGDALMakieExt = "Makie"

[extras]
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"

[weakdeps]
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
8 changes: 8 additions & 0 deletions src/ArchGDAL.jl
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,14 @@ include("ogr/featurelayer.jl")
include("ogr/featuredefn.jl")
include("ogr/fielddefn.jl")
include("ogr/styletable.jl")
include("mdarray/types.jl")
include("mdarray/attribute.jl")
include("mdarray/dimension.jl")
include("mdarray/extendeddatatype.jl")
include("mdarray/global.jl")
include("mdarray/group.jl")
include("mdarray/highlevel.jl")
include("mdarray/mdarray.jl")
include("utilities.jl")
include("context.jl")
include("base/iterators.jl")
Expand Down
42 changes: 22 additions & 20 deletions src/constants.jl
Original file line number Diff line number Diff line change
Expand Up @@ -392,28 +392,30 @@ wkbXDR::OGRwkbByteOrder = 0x00000000

@enum(
GDALOpenFlag,
OF_READONLY = GDAL.GDAL_OF_READONLY, # 0x00
OF_UPDATE = GDAL.GDAL_OF_UPDATE, # 0x01
# OF_All = GDAL.GDAL_OF_ALL, # 0x00
OF_RASTER = GDAL.GDAL_OF_RASTER, # 0x02
OF_VECTOR = GDAL.GDAL_OF_VECTOR, # 0x04
OF_GNM = GDAL.GDAL_OF_GNM, # 0x08
OF_KIND_MASK = GDAL.GDAL_OF_KIND_MASK, # 0x1e
OF_SHARED = GDAL.GDAL_OF_SHARED, # 0x20
OF_VERBOSE_ERROR = GDAL.GDAL_OF_VERBOSE_ERROR, # 0x40
OF_INTERNAL = GDAL.GDAL_OF_INTERNAL, # 0x80
# OF_DEFAULT_BLOCK_ACCESS = GDAL.GDAL_OF_DEFAULT_BLOCK_ACCESS, # 0
OF_ARRAY_BLOCK_ACCESS = GDAL.GDAL_OF_ARRAY_BLOCK_ACCESS, # 0x0100
OF_HASHSET_BLOCK_ACCESS = GDAL.GDAL_OF_HASHSET_BLOCK_ACCESS, # 0x0200
# OF_RESERVED_1 = GDAL.GDAL_OF_RESERVED_1, # 0x0300
OF_BLOCK_ACCESS_MASK = GDAL.GDAL_OF_BLOCK_ACCESS_MASK, # 0x0300
OF_READONLY = GDAL.GDAL_OF_READONLY, # 0x00
OF_UPDATE = GDAL.GDAL_OF_UPDATE, # 0x01
# OF_ALL = GDAL.GDAL_OF_ALL, # 0x00
OF_RASTER = GDAL.GDAL_OF_RASTER, # 0x02
OF_VECTOR = GDAL.GDAL_OF_VECTOR, # 0x04
OF_GNM = GDAL.GDAL_OF_GNM, # 0x08
OF_MULTIDIM_RASTER = GDAL.GDAL_OF_MULTIDIM_RASTER, # 0x10
OF_KIND_MASK = GDAL.GDAL_OF_KIND_MASK, # 0x1e
OF_SHARED = GDAL.GDAL_OF_SHARED, # 0x20
OF_VERBOSE_ERROR = GDAL.GDAL_OF_VERBOSE_ERROR, # 0x40
OF_INTERNAL = GDAL.GDAL_OF_INTERNAL, # 0x80
# OF_DEFAULT_BLOCK_ACCESS = GDAL.GDAL_OF_DEFAULT_BLOCK_ACCESS, # 0
OF_ARRAY_BLOCK_ACCESS = GDAL.GDAL_OF_ARRAY_BLOCK_ACCESS, # 0x0100
OF_HASHSET_BLOCK_ACCESS = GDAL.GDAL_OF_HASHSET_BLOCK_ACCESS, # 0x0200
# OF_RESERVED_1 = GDAL.GDAL_OF_RESERVED_1, # 0x0300
OF_BLOCK_ACCESS_MASK = GDAL.GDAL_OF_BLOCK_ACCESS_MASK, # 0x0300
# OF_FROM_GDALOPEN = GDAL.GDAL_OF_FROM_GDALOPEN, # 0x0400
)

@enum(
FieldValidation,
F_VAL_NULL = GDAL.OGR_F_VAL_NULL, # 0x0001
F_VAL_GEOM_TYPE = GDAL.OGR_F_VAL_GEOM_TYPE, # 0x0002
F_VAL_WIDTH = GDAL.OGR_F_VAL_WIDTH, # 0x0004
F_VAL_ALLOW_NULL_WHEN_DEFAULT = GDAL.OGR_F_VAL_ALLOW_NULL_WHEN_DEFAULT, # 0x0008
F_VAL_ALLOW_DIFFERENT_GEOM_DIM = GDAL.OGR_F_VAL_ALLOW_DIFFERENT_GEOM_DIM, # 0x0010
F_VAL_NULL = GDAL.OGR_F_VAL_NULL, # 0x0001
F_VAL_GEOM_TYPE = GDAL.OGR_F_VAL_GEOM_TYPE, # 0x0002
F_VAL_WIDTH = GDAL.OGR_F_VAL_WIDTH, # 0x0004
F_VAL_ALLOW_NULL_WHEN_DEFAULT = GDAL.OGR_F_VAL_ALLOW_NULL_WHEN_DEFAULT, # 0x0008
F_VAL_ALLOW_DIFFERENT_GEOM_DIM = GDAL.OGR_F_VAL_ALLOW_DIFFERENT_GEOM_DIM, # 0x0010
)
47 changes: 40 additions & 7 deletions src/context.jl
Original file line number Diff line number Diff line change
Expand Up @@ -182,37 +182,46 @@ function writegeomdefn(
end

for gdalfunc in (
:asclassicdataset,
:asmdarray,
:boundary,
:buffer,
:centroid,
:clone,
:convexhull,
:copy,
:create,
:createRAT,
:createattribute,
:createcolortable,
:createcoordtrans,
:copy,
:createdimension,
:createfeaturedefn,
:createfielddefn,
:creategeom,
:creategeomcollection,
:creategeomfieldcollection,
:creategeomdefn,
:creategeomfieldcollection,
:creategroup,
:createlayer,
:createlinearring,
:createlinestring,
:createmdarray,
:createmultidimensional,
:createmultilinestring,
:createmultipoint,
:createmultipolygon,
:createmultipolygon_noholes,
:createpoint,
:createpolygon,
:createRAT,
:createstylemanager,
:createstyletable,
:createstyletool,
:curvegeom,
:delaunaytriangulation,
:difference,
:extendeddatatypecreate,
:extendeddatatypecreatestring,
:forceto,
:fromGML,
:fromJSON,
Expand All @@ -226,40 +235,64 @@ for gdalfunc in (
:gdaltranslate,
:gdalvectortranslate,
:gdalwarp,
:getattribute,
:getattributes,
:getband,
:getcolortable,
:getcomponents,
:getdatatype,
:getdimensions,
:getfeature,
:getgeom,
:getgridded,
:getindex,
:getindexingvariable,
:getlayer,
:getmask,
:getmaskband,
:getoverview,
:getpart,
:getresampled,
:getrootgroup,
:getspatialref,
:gettype,
:getunscaled,
:getview,
:importCRS,
:intersection,
:importEPSG,
:importEPSGA,
:importESRI,
:importPROJ4,
:importURL,
:importUserInput,
:importWKT,
:importXML,
:importUserInput,
:importURL,
:intersection,
:lineargeom,
:newspatialref,
:nextfeature,
:open,
:opendimensionfromfullname,
:opengroup,
:opengroupfromfullname,
:openmdarray,
:openmdarrayfromfullname,
:openvectorlayer,
:pointalongline,
:pointonsurface,
:polygonfromedges,
:polygonize,
:read,
:readraster,
:resolvemdarray,
:sampleoverview,
:simplify,
:simplifypreservetopology,
:subsetdimensionfromselection,
:symdifference,
:transpose,
:union,
:update,
:readraster,
)
eval(quote
function $(gdalfunc)(f::Function, args...; kwargs...)
Expand Down
18 changes: 16 additions & 2 deletions src/dataset.jl
Original file line number Diff line number Diff line change
Expand Up @@ -1013,9 +1013,23 @@ function buildoverviews!(
return dataset
end

function destroy(dataset::AbstractDataset)::Nothing
GDAL.gdalclose(dataset)
# TODO: Wrap `GDAL.CPLErr`
function close(dataset::AbstractDataset)::GDAL.CPLErr
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of OGR, a dataset might have features (from e.g.

function createlayer(;
name::AbstractString = "",
dataset::AbstractDataset = create(getdriver("Memory")),
geom::OGRwkbGeometryType = wkbUnknown,
spatialref::AbstractSpatialRef = SpatialRef(),
options = StringList(C_NULL),
)::IFeatureLayer
return IFeatureLayer(
GDAL.gdaldatasetcreatelayer(dataset, name, spatialref, geom, options),
ownedby = dataset,
spatialref = spatialref,
)
end
). There we took the approach of tracking references using an attribute named .ownedby (so that Julia's GC wouldn't finalize the dataset while there are references to it) and have an implicit convention that users shouldn't be calling destroy themselves (and either rely on Julia's GC, or the do-block approach for managing context).

That said, if you feel strongly about it, I'm sympathetic to the argument for being able to "force close" a dataset that resulted in the .children attribute being introduced. To avoid https://gdal.org/api/python_gotchas.html#python-crashes-or-throws-an-exception-if-you-use-an-object-after-deleting-a-related-object, should we introduce a new type for datasets that tracks their children, and restrict this function to only accept those datasets?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is this OSGeo/gdal#10490 . When writing a dataset, there is no reliable way to ensure it has been written since one has to wait until all GDAL objects pointing into the dataset have been garbage collected and finalized.

One way out would be not provide an interactive API, and to handle all lifetimes manually or via context managers. That's quite inconvenient in many cases.

Thus I decided to implement a close function that (by default) hard-closes a dataset when the dataset is writable. For this, the dataset needs to hold references that need to be released when the dataset is force-closed.

This has nothing to do with lifetime management. You can open datasets with a hard_close=false option, and no such tracking happens. However, it's then unclear when a dataset will actually be written to disk.

I now think that the dataset should hold weak references (not strong ones) to the objects that should be released. I'll update the code.

If you have a different suggestion for reliably closing or flushing a dataset I'd be happy to change things.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way out would be not provide an interactive API, and to handle all lifetimes manually or via context managers. That's quite inconvenient in many cases.

Yeah, it is inconvenient. I'd still recommend it for this PR though -- to keep the PR scoped to the introduction of the MDArray API and to introduce the hard close option in a separate PR.

For this, the dataset needs to hold references that need to be released when the dataset is force-closed.

Even if that were implemented correctly, it still doesn't address the resulting complexity of making all GDAL objects suspect of potentially corresponding to resources that might have already been released.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to think about how to proceed without creating very unexpected behaviour.

There is already a function isnull that checks whether a reference is valid. References become invalid when they are released.

dataset.ptr == C_NULL && return GDAL.CE_Failure
if !isnothing(dataset.children)
for child in dataset.children
value = child.value
!isnothing(value) && destroy(value)
end
Base.empty!(dataset.children)
end
err = GDAL.gdalclose(dataset)
dataset.ptr = C_NULL
return err
end

function destroy(dataset::AbstractDataset)::Nothing
close(dataset)
Comment on lines +1031 to +1032
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I would prefer to not change the behavior of destroy() (since it's out for quite a while by now), and instead for it to be a new function being introduced e.g. force_close_mdarray_dataset!(...) (inside the src/mdarray/* subdirectory), and to have it document that it will close all of its children (possibly recursively too)?

We can keep it scoped to mdarray is until we have verifiable claims that it's working robustly for the other types of GDAL datasets (FeatureCollections, etc). Once there is a verifiably general implementation that's working and tested for all GDAL dataset types, we can give it a more general name.

return nothing
end

Expand Down
Loading
Loading