Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implications of supporting scale-factor=1 for static endpoint #190

Open
jerstlouis opened this issue Aug 28, 2024 · 23 comments
Open

Implications of supporting scale-factor=1 for static endpoint #190

jerstlouis opened this issue Aug 28, 2024 · 23 comments

Comments

@jerstlouis
Copy link
Member

Currently, we require implementations to still support scale-factor=1, even if not support the Scaling requirement class, so clients have a general way of requesting the original resolution of the data without any resampling applied. This is to allow servers to automatically return downsampled data if asking for /coverage and the entire coverage is too much for a response without any subsetting or resampling, and so that the link from the collection to the coverage is not a 4xx error (which gives a bad impressions to users following the link, and may negatively affect SEO etc.).

Is this requirement to support scale-factor=1 a significant problem for static cloud storage deployments, or can they simply be ignored?

cc @m-mohr

@fmigneault
Copy link

A neat way to address this would be to make scale-factor=1 the default value. For the case of static storage, where the parameter would be omitted entirely, it would behave the same way as requesting a dynamic endpoint omitting it as well.

@jerstlouis
Copy link
Member Author

jerstlouis commented Aug 28, 2024

@fmigneault

A neat way to address this would be to make scale-factor=1 the default value.

scale-factor=1 is already the default value for cases where the server accepts to return the whole thing at full resolution.

But that does not address the issue of clients requesting the whole coverage which is impossible to return, and the server returning a 4xx error, which we want to avoid, therefore we have a permission to return a downsampled version rather than an error if the client does not explicitly state that it wants the full resolution or an error.

Pehaps there could be a special exception to this for servers supporting e.g., COG and Accept-Range: ?

@m-mohr
Copy link

m-mohr commented Aug 28, 2024

But that does not address the issue of clients requesting the whole coverage which is impossible to return

I feel like that's a non-issue. Requests should be made in an informed manner, otherwise returning an error is totally reasonable.
If clients provide links for users to click, they should probably add parameters that are sensible and supported.

@jerstlouis
Copy link
Member Author

jerstlouis commented Aug 28, 2024

@m-mohr

The related issue is #54 and we came to a decision 3 years ago on this permission.

I disagree that it's a non-issue, as I feel it's important that the canonical [ogc-rel:coverage] link never returns an error by default, when specified without any parameter.

At least four of us were in strong agreement with this back when we made this decision.

We then had a second motion on this topic.

Getting back to my question which would help consider a proper solution, what happens with a typical static object storage if a client makes a request with a query parameter?
I imagine this could at least be configured to ignore the query parameters, if that is not the default behavior?

@fmigneault
Copy link

@jerstlouis
Isn't there some contradictions here?

On one hand "scale-factor=1 is already the default value", but on the other, "A server advertising conformance for the scaling conformance class MAY return a downsampled version of the coverage, if the client does not explicitly request scale-factor=1".

So which is it, scale-factor=1 is default or not?
Sounds more like a "do whatever you want", which is worrying about a standard definition...

I think #54 (comment) highlights the same consideration:

having endpoints that actually work without having to append parameters to them would be a good thing
[...]
servers that don't support this conformance class would presumably just ignore this parameter

So, scale-factor=1 would indeed be assumed the default by a static endpoint, since it cannot support it, will not indicate it supports the class, and will not include (or will ignore) it anyway.

If anything, it is the "Requirement: Even if the scaling conformance class is not supported, the server SHALL understand requests specifying scale-factor=1" that is just wrong. If the class is not supported, then its contained requirement does not apply... There is no reason for this requirement to even exist, since it applies only to cases supporting the class.

Instead, there should be a requirement that indicates that, if a server wants to use an alternate default than scale-factor=1, it must provide this information "somehow" in the collection description.

@jerstlouis
Copy link
Member Author

So, scale-factor=1 would indeed be assumed the default by a static endpoint, since it cannot support it, will not indicate it supports the class, and will not include (or will ignore) it anyway.

By "support scale-factor=1", this particular current requirement means "gracefully ignore it" if it doesn't support the Scaling requirement class.

Instead, there should be a requirement that indicates that, if a server wants to use an alternate default than scale-factor=1, it must provide this information "somehow" in the collection description.

We could make that compromise, as there was a previous suggestion to do so. Then we had second thoughts about that.

If this really helps the static use case while preserving the permission to downsample if necessary to avoid returning an error, then I'm happy to re-introduce a property like "defaultCoverageMayBeDownsampled": true.

@fmigneault
Copy link

Yes. I think that would help at least advertise what would happen, without having to download the data, to figure out which scaling would be applied by default.

I think it would be better to have an actual value at the collection level rather than a somewhat meaningless boolean. Something like defaultScaleFactor: 2 if you want requests omitting scale-factor to scale by 2.

Omitting defaultScaleFactor in the collection metadata would default to defaultScaleFactor: 1, which in turn defaults to scale-factor=1 for requests at that collection.

@jerstlouis
Copy link
Member Author

@fmigneault

I think it would be better to have an actual value at the collection level rather than a somewhat meaningless boolean. Something like defaultScaleFactor: 2 if you want requests omitting scale-factor to scale by 2.

That would not work, because the scale factor can depend on the subset requested. Requesting a small spatial extent would return the full resolution, but returning the whole coverage can return a downsampled version.

@fmigneault
Copy link

I see. Makes sense.
So practically, how are the alternate/default scale triggered in current implementations?
A maximumMegabytes threshold, or something more complicated like max-dimensions/bands?

@jerstlouis
Copy link
Member Author

@fmigneault Recommendation 1 mentions maxCells, maxWidth and maxHeight limits.

@jerstlouis
Copy link
Member Author

jerstlouis commented Sep 25, 2024

We discussed the possibility to add a response header, which could also be returned as part of a HEAD request.
e.g., Content-Scale-Factor or Content-Resolution.

In general, it seems that the scale-factor=1 approach to enforce a particular scale is not a major issue, and should mostly simply be ignored for static servers (which is the expected behavior)

@m-mohr
Copy link

m-mohr commented Oct 24, 2024

So we are running into related issues as well for two different APIs. It is not quite clear to us right now what is allowed in Coverages and what isn't. Is it allowed to return a downsampled version (i.e. non-native resolution) if I submit a Coverages request without any parameters / all parameters being default?

I think default values should work as such that you can usually return a response, which means scale-factor should never ever have the default value 1, which would in many cases overload servers and floog users with huge amount of data that they did not explicitly requested. If users diverge from defaults and know hopefully what they are doing so returning larger amounts of data is fine.

It may also be that the downsampled version can not be described upfront. Servers may choose limits depending on their abilities to load and process different data, the provided parameters, etc. There may not always be specific values for that.

I'd propose that for all default values, the server chooses what is best to return. This leads to different results across implementations of course. If you want predictable results you need to specify specific values for the parameters. In this case you may return errors whenever the users requested too much information.

cc @jankovicgd

@jerstlouis
Copy link
Member Author

Is it allowed to return a downsampled version (i.e. non-native resolution) if I submit a Coverages request without any parameters / all parameters being default?

Yes, that's what the downsampling permission is about.

which means scale-factor should never ever have the default value 1,

In implementations that do not support downsampling, then scale-factor=1 is the only possible default.

Servers may choose limits depending on their abilities to load and process different data, the provided parameters, etc.

Exactly, this is why the server limit are exressed in terms of total number of cells, width, and height. (Recommendation 1 about limits)

In this case you may return errors whenever the users requested too much information.

That's exactly what the draft says! :)

@m-mohr
Copy link

m-mohr commented Oct 24, 2024

Thanks.

Yes, that's what the downsampling permission is about.

Strictly speaking that should be in the Core conformance class (additionally?). Just having that in the scaling conformance class means I can't downscale in core?!

In implementations that do not support downsampling, then scale-factor=1 is the only possible default.

Default values should not depend on other criteria, at least from a specification point of view. It seems the default value for scale-factor must be undefined, i.e. the server decides. Which leads to the downsampling permission, which should be in Core.

@jerstlouis
Copy link
Member Author

I can't downscale in core?!

Correct. If the server supports downscaling, then it should implement the Scaling requirements class.

default value for scale-factor must be undefined,

Yes, the default value is "none specified" / "varying" / "undefined".

the downsampling permission, which should be in Core.

Why would the server return a downsampled version when it does not support downsampling?

Are you thinking of a use case where there are only a limited set of pre-defined scales, and the server can either return the low-resolution or higher-resolution version?

Coverages doesn't handle that "in-between" use case.

Either the server does not support downsampling, in which case it can return an error if the user asks for too big of an area or the whole thing, or it supports downsampling and is happy to downsample to a particular requested resolution resulting in not too much data output.

Potentially, if it supports subsetting but not downsampling, and the client didn't request a specific subset, the server could be allowed in that case to return a subset instead of a downsampled version to avoid returning an exception -- not sure this is currently allowed in Coverages, but that is allowed in OGC API - Maps (the default for the subset/bbox is undefined) -- see https://docs.ogc.org/is/20-058/20-058.html#_56e245b6-53bf-4996-b579-062598191edd Tables 8 & 9.

@m-mohr
Copy link

m-mohr commented Oct 24, 2024

Correct. If the server supports downscaling, then it should implement the Scaling requirements class.

No! Servers may downscale by default but still can't support the full possibilities of the Scaling requirement class.
I want to downscale by default in Core regardless of other Conformance classes. I heard that from @jankovicgd, too.

Yes, the default value is "none specified" / "varying" / "undefined".

Okay, so 7.2.4 Req. 4J is to be understood as follows:

  • scale-factor by default is undefined, servers returns a scale-factor of their choice (i.e. may downscale, so the permission above should be moved to Core).
  • scale-factor=1 is required to be accepted, too
  • All other values for the parameter lead to an error

Why would the server return a downsampled version when it does not support downsampling?

I support downsampling but may not want to support Scaling or not all options that Scaling requires are possible to be implemented. As such I can't advertise it as supported. I requested splitting it into easer/smaller chunks, but you also don't want that?! So I'm out of options. See #194

Are you thinking of a use case where there are only a limited set of pre-defined scales, and the server can either return the low-resolution or higher-resolution version?

No, but that would be another possibility, indeed.

@jerstlouis
Copy link
Member Author

jerstlouis commented Oct 24, 2024

Okay, so 7.2.4 Req. 4J is to be understood as follows:

At the moment, if Scaling is NOT supported, scale-factor is understood to be 1.
I guess we could possibly promote the permission to Core without any significant breakage, considering that clients can still enforce a scale-factor=1 with the parameter as they do for a Scaling implementation, and that could give a bit more meaning to that otherwise odd prameters that does not do anything for non-Scaling implementation? I don't know what the opinion of others in the SWG would be on this -- some were strongly opposed to the permission from the start.

not all options that Scaling requires are possible to be implemented

We clarified some of this in #187. There should not be anything impossible to implement.

I feel like the solution is to clarify this, perhaps providing guidance how these different options map to the same thing?

There should be a relatively simple way to map all required parameters to existing capabilities, if the backend does support some kind of downsampling for all dimensions. Could we discuss your use case in more detail? (in #187 if it is related to the temporal dimension).

See #194

I am much less opposed to splitting temporal vs. spatial chunks (whether subsetting or scaling) than convenience vs. a more general way to do the same thing, because from an implementation perspective implementing a convenience parameter is simply parsing a parameter and re-routing it to the same underlying functionality -- not a big ask for server developers, which gives more freedom to clients and direct-in-the-browser API users, while maintaining interoperability.

@m-mohr
Copy link

m-mohr commented Oct 24, 2024

At the moment, if Scaling is NOT supported, scale-factor is understood to be 1.

That makes no sense to me, so I'd appreciate to discuss this:

I guess we could possibly promote the permission to Core without any significant breakage, considering that clients can still enforce a scale-factor=1 with the parameter as they do for a Scaling implementation, and that could give a bit more meaning to that otherwise odd prameters that does not do anything for non-Scaling implementation? I don't know what the opinion of others in the SWG would be on this -- some were strongly opposed to the permission from the start.

not all options that Scaling requires are possible to be implemented

We clarified some of this in #187. There should not be anything impossible to implement.

That's one of the issues, but doesn't cover all potential issues.

I feel like the solution is to clarify this, perhaps providing guidance how these different options map to the same thing?

That would probably help, not sure it solves all blockers though.

There should be a relatively simple way to map all required parameters to existing capabilities, if the backend does support some kind of downsampling for all dimensions. Could we discuss your use case in more detail?

Just trying to implement spatial scaling on top of GEE right now. Not sure about @jankovicgd's usecase.

So I can set width + height + bbox or a scale (in CRS resolution). I don't always know what the native resolution is and I don't know the number of sample values, I think. How can I map all parameters that scaling provides?

  • width + height has a clear mapping
  • scale-size I understand as an alias for width + height (why two ways again? - more a rhetorical question)
  • scale-axes not sure
  • scale-factor not sure

I guess the issue is that scale-axes and scale-factor are both ratios, but there's no way to just set a specific resolution value (e.g. 20m)?!

Additionally, I can only work in a meaningful way on the spatial dimensions as discussed in #187.

I am much less opposed to splitting temporal vs. spatial chunks (whether subsetting or scaling) than convenience vs. a more general way to do the same thing, because from an implementation perspective implementing a convenience parameter is simply parsing a parameter and re-routing it to the same underlying functionality -- not a big ask for server developers, which gives more freedom to clients and direct-in-the-browser API users, while maintaining interoperability.

That's not how it is defined. If I have bbox + datetime implemented, I can not just redirect subset to those parameters and have a compliant server. Subset allows much more, so it's not a simple redirect.

(Sorry, we are drifting into off-topic here, I think.)

@jerstlouis
Copy link
Member Author

jerstlouis commented Oct 24, 2024

not sure it solves all blockers though.

If there are any blocker left, that could justify moving the permission to Core, but I think first we should first address the possibility to implement Scaling.

there's no way to just set a specific resolution value

We did discuss the possibility to add a resolution parameter as an additional convenience parameter, but that would not remove / address the requirement to support scale-factor / scale-axes.

Again all these scale-* parameters come directly from WCS and are there to maintain a sense of continuity, while the other parameters are for convenience / familiarity / consistency with the other OGC APIs.

I don't always know what the native resolution is and I don't know the number of sample values, I think. How can I map all parameters that scaling provides?

That certainly would be a problem, but that's already a problem in terms of describing the coverage in the spatial extent of the collection description. A regularly gridded coverage has a resolution (Requirement 27 in the OGC Abstract Specification Topic 6 - Part 1).

Isn't there something like https://developers.google.com/earth-engine/apidocs/ee-projection-nominalscale that can always be queried?

In cases where you really cannot know, I guess you would have to assume a detailed value that's likely to be large enough, and you would report that as the resolution in the collection description.

If I have bbox + datetime implemented, I can not just redirect subset to those parameters and have a compliant server. Subset allows much more, so it's not a simple redirect.

It would normally be the other way around - the redirect should be from bbox & datetime to the more general subset.

But if the collection only has spatial and temporal dimension, the only extra thing that subset supports I think is slicing (reducing dimensionality) instead of trimming (preserving dimensionality). datetime already supports both trimming (interval syntax) and slicing (single time instant), so it should still be possible to re-route to your underlying implementation assuming it already supports spatial slicing. In the worst case the slicing subset could turn into a point or line bbox, and the lack of dimensionality reduction would only apply to nD output formats like netCDF (GeoTIFF is always 2D).

Additionally, I can only work in a meaningful way on the spatial dimensions as discussed in #187.

We clarified how things should be done for the temporal dimension on irregular grids -- I need to update the requirements to reflect that. If you can support something like resolution=time(P1M) (for monthly), that would be a good start.

@m-mohr
Copy link

m-mohr commented Oct 24, 2024

If there are any blocker left, that could justify moving the permission to Core, but I think first we should first address the possibility to implement Scaling.

For me this request has nothing to do with my specific implementation details. I think this should just be generally the case from an UX perspective. Default request should always return something reasonable, usually not overload servers or users. This usually means by default some kind of downscaling (or often just using different level of pyramids without actual downscaling).

We did discuss the possibility to add a resolution parameter as an additional convenience parameter, but that would not remove / address the requirement o support scale-factor / scale-axes.

Yeah, I said before I don't like the 100 ways of doing things and require 100 things to implement a conformance class.
KISS for conformance classes, I think that's the way to success.

Again all these scale-* parameters come directly from WCS and are there to maintain a sense of continuity, while the other parameters are for convenience / familiarity / consistency with the other OGC APIs.

Not asking to remove anything, just let people pick. But sometimes it's also a question of refreshing things and not keep all the old legacy stuff around forever which makes things klunky the longer things evolve. Are all these parameter really needed? Can't clients just do the computation? Yeah, I'm a client fan and OGC historically has heavy focus on servers (which I think is pretty bad and must change, but just my 2ct.)

That certainly would be a problem, but that's already a problem in terms of describing the coverage in the spatial extent of the collection description. A regularly gridded coverage has a resolution (Requirement 27 in the OGC Abstract Specification Topic 6 - Part 1).

I might be able to get the resolution, but not the number of cells. I also noticed that that's an issue for the UMD and thought I had opened an issue. Maybe forgot about it.
But back to resolution - for S2: Which band? 10m? 20m? 60m? Or do I need to split that also into separate collections like for the UTM zones? User friendlyness: byebye - 100s of Collections just for S2!

It would normally be the other way around - the redirect should be from bbox & datetime to the more general subset.

Well, I have an unnormal case then. But as OGC always tries to cater for everyone: here we go...

But if the collection only has spatial and temporal dimension, the only extra thing that subset supports I think is slicing (reducing dimensionality) instead of trimming (preserving dimensionality).

I was thinking more about the other dimensions, if any...

@jerstlouis
Copy link
Member Author

had opened an issue. Maybe forgot about it.
But back to resolution - for S2: Which band? 10m? 20m? 60m? Or do I need to split that also into separate collections like for the UTM zones? User friendlyness: byebye - 100s of Collections just for S2!

The most detailed one (10m). See also #142 where I proposed a new x-ogc- field in the schema to allow specifying a different resolution per band.

I might be able to get the resolution, but not the number of cells.

The spatial extent divided by the resolution should give the number of cells (+1 cell for regular point coverage).

I was thinking more about the other dimensions, if any...

Is that a difficulty with the underlying data store to dimensions other than space and time (even though additional dimensions are supported)?

@jerstlouis
Copy link
Member Author

SWG 2024-11-27: Here are the actions that we decide to take to address this issue:

  • The requirement to gracefully ignore scale-factor=1 for servers that do not support scaling will remain, which hopefully can be ignored for static server. Clients which prefer a potential error over retrieving downsampled data can therefore always include scale-factor=1 in the request. If the client checks /conformance and it does not include scaling, then it does not need to include this.
  • The use case about supporting scale-factor=1 plus only one resolution is quite particular, but could be partially addressed by supporting the COG req. class at /coverage allowing clients to use range requests to retrieve overviews at predefined scale. Otherwise this is out of scope for this API which handles either fully static (no scaling) or fully dynamic (arbitrary scaling specified by the client).
  • In Split subset and datetime/bbox #194 we agreed to split the subsetting requirement class, but not on the convenience / general line, but at the spatial / temporal / other dimensions
  • The GEE-related questions seem more a case of gathering the necessary information from underlying backing store. We will however try to provide guidance on how to implement the different required parameters to specify scaling, and we also agreed to add resolution as a new way for clients to request downsampled data (see Clarify Scaling for time dimension (or exclude non-spatial dimensions?) #187)

@jerstlouis jerstlouis moved this from Next Discussions to Agreed; to be applied in OGC API - Coverages - Part 1: Core Nov 27, 2024
@joanma747
Copy link

I'll suggest to add a permision in the core saying something like: Servers that do not implement scale-factor class are allowed to declare a scale-factor with the only possible value of 1 to allow clients to request scale-factor=1 to be sure that the server is not rescaling the data.

This permission is compatible with the OWS Common and the need for returning 4xx for any parameter that is not declared in the API definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Agreed; to be applied
Development

No branches or pull requests

4 participants