Use Case 1 : Max number of dimensions question #1

brianthomas · 2014-10-10T01:07:50Z

We need to determine a maximum, if any, in the number of allowed dimensions in a data cube.

migueldvb · 2014-10-16T15:11:16Z

I think that the maximum number of dimensions in HDF5 is 32, defined by H5S_MAX_RANK (edited this number in the use case).

brianthomas · 2014-10-16T15:16:18Z

"640K ought to be enough for anybody." -bill gates

I supply the quote to wonder if building limits into the data format is wise. What appears to be a good limit today may look inadequate in the future.

timj · 2014-10-16T15:40:24Z

A fixed number simplifies some coding in C et al and having it as a compile time parameter makes it easy to change. Are there any datasets in astronomy that even come close to 32 though?

brianthomas · 2014-10-16T16:19:55Z

Radio guys are probably the ones pushing this more than anyone else (the number of dimensions needed)

juandesant · 2014-10-16T16:25:47Z

For the SKA data products the number of elements per dimension will be very
large, but the number of dimensions will be typically 2 spatial axes (in
the order of 100Mpixel), 1 polarization (4 values), and a frequency axis
(with up to 256k channels). An optional RFI axis could be added, where
elements in one plane are the actual measure, and the other would be an RFI
axes, and other such maps, but I cannot envision nothing beyond 32 for a
data product. A velocity axis is typically a second representation of the
frequency axis, as well as a wavelength one.

For raw data I can imagine more dimensions, including baseline, but I find
it difficult to go beyond 32 dimensions.

On Thu, Oct 16, 2014 at 5:19 PM, Brian Thomas [email protected]
wrote:

Radio guys are probably the ones pushing this more than anyone else.

Reply to this email directly or view it on GitHub
#1 (comment)
.

Juande Santander-Vela
System Engineer (Science Data Processor/Telescope Manager)
Square Kilometre Array/SKA Organisation
Jodrell Bank Observatory, Lower Withington
Macclesfield SK11 9DL, United Kingdom

migueldvb · 2014-10-16T16:54:18Z

It looks like the maximum number of dimensions is defined in the header file H5Spublic.h in HDF5. The standard library limits dataspace objects to a maximum rank of 32 but it should be possible to change this up to the maximum value on the system and recompile the library if necessary. I think this is a good approach and I agree that it is unlikely that a larger value is needed.

embray · 2014-10-17T12:08:38Z

I believe it would be foolish to bake in any absolute upper limit, though it might make sense to define a minimum number of dimensions that software readers must be able to support somehow. Even for very large numbers of dimensions readers should probably at least be able to return slices along a subset of those dimensions--after all it's still just bytes.

I believe Numpy has a baked in limit of 256 axes for ndarrays, but that can be changed at compile-time if needed. So data with more than 256 dimensions may not be readable into a typical Numpy array and software should be able to detect that.

I guess what I'm trying to say is, I don't feel like all data needs to be readable by all readers (at least in extreme cases) as long as it's clear where the limitations are, and that it's at least possible to find a way to read the data in those files in the preferred format.

brianthomas · 2014-10-17T16:29:52Z

Yes, I agree with Erik; this is what I was alluding to earlier. Further, I'd expect that various data models probably have different limits. Not sure what the minimum for all images in the format might be, but its at least 3 axes.

juandesant · 2014-10-23T14:04:19Z

OK, so we should characterize some typical dimensions, spatial, polarization, frequency/wavelength, time, intensity which can be orthogonal, and perhaps hope for the referees to show a few more, and then use the typical dimension limits in modern formats to show that there is a lot of legroom, and that more can be achieved.

brianthomas · 2014-10-23T15:25:03Z

@juandesant I think that would be a good starting point for justification of any derived requirement(s)

brianthomas self-assigned this Oct 10, 2014

brianthomas added this to the Collect Usecases milestone Oct 16, 2014

brianthomas changed the title ~~Use case 1 max number of dimensions~~ Use case 1 : Max number of dimensions question Oct 16, 2014

brianthomas changed the title ~~Use case 1 : Max number of dimensions question~~ Use Case 1 : Max number of dimensions question Oct 16, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Case 1 : Max number of dimensions question #1

Use Case 1 : Max number of dimensions question #1

brianthomas commented Oct 10, 2014

migueldvb commented Oct 16, 2014

brianthomas commented Oct 16, 2014

timj commented Oct 16, 2014

brianthomas commented Oct 16, 2014

juandesant commented Oct 16, 2014

migueldvb commented Oct 16, 2014

embray commented Oct 17, 2014

brianthomas commented Oct 17, 2014

juandesant commented Oct 23, 2014

brianthomas commented Oct 23, 2014

Use Case 1 : Max number of dimensions question #1

Use Case 1 : Max number of dimensions question #1

Comments

brianthomas commented Oct 10, 2014

migueldvb commented Oct 16, 2014

brianthomas commented Oct 16, 2014

timj commented Oct 16, 2014

brianthomas commented Oct 16, 2014

juandesant commented Oct 16, 2014

migueldvb commented Oct 16, 2014

embray commented Oct 17, 2014

brianthomas commented Oct 17, 2014

juandesant commented Oct 23, 2014

brianthomas commented Oct 23, 2014