-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: 1381 change to the axes attribute meaning #1396
base: main
Are you sure you want to change the base?
Conversation
Also add to
|
NIAC Jul 2024 comments
|
@rayosborn As for you comment about stating that "AXISNAME_indices are not allowed to contradict the position of AXISNAME in @axes". So this is the canonical example
Then this should be invalid
But then is this invalid as well?
From the position of the string "x" in @axes we get Edit: maybe it should say that "if @AXISNAME_indices is provided, then the position index of "AXISNAME" in @axes must be one of the indices." ??? |
@woutdenolf requested some evidence that NIAC had previously allowed the dimension size of an axis array to be one greater than the associated data dimension. It turns out that this was approved at the very first NIAC meeting in Pasadena in September, 2003. According to the minutes
If you look at the current PDF version of the manual, you will find examples of axes implementing this rule on page 16/17. An example NeXus file has the following two entries:
Finally, it is also implied in the current NXdetector definition. Here are two of the field definitions:
The |
Refactor again to take NIAC comments into account:
The structure of the NXdata doc section at the top is now
One example per section is enough. The example in the axes section covers all axes use cases supported by NXdata (histogram edges +1, alternative axes, multi-dimensional axes, "." in axes, missing indices). I made this example very concrete by referring to a specific scientific case and included an image. Edit: I keep discovering new things about the axes. I added this statement which did not seem to be in the current definition but I think is implied?
|
ff696cd
to
edba426
Compare
b6cf257
to
76979cd
Compare
Sphinx-Gallery is meant to generate a gallery of plots from python code (see examples). So very different from code examples with different implementations, which is what sphinx-tabs or sphinxcontrib-osexample does well. |
Concerning histograms (axis has one value more than the signal): we do not define whether the signal contains the bin height or the bin area. I'm assuming it is the area. Should we add this to "The fields and attributes have the following constraints ..."? Edit: often histograms contain photon counts integrated over the bin width. For example in powder diffraction the bin value unit is photons, not photons per degree (assuming the X axis is in degrees). So it is the integrated bin area. Edit: however when plotting a histogram, the value is usually the height. I'm confused now. |
@woutdenolf, you have opened up another can of worms here. What is stored could well depend on the conventions of the community writing the data, and we have probably been lax in NeXus in providing a mechanism to define which convention is being used. In time-of-flight neutron scattering (at least when I was writing code), the convention was to divide the counts by the bin width when storing the data, so that we were effectively plotting in units of 'neutrons/microsecond', i.e., bin heights. When we wanted to convert to other units, e.g., to energy, we simply computed the energy values at all the bin boundaries and then multiplied by (dt/dE), a numerical Jacobian, to switch to units of 'neutrons/meV'. Because I come from that background, NeXpy just assumes the values are bin heights when plotting. However, another community might well choose to store the original counts, i.e., bin integrals, and there is no way of telling with the current NeXus standard. We probably need to define an attribute that says which convention is being used, with one of them being the default. Of course, when the bin widths are all constant, as they nearly always are, it doesn't really matter, but we nevertheless shouldn't assume that. I will contact some friendly Mantid developers to make sure that 'neutrons/microsecond' is still the convention at neutron sources, and perhaps you could check with your powder diffraction colleagues to see if anyone is storing bin integrals. |
76979cd
to
6fa27e0
Compare
@PeterC-DLS has mentioned this PR in an issue I raised. I think this is all good, pictorial examples will be very useful, but there are some things that need to be explicit (some are raised in the issue and some raised by the current solution):
Raised by above discussion in this PR:
I think there's inspiration to be taken from There's also https://root.cern.ch/doc/master/classTH1.html (a very complete histogram class for ROOT in C++) and it might be interesting that they have separate plot types for histograms and graphs (point-like data), which might help (or provide motivation for) divergence of NXdata trying to add too much information and essentially start plotting data (which, as far as I am aware, is explicitly not the aim of the class). Anecdotally, it's been very difficult to get started with NXdata, so I'm encouraged by this discussion :) whilst we are talking about AXISNAME, it would be very useful to add some clarity to
since it seems that |
@ggoneiESS, thanks for raising the issue. I think this should be the subject of a separate PR. @woutdenolf has made significant progress in improving the NXdata documentation but has, in the process, uncovered some limitations in the current standard including, coincidentally, the one you raised in #1527. I think the discussion will get too confusing if we mix documentation fixes with substantive changes to the standard, although they are obviously linked. In my view, I think it would be best to reopen #1527 to complete the discussion with a view to preparing a separate PR. It shouldn't take too much work to update the documentation if we resolve your issue. @woutdenolf, for the purposes of this PR, do you think it's sufficient just to state that histograms are plotted using the bin heights for now? Perhaps #1527 can be mentioned in the documentation as an ongoing discussion, but I would prefer that this PR is not further delayed. The vast majority of histogram plotting is currently in the context of time-of-flight neutron scattering, so I think this represents the current practice, even if it could be modified in the future. |
I reopened my issue, but I'll hold off submitting the PR until this issue is implemented. I still wonder whether the extra information about the bin plotting (leading/trailing edge or centre) and the errors info should form part of this, or whether it should all be pushed into my new PR (which, having compared the two, will also change the standard slightly rather than just the documentation). |
f230f04
to
25f39a5
Compare
typo Co-authored-by: Peter Chang <[email protected]>
Summary of what this MR provides:
HTML RENDERING of the NXdata after this MR. Especially expand the "description". |
Hi all, as discussed in the Telco, we can now move this PR to an online vote. NIAC committee members please vote on this PR using emojis on this comment. 👍 for yes, 👎 for no, anything else (for example 👀) to abstain. We need 14 votes to hit quorum so please review and vote! |
A technical detail: we see orphan: in the HTML rendering. @tacaswell Do you have a workaround since sphinx-gallery/sphinx-gallery#854 is not addressed? |
Voters: consider the two open discussions #1396 (comment) (can be handled separately in #1544) #1396 (comment) (potential extra condition to allow axis_indices to be omitted) NXdata: the gift that keeps on giving ;-). |
Closes #1381
HTML RENDERING OF NXDATA FIX <-> current nxdata for comparison
When reviewing this PR, please keep in mind that the purpose is to rectify NXdata, not improve it. Suggestions for improvements can go in #1381.
For reference: multi-dimensional axes were introduced here https://www.nexusformat.org/2014_axes_and_uncertainties.html
Context
The NXdata definition got a makeover in PR #1213 to make it more understandable. In this effort, I assumed the
@axes
attribute was supposed to say what all the axes are of the NXdata group. It didn't occur to me this attribute is not about the axes, it is about the signal. It defines what the default axis is for each signal dimension. The unintended change do not make existing files invalid but it does introduce more flexibility which changes things for existing readers as pointed out in issue #1381 by @jacobfilik .Purpose of this PR
The sole purpose of this PR is to rectify PR #1213 and NOT introduce anything new. The alternative PR #1392 by @PeterC-DLS fixes the situation by carefully modifying some sentences here and there. I would argue however that the entire "axes" section in the introduction has been structured with the less restrictive
@axes
in mind. In this PR I refactor the entire "axes" section to better reflect restrictive@axes
.Details on @axes rectification
This is the current less restrictive
@axes
attribute definition which I'm rectifyingThis definition is much simpler and more concise than the definition in the "axes" section in this PR. However it removes the restriction that
length(axes) == rank(signal)
so we need to put it back in with @axes being the default axes, not all axes.