Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested improvements to the NXdata base class definition #602

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

rayosborn
Copy link
Contributor

As suggested by @prjemian, I have created a PR that contains a number of my suggested changes to the NXdata definitions. I have created a number of smaller commits along with longer descriptions of my rationale for proposing each one. If they are agreed to, it may be necessary to make corresponding changes in the manual.

Data sets can mean a multitude of things in different contexts. For example, in many data catalogs, data sets refer to collections of data objects. Even within this file, some might construe 'dataset' to refer to the plottable data. However, there are many uses of the word 'field', which is less ambiguous, so I would propose that this is used wherever we are referring to an item in the group.
The use of 'VARIABLE' was presumably inspired by the term 'independent variable', but this is scarcely used in favor of 'dimension scale'. Since we are always referring to an axis, I suggest changing 'VARIABLE' to 'AXIS'.  Since both refer to an arbitrary name, the use of 'AXISNAME' is redundant. I also made a couple of references upper case for consistency.
The 'x', 'y', and 'z fields seem to be specific names for the 'AXIS' fields. I'm not sure why these should be included and not a hundred other possible fields.
I object to the phrase "file readers are encouraged to make their best efforts to plot the data" since it implies that the axis definitions become ambiguous. Since I have implemented a generic plotter that has never had a problem in identifying axes unambiguously, I know this is not true. I would suggest that the whole text is unnecessary, and will just confuse any outsiders not familiar with the controversy. All that is needed is a statement that the attributes should be provided, assuming that is still the NIAC recommendation.
It is my understanding that we agreed that the requirement for at least one NXdata group was not mandatory, but strongly encouraged.
The current version still assumes that the signal array has an associated array called 'errors', rather than 'DATA_errors' or by using the 'uncertainties' attributes. The changes add these other options, and define 'DATA_errors' - at the moment, only 'AXIS_errors' is defined.
The dimension scale could have a size of one greater than the corresponding signal array if the values are bin boundaries of a data histogram.
The remaining changes are my attempt at improving the understandability of some of the descriptions. I don't think they change the meaning substantively, although I shorten some of the deprecation warnings to make them, I believe, less confusing to the newcomer.
Client is responsible for defining the dimensions of the data.
The name of this field may be changed to fit the circumstances
but is matched with the *DATA*
field with ``_errors`` appended.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we deprecate the DATA_errors field?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2018-01-16 telco summary was Yes, but indicate in the documentation that the name for an uncertainty might be DATA_errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the wording. This is the only amendment the telco would need to discuss (apart from the original submission, of course).

@prjemian
Copy link
Contributor

Thanks for the contributions. Once approved, we should "squash-merge" (see the pull-down to the right of the Merge pull request button.

@prjemian prjemian added the telco label Nov 28, 2017
I have changed it to show that naming the field this way is no longer the recommended way of identifying the uncertainties, but is still a valid naming pattern.
@prjemian
Copy link
Contributor

@rayosborn : travis-ci fails due to indentation on this line

Also, change ..note:: to .. note:: (insert a space as shown)

Suggest you indent the two lines following by 4 spaces as well

@prjemian prjemian added this to the NXDL 2018.3 milestone Jan 30, 2018

It is recommended (as of NIAC2014) to use this attribute
rather than adding a signal attribute to the dataset.
rather than adding a signal attribute to the field.
See http://wiki.nexusformat.org/2014_How_to_find_default_data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that link goes nowhere

array named ``time`` while the other two dimensions have no fields
to be used as dimension scales. If the dimension scale does not
exist, it is assumed the data will be plotted against the
corresponding array index.

See examples provided on the NeXus wiki:
http://wiki.nexusformat.org/2014_axes_and_uncertainties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link dead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prjemian
Copy link
Contributor

Given the suggestion that this be split into two edits, how should this PR be handled? Are there issues created?

@prjemian
Copy link
Contributor

@rayosborn This PR languishes. @zjttoefs made a suggestion. How do wish to proceed? Do you want it assigned to a milestone?

@prjemian
Copy link
Contributor

This issue needs to be rebased.

@prjemian prjemian added this to the NXDL 2020.10 milestone Oct 19, 2020
Copy link
Contributor

@prjemian prjemian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use field instead of dataset. Also see #863.

@prjemian
Copy link
Contributor

prjemian commented Nov 3, 2020

Should this PR be moved to the next milestone?

@prjemian
Copy link
Contributor

prjemian commented Nov 6, 2020

Moving to next milestone.

@prjemian prjemian modified the milestones: NXDL 2020.10, NXDL 2021.10 Nov 6, 2020
Base automatically changed from master to main February 16, 2021 04:18
@benajamin
Copy link
Contributor

@rayosborn Would you please make the requested changes and perhaps provide a summary of the changes in nexusformat/NIAC#48 so that the NIAC can consider approval?

@prjemian prjemian modified the milestones: NXDL 2022.06, NXDL 2023.06 Jun 24, 2022
@prjemian prjemian self-requested a review September 16, 2022 15:45
@prjemian
Copy link
Contributor

In commit 81634f1, I resolved merge conflicts.

@prjemian
Copy link
Contributor

The change from AXISNAME, I believe is a good one. There are other places in the NXDL files and manual which also sue AXISNAME. This same change should be applied to keep terms consistent.

(bluesky_2022_3) zorinvm@zorin22:~/.../NeXus/definitions$ git grep AXISNAME
base_classes/NXtransformations.nxdl.xml:                the values change with time. The all-caps name ``AXISNAME`` designates the
base_classes/NXtransformations.nxdl.xml:        <field name="AXISNAME" nameType="any" units="NX_TRANSFORMATION" type="NX_NUMBER" maxOccurs="unbounded">
base_classes/NXtransformations.nxdl.xml:                        that does not cause confusion.  When using more than one ``AXISNAME`` field,
base_classes/NXtransformations.nxdl.xml:                        frames.  The end points should be given in ``AXISNAME_end``.
base_classes/NXtransformations.nxdl.xml:        <field name="AXISNAME_end" units="NX_TRANSFORMATION" nameType="any" type="NX_NUMBER" minOccurs="0">
base_classes/NXtransformations.nxdl.xml:                        ``AXISNAME_end`` is a placeholder for a name constructed from the actual
base_classes/NXtransformations.nxdl.xml:                        at the corresponding positions given in the ``AXISNAME`` field.
base_classes/NXtransformations.nxdl.xml:        <field name="AXISNAME_increment_set" units="NX_TRANSFORMATION"  nameType="any" type="NX_NUMBER" minOccurs="0">
base_classes/NXtransformations.nxdl.xml:                        ``AXISNAME_increment_set`` is a placeholder for a name constructed from the actual
base_classes/NXtransformations.nxdl.xml:                        value of this field added to each value of ``AXISNAME`` would agree with the
base_classes/NXtransformations.nxdl.xml:                        corresponding values of ``AXISNAME_end``, but there is a possibility of significant
base_classes/NXtransformations.nxdl.xml:                        differences.  Use of ``AXISNAME_end`` is recommended.
impatient-guide/index.rst:is used as the first dimension of ``data``.  The ``AXISNAME_indices`` 
impatient-guide/index.rst:   #. Associate dimension scales with plottable data dimensions (the ``AXISNAME_indices`` attributes).
impatient-guide/index.rst:#. Plot the *signal* data, given *axes* and *AXISNAME_indices*.
manual/source/datarules.rst:      For each field (its name is *AXISNAME*) in ``axes`` that
manual/source/datarules.rst:      an ``NXdata`` group attribute ``AXISNAME_indices`` which
manual/source/datarules.rst:      If no ``AXISNAME_indices`` attribute is provided, a programmer is encouraged
manual/source/datarules.rst:      The ``AXISNAME_indices`` attribute is only required when necessary to
manual/source/datarules.rst:      It is possible there may be more than one ``AXISNAME_indices`` attribute
manual/source/datarules.rst:#. Plot the *signal* data, given *axes* and *AXISNAME_indices*.
manual/source/datarules.rst:   ``AXISNAME_indices`` as described below (where the text shown here
manual/source/datarules.rst:   as ``AXISNAME`` is to be replaced by the actual field name).
manual/source/datarules.rst:.. AXISNAME_indices documentation will be repeated in NXdata/@AXISNAME_indices
manual/source/datarules.rst::``AXISNAME_indices``:
manual/source/datarules.rst:   Each ``AXISNAME_indices`` attribute indicates the dependency
manual/source/datarules.rst:   relationship of the ``AXISNAME`` field (where ``AXISNAME``
manual/source/datarules.rst:   which need to be used in the ``AXISNAME`` field in
manual/source/datarules.rst:   Here, *AXISNAME* is to be replaced by the name of each
manual/source/datarules.rst:   ``AXISNAME_indices`` attribute is based on the model of

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants