Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated column_description.csv with short descriptions #196

Conversation

stephjuneau
Copy link
Contributor

This new version of column_description.csv includes an additional column called FullDescription which is the original description that is often too long for FITS files but will work OK for the data model documentation. The column called Description is a shortened version that is limited to up to 46 characters as an estimate for the length compatible with FITS file headers.

So far, the file was changed but none of the code. Based on reserving the column Description for the FITS file functionality, we anticipate that no changes should be needed for desiutil.

However, there is still a TO DO item to adapt and test: desidatamodel/bin/update_column_descriptions to use the FullDescription column to preserve the longer descriptions. I have not yet attempted this with perlmutter being down so someone can feel free to try and contribute to this PR.

@stephjuneau stephjuneau linked an issue Sep 12, 2024 that may be closed by this pull request
@stephjuneau
Copy link
Contributor Author

@sbailey I made this draft PR because I'm going to be unavailable for part of tomorrow and may not have time to change the code so this is changing ONLY the csv file as a starting point. That said, I don't understand why changing only one csv file and nothing else results in so many of the CI tests failing. Is that expected? I just cloned, updated the csv and pushed to this branch. Feel free to commit more changes or let me know if you have specific requests.

@weaverba137
Copy link
Member

I can update update_column_descriptions and fix the tests. One set of tests involves some stringent QA on the CSV file itself, so it's not surprising that adding a new column would require updating the tests.

@akremin
Copy link
Member

akremin commented Sep 12, 2024

Posting what I put on Slack with regards to DESINAME:

I might suggest for consideration:
“Name with truncated decimal TARGET_{RA,DEC}”

I am not opposed to Stephanie’s version but it loses some important information while repeating some things that could be gleaned from the names themselves. If someone doesn’t explicitly look at a DESINAME entry then my explanation is less obvious, but if they do look at the description + entry then I think mine conveys more. So it depends on the most common use case we expect to have.

I’m very happy with the long description.

@coveralls
Copy link

coveralls commented Sep 12, 2024

Coverage Status

coverage: 100.0%. remained the same
when pulling 547accc on 195-column_descriptions-with-shorter-descriptions-for-fits-headers
into 6937e98 on main.

@sbailey
Copy link
Contributor

sbailey commented Sep 12, 2024

Unfortunately desiutil/bin/annotate_fits takes the first column with case-insensitive "description" anywhere in the name, not just the column called case-insensitive "description" as we had previously expected. That means that annotate_fits is picking up the "FullDescription" column instead of the intended "Description" column. Options:

  • Reorder column_descriptions.csv to have "Name,Type,Units,Description,FullDescription" instead of "Name,Type,Units,FullDescription,Description"
  • Rename "FullDescription" to something that doesn't have "description" as a subset of the name
  • Update desiutil.annotate.find_column_name so that the prefix parameter actually mean prefix, and not "contained in", e.g. changing line 52 from if s in column.lower() to if column.lower().startswith(s). I don't know if there are other places that might be replying upon the current behavior, e.g. using a file with a column name "ColumnDescription".

@weaverba137 @stephjuneau opinions?

@weaverba137
Copy link
Member

Again, my preference would be to not modify desiutil. Any solution that does not involve that is fine with me.

@sbailey
Copy link
Contributor

sbailey commented Sep 12, 2024

  • I updated column_descriptions.csv to use the order "Name,Type,Units,Description,FullDescription". This lets desiutil/bin/annotate_fits work with the short Description column without requiring any code changes there.
  • I updated the desidatamodel.stub code to use FullDescription instead of Description when updating a datamodel file.
  • I updated the short version of DESINAME to Anthony's suggestion "Name with truncated decimal TARGET_{RA,DEC}" i.e. emphasizing the truncated aspect that would otherwise be tricky to know.
  • I cross checked the new column_descriptions.csv FullDescription with the current main Description. I fixed one quote typo induced by the csv -> spreadsheet -> csv round trip, and verified that the other changes are purposeful:
    • DESINAME: more detailed FullDescription than before
    • SHAPE*_*: fixed ellipticity typo (thanks Stephanie!)
    • SURVEY: more detailed FullDescription than before

@sbailey sbailey merged commit 37f22ff into main Sep 12, 2024
32 checks passed
@sbailey sbailey deleted the 195-column_descriptions-with-shorter-descriptions-for-fits-headers branch September 12, 2024 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

column_descriptions with shorter descriptions for FITS headers
5 participants