Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding all of the text of the original word file for the Solarnet recommendations #5

Merged
merged 26 commits into from
Nov 6, 2024

Conversation

ehsteve
Copy link
Member

@ehsteve ehsteve commented Jul 11, 2024

This merge request should include all of the original text from the solarnet document with a few small changes. The original text can be found here.

To see a rendered version of the documentation visit readthedocs. The PDF version can be found here.

This pull request should lead to the first official release.

@ehsteve ehsteve requested a review from tfredvik July 11, 2024 12:20
@tfredvik tfredvik marked this pull request as ready for review July 11, 2024 12:24
Copy link
Collaborator

@tfredvik tfredvik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Steve, Stein and I think it would be useful to skip our reviewing until the document is more ready for release. Is that ok with you guys?

@ehsteve
Copy link
Member Author

ehsteve commented Jul 11, 2024

@tfredvik yes! that is the point. I labelled the pull request as draft and then added you as reviewer which probably notified you. I assumed that you would have gotten a notification only once I removed the draft label. Sorry about that! Ignore this until it is not a draft and is ready for review. I'll re-add you both as reviewers once it is ready.

@ehsteve ehsteve marked this pull request as draft July 11, 2024 12:45
@ehsteve
Copy link
Member Author

ehsteve commented Jul 11, 2024

Oh sorry, actually i had not marked it as draft but now i have!

@ehsteve
Copy link
Member Author

ehsteve commented Sep 9, 2024

@steinhh and @tfredvik, I think we are ready for your review. All of the content should now be there (thanks to a lot of help from @Alrobbertz!). There are a few open issues that we wanted to check with you on.

  1. Section numbering is now done automatically but this does not play well with existing section numbers including the highest sections Part A, Part B, etc. Appendices also have their own automatic section numbers which differ from the hard-coded appendix numbers which is confusing. How should we proceed?
  2. Original section 19 and 20 are now potentially obsolete because indexing is now done automatically. Do we keep them? Do we keep a separate list of the solarnet keywords still?
  3. Review the keyword descriptions in Section 13. Is there anything else we want to be added to this table?

@ehsteve ehsteve marked this pull request as ready for review September 10, 2024 17:56
@ehsteve ehsteve requested a review from steinhh September 10, 2024 17:56
@steinhh
Copy link
Collaborator

steinhh commented Sep 11, 2024

Hi @ehsteve! So far we've only glanced at it, but here's Terje's comments:

  • Could all appendices be put into a common super-section, either the existing “1. Descriptions of FITS keywords”, or a new “Appendices”? That would also make Section 12 more visible, I actually thought is was missing because it was drowned by all the appendices.
  • The navigation bar includes Section 1.2.4 Comments, but this section doesn’t exist.
  • Section 13: there are no section references. The keywords are not printed in fixed width.
  • In some (but not all) blocks with keyword examples the values of keywords have been coloured, e.g. 1.8.1 vs 1.8.2. Also, the colouring affects keyword comments containing an apostrophe which looks a bit strange, e.g. 12.4.3.1. Finally, the colouring of “for”, “from”, “is” etc is even stranger, e.g. 12.4.2
  • Some places the units in the keyword comments have additional backslashes, e.g. 12.4.3.3 (\[m\] -> [m]) and Appendix IV (\[Angstrom\] -> [Angstrom]). The extra backslashes have also crept into some keyword values, e.g. PIXLISTS of Appendix II.
  • Section 15 should be deleted

I don’t understand Steven’s question 2. What automatic indexing is making the keyword lists obsolete? I don’t understand what his 3rd question mean.

--

To which I'll add that it is essential to preserve the original section numbering in such a way that existing references in other documents (papers, metadata standards, etc) are still valid and correct. Not least the companion document regarding simulations, where section numbers have been "synchronised" and quite a few sections are simply "See Section X in the other document".

At worst, sections should be kept even if they end up containing nothing other than "See Section X". If it ends up with a double redirect from the simulation paper, that's just a funny side effect ;-)

@ehsteve
Copy link
Member Author

ehsteve commented Sep 12, 2024

@steinhh thanks for the feedback!

  1. Fixed
  2. Please check again. I see the section but i may have fixed it without realizing it.
  3. Sorry, but i flew through and updated all of the sections and so now I am not sure which one was Section 13!
  4. Not sure what you mean. See below for providing pointers to where you see individual issues.
  5. Those were probably escape characters introduced by the original docx to markdown translation. Thanks for pointing them out. I think I fixed all them all with a project wide search.
  6. Deleted. Sorry, we left that in by mistake.

I understand your concerns about keeping the original section numbering. I had tried to match it with automatic numbering with sphinx but could not match it exactly. I've now removed that automatic numbering and have manually added all of the section numbers to their original values.

Regarding question 2 and Part C, we wanted to talk to you about the content there. For one, the index is here. It is generated automatically by sphinx and it just looks at the list of keywords in this file. We can easily add more files to add elements to the index (like the other FITS keywords). This seems to be doing exactly what you are doing in Part C. You have two sections here 19 and 20 but from an index perspective, I don't see a reader wanting to have to figure out whether a keyword they want more information about is a solarnet keyword or not. A single index makes more sense to me.

Regarding question 3, the table in Section 19 is something i had suggested to you by email before we met! The idea is to create a quick dictionary description of all keywords so that users can easily look up a keyword to see what it means. This is also generated from the csv file. We can therefore easily add columns to this table. For example, we can add a column that notes whether it is a solarnet keyword or not.

To provide pointers to specific problems, if you click on Files Changed you'll see every single file. You can then click on a line number in a specific file to add a comment at that specific spot.

@ehsteve
Copy link
Member Author

ehsteve commented Sep 12, 2024

Have either of you had the chance to look at the automatically generated PDF?

@ehsteve ehsteve requested a review from tfredvik September 12, 2024 19:22
@tfredvik
Copy link
Collaborator

tfredvik commented Sep 16, 2024 via email

@steinhh
Copy link
Collaborator

steinhh commented Sep 16, 2024

Regarding indexing/reference, for sure it's a bit overkill (and confusing) to have three sections. Ideally, we would like to have them all combined into one. I.e., an alphabetical list where SOLARNET keywords are highlighted (e.g., red text), with the brief explanations and section references as links. Would take a bit of scripting, yes... but doable? (We think those SOLARNET keywords should stand, in particular for folks who are already familiar with fits files and who want to know "what's new"). I guess the non-essential elements are keyword sections as links, followed by the section numbers themselves, and the coloring. But it would be really useful to have at least the descriptions + links together.

@ehsteve
Copy link
Member Author

ehsteve commented Sep 17, 2024

Thank you both for the feedback. We will work on it. @Alrobbertz, I won't be able to get to this this week. Do you have time?

@Alrobbertz
Copy link
Member

Thank you both for the feedback. We will work on it. @Alrobbertz, I won't be able to get to this this week. Do you have time?

Yes, I can work on this later this week.

@ehsteve
Copy link
Member Author

ehsteve commented Sep 26, 2024

@Alrobbertz I realized that i can't switch the readthedocs to this repo (instead of mine) because it does not contain readthedocs.yaml file yet (because it is in this pull request!). We will have to switch it over once this PR is merged.

@Alrobbertz
Copy link
Member

Regarding Terje's comments:

  1. I Fixed the section numbering in the PDF so neither the PDF nor the HTML use auto-generated section numbers.
  2. This was just a manual section number fix.
  3. This was a more challenging fix. I had to add a custom latex parser since we we're nesting index nodes inside of code nodes and sphinx wasn't handling this well.
  4. Making the code blocks support more text without wrapping is a trade-off. I tried messing around with margins, and changing the font size.
    • I could go down to font size 10pt to keep the 1in margins.
    • I went down to 0.5in margins to keep the 12pt font size. (This is what I ended up keeping)
  5. Not sure what was happening here, but its fixed now...

Regarding a combined index, I created a combined section 19 that merges together the keywords, their origins, descriptions, and hyperlinked section references.

Other things I fixed along the way...

  1. The coordinate calculate flowchart in Appendix VI was displaying properly in the HTML but it was not displaying at all in the PDF. I got it to display in the PDF by changing the image format from .svg to .png. (since LaTeX doesn't handle .svg images well).

Other Issues I'm still working on, or @ehsteve if you have time to look at these

  1. Text in tables is not wrapping properly in the PDF. For example, the second table in Appendix II has text wrapping in the HTML, and looks funky, but the PDF has the text cut off and not wrapping at all.
  2. There are extra blank pages in the PDF that I can't seem to get rid of. I've tried a few things, but nothing has worked so far.
  3. The "References" section appears after the appendics in the original document, and is listed as section 11. However, in our conversion we list the references after Part C. and do not give it a section number. This creates a strange transition where Part A now ends with section 10 and Part B picks up with section 12.

@ehsteve
Copy link
Member Author

ehsteve commented Oct 1, 2024

This is looking great @Alrobbertz IMHO!

Regarding the SVG figure, we could potentially use svglib to convert the file to a PDF. Latex can generally include those without too much trouble.

For your other issues

  1. Text wrapping in tables is a known problem. There are a few known solutions though I'm not sure if any of them are available to us.
  2. The blank pages may be caused by this.
  3. We need input from @steinhh and @tfredvik on this one please!

Also, could the two of you have a look at the new table with section references? I think it would be good to add more explicit descriptions of each of the keywords. I'm wondering whether either of you are willing to help with this?

@tfredvik
Copy link
Collaborator

tfredvik commented Oct 8, 2024 via email

@ehsteve
Copy link
Member Author

ehsteve commented Oct 9, 2024

@tfredvik if you are okay with the document as it stands please click approve and we can do a first release and fix those cosmetic glitches in a bug fix release. Thanks!

@Alrobbertz
Copy link
Member

@ehsteve I fixed the index and blank pages. It looks like the table wrapping is going to be a bigger fix since it requires converting all the markdown tables into latex tables. Is this something you want to do as a part of this release or a future release?

@ehsteve
Copy link
Member Author

ehsteve commented Oct 10, 2024

@Alrobbertz i just want through the pdf and didn't see anything that bad with the tables so i'm okay with doing tweaking in a future release. We are just waiting for approval from @steinhh.

@steinhh
Copy link
Collaborator

steinhh commented Oct 12, 2024

Hi guys! First of all, thank you for an amazing job!

I don't mind approving the PR as such, things look quite good! The big (and real) step, though, is making it official by telling everyone on my distribution list about it and changing the doc on our web servers to point at the github/readthedocs/PDF versions. I.e., "to find the latest version, from now on go to https://.../".

The announcement should be accompanied by instructions on how to automatically be alerted about changes - preferably only when we think it would be worth-while, as we've done up until now whilst actually updating the on-line version as we go, which is why we call it the "live" version (most updates lately have been very minor, and often just edits for clarity).

Do we do that by having a release branch which people can follow/watch, while we do updates continuously on a develop branch and/or with minor pull requests to that one. With people actually using the develop branch on a daily basis, the release branch being just a way to get alerts (and to get distinct versions to use in references)?

Do we point people to https://.../.../latest/? I really like the TOC there, and I miss it sorely in the https://.../.../latest/generated/ version. The side panel navigation is not as good by far. The side panel is also not sticky (to the viewport) so it does not add anything compared to the very clear TOC on the https://.../...latest/ page. Any way to carry the TOC over into the https://.../.../latest/generated/ page, or vice versa to put the contents onto the https://.../.../latest/ page?

Then, I have to expose more of my ignorance w.r.t. github:

Where's the meat? Sorry, the "code" ;-). Going to the repo home page (https://github.com/IHDE-Alliance/solarnet_metadata/) it's basically empty, and cloning the repo doesn't give me anything more. I suppose this is b/c all the content is in this pull request? If I want to do tweaks, is maneuvering to specific files in the PR the only way, I can't edit on my laptop? That would only be possible after this PR is completed, am I right?

Speaking of tweaks:

  1. The .../latest/generated/ page and PDF document title is just "Solarnet", it should of course be "Solarnet Recommendations for ..." as on the latest/ page.
  2. The second paragraph says "If you want to find the latest released or past version, see the releases". Mhm... where? I guess it's difficult to fix this before we have a release though ;-)
  3. Upon release/before announcement the readme should no longer point to arXiv and the sdc.uio.no sites, but rather the readthedocs page(s)

I will approve the PR unless you think the above comments warrant otherwise. Then that will be the first release, and we should have someone here play guniea pig w.r.t. new releases/alerts, then announce it to the world with clear instructions :)

@steinhh
Copy link
Collaborator

steinhh commented Oct 14, 2024

Terje didn't see what my "TOC problem" was (but then on the other hand he had not even noticed the full TOC in the latest/ version either). If you guys don't get what I meant, tell me and I'll try to be more explicit. As mentioned before, this is not a showstopper in any way, just a (strong) wish from my side. I may be a bit brain-damaged from working with the document for so many years, but I'm used to "scroll to the top for a complete, clear TOC". Now it's "scroll to the top for a less readable nav bar". I realise that the browser's back button solves the problem to some extent, but if you've clicked more than once on the nav bar, you'll need more than one back button press.

Alternative things that would help would be

  1. Indentation of the section titles in the nav bar, would make it much easier to identify the sections
  2. Expansion of the sections (always) in the nav bar. Improves readability (especially now that there is no indentation for main section titles), but it's also easier to find the right place to go to (the subsection titles provide more context for the main section titles).
  3. Having a sticky nav bar - no scroll to the top needed. Though... well, it's more than one page's worth ain't it... not sure how that would work ;)

@Alrobbertz
Copy link
Member

Hi @steinhh,

I think we'd like to merge these changes into the main branch as-is and then we can fix any of the follow-on issues in separate pull requests.

The GitHub release process uses Git Tags to create a unique version number associated with a specific commit on the main branch. This alleviates the need to have two branches for "release" and "development". The latest commits on the main branch can be considered the dev/latest/live version, and we can tag specific commits when we want to generate an official released version. This will eventually populate all the versions on the GitHub repository's release page, although its empty until we create our first release.

This release process will allow users in the web page to view the documentation for each release as well. For example, the Sunpy project uses the same code tools that we are using here, so you can view the Sunpy documentation page as an example. In the bottom right-hand corner there will be a menu that allows you to select a version of the webpage documentation to view. On the sunpy page the default is below:
image

Once you click on this menu it will give you the choice of different versions of the documentation to look at on the web page, including released versions (6.0.1, 6.0.2, 6.0.3 as of the time I write this) as well as the latest/live version which is at the head of their main branch.
image

Right now the meat and the code only exist in Steven's fork of this repository and in this Pull Request. Once the pull request is merged, then the meat will be in the "/docs/source" folder from the root of the repository. Other files in the "/docs" folder are mainly code used to generate the webpage and PDF versions of the documentation. I'd be happy to go through some examples of how to edit and modify the content with you once we have this PR merged.

Unfortunately the TOC issue might be complicated to solve, and require some more changes to how the code builds the webpage. I'm happy to experiment with this in the future, but I think we'd like to get this first version merged in if possible.

Final Questions:

  1. Can you approve this Pull Request if you agree with path forward and the GitHub release process?
  2. When we do generate a release, what version would you like to use? I see the latest on you server is "2.2-live". Would you like our first release here to be v2.3 or v2.2.1? None of the main content has changed compared to the "live" version, only formatting and our indexing. My recommendation would be to use a v2.3 but I am open to suggestions.

@steinhh steinhh merged commit bb53155 into IHDE-Alliance:main Nov 6, 2024
@steinhh
Copy link
Collaborator

steinhh commented Nov 6, 2024

Hi Steve!

I approved the merge now, but I'm still not clear about the release process - what I'm looking for is a way for end users to get notified when and only when we decide it's time for a new release. Is that possible?

@ehsteve
Copy link
Member Author

ehsteve commented Nov 12, 2024

@steinhh, if they have a github account they can follow this repo and will automatically get notifications but i assume you want something broader and automatic (besides you sending your own email?). I would actually suggest that you keep sending your own emails to notify people regardless of us adding an automatic notification scheme for that personal touch. People usually start ignoring automated notifications (i know i do!). Let us look into creating something automatic as well. I'll create an issue. Once we create the first release you'll see how it looks on github as well.

Btw, did you answer @Alrobbertz question about the version number you want to use? I recommend you consider Semantic versioning (see https://semver.org). To quote

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backward compatible manner
PATCH version when you make backward compatible bug fixes
Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

It is designed for code but people in the community understand what it means. Given that this release has not changed to the actual text, i would consider it a MINOR or PATCH version but given that it is a major change in format and distribution perhaps it deserves a MAJOR version. What do you think?

@steinhh
Copy link
Collaborator

steinhh commented Nov 12, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants