Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(JATS): Add footnote #896

Merged
merged 4 commits into from
Apr 11, 2021
Merged

Conversation

rgieseke
Copy link
Contributor

@rgieseke rgieseke commented Apr 6, 2021

A first attempt at adding footnote (fn) support to the JATS decoder (as discussed in #880 and stencila/schema#251) .

Considering the following JATS example:

<?xml version="1.0"?>
<article>
<body>
<p>Further<fn id="note-1"><p id="Ch0.note-1">An <bold>important</bold> comment.</p></fn> thoughts below.</p>
</body>
</article>

I get this JSON:

{
  "type": "Article",
  "content": [
    {
      "type": "Paragraph",
      "content": [
        "Further"
      ]
    },
    {
      "type": "Note",
      "id": "note-1",
      "content": [
        {
          "type": "Paragraph",
          "id": "Ch0.note-1",
          "content": [
            "An ",
            {
              "type": "Strong",
              "content": [
                "important"
              ]
            },
            " comment."
          ]
        }
      ]
    },
    {
      "type": "Paragraph",
      "content": [
        " thoughts below."
      ]
    }
  ]
}

Ideally, the footnote would appear as an inline element. Is this due to how paragraphs are handled?

@nokome
Copy link
Member

nokome commented Apr 7, 2021

Thanks for this @rgieseke !

Ideally, the footnote would appear as an inline element. Is this due to how paragraphs are handled?

I decided in Schema that the content of a Note should allow for BlockContent only and inline nodes could get wrapped into a paragraph as you have done. We could expand the schema for content to include inline nodes, so that the wrapping is not necessary. But when encoding to another format e.g HTML it makes the code a little more cumbersome. Generally, I'm leaning towards making schemas as narrow as possible, while still be useful - mostly because it requires less type checking code when it comes to implementations. Ref https://github.com/stencila/schema/blob/9217c1435f92844f91cade18efefbf75b1854297/schema/Note.schema.yaml#L27

Could you please add noteType: 'Footnote' (that should probably be the default in the schema).

@rgieseke
Copy link
Contributor Author

rgieseke commented Apr 7, 2021

What i was expecting is something like

{
  "type": "Article",
  "content": [
    {
      "type": "Paragraph",
      "content": [
        "Further",
        {
          "type": "Note",
          "id": "note-1",
          "content": [
            {
              "type": "Paragraph",
              "id": "Ch0.note-1",
              "content": [
                "A footnote",
              ]
            }
          ]
        },
        " thinking about it."
      ]
    }
  ]
}

I though a footnote is like something in parens which can also be read inline. Is this not possible because a block element cannot contain further block elements? The note itself could be an inline element, couldn't it?

@codecov-io
Copy link

Codecov Report

Merging #896 (fd8241a) into master (9414e46) will increase coverage by 0.04%.
The diff coverage is 75.00%.

❗ Current head fd8241a differs from pull request most recent head 365a55c. Consider uploading reports for the commit 365a55c to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #896      +/-   ##
==========================================
+ Coverage   87.02%   87.06%   +0.04%     
==========================================
  Files          88       88              
  Lines        7900     7919      +19     
  Branches     2511     2524      +13     
==========================================
+ Hits         6875     6895      +20     
+ Misses       1003     1002       -1     
  Partials       22       22              
Impacted Files Coverage Δ
src/codecs/csl/index.ts 87.15% <42.85%> (-0.42%) ⬇️
src/codecs/jats/index.ts 95.87% <81.81%> (+0.43%) ⬆️
src/util/schemas.ts 80.76% <0.00%> (-2.57%) ⬇️
src/util/xml.ts 98.82% <0.00%> (+1.17%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f26e6f...365a55c. Read the comment docs.

@rgieseke
Copy link
Contributor Author

rgieseke commented Apr 7, 2021

Could you please add noteType: 'Footnote' (that should probably be the default in the schema).

Done! I was reluctant in my first commit to add this but this probably makes sense.
In JATS endnotes seem to be encoded as fn within fn-group but could also be just table notes etc.
For the article usecase this default is probably perfectly fine.

@nokome
Copy link
Member

nokome commented Apr 7, 2021

Oh sorry, I misunderstood your point about the note being in it's own paragraph. Now I see what you mean...

The note itself could be an inline element, couldn't it?

Yes, that was what I envisioned too. I think this may have been caused by the fact that we forgot to add Note to the list of InlineContent nodes. I have just released a new version of Schema with that fixed: stencila/schema@aa370b6. Can you try with the new version please.

Also it would be good to add a test for this using the above JATS and expected JSON.

@rgieseke
Copy link
Contributor Author

rgieseke commented Apr 8, 2021

Looks good!

I had to introduce a workaround related to stencila/schema@ebc4560

This also affects other decoders (HTML, Markdown, Gdoc, Ipynb) and i'm not sure if my change is the proper solution.

{
  "type": "Article",
  "content": [
    {
      "type": "Paragraph",
      "content": [
        "Further",
        {
          "type": "Note",
          "id": "note-1",
          "noteType": "Footnote",
          "content": [
            {
              "type": "Paragraph",
              "id": "Ch0.note-1",
              "content": [
                "An ",
                {
                  "type": "Strong",
                  "content": [
                    "important"
                  ]
                },
                " comment."
              ]
            }
          ]
        },
        " thoughts below."
      ]
    }
  ]
}

@rgieseke
Copy link
Contributor Author

rgieseke commented Apr 8, 2021

I tried adding a test file, but right now i can't get to run the tests (npx jest src/codecs/jats) without failure even if i revert to earlier schema versions.
Any advice on how to debug that?

@nokome
Copy link
Member

nokome commented Apr 11, 2021

Any advice on how to debug that?

Sorry for the late reply, been offline for a few days.

Given the fixes that need to be made due to stencila/schema@ebc4560, I think the most efficient approach is for me to merge this PR and then work on those fixes for JATS and the other codecs on master.

@nokome nokome merged commit e792291 into stencila:master Apr 11, 2021
@stencila-ci
Copy link
Collaborator

🎉 This PR is included in version 0.114.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@rgieseke rgieseke deleted the feat/jats-footnotes branch April 13, 2021 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants