-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LaTeX export for ACM-BCB #943
Conversation
panflute upgrade required (not yet on conda-forge, so swith to PyPI)
includes fix to manubot/rootstock#386 (comment): Element "MetaList" received "CSL_Item" but expected <class 'panflute.base.MetaValue'>
From pandoc 2.11 (2020-10-11) changelog: > Add CSS to default HTML template (#6601, Mauro Bieg). This greatly improves the default typography in pandoc’s HTML output. The CSS is sensitive to a number of variables (e.g. mainfont, fontsize, linestretch): see the manual for details. To restore the earlier, more spartan output, you can disable this with -M document-css=false.
Initialize methods authors for testing Disable individual docx outputs
AppVeyor build 1.0.4002 |
Worked locally but failed in CI
AppVeyor build 1.0.4004 for commit d97e5f6 is now complete. Found 15 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymouscontent/09.evolution.md:139:LVNA content/23.vaccines-app.md:15:IgGs content/23.vaccines-app.md:387:IgGs content/60.methods.md:8:CCS content/60.methods.md:64:ECRs content/60.methods.md:80:Manubot's content/60.methods.md:110:scite content/60.methods.md:142:scite content/60.methods.md:159:ECR content/60.methods.md:160:ECRs content/60.methods.md:164:docx content/60.methods.md:166:scite... |
Thank you so much for working on this @agitter! I think as long as we are getting the markdown document into something that is vaguely compatible with the template, it will be a lot easier to clean up the template. I'm not sure how well I can review this PR because it is really complex. @mprobson has also said he can take a look, but I imagine he'll have the same issue with not quite understanding the under-the-hood of Manubot well enough. I can definitely help with reviewing the outputs, though. |
I'm happy to walk through some of the build script changes if you comment on anything you're curious about. The The main issue is that the newer version of Pandoc does make the HTML and PDF outputs look different. A short term workaround would be to create two conda environments and manually toggle between them when we're preparing LaTeX export versus making general builds the rest of the time. It would be better to fix the formatting issues, but we have so little time. For review, you can ignore all the |
@agitter this is an amazing effort! I looked through the committed files and I think I understand most of what they're doing. I'll try to chime in with what I can parse but apologies if I missed something. As far as I can tell the generated .bib file looks correct. We should be able to generate a rough pdf in the file stage of the
or
I use the first command locally to build my finished pdfs (and clean the outputs, everything after the && is just housekeeping) and the seconds set of commands is from this blog post, which links to bibtex's instructions. We may even be able to use pandoc, e.g. Also, I don't know that all the The majority of my observations are about the generated .tex file.
The linked Pandoc-to-LaTeX template does have functionality to generate this but I don't see it being employed even though I see the
|
Thanks for reviewing. You have a good understanding of what's going on.
This should be possible, but I anticipate we'll run into a lot of minor details when trying to fully automate pdf generation that will slow us down. manubot/rootstock#249 and manubot/rootstock#256 have additional context about building the pdf in the continuous integration environment with pandoc (the old Travis CI one, which was less flexible). The workflow I had in mind was
What do you think about that?
I can delete these and keep only the .cls file. My goal was to provide what is needed for someone building locally.
Yes, Manubot has a different convention for setting authors, but we can modify the metadata in the yaml stub or pass metadata to pandoc as a command line argument. I'm passing the title via the command line for individual manuscripts to override the title in the yaml block. Author metadata is probably worth automating before we merge.
To run this locally
If I'm forgetting a step, I can help debug. I agree that we would need to fork a template if we want to perfect the automated build. If we instead can settle with a .tex file that has content that can by copied into the ACM template, we can live with a lot of errors in the template. |
That's not a problem for me. I'm editing these files regularly anyway. We can integrate it into this repo after we merge this pull request. I'll work on the conflicts statement after pushing some text edits to #947. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good (pending conflicts integration). I'm not in a place to comment on build/update-latex-metadata.py but it seems to work.
Minor / Unimportant notes (please feel free to ignore - not blocking merge):
- Can remove acmart.bib and sample-sigconf.tex (unused)
- Can add (back) Makefile, bbx, cbx, dbx (but I'm not sure if we need?)
Just pushed a change to the gist enabling |
I noticed we need the ACM copyright info. I can hardcode but the template could ingest the following (I'm just not sure where to check it in):
|
Outstanding things on my TODO list (I think this all needs to be done manually?):
|
@mprobson I usually manually change the acknowledgements, do you want me to send you the correct ones for this paper? Edited to add: have to go into a meeting so doing this here just in case-- We are grateful to Josh Nicholson and Milo Mordaunt for their support with the scite plugin, and to David Nicholson for the suggestion and feedback to enable the reporting of the locations of spelling errors in the spell-checker tool. We thank Nick DeVito for assistance with the Evidence-Based Medicine Data Lab COVID-19 TrialsTracker data. |
I'll update to add that block into the YAML I generate.
I can fix those. |
AppVeyor build 1.0.4065 for commit f7780b8 is now complete. Found 16 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymouscontent/09.evolution.md:139:LVNA content/23.vaccines-app.md:15:IgGs content/23.vaccines-app.md:387:IgGs content/60.methods.md:8:CCS content/60.methods.md:65:ECRs content/60.methods.md:81:Manubot's content/60.methods.md:111:scite content/60.methods.md:111:Scite content/60.methods.md:147:scite content/60.methods.md:164:ECR content/60.methods.md:165:ECRs content/60.methods.md:169:docx... |
That's great, thanks! I think the bigger issue is that we need to use \section*{Acknowledgments} and I'm not sure how to automate that. |
I pushed changes that resolve most/all of the issues above:
The acknowledgments section now shows up as
@rando2 you can edit the content in I'm happy with this tex output. It's a lot cleaner than I would have guessed. From here, should we create a new issue with a single checklist of manual changes needed before submitting? Latest version: |
I just realized we don't have a funding statement anywhere. Should we add that to |
Wow, that's amazing! I agree re: funding statement. I usually put my sources in the acknowledgements. I'm currently getting a weird error trying to build the latest commits:
I suspect that something has gone wrong with my python but wanted to confirm it's working for you. |
Strange. It is working for me. Is that error coming from Manubot or my script? |
It's definitely a system configuration issue on my end, although I am not sure what changed... I have some weird combination of Tmux, Conda, and MacOS... I re-ran on a known good commit and it still fails. conda deactivate/activate doesn't fix it nor does destroying and recreating the environment. I imagine it's just something I need to debug on my own. Here's the command plus error:
|
I think I figured out what's happening... I seem to be dying on this line:
@rando2 and I share an IP address currently and she's getting rate limited by the GitHub API so I suspect I am now being throttled too... Update: Connecting to my VPN resolves the issue! |
AppVeyor build 1.0.4083 for commit 3baa06f is now complete. Found 16 potential spelling error(s). Preview:content/09.evolution.md:83:nonsynonymouscontent/09.evolution.md:139:LVNA content/23.vaccines-app.md:15:IgGs content/23.vaccines-app.md:387:IgGs content/60.methods.md:65:ECRs content/60.methods.md:81:Manubot's content/60.methods.md:111:scite content/60.methods.md:111:Scite content/60.methods.md:147:scite content/60.methods.md:164:ECR content/60.methods.md:165:ECRs content/60.methods.md:169:docx content/60.methods.md:171:s... |
[ci skip] This build is based on d393fe9. This commit was created by the following CI build and job: https://github.com/greenelab/covid19-review/commit/d393fe908979b74420b267e6c5433c58a2d7602d/checks https://github.com/greenelab/covid19-review/runs/801070962
[ci skip] This build is based on d393fe9. This commit was created by the following CI build and job: https://github.com/greenelab/covid19-review/commit/d393fe908979b74420b267e6c5433c58a2d7602d/checks https://github.com/greenelab/covid19-review/runs/801070962
This adds preliminary LaTeX export for the ACM-BCB submission. The big picture is that we can automatically generate a base document for the submission, but it will require a fair amount of manual editing before it will build with the ACM sigconf proceedings template. Because of our tight timeline, we'll need to decide how much more to automate versus prioritizing manuscript content.
There is now a list
individual-docx-manuscripts.txt
of manuscripts to export as docx and another listindividual-latex-manuscripts.txt
to export for LaTex. Exporting for LaTeX generates a .tex and a .bib file that contains reference metadata from Manubot. Hopefully those two outputs are work together. I haven't tried building a PDF yet.This builds on two experimental rootstock pull requests: manubot/rootstock#384 and manubot/rootstock#386 One of those upgrades the version of Pandoc. That will cause some changes to our HTML and PDF outputs, so we'll need to check those carefully before merging. I also had to further update the environment to resolve new package incompatibilities. However, the newer version of Pandoc is needed to extract the .bib file.
When outputting .tex files, Pandox uses a LaTeX template. This is different from what ACM refers to as a template, which is an example .tex file. The Pandoc version of a template can access metadata variables, like our Markdown template for the front matter. This strategy isn't compatible with how Manubot writes author information and some other metadata. I can resolve some of these incompatibilties, but we'll need to decide what missing metadata (authors? affiliations?) is high priority.
I didn't test images. Those will probably not work immediately.
@misc
URL citations currently don't include the URL in the bib file.Here are the current outputs (.txt added so GitHub will allow the attachment):