-
Notifications
You must be signed in to change notification settings - Fork 111
feat(tex): support page parameter for includegraphics with multi-page pdf #1922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 56921a5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 5 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, looks great. Left a few minor things.
Could you also add page with a comment here:
https://github.com/executablebooks/mystmd/blob/main/packages/myst-spec-ext/src/types.ts#L117
e.g.
export type Image = SpecImage & {
urlSource?: string;
urlOptimized?: string;
height?: string;
placeholder?: boolean;
/** Optional page number for PDF images, this ensure the correct page is extracted when converting to web and translated to LaTeX */
page?: number;
};
| } ${output}`; | ||
| session.log.debug(`Executing: ${executable}`); | ||
|
|
||
| session.log.info(`Executing: ${executable}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| session.log.info(`Executing: ${executable}`); | |
| session.log.debug(`Executing: ${executable}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
packages/tex-to-myst/src/figures.ts
Outdated
| import type { Handler, ITexParser } from './types.js'; | ||
| import { getArguments, texToText } from './utils.js'; | ||
| import { getArguments, extractParams, texToText } from './utils.js'; | ||
| import { group } from 'console'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
packages/tex-to-myst/src/figures.ts
Outdated
| } | ||
| if (params.page) { | ||
| if (typeof params.page === 'number') { | ||
| params.page = Number(params.page) - 1; // Convert to 0-based for imagemagick |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to round or parse this or have something similar to Number.isFinite(Number.parseFloat(params.page))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right! Unless I'm mistaken, I think the extractParams should already have done the parsing, but I've added the missing Number.isFinite and the rounding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked at this PR now for the first time - looks great, and I think we should work on getting it landed.
First, just to summarize my understanding, this has two parts: (1) On parsing a tex file, this looks at image params and adds page to the image node (alongside width, which was previously supported). (2) On image conversion, if page is present, imagemagick extracts the correct page.
A couple points:
- @agoose77 - I think your concern was around adding
pageto the image node. It feels a little clutter-y since it only applies in a few specific cases. Thinking through the alternative a bit: If we do not storepageon the node, we would need to do initial image processing at parse time, pulling out the specific page as a separate file. This separate file would still need final conversion to correct format. This means we have extra intermediate files stored on users' machines. - I noticed if this page-specific PDF image is selected as the "implicit" thumbnail, it also causes problems:
thumbnailis just a file, not an entire image node. That meansthumbnaildoes not get thepagevalue, and is stuck trying to process/convert the full PDF file. There are certainly ways around this where we could get these thumbnails working... but I also think we might not need to worry about this edge case. An "explicit" thumbnail can always be set if there are errors with the "implicit" thumbnail.
My inclination is to keep this as-is. It works nicely and it's relatively simple. The only downside is an extra page field on image nodes that will usually be undefined.
| ); | ||
| } | ||
|
|
||
| export function extractParams(args: { content: string }[]): Record<string, string | number> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it has potential for wider use across other macros/parameter types. 👍
Add support for selecting specific pages in multi-page PDFs
In LaTeX, when including a multi-page PDF as a graphic, it's possible to specify a page number:
However, the page parameter is currently ignored, causing all the links to point only to the first page, even though ImageMagick extracts and converts all pages.
This PR adds support for the page parameter, ensuring that the correct page is selected when converting multi-page PDFs.