-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tools=>Illustration Fixup #595
Comments
Thanks for the suggestions, Rick. Notes (partly for whoever looks at this)
|
It may never be possible to catch all but more would be helpful. Relying on the asterisk alone misses far too many and may leave the PPer surprised later. I was astonished when only one was found in the entire file (already corrected before I zipped it). Perhaps the better solution may be to provide a manual tool similar to the GG1 |
I composed a reply, but obviously never clicked the final "Comment" button to post it.
So the algorithm would be:
I think this would catch a lot of the cases where the formatter hasn't marked it as a mid-para illo, and shouldn't produce false positives. |
I concur. While it might miss a few instances, this approach should significantly improve the situation. One potential limitation is the program's ability to accurately identify instances where multiple illustrations occur consecutively. I'll leave the decision regarding the feasibility of implementing a longer lookahead mechanism to your discretion.
|
@okrick - I've changed the code to cope with the above situation of multiple illos & blank lines, and also with illos that span more than one line, and even even ones that have blank lines within the caption, so in the next release you would get the following illos all reported as being MID-PARAGRAPH:
|
It now looks forward to find the first "normal" line, i.e. not an empty line, a `[Blank Page]`, another illo/SN, nor a page separator line. Then it finds a normal line, it checks if the line above it is blank, meaning it's the start of a paragraph. If not, then the illo/SN is mid-paragraph. Fixes DistributedProofreaders#595
Wow, that's quite an accomplishment. Thanks |
The current GG2 Illustration Fixup tool has limitations:
Limited Detection of Mid-Paragraph Illustrations: It only reliably identifies mid-paragraph Illustrations if they are explicitly marked with an asterisk before the Illustration tag (e.g.,
*[Illustration...).
Failure to Detect Page Break Interruptions: The tool fails to recognize instances where a paragraph is interrupted by a page break, followed by an Illustration, and potentially more page breaks before the paragraph resumes.
Dependence on Manual Asterisk Placement: Proofreaders often omit the necessary asterisk when the Illustration occupies an entire page, making these instances difficult for the tool to detect.
Addressing these limitations would require enhancements to the GG2 Illustration Fixup tool:
Improved Contextual Analysis: The tool could be enhanced to analyze paragraph flow across page breaks, considering the presence of Illustrations as potential interruptions.
Current Workaround:
To manually identify potential paragraph interruptions caused by Illustrations, I currently use the following search term:
^-+[^\n+]+-\n+\*?\[(Illustration|Music)
This search term helps locate images that might be breaking paragraphs by targeting images at the top of pages.
Note: I realize a complete solution might not be feasible. I only ask that the problem be given some consideration.
henry.txt
henry.txt.json
The text was updated successfully, but these errors were encountered: