You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 17, 2023. It is now read-only.
This appears to be an issue with the transformation from rawtei to toktei.
In the DjVu and rawtei "page" (really image offset) 56 contains "o X". The actual page (https://archive.org/stream/cu31924020438929#page/n56/mode/1up) is an illustration, the next page is blank.
The Phokas program pulled the contents of that "page" into the previous page resulting in offset 56 being removed, followed by (correctly) offset 57 being blank.
The page index was built using toktei (list of document for that index is on sydney at /mnt/nfs/work3/michaelz/data/caribbean-via-grep.list). Proteus expects to see "o X" for offset 56 (the illustration) but that does not exist, resulting in the off by one error.
Attached are the rawtei and toktei files. Search for "" in the rawtei, and in the toktei to see the issue.
Ultimately the solution is to either fix Phokas or build the index using rawtei files. My experience has been that building from the rawtei files is the best way to proceed.
See document cu31924020438929
The actual book has: page 32, a full page image, blank page, page 33.
In Proteus book page number 33 is associated with the text for page 34.
https://archive.org/stream/cu31924020438929#page/n55/mode/2up
http://laguna.cs.umass.edu:2333/view.html?kind=ia-pages&action=view&id=cu31924020438929_55
The text was updated successfully, but these errors were encountered: