-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2: incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError) #8
Comments
Thank you for such an outstanding bug report! The script made it really easy to reproduce the problem. I was too confident in my specs protecting me from this kind of issue, but I have insufficiently tested binary data it seems. I managed to distill your example down into a short string that triggers the issue, I think the issue can mostly be solved by spamming Still, I will shortly push a commit doing just that, and it at least makes the example pass -- but I'm not 100% sure if I managed to catch every instance of the problem. I probably won't be able to work on this again before Wednesday; I'll try to see if I can uncover more edge cases that can lead to problems then. |
Sounds good. There's no urgency from my perspective, released versions of pdf-reader are locked to v1.x so they're continuing to work fine. I can see a fix has been pushed to main, so I gave it a go (https://github.com/yob/pdf-reader/compare/ascii85-2-0?expand=1). The pdf-reader spec suite is green (some jobs failed, but for unrelated reasons): Here's a passing example, on ruby 3.3 https://buildkite.com/yob-opensource/pdf-reader/builds/630#0191b783-766d-4fa6-b7e4-b9583a832f1e |
- Bump PORTREVISION for package change Obtained from: yob/pdf-reader@cb6f8ed Reference: yob/pdf-reader#538 DataWraith/ascii85gem#8 yob/pdf-reader@main...ascii85-2-0 DataWraith/ascii85gem@b7480db
Thank you for testing the changes! I went through the code again on the weekend and made sure that all String literals are unfrozen and encoded as ASCII_8BIT before use; that should take care of the encoding errors. The new version has also managed to correctly encode and then decode a few gigabytes of random binary data without raising an Exception, so I hope that it works properly now. Unless something else crops up, I'll probably release version 2.0.1 this weekend. |
But not 2.0.0, it has some encoding issues with binary data DataWraith/ascii85gem#8
Thanks for your help here! I've released pdf-reader with a relaxed Ascii85 dependency and all our tests are green ❤️ |
Thanks for maintaining this library ❤️
I noticed that #7 helped to prompt a v2 release, and over in yob/pdf-reader#538 I've had a suggestion to relax the pdf-reader dependencies to allow v2 to be used.
I gave it a go, but the CI build on ruby versions that installed Ascii85 v2 failed, for example: https://buildkite.com/yob-opensource/pdf-reader/builds/629#0191ac89-884b-4fc1-ace3-3f1a7b11258a
The input data was pulled from a test PDF and is hard to work with for a reproduction, so I trimmed the sample down and put together a short script:
If I flip the Ascii85 version between v1 and v2: the input data works on v1.1.1 and raises an exception on v2.0.0:
The output data is expected to be binary and not valid UTF-8. I assume I might be able to work around it by using the new v2 API to pass in a binary encoded output buffer, however
pdf-reader
still supports rubies < 2.7 so I'm aiming to use the v1 compatible parts ofAscii85
s APIThe text was updated successfully, but these errors were encountered: