Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch file extraction errors in parseComicMetadata #3422

Merged
merged 1 commit into from
Sep 15, 2024

Conversation

mikiher
Copy link
Contributor

@mikiher mikiher commented Sep 15, 2024

This partially fixes #3406, fixes #3399, fixes #2526 by catching file-extraction related exceptions in parseComicMetadata functions and preventing scan failure.

This is just adding straightforward try-catch block around the code.

However, there's the more general question of libarchive. It seems we have multiple reports of it throwing exceptions while trying to extract files from cbr/cbz files. Some observations:

  • It looks like it is only used for comic file extraction on the server.
  • While libarchive throws exceptions on some cbr files, it looks like 7zip has no issues reading and extracting those files.
  • Another potential issue is that we need to read the whole archive file into memory and pass it to libarchive. This is not how modern archivers should work, and can potentially get us into memory issues.

On the client, I'm even more confused regarding what's happening with libarchive - we have some mix of an installed libarchive npm package (v1.3.0), and some libarchive code in client/static, both of which (!) seem to be used in ComicReader.vue.

Also after fixing the server failures, the offending CBR book is successfully added to library, but then crashes with the same libarchive error on the client when trying to read the book.

Any insights or suggestions on how to proceed?

@mikiher mikiher marked this pull request as ready for review September 15, 2024 09:31
@mikiher mikiher changed the title catch file extraction errors in parseComicMetadata Catch file extraction errors in parseComicMetadata Sep 15, 2024
@advplyr
Copy link
Owner

advplyr commented Sep 15, 2024

For the client I had to put the WASM file in static so it could be passed in to Archive.init function. It's not the entire package stored in static.
When I added the comic metadata parser on the server side the libarchive package wasn't available for node so I had to make manual updates. Shortly after they made it available for node. https://github.com/nika-begiashvili/libarchivejs/releases/tag/v2.0.0

I believe the main issue with the CBR files is the libarchive WASM file was built a while ago and is not up-to-date with the latest RAR file changes.

Being unable to stream is reason enough to find an alternative to libarchive.js

I was looking at another e-reader that appears to have dropped support for CBR and moved away from using libarchive. Now using https://github.com/gildas-lormeau/zip.js which has no plans to support RAR.

It may be best for us to also drop support of RAR

@advplyr advplyr merged commit fa0c90d into advplyr:master Sep 15, 2024
5 checks passed
@mikiher
Copy link
Contributor Author

mikiher commented Sep 15, 2024

It may be best for us to also drop support of RAR

That would be quite unfortunate and probably very unpopular...

Let's try to look for some alternatives for CBR handling before we throw support for it. We should also separate the problem of extracting metadata on the server from providing a CBR Reader.

@mikiher mikiher deleted the parse-comic-metadata-try-catch branch November 18, 2024 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants