-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict NGINX (public?) access to "large" files only (e.g. .mp3, .mp4, .webm) in /library/calibre-web — for PR #51 OOM workaround #57
base: master
Are you sure you want to change the base?
Conversation
I smoke and functional tested this PR on Ubuntu 22.04. Waiting for your review @holta. |
Explaining the wider context
FYI this (interim?!) PR might not be a perfect solution to allow for Calibre-Web playback of large videos. But it might be a "good enough" interim workaround for now — by building on:
|
Co-authored-by: A Holt <[email protected]>
Smoke-tested together with #51 on Ubuntu 24.04 (Noble Numbat) |
@deldesir, 4 questions below if you can help shine a light on where this is going:
|
The implementation of these pull requests does not alter the fundamental user experience in consuming content through Calibre-Web. Instead, their primary focus is on enabling the direct playback of video webm and mp4 files. Whether users have an account or not, the ability to browse and read books has always been available. The specific issue addressed pertained to users being served a "copy" of the book, which, when loaded into memory, resulted in an out-of-memory problem, especially with larger video files. While the current fix does not comprehensively cover the download aspect, a workaround exists for videos. Users can click on the video to obtain its URL and subsequently save/download it. However, this workaround may not be as practical for books, where the conventional download button remains the more suitable option. It's worth noting that significant issues related to reading larger books have not been encountered thus far. Therefore, in terms of viewability and playability, the pull requests have proven effective. The downloadability aspect, however, is contingent on the specific content type and the workaround mentioned.
The mention of "Another PR will be made" specifically alludes to an additional pull request that will be crafted to address the download aspect, particularly for media files that tend to be larger in size, such as videos and occasionally audios. While PR [Workaround for] OOM / memory "502 Bad Gateway" crashes w/ large videos' playback or download-to-client [SEE ALSO #37 & #53] #51 primarily focuses on resolving out-of-memory issues related to video playback, this forthcoming pull request aims to further optimize and enhance the download functionality for large media files.
Calibre-Web has been designed with a primary focus on handling books. The issue of "502 Bad Gateway" out-of-memory crashes, as outlined in Some videos download as overweight (giant) files that (1) are not appropriate for schools, and (2) crash Calibre-Web with "502 Bad Gateway" #37 and Give everyone access to all books, intentionally ignoring Calibre-Web ACL accounts/authorization? #53, is primarily associated with specific scenarios involving videos, rather than the core functionality of serving books and possibly audiobooks. Given the nature of books and audiobooks/podcasts, which typically have modest file sizes, the likelihood of encountering out-of-memory issues with very large size of them is minimal. The current architecture is adept at efficiently handling their storage, retrieval, and display, ensuring a stable and secure experience for users, even on devices with constrained resources.
In terms of tightening it up, the commented-out lines (# fancyindex on; and # autoindex on;) were removed as they were deemed unnecessary for the current configuration. The regex patterns for file types and directories are combined for better readability and maintenance while ensuring the temporary security measures remain intact.
|
@deldesir thanks much for explaining as we work this out. Can we please keep things extremely simple at this time? Ideally something like:
Or if absolutely necessary something like:
|
Absolutely, simplicity is key! I'll just make a small adjustment to it to ensure the intended access restriction.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to keep in mind this is a temporary fix. We'll need to align with Calibre-Web architecture eventually.
scripts/calibre-web-nginx.conf
Outdated
index off; # Disable automatic index file processing | ||
|
||
# Allow serving of webm and mp4 files | ||
location ~* \.(webm|mp4)$ { } | ||
|
||
# Deny access to all other content within /books-direct/ | ||
location /books-direct/ {deny all;} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
index off;
appears erroneous:
autoindex off;
is what you mean I suspect? Specifically:
https://nginx.org/en/docs/http/ngx_http_autoindex_module.html in contrast to...
https://nginx.org/en/docs/http/ngx_http_index_module.html which is really used forindex index.html;
etc.(While we investigate preserving Calibre-Web's default behaviors, as much as possible anyway!)
-
Many NGINX folks seem to prefer
location /books-direct/ {deny all;}
on top of the file — if only as a reminder — to make clear that NGINX examines all prefixes first (before regex's, which are typically put on the bottom of the file, to make that clear). -
Does
location / { deny all; }
also work? (Or is longer-formlocation /books-direct/ { deny all; }
truly required?)
index off; # Disable automatic index file processing | |
# Allow serving of webm and mp4 files | |
location ~* \.(webm|mp4)$ { } | |
# Deny access to all other content within /books-direct/ | |
location /books-direct/ {deny all;} | |
#autoindex off; # Directory listings (should be off by default?) | |
# Deny access to all other content: https://serverfault.com/a/222928 | |
location /books-direct/ { deny all; } | |
# Allow serving of webm and mp4 files | |
location ~* \.(webm|mp4)$ { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1- I meant autoindex off;
, thanks for catching this. And yes, it's off by default.
2- Regarding the placement of location /books-direct/ {deny all;}
, it won't work for me if it's on top. I would get a "404 not found" error.
3- Adding location / { deny all; }
will result in the following error:
root@box:~# nginx -t
2023/12/05 15:19:23 [emerg] 2387#2387: location "/" is outside location "/books-direct/" in /etc/nginx/conf.d/calibre-web-nginx.conf:36
nginx: configuration file /etc/nginx/nginx.conf test failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2- Regarding the placement of
location /books-direct/ {deny all;}
, it won't work for me if it's on top. I would get a "404 not found" error.
"404 not found" error occured with which URL(s) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deldesir I'm not sure what you tested. Please explain.
This PR works correctly as is:
(1) I tested http://box/books-direct/ == http://10.8.0.50/books-direct/ which correctly shows 403 Forbidden.
(2) The following URL correctly plays its (tiny 256x144 rectangle!) video...
( After I clicked "Download to IIAB" and entered URL https://www.youtube.com/watch?v=fDB4TpZIgzQ )
Co-authored-by: A Holt <[email protected]>
@deldesir How will Raspberry Pi Zero 2 W (which has just 512MB RAM) serve multi-hour audiobooks and multi-hour symphonies[*] to many people at the same time? [*] Both of which are common on YouTube and other places, for background listening! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. This was smoke-tested on Ubuntu 24.04.
This ensures that access is restricted to only video files with "webm" or "mp4" extensions within the /library/calibre-web/ path. Any other files or directories are denied access. This PR doesn't cover the download of videos, they are still being loaded first in memory like the other file formats. Another PR will be made to address this issue affecting huge files access.