-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support accent-insensitive search using SQLean unicode sqlite3 extension #3199
Conversation
I tested with the debian pkg on ubuntu 23 and it is working. I removed the ffmpeg install from the debian preinstall script. It will be much better to have all the install methods using the same ffmpeg source. I'll mention in the release notes that the old location of the ffmpeg/ffprobe binaries at Thanks for figuring all that out with sqlite extensions. Great stuff! |
I realize that the docker images are going to download the binaries on every image start since they aren't getting mapped outside the container. Also the Dockerfile is already including ffmpeg/ffprobe but the latest version. We can either:
It is nice to have the binaries already included in the docker image but it is also helpful to have everything use the same method of getting binaries. I'm leaning towards 2 unless there is a better reason to package the binaries |
Option 1 is the accepted practice, unless the file is optional. |
I am obviously somewaht partial here, but let me try to list the pros and cons for 2 as I see them. Pros:
Cons:
Also, If we go with option 1, I think we still need to obtain it from github in Dockerfile - sqlean is not in the apk repository, and I don't think we want to keep a copy of it in our repository (same goes for the other build scripts). |
I haven't gotten to look into this much yet but there have been a lot of reports of sqlite db corruption and the server becoming unresponsive #3251. I hope to be able to look into this later this afternoon (it's morning for me) but let me know if you have any thoughts |
I saw some of these reports. It is of course possible that this is caused
by the extension, but as you know, this extension only ever gets used in
search queries triggered from the UI, and only runs inside SELECT queries -
it's not involved in any update of the data.
But if you suspect this is due to the extension, or even if you want to
check if without it the issue goes away, please go ahead and revert.
…On Thu, Aug 8, 2024 at 4:55 PM advplyr ***@***.***> wrote:
I haven't gotten to look into this much yet but there have been a lot of
reports of sqlite db corruption and the server becoming unresponsive #3251
<#3251>. I hope to be
able to look into this later this afternoon (it's morning for me) but let
me know if you have any thoughts
—
Reply to this email directly, view it on GitHub
<#3199 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFMDFVQPRSCKM4XCSJQEGULZQN2D7AVCNFSM6AAAAABLSDWVJOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZVHA4TKOBYGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The other possibility I can think of is the switch to using MemoryStore for sessions |
At this point, since we don't seem to have a good direction, and since this
seems to be causing quite a bit of damage, I would revert these changes one
at a time.
…On Thu, Aug 8, 2024 at 8:02 PM advplyr ***@***.***> wrote:
The other possibility I can think of is the switch to using MemoryStore
for sessions
—
Reply to this email directly, view it on GitHub
<#3199 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFMDFVXPRUFHVK6RXF342ETZQOQCLAVCNFSM6AAAAABLSDWVJOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZWGI3TONZZGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I was able to reproduce the A user on Discord sent me their corrupt DB and narrowed it down to crashing when the title of specific books are updated. It doesn't seem like it is related to the actual title ("Console Wars") but just so happens that edit triggers the SQLITE_CORRUPT. When not loading the unicode extension the db did not crash so I'm putting out a patch now with the extension disabled. I couldn't find anything useful to track down the root cause of the crash. The sqlite db used in testing is only 2MB so I can send it to you if you want to test it. |
Oh and also the other issue happening with the server hanging up was the new MemoryStore that I patched out in v2.12.2. |
Thanks for tracking this, and sorry for causing this mess. Yes, please send me the db and the edit that caused the corrupt error. So, do you think the db was corrupted before the edit, or was the edit causing the corruption? Also, please send me the log up to the crash. Which system was this running on? |
After reading #3241 again, and the dicussion on discord, and re-reading the extension docs, and looking at the extension source code, I think I might have some idea of what's been happening... I initially thought that the extension only added a couple of functions which I only used in SELECT queries, but then I read your comment that this happens specifically when updating the title of some book, and something clicked. I went back to the extension documentation, and saw the following line, which at the time I didn't think was relevant to what I was trying to achieve:
But now that you mentioned the title, I rememberd that the What I now suspect is that the new NOCASE collation doesn't work well (a euphemism for crashes) on indices built with the original collation. The issue probably manifests itself when you have some/many titles with with non-ascii characters, where indexing would likely be different due to the new collation. And so when you try to do things that change the Looking at the bug, I also note that the original crash reported happened when trying to delete a If all of the assumptions above are true, the solution would be to reindex the In the meantime. while the extension is disabled, It would probably be a good idea to reindex all I'm waiting for you to send me the db so I can run some tests on it to verify the above. |
Ah yeah that makes sense. I just sent you the broken db on discord. |
const StreamZip = require('../nodeStreamZip') | ||
const { finished } = require('stream/promises') | ||
|
||
var API_URL = 'https://ffbinaries.com/api/v1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rate limit is pretty strict.
https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28
Probably best to revert back to ffbinaries or we could host the binaries ourselves.
Docker builds should also not be installing ffmpeg/ffprobe since they are packaged in the image. Only reason they are is because the image includes ffmpeg/ffprobe v6.1 but the BinaryManager is expecting 5.1.
I updated the BinaryManager to 5.1 because of an error with ffprobe on 6.1 that I was able to reproduce on Windows #2689
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO for Docker the dependencies should always be bundled inside. This is IMO one of the main points of using a Docker image: make running containers more predictable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree. I'm testing ffmpeg 5.1 in the Dockerfile now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ffprobe 6.1 issue with those mp3 files mentioned in #2689 is specific to Windows in my testing. Since it is not easy to include an older version of a package with alpine linux I think I'll just add a skip for the BinaryManager for docker installs for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to note, though, that even in the current docker setup, binaries aren't really bundled, but installed through apk (which probably doesn't have the rate-limiting issue, but is still downloaded over the network).
This only happens when the image is built though. Once it is built the users system doesn't run it.
Docker build can stay with ffmpeg/ffprobe v6.1. It is the Windows built that had the issue with 6.1.
It would be nice to have them all use the same version but it is not easy to install older alpine linux packages that I could see so I was thinking of just having the binary manager skipped for docker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for speaking out of ignorance.
Then I concede that BinaryManager is not suitable for docker installations (at least not as part of server init).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set it up to skip the binary manager if the source is "docker" that is set in index.js. That should do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mikiher I apologize for the tone of my first comment, I reworded it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I'm using untone() so I'm tone-insensitive...😁
Thanks @advplyr. Do you also have the log of the SQLite failure? on which system were you able to reproduce it? |
It is the same SQLITE_CORRUPT error in #3241. I reproduced it on Windows running locally. First I opened the db and NULLed the root user password so I could login. Once logging into root with no password I search for "Console Wars". There are 2 books in the books library starting with "Console Wars" and one book in the audiobook library. If you edit the title in Abs of any of those books it will crash with SQLITE_CORRUPT. |
OK, I can confirm that was able to reproduce the crash locally using the DB you sent. I can also confirm that after adding the |
I'm not sure if it's too late to weigh in on the discussion about the ffmpeg binaries as a non-contributor (thank you for all the work you folks do) is that I'd prefer they be packaged with the container image. I always lock my dependencies to a version at work, and in this case I had to downgrade after my image failed to start due to the unauthenticated rate-limiting after upgrade |
This is already done. On Docker we no longer try to download binaries at runtime. |
This fixes #2678.
This PR has two main parts:
1. Refactoring and modifications in BinaryManager to support new capabilities
These BinaryManager changes allowed me to:
I also made BinaryManager run for all platforms, so it's able to download the unicode library (which is needed for the second part). Up until now it was only running for dev and Windows platforms, but I believe it's safe to run it on all supported platforms, with no changes or side-effects (if ffmpeg and ffprobe already exist on the system and are identified using the
findRequiredBinaries()
method, those will be used and nothing will be downloaded).As a reminder, if BinaryManager is not able to find a binary it needs, it downloads it and puts it in one of two locations:
mainInstallDir
:gloabl.appRoot
(or, in the case of a pkg-ed binary, in the directory where that binary is located)altInstallDir
: (ifmainInstallDir
is not writable) the Audiobookshelf config directory (which should always be writable)I believe the BinaryManager is a good mechanism to make sure all required binaries are available (whether they're obtained externally or by BinaryManager itself), and it removes the hassle of deploying those binaries in the various supported platforms.
2. Support for accent-insensitive search by using functions from the SQLean unicode sqlite3 extension
SQLEAN_UNICODE_PATH
environment variable)Note regarding pkg-ed binaries: the unicode extension cannot be packaged into a pkg-ed binary, since it loaded by the sqlite3 native code, which doesn't have access to the pkg virtual file system, so it has to be downloaded as a dependency for pkg-ed binaries as well.