-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understanding the disableChunked
option
#1200
Comments
An explanation here may help me think more about #1201. |
I am not behind a computer at the moment, and is a long time ago I wrote this code. I believe it is the following…By default this tokenizer reads files in chunks, which means it only reads those parts which it really needs. So these parts are driven my the parsers, so ideally they just fetch the metadata and not the audio itself.If you disable that mode, you essentially read from a stream. Depends on the scenario when one is better then the other one.If maybe faster, it may be slower depending on the file type, network delay.Op 29 sep. 2023 om 13:38 heeft Attila Večerek ***@***.***> het volgende geschreven:
An explanation here may help me think more about #1201.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Thanks for the answer, @Borewit! I would like to know a bit more about the stream version. I see that the code calls |
Hi there 👋
This is more of a question, rather than a bug report or feature request. There's the undocumented option called
disableChunked
. I can see from the code that if it's set to true, it performs the tokenization in a different manner. However, is there any recommendation on when to use this flag and why?I've seen some file type detection for zip files take 9+ seconds in production from time to time. I've noticed that in such cases up to a 100 byte range requests are made. Then I tried to set the
disableChunked
option totrue
and that seemed to have solved the problem. I'm thinking of always disabling the chunked tokenization but I'd like to know the trade-offs if any exist 🙏The text was updated successfully, but these errors were encountered: