You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is probably not a mutagen issue, but something which may be of interest anyway. I did not try to reproduce the thing in other contexts, so it may be quite specific.
While doing mass tags extraction from an NFS-mounted file system, specifying buffering=4096 to the open() call in _utils.py
yields a massive performance improvement (around 5x in my configuration).
Details:
Client system: "Ubuntu 22.04.1 LTS" Linux 5.15.0-56-generic Python 3.10.6
NFS server: Odroid hc4 : ARM running "Ubuntu 22.04.1 LTS" Linux 5.19.17-meson64
The volume is a 4TB spinning disk on the ARM system.
Without the buffering parameter, extracting tags from 3000 FLAC and MP3 files takes around 100 mS per file. With the buffering argument we get down to around 22 mS
I also did a quick test on a local SSD, on which the buffering does not appear to make a difference one way or another.
Tests done while trying to determine why recoll was slow indexing NFS-mounted audio files. The workaround for the application is to open the file with a buffering argument, before building the mutagen object.
This appears to be actually a Python bug, as from the Python manual open() doc:
Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying
to determine the underlying device’s “block size” and falling back on
[io.DEFAULT_BUFFER_SIZE](https://docs.python.org/3/library/io.html#io.DEFAULT_BUFFER_SIZE).
On many systems, the buffer will typically be 4096 or 8192 bytes long.
So specifying buffering=4096 should be close to a no-op, and doing it as a precautionary default in mutagen should be inocuous enough.
The text was updated successfully, but these errors were encountered:
This is probably not a mutagen issue, but something which may be of interest anyway. I did not try to reproduce the thing in other contexts, so it may be quite specific.
While doing mass tags extraction from an NFS-mounted file system, specifying buffering=4096 to the open() call in _utils.py
yields a massive performance improvement (around 5x in my configuration).
Details:
Without the buffering parameter, extracting tags from 3000 FLAC and MP3 files takes around 100 mS per file. With the buffering argument we get down to around 22 mS
I also did a quick test on a local SSD, on which the buffering does not appear to make a difference one way or another.
Tests done while trying to determine why recoll was slow indexing NFS-mounted audio files. The workaround for the application is to open the file with a buffering argument, before building the mutagen object.
This appears to be actually a Python bug, as from the Python manual open() doc:
So specifying buffering=4096 should be close to a no-op, and doing it as a precautionary default in mutagen should be inocuous enough.
The text was updated successfully, but these errors were encountered: