Replies: 9 comments 33 replies
-
Hey, this is something I've wanted to do for a while but haven't come up with a good solution yet. It has been brought up a few times and there is a request for it #528. I like the idea of using grok syntax. There are a lot of edge cases to cover and for that reason the current implementation is split up into multiple functions with regular expressions (See https://github.com/advplyr/audiobookshelf/blob/master/server/utils/scandir.js#L209). For example, the series sequence can be taken from the title folder but only if there is also a series folder. You could specify a publish year or a series sequence # as the first part of a book title folder name. If server setting parse subtitles is enabled then the last part of the book title folder if separated by Audio file names only try to parse out a track and disc number but also support being separated into CD folders i.e. I think it is possible to break this down into a list of grok strings and I think it would clean things up a lot. |
Beta Was this translation helpful? Give feedback.
-
Great :-)
Of course... audio books are named in so many different ways, that it is nearly impossible, to catch all edge cases with a simple regex based system. Some edge cases that I came across:
Yeah I also think that. I also added shorthands (%s) and custom patterns for
That way you can specify multiple path patterns very easy, e.g. my personal structure:
BTW: I would LOVE to see support for |
Beta Was this translation helpful? Give feedback.
-
Oh, I forgot - since I don't exactly know, how you extract the metadata from files, I assume you use If you are willing to add an optional static 20MB dependency with You could go for:
I could provide the following fields (and more):
Where the |
Beta Was this translation helpful? Give feedback.
-
Yes... very powerful and easy. Although I must say that I only tested it myself and there may be some issues atm... but nothing that is hard to fix.
Yes... also very powerful. You can embed multiple images in different formats (e.g.
Yes, that is one reason why I developed
You could also use
|
Beta Was this translation helpful? Give feedback.
-
I'm not sure, what you mean by this :-) What would you like to see? BTW: With $ tone dump --format=json christmasmiscellany2018_01_various_64kb.mp3
{
"meta": {
"album": "A Christmas Miscellany 2018",
"albumArtist": "",
"artist": "Lucy Maud Montgomery",
"chaptersTableDescription": "",
"composer": "",
"comment": "https://archive.org/details/a_christmas_miscellany_2018_1807_librivox",
"conductor": "",
"copyright": "",
"description": "",
"discNumber": 0,
"discTotal": 0,
"genre": "speech",
"lyrics": null,
"originalAlbum": "",
"originalArtist": "",
"popularity": 0.0,
"publisher": "",
"publishingDate": "0001-01-01T00:00:00",
"recordingDate": "0001-01-01T00:00:00",
"title": "01 - A Christmas Of Long Ago (1906)",
"trackNumber": 1,
"trackTotal": 0,
"chapters": [],
"embeddedPictures": [],
"additionalFields": {
"tlen": "437.63"
}
}
} As you see, there are some rough edges regarding the default values, which should be prohibited (e.g. You can also query specific property values by $ tone dump --format=json christmasmiscellany2018_01_various_64kb.mp3 --query='$.meta.album'
A Christmas Miscellany 2018 And it is also possible to IMPORT metadata in this format: tone tag --meta-tone-json-file="tone.json" my-audio-file.m4b It is also planned to provide more data on the upper level, e.g.: $ tone dump --format=json christmasmiscellany2018_01_various_64kb.mp3
{
"meta": {
},
"audio": {
"codec": "...",
"duration": 23456
},
"file": {
"size": 234666,
"name": "my-audiobook.m4b",
"created": "2022-02-12T12:34:56Z,
}
} But this is not fully implemented yet. |
Beta Was this translation helpful? Give feedback.
-
That would be the way to go. You could take a look at Oh and be aware, that there is a bug in the tag library for For
Example: https://github.com/sandreas/dockerhub-builds/blob/main/m4b-tool/latest/Dockerfile |
Beta Was this translation helpful? Give feedback.
-
I use atldotnet. Unfortunately there is no such mapping table and I'm not really sure I understand what you are asking for... There is no such thing as a standardized mapping of metadata to different formats or specifications, just best practises - I also thought about publishing a best practise guide, but this would be a lot of work. The best sources for mapping I found were these:
Some other resources I like to use as reference:
Based on these mapping tables and information, I created my personal mapping table of tags that are not supported natively in atldotnet (although you can always use Of special interest is the field BTW:
Well, this is easy. For ID3 visit the official spec: https://id3.org/id3v2.3.0. I recommend using 2.3.0 wherever possible, but there are also links to v2.4.0 (latest) and v2.2.0 (the one before). I think that previous versions are no longer relevant - even if ID3 V1, which uses a fixed 128 byte size TRAILER (end of the file) often is seen as "fallback" and still present, it is not really useful in most cases.
I'm surprised that ID3 is your preferred format :-) I would never use mp3 for audiobooks again since I found Please note that latest
The frames information might be of special interest - this shows up the window of raw binary data (with metadata fields). So this is the core part of the audio file. Building a hash of these offset would never change, even if metadata changes. This is interesting for finding duplicates or keeping the metadata when files get moved around. Just in case you need this somewhere. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Just to mention something: https://github.com/Borewit/music-metadata I don't know if you are aware of this... :-) |
Beta Was this translation helpful? Give feedback.
-
Hello @advplyr ,
as you may know, I'm the author of m4b-tool and tone and I really love audiobookshelf.
The recommended directory structure implies having a directory for each audio book, e.g.:
Currently, I'm struggling getting the series to be recognized with my existing directory structure, which is the following:
This results in audiobookshelf stacking my parts of a series into a single audio book with multiple files. It is mainly because the current metadata extraction does not support
movement-name
(series) andmovement
(series-part) form4b
files (ffmpeg
really sucks formp4
metadata extraction). Sure I could re-organize my whole library to match audiobookshelf recommendation, but unfortunately that would result in a ton of work. Instead, I had an idea: How about providing a SETTING for how the audiobook directory structure is.In m4b-tool I use
--batch-pattern
, in tone it is--path-pattern
(which I would prefer as name), so here is my feature description:path-pattern
for the Library, where you can put multiple path-patterns, that use grok-js like syntax to match part of the path with metadata fieldspath-pattern
is gonna match firstThe grok-js library is pretty old and has not been updated for a long time, but I think there are other implementations... it should also not be too hard to implement (here is the C# implementation i use for tone: https://github.com/Marusyk/grok.net)
What do you think?
Beta Was this translation helpful? Give feedback.
All reactions