-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize the database #345
Comments
While the import process might take a little longer the processed import batch will most probably contain multiple tracks with the same band & album. Might I also suggest adding a boolean to album to be able to distinguish compilations from ordinary albums? |
Thanks @lGuillaume124 for letting us in the loop, that's really thoughtful! I like this new database proposition. There are clear flaws in the current one that this one addresses. Onto the review now:
And that's pretty much it! I can't wait to see the new DB design, as you said, it will allows us to do very cool new features! @gs11 :
I don't see what issue it addresses 😕 |
I think the genre column should be moved to album as well as you'll probably want to browse albums by genre. An issue is of course the quality of the tags. This would require the first track to be imported from an album to actually contain year/genre for them to be stored for the album.
Yeah, I realize this is may be less useful and more complex to implement. It's a nice feature in Logitech Media Server (LMS) to browse for compilations but I think LMS actually looks at albums where the artists differ for tracks and not the (kind of ugly) tag ITUNESCOMPILATION . |
So the year field of album will correspond to the year of the first track imported for this album, right? |
Yep, that's it. And the same goes for the genre, as @gs11 mentioned. I don't think it's too problematic to have some heuristic like that (e.g. considering that some metadata in the first track in valid for all the tracks in the album). In a later milestone, it could even be possible to display when there are some descrepancies in the metadata of tracks of the same album, so that the admin would be able to fix them and re-upload the tracks, and thus helping him or her to keep a nice and clean library. |
Hello everyone, The first thing I want to propose is a table for artists. Also, the genre is not only a property of a track, but also of an album and even an artist. I would propose the following scheme (ignoring for now how to populate it from tags): CREATE TABLE `artists` (
`gid` INT NOT NULL AUTO_INCREMENT,
`name` varchar NOT NULL UNIQUE,
PRIMARY KEY (`gid`)
);
CREATE TABLE `genres` (
`gid` INT NOT NULL AUTO_INCREMENT,
`name` varchar NOT NULL,
`description` varchar,
PRIMARY KEY (`gid`)
);
CREATE TABLE `bands` (
`gid` INT NOT NULL AUTO_INCREMENT,
`name` varchar NOT NULL,
`inherits_genres` BOOLEAN NOT NULL DEFAULT '0',
PRIMARY KEY (`gid`)
);
CREATE TABLE `albums` (
`gid` INT NOT NULL AUTO_INCREMENT,
`name` varchar NOT NULL UNIQUE,
`year` INT,
`band_id` INT,
`ndiscs` INT NOT NULL DEFAULT '0',
`inherits_genres` BOOLEAN NOT NULL DEFAULT '0',
`inherits_artists` BOOLEAN NOT NULL DEFAULT '1',
PRIMARY KEY (`gid`)
);
CREATE TABLE `discs` (
`gid` INT NOT NULL AUTO_INCREMENT,
`disc_number` INT,
`album_id` INT NOT NULL,
`ntracks` INT,
PRIMARY KEY (`gid`)
);
CREATE TABLE `tracks` (
`gid` INT NOT NULL AUTO_INCREMENT,
`title` varchar,
`source_path` varchar NOT NULL UNIQUE,
`track_number` INT,
`year` INT,
`updated` TIMESTAMP NOT NULL,
`disc_id` INT NOT NULL,
`inherits_genres` BOOLEAN NOT NULL DEFAULT '0',
`inherits_artists` BOOLEAN NOT NULL DEFAULT '1',
`inherits_year` BOOLEAN NOT NULL DEFAULT '1',
PRIMARY KEY (`gid`)
);
CREATE TABLE `artist_band` (`artist_id` INT NOT NULL, `band_id` INT NOT NULL);
CREATE TABLE `artist_album` (`artist_id` INT NOT NULL, `album_id` INT NOT NULL);
CREATE TABLE `artist_track` (`artist_id` INT NOT NULL, `track_id` INT NOT NULL);
CREATE TABLE `genre_artist` (`genre_id` INT NOT NULL, `artist_id` INT NOT NULL);
CREATE TABLE `genre_band` (`genre_id` INT NOT NULL, `band_id` INT NOT NULL);
CREATE TABLE `genre_album` (`genre_id` INT NOT NULL, `album_id` INT NOT NULL);
CREATE TABLE `genre_track` (`genre_id` INT NOT NULL, `track_id` INT NOT NULL);
ALTER TABLE `albums` ADD CONSTRAINT `albums_fk0` FOREIGN KEY (`band_id`) REFERENCES `bands`(`gid`);
ALTER TABLE `discs` ADD CONSTRAINT `discs_fk0` FOREIGN KEY (`album_id`) REFERENCES `albums`(`gid`);
ALTER TABLE `tracks` ADD CONSTRAINT `tracks_fk0` FOREIGN KEY (`disc_id`) REFERENCES `discs`(`gid`);
ALTER TABLE `artist_band` ADD CONSTRAINT `artist_band_fk0` FOREIGN KEY (`artist_id`) REFERENCES `artists`(`gid`);
ALTER TABLE `artist_band` ADD CONSTRAINT `artist_band_fk1` FOREIGN KEY (`band_id`) REFERENCES `bands`(`gid`);
ALTER TABLE `artist_album` ADD CONSTRAINT `artist_album_fk0` FOREIGN KEY (`artist_id`) REFERENCES `artists`(`gid`);
ALTER TABLE `artist_album` ADD CONSTRAINT `artist_album_fk1` FOREIGN KEY (`album_id`) REFERENCES `albums`(`gid`);
ALTER TABLE `artist_track` ADD CONSTRAINT `artist_track_fk0` FOREIGN KEY (`artist_id`) REFERENCES `artists`(`gid`);
ALTER TABLE `artist_track` ADD CONSTRAINT `artist_track_fk1` FOREIGN KEY (`track_id`) REFERENCES `tracks`(`gid`);
ALTER TABLE `genre_artist` ADD CONSTRAINT `genre_artist_fk0` FOREIGN KEY (`genre_id`) REFERENCES `genres`(`gid`);
ALTER TABLE `genre_artist` ADD CONSTRAINT `genre_artist_fk1` FOREIGN KEY (`artist_id`) REFERENCES `artists`(`gid`);
ALTER TABLE `genre_band` ADD CONSTRAINT `genre_band_fk0` FOREIGN KEY (`genre_id`) REFERENCES `genres`(`gid`);
ALTER TABLE `genre_band` ADD CONSTRAINT `genre_band_fk1` FOREIGN KEY (`band_id`) REFERENCES `bands`(`gid`);
ALTER TABLE `genre_album` ADD CONSTRAINT `genre_album_fk0` FOREIGN KEY (`genre_id`) REFERENCES `genres`(`gid`);
ALTER TABLE `genre_album` ADD CONSTRAINT `genre_album_fk1` FOREIGN KEY (`album_id`) REFERENCES `albums`(`gid`);
ALTER TABLE `genre_track` ADD CONSTRAINT `genre_track_fk0` FOREIGN KEY (`genre_id`) REFERENCES `genres`(`gid`);
ALTER TABLE `genre_track` ADD CONSTRAINT `genre_track_fk1` FOREIGN KEY (`track_id`) REFERENCES `tracks`(`gid`); To explain a bit:
This "recursive" design allows powerful combination of genres and artists and is very versatile. I hope this can inspire you in some way ;) |
I don't know how difficult it would be to implement this model, but I like it. I like it especially because it helps to have a nice search algorithm (we could filter by genre for instance). That being said, I'd like to be a bit pragmatic in this case. If it's too much hassle to do this design, I'd rather have the simpler design than nothing. So it's up to @IGuillaume124 to decide in the end 😉 |
I don't think it is difficult to implement, except for the automatic population from file tags. But I agree, that's up to @lGuillaume124 in the end ^^ |
Thank you @csdt I didn't expect such contribution! Your proposition is very interesting, but I'm afraid it take too much time to me to implement it. I don't want to start something and abort in a few months because I'm stuck. I'm already struggling with the new simple schema I proposed earlier 😕 As @MightyCreak said:
I prefer starting to work on something smaller, and, why not, go to something like your proposition after several more iteration. Your work stay here, the issue will remain open :) I hope you understand. So, does everyone agree with the following schema? CREATE TABLE `albums` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`cover` varchar(37) DEFAULT NULL,
`band_id` int(11) DEFAULT NULL,
`year` int(4) DEFAULT NULL,
`created` timestamp NOT NULL DEFAULT current_timestamp(),
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `albums_bands_id_fk` (`band_id`),
KEY `albums_name_index` (`name`),
CONSTRAINT `albums_bands_id_fk` FOREIGN KEY (`band_id`) REFERENCES `bands` (`id`)
)
CREATE TABLE `bands` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`created` timestamp NOT NULL DEFAULT current_timestamp(),
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `bands_name_index` (`name`)
)
CREATE TABLE `tracks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) DEFAULT NULL,
`source_path` varchar(1024) NOT NULL,
`playtime` varchar(9) DEFAULT NULL,
`track_number` int(5) DEFAULT NULL,
`disc` varchar(7) DEFAULT NULL,
`year` int(4) DEFAULT NULL,
`genre` varchar(255) DEFAULT NULL,
`artist` varchar(255) DEFAULT NULL,
`created` timestamp NOT NULL DEFAULT current_timestamp(),
`album_id` int(11) DEFAULT NULL,
`imported` tinyint(1) NOT NULL DEFAULT 1,
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tracks_albums_id_fk` (`album_id`),
KEY `tracks_title_index` (`title`),
KEY `tracks_artist_index` (`artist`),
CONSTRAINT `tracks_albums_id_fk` FOREIGN KEY (`album_id`) REFERENCES `albums` (`id`)
) |
I understand, and I'm completely fine with this. Just a quick question about your scheme: CREATE TABLE `tracks` (
...
`track_number` int(5) DEFAULT NULL,
`max_track_number` int(5) DEFAULT NULL,
`disc_number` int(5) DEFAULT NULL,
`max_disc_number` int(5) DEFAULT NULL,
...
) Also, I would recommend a |
Completely agree with @csdt's comment. I'm also wondering what is the purpose of And finally, I would really like to have the total time of an album displayed in the UI. It could be done in CPU ( |
Personally, I prefer to compute the sum every single time: like that, you are always sure your database is in a consistent state. Performance wise, if you have an index on |
I completely trust you on that one 😉 |
I've added your proposition @csdt
Used during my tests, it shouldn't be here sorry. So here we are! CREATE TABLE `albums` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`cover` varchar(37) DEFAULT NULL,
`band_id` int(11) DEFAULT NULL,
`year` int(4) DEFAULT NULL,
`created` timestamp NOT NULL DEFAULT current_timestamp(),
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `albums_bands_id_fk` (`band_id`),
KEY `albums_name_index` (`name`),
CONSTRAINT `albums_bands_id_fk` FOREIGN KEY (`band_id`) REFERENCES `bands` (`id`)
)
CREATE TABLE `bands` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`created` timestamp NOT NULL DEFAULT current_timestamp(),
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `bands_name_index` (`name`)
)
CREATE TABLE `tracks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) DEFAULT NULL,
`source_path` varchar(4096) NOT NULL,
`playtime` varchar(9) DEFAULT NULL,
`track_number` int(5) DEFAULT NULL,
`max_track_number` int(5) DEFAULT NULL,
`disc_number` int(5) DEFAULT NULL,
`max_disc_number` int(5) DEFAULT NULL,
`year` int(4) DEFAULT NULL,
`genre` varchar(255) DEFAULT NULL,
`artist` varchar(255) DEFAULT NULL,
`created` timestamp NOT NULL DEFAULT current_timestamp(),
`album_id` int(11) DEFAULT NULL,
`updated` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tracks_albums_id_fk` (`album_id`),
KEY `tracks_title_index` (`title`),
KEY `tracks_artist_index` (`artist`),
CONSTRAINT `tracks_albums_id_fk` FOREIGN KEY (`album_id`) REFERENCES `albums` (`id`)
) Thank you everyone! |
You seem to have forgotten to change Otherwise, I'm fine with it. |
Hi, Some news of the project. I almost finished to rewrite the synchronization system (i.e the code to import / updated and clean the database) (#348). It's been a long and hard way, but I learned a lot of new things and the most difficult part of the refactoring is now done. The new import page looks like this (sorry for the bad quality, click to enlarge): Errors and bad files are properly handled. Now I focus on the CLI to allow basic, but reliable automation, then I will write the new controllers and views to use the new database schema, and Sonerezh 2.0.0 will be ready \o/ |
I take a quick look on the current database and on the new proposition. I do not know how are your files on disks, but mine are usually one directory/one album. |
@Nizhile For albums with various artists, setting only the artist tag on each track is not enough, you also need to set the band (a.k.a. album artist). I don't think using string values as DB keys is a good practice. All in all, since the folder tree of a music library may vary from user to user, I think it's better to rely solely on the songs ID3 tags. It's more explicit and easier to fix: if there's a problem, look at the tags. |
@lGuillaume124 I just realized that most of my songs metadata need review. MusicBrainz provides covers that trigger timeout on import due to image too large, and most of my old compilations do not provide any album artist (Various or whatever). Thanks for the work anyway. |
I encourage you to import your library progressively so you can detect which album is problematic (and remove it for the time being). In the end you'll only have (hopefully) a small list of music to re-tag. |
Yes, I noticed that, but unfortunately I can't do anything on this third-party library…
I'm using http://beets.io/ to manage my collection. It's the best tool I know to do such a work! |
@Nizhile here's what a properly tagged compilation should look like:
|
@MightyCreak , I'll write some script to review the files 😁 @lGuillaume124, I usually use mp3tag for all tags update. Original files created using CDDB information 😕. Anyway, for more recent files, the current database schema is sufficient and rendering is great. |
The digest for thumbnails should probably be changed in a next release. The example in #345 (comment) shows an album with correct albumartist set. Anyway, thumbnails are computed using |
I changed once more this to hash directly image content as the thumbnail name. My collection is still a test one (8 albums) but even using very big covers, import does not trigger problem using php -S. I take a look on 2.0 branch and I notice that there is no general table. Usually, you can use such a table to store the database schema version. Then application can trigger database migrations according to supported schema version. |
I didn't open any issue about this yet, but I was thinking to include such information in a new create table settings
(
id int auto_increment,
name varchar(255) not null,
value tinytext null,
constraint settings_pk
primary key (id)
);
create unique index settings_name_uindex
on settings (name); |
@gs11 @MightyCreak you contributed a lot to Sonerezh and I would like to have your opinion about this important new step.
As you know by now the database has a big table
songs
in which all the Sonerezh's stuff is stored. This lead to performance issues on views, duplicate information in the database, no flexibility, etc.I started to think about the new schema, and to normalize it. Here is what I have, instead of one single table:
bands
with 2 columns:id
andname
albums
with 4 columns:id
,name
,cover
andband_id
tracks
with all the stuff related to the metadata plus a columnalbum_id
We have relationships between these tables:
Let's illustrate it:
With this schema, I'll be able to create very good new views like one dedicated to a band or an album with custom images / banner from fanart.tv or whatever, just like Spotify and others does. The interface should be more reactive and the database shouldn't be overloaded anymore when you visit the
/albums
page.But, the import process should be far much longer because for each file imported we must:
band
table, or create it otherwisealbum
table, or create it otherwiseAny help from anyone will be very appreciated :)
The text was updated successfully, but these errors were encountered: