Add 'small' subset #13

andimarafioti · 2019-06-27T12:42:06Z

Hi! thanks for the dataset. It would be useful for me if you provided a 'small' subset like FMA (they do 8,000 tracks of 30s, 8 balanced genres (GTZAN-like) (7.2 GiB)). I know I could make a subset myself with the script cited on the readme, but I would need to download 100x the amount of data I want and then process it. If you think it's worth it, and are willing to host it, I can also make the subset myself and upload it somewhere. Thanks!

dbogdanov · 2019-06-27T13:42:15Z

Hi @andimarafioti, yes we are working on that ;-) Will update soon.

abugler · 2021-02-22T01:02:17Z

Hi! Is there an update on the small subset? Thank you so much.

dbogdanov · 2023-03-06T15:39:21Z

Note that we have included lower-bitrate mono audio downloads that significantly reduce the download size (full dataset: 508 GB to 156 GB). I assume this is not small enough for a "small" dataset...

We lack a specific proposal for what the small subset should include. Should it cover all tags in MTG-Jamendo or a subset of tags?

Another alternative is to create a version of the full dataset with audio fragments instead of full tracks. Using 2 min or 30 second fragments for each track reduces the total dataset size from ~3778 hours to 1856.7 or 464 hours, respectively. The low-bitrate mono audio 30-second fragment version would take ~19 GB which is very reasonable.

dbogdanov · 2023-03-08T16:52:38Z

Related to this, @philtgun has previously done a subset of MTG-Jamendo with one random track per artist (5 random trials) and one random track per album to see the statistics (autotagging_toy_0..4 and autotagging_toy_album_0). Leaving this here for reference.

dbogdanov assigned philtgun Jun 27, 2019

philtgun added the enhancement New feature or request label Jul 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'small' subset #13

Add 'small' subset #13

andimarafioti commented Jun 27, 2019

dbogdanov commented Jun 27, 2019

abugler commented Feb 22, 2021

dbogdanov commented Mar 6, 2023

dbogdanov commented Mar 8, 2023

Add 'small' subset #13

Add 'small' subset #13

Comments

andimarafioti commented Jun 27, 2019

dbogdanov commented Jun 27, 2019

abugler commented Feb 22, 2021

dbogdanov commented Mar 6, 2023

dbogdanov commented Mar 8, 2023