Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unreadable noise signals in the fma_small subset #1

Closed
ChangLee0903 opened this issue Oct 6, 2022 · 1 comment
Closed

unreadable noise signals in the fma_small subset #1

ChangLee0903 opened this issue Oct 6, 2022 · 1 comment

Comments

@ChangLee0903
Copy link

ChangLee0903 commented Oct 6, 2022

Dear MultiSV members,

I just noticed that there are some unused noise signals that cannot be loaded by torchaudio and librosa, such as 'fma_small/108/108925', 'fma_small/099/099134', and 'fma_small/133/133297'. I think there might be some broken files in the noise dataset?

best regards,
Chi-Chang Lee

@Lamomal
Copy link
Collaborator

Lamomal commented Dec 16, 2022

Dear Chi-Chang,

Thank you for bringing this up. Indeed, the noise files you mentioned do not contain any audio. It is a known issue of the FMA dataset. For reference, please see FMA wiki and a related issue. In line with your comment, MultiSV users might encounter warnings during data preparation related to these files (which can be safely ignored). Even if the FMA small dataset is downloaded correctly (as indicated by the create_training_data.sh script, which compares checksums), the erroneous files will cause the following warnings during the conversion to wav:
WARNING: conversion failed for: <output_dir>/noises_training/fma_small/108/108925.mp3
WARNING: conversion failed for: <output_dir>/noises_training/fma_small/099/099134.mp3
WARNING: conversion failed for: <output_dir>/noises_training/fma_small/133/133297.mp3

Since MultiSV does not use these three files, they do not pose an issue for the corpus.

Best,
Ladislav

@Lamomal Lamomal closed this as completed May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants