Skip to content
This repository has been archived by the owner on Feb 1, 2024. It is now read-only.

tfrecord features #52

Open
gladys0313 opened this issue Jun 6, 2022 · 3 comments
Open

tfrecord features #52

gladys0313 opened this issue Jun 6, 2022 · 3 comments

Comments

@gladys0313
Copy link

gladys0313 commented Jun 6, 2022

Hello, I'm very interested in your amazing work. Took a deep look at the urmp tfrecord datasets, I found something a bit confusing for me. Could you be so kind to help?

  1. I checked some urmp_tfrecords data, I found that some of them contain the following features: {"audio", "f0_confidence", "f0_hz", "f0_time", "id", "instrument_id", "loudness_db", "note_active_frame_indices", "note_active_velocities", "note_offsets", "note_onsets", "orig_f0_hz", "orig_f0_time", "power_db", "recording_id", "sequence"}. However, some of them don't include {"orig_f0_hz", "orig_f0_time"} in their tfrecord data. Why is this so and does such an inconsistency influence the model training?
  2. I want to include piano music when I train my own model. To this end, I think I need to generate tfrecords that have the same content as the urmp ones you used in your model. I plan to use maestro dataset. Could you be so kind to indicate if there's a tfrecord data generation code that we can take as a reference? Like the one you used to generate tfrecords for the midi-ddsp model?
  3. What is the difference between "batched" and "unbatched" dataset?

Thank you very much for your help in advance.

@lukewys
Copy link
Contributor

lukewys commented Aug 1, 2022

Hi! Thanks so much for your interest! Sorry for the late reply.

  1. Only the keys used here (https://github.com/magenta/ddsp/blob/d1e9b555bf7ef6541d6c9a820b2e3941777c35c8/ddsp/training/data.py#L495) are useful.
  2. Unfortunately I also do not have the dataset generation model. It was written in google's internal codebase. However, I would not recommend training DDSP-related model on piano dataset, as DDSP is not specifically good at synthesizing polyphonic instrument. Also, there are features in MIDI-DDSP that are designed for monophonic instruments, such as vibrato, etc.
  3. "batched", meaning the data is chunked into 4s of samples used to train DDSP inference and synthesis generator. The other one is "unbatched", meaning there is one sample per audio recording used to train expression generator.

@vivienseguy
Copy link

I am also trying to make tfrecords from guitar solos. Occasionally it's polyphonic but it's rare. Did anyone get some tfrecord generation code working or more info about the data format?

@lukewys
Copy link
Contributor

lukewys commented Sep 12, 2023

Hi @vivienseguy , unfortunately I don't have the code for generating the tfrecord. I think you could look into some of the dataset format details explained in this issue and the issue that mentioned this issue. (#59). You could try to generate your tfrecord dataset in the same format with the same keys.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants