You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm training on a dataset of songs, and I was training with this package. After about 10 epochs (of 1000 samples each) the loss seems to converge, however after I sample I get pure noise. My intuition is even if the model is converging to a local minima, or I've not trained for enough time, it still should be producing some output (garbage in garbage out should still produce something other than pure noise). Thus I'm led to believe that there's an issue with the way I'm generating the audio. I've attached my code below.
Any suggestions, or anything more I need to provide?
def generate_samples(model, num_samples, sample_rate, audio_length_seconds, device):
"""
Generate audio samples from the trained diffusion model.
:param model: The trained diffusion model.
:param num_samples: The number of audio samples to generate.
:param sample_rate: The sample rate of the audio.
:param audio_length_seconds: The length of the audio to generate, in seconds.
:param device: The device ('cpu' or 'cuda') to run the sampling on.
:return: A tensor containing the generated audio samples.
"""
audio_length = sample_rate * audio_length_seconds
# Initialize with random noise
noise = torch.randn(num_samples, 1, audio_length, device=device)
model.eval()
with torch.no_grad():
samples = model.sample(noise, num_steps=100)
return samples
waveform, sample_rate = torchaudio.load(audio_path)
# Example usage after training the model:
num_samples = 1 # Number of samples to generate
#sample_rate = dataset.sample_rate # Sample rate of the audio
audio_length_seconds = 20 # Length of the audio to generate, in seconds
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Generate samples
generated_audio = generate_samples(model, num_samples, sample_rate, audio_length_seconds, device)
# Save the generated samples as FLAC files
for i, audio_tensor in enumerate(generated_audio):
filename = f"generated_sample_{i+1}.flac"
torchaudio.save(filename, audio_tensor.cpu(), sample_rate)
print(f"Saved: {filename}")
The text was updated successfully, but these errors were encountered:
Hi,
I'm training on a dataset of songs, and I was training with this package. After about 10 epochs (of 1000 samples each) the loss seems to converge, however after I sample I get pure noise. My intuition is even if the model is converging to a local minima, or I've not trained for enough time, it still should be producing some output (garbage in garbage out should still produce something other than pure noise). Thus I'm led to believe that there's an issue with the way I'm generating the audio. I've attached my code below.
Any suggestions, or anything more I need to provide?
The text was updated successfully, but these errors were encountered: