Signal-Processing

Signal Processing with Python and Librosa

#1) Voice Reconstruction Using Vq-VAE

This notebook proposes a method on how to reconstruct speech using vq-vae which has been first introduced by Oord et. al.

#2) Vq-VAE vs VAE Main difference between Vq-VAE & VAE is that VAE learns a continuous latent representation of a given dataset, but Vq-VAE learns a discrete latent representation of dataset.

#3) Architecture

At the begining, encoder takes a batch of images with input shape of $X : (n, h, w, c)$ and outputs $Z_{e} : (n, h, w, d)$
Then vector quantization layer takes $Z_{e}$ and for each vector in $Z_{e}$ it selects the nearest vector from the codebook based on $L_{2}$ norm and outputs $Z_{q}$
Finally decoder takes $Z_{q}$ and reconstructs the input $X$ .

#4) Detailed View on Vq-VAE Architecture

Reshaping: First of all we need to reshape input from $(n, h, w, d)$ to $(nhw, d)$.
Calculating Distances: For each of d-dimensional vectors, we calculate their distance from each k, d-dimensional vectors in codebook and get a matrix of $(nhw, k)$.
Argmin: Next for each row of the matrix, we apply argmin function to get the nearest vector index from codebook and do one-hot encoding no each row (in fact the value of the nearest vector will be 1 and rest would be 0).
Index from Codebook: After that we multiply the one-hotted matrix to the whole codebook and we get a matrix of $(nhw, d)$ dimension.
Finally we reshape $(nhw, d)$ to $(n, h, w, d)$ and give it to the decoder to reconstruct the input data

#5) Some High Resolution Constructed Images

References

[1] https://shashank7-iitd.medium.com/understanding-vector-quantized-variational-autoencoders-vq-vae-323d710a888a

[2] https://arxiv.org/pdf/1711.00937.pdf

Name	Name	Last commit message	Last commit date
Latest commit mehdihosseinimoghadam CBHG Module added to Encoder Mar 14, 2022 3a6f0eb · Mar 14, 2022 History 30 Commits
_downloads/7303ce3181f4dbc9a50bc1ed5bb3218f	_downloads/7303ce3181f4dbc9a50bc1ed5bb3218f	Torchaudio Tutorial	Dec 29, 2021
Basics.ipynb	Basics.ipynb	GriffinLim	Dec 29, 2021
CBHG_vq_VAE_for_Melspectrogram.ipynb	CBHG_vq_VAE_for_Melspectrogram.ipynb	CBHG Module added to Encoder	Mar 14, 2022
Deep_Complex_U_net.ipynb	Deep_Complex_U_net.ipynb	one	Feb 11, 2022
Inception_Like_for_mel_spectrogram.ipynb	Inception_Like_for_mel_spectrogram.ipynb	Created using Colaboratory	Jan 4, 2022
README.md	README.md	Update README.md	Feb 10, 2022
VAE_for_Melspectrogram.ipynb	VAE_for_Melspectrogram.ipynb	updated	Jan 28, 2022
VAE_for_Wav_Reconstruction.ipynb	VAE_for_Wav_Reconstruction.ipynb	Created using Colaboratory	Jan 28, 2022
WaveNet.ipynb	WaveNet.ipynb	Four	Feb 7, 2022
vq_VAE_for_Melspectrogram.ipynb	vq_VAE_for_Melspectrogram.ipynb	Check Points added	Feb 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Signal-Processing

References

About

Releases

Packages

Languages

mehdihosseinimoghadam/Signal-Processing

Folders and files

Latest commit

History

Repository files navigation

Signal-Processing

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages