Recipe for spekaer diarization using Kaldi.
Check clustering/
dir.
Based on pyannote-audio
lib.
- Dataset used in the recipe consists of 5 audios from the English portion of the CallHome dataset.
- CallHome is 2-channel, μ-law 8 kHz telephone speech and it is converted on
the fly via sox under the
wav.scp
file to mono-channel, 16 kHz PCM (seefblocal/prep_data.sh
). If you're dealing with data already in PCM format then you'll need to edit the script. - No model is trained. Instead, the scripts download a pre-trained SRE16 model.
- Speaker Diarization with Kaldi by Yoav Ramon (Towards Data Science blog)
- "Speakers in the Wild" informal documentation by David Ryan Snyder
- NIST SRE 2016 Xvector Recipe by David Ryan Snyder
- Kaldi's
callhome_diarization
v2 recipe onegs/
.
Grupo FalaBrasil (2020) - https://ufpafalabrasil.gitlab.io/
Universidade Federal do Pará (UFPA) - https://portal.ufpa.br/
Cassio Batista - https://cassota.gitlab.io/