Skip to content

Latest commit

 

History

History
37 lines (24 loc) · 1.7 KB

README.md

File metadata and controls

37 lines (24 loc) · 1.7 KB

Speaker diarization

Recipe for spekaer diarization using Kaldi.

Clustering (standalone)

Check clustering/ dir. Based on pyannote-audio lib.

Notes

  • Dataset used in the recipe consists of 5 audios from the English portion of the CallHome dataset.
  • CallHome is 2-channel, μ-law 8 kHz telephone speech and it is converted on the fly via sox under the wav.scp file to mono-channel, 16 kHz PCM (see fblocal/prep_data.sh). If you're dealing with data already in PCM format then you'll need to edit the script.
  • No model is trained. Instead, the scripts download a pre-trained SRE16 model.

References

FalaBrasil UFPA

Grupo FalaBrasil (2020) - https://ufpafalabrasil.gitlab.io/
Universidade Federal do Pará (UFPA) - https://portal.ufpa.br/
Cassio Batista - https://cassota.gitlab.io/