Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speaker Diarization goes haywire due to small segments of audio #9523

Open
AatikaNazneen opened this issue Jun 24, 2024 · 0 comments
Open

Speaker Diarization goes haywire due to small segments of audio #9523

AatikaNazneen opened this issue Jun 24, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@AatikaNazneen
Copy link

Describe the bug

I have a long audio of around 3 hours that spans multiple speakers. The speaker diarization label a single speaker when this audio is passed. When I break down into this audio in parts and pass each part separately, some of the parts get assigned speakers correctly but the rest of the portion has the same bug. I identified some 1 min chunks that when added in this audio cause the model to behave this way. I'm seeking possible explanations or solutions to this behavior since I believe that the model should be resilient enough.

Steps/Code to reproduce bug

Test Speaker Diarization on the audio

Expected behavior

A clear and concise description of what you expected to happen.

Environment overview (please complete the following information)

  • Environment location: AWS
  • Method of NeMo install: pip install

Environment details

  • AWS Linux 2
  • PyTorch version: 2.3.1
  • Python version: 3.10

Additional context

GPU model

@AatikaNazneen AatikaNazneen added the bug Something isn't working label Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants