How to calculate the overlap rate #10

yhy929 · 2024-12-01T08:22:49Z

Hi, fnlandini .I had a problem calculating the overlap rate. In your previous answer, you indicated that we can use this code https://github.com/BUTSpeechFIT/diarization_utils/blob/main/compute_stats.py to calculate the overlap rate, can you tell me how to run this code? This code seems to iterate over the RTTM of each file, but there is only one overall RTTM in the resulting data set. How to get these input files，such as in-rttm-dir, lengths and txt-list, please help me, thank you very much！

fnlandini · 2024-12-01T14:22:25Z

Hi,
The arguments in that script are

parser.add_argument('--in-rttm-dir', type=str, required=True, help='directory with rttm files')
parser.add_argument('--out-file', type=str, required=True, help='output file where results are written')
parser.add_argument('--lengths', type=str, required=True, help='file containing list of lengths per file')
parser.add_argument('--txt-list', type=str, required=True, help='list of files to process')

The idea is to have separate rttm files for each recording so that one can get separate statistics for each one. --in-rttm-dir will contain those. Then you will need a file with the length of each recording (this is because the recording might be longer than the last segment in the rttm). Finally, a list of the recordings.

If you want to create separate rttm files, you can iterate over the recordings and grep your single rttm and output to separate files. You will then need to calculate the lengths. sox can be helpful for that.

I hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to calculate the overlap rate #10

How to calculate the overlap rate #10

yhy929 commented Dec 1, 2024

fnlandini commented Dec 1, 2024

How to calculate the overlap rate #10

How to calculate the overlap rate #10

Comments

yhy929 commented Dec 1, 2024

fnlandini commented Dec 1, 2024