yunitate segmentation outside audio duration #153

gobbios · 2019-10-20T13:07:46Z

yunitate.sh seems to produce rttm files with segments that go beyond (or even are completely outside) the duration of the source wave file.

the audio I'm using is this:
vagrant ssh -c "sox --i '/vagrant/data/0513.wav'"

Input File : '/vagrant/data/0513.wav'
Channels : 1
Sample Rate : 44100
Precision : 16-bit
Duration : 00:10:04.12 = 26641575 samples = 45308.8 CDDA sectors
File Size : 53.3M
Bit Rate : 706k
Sample Encoding: 16-bit Signed Integer PCM

So that amounts to 604.12 seconds duration.

After running vagrant ssh -c "yunitate.sh data/", I get the following rttm (only last few lines shown):

SPEAKER 0513.rttm 1 601.4 0.1 CHI
SPEAKER 0513.rttm 1 601.5 1.2 FEM
SPEAKER 0513.rttm 1 602.7 2.1 CHI

where the last segment starts inside the source wave file's duration, but goes beyond the end (602.7 + 2.1 = 604.9).

When running vagrant ssh -c "yunitate.sh data/ english" things become even stranger:

SPEAKER 0513.rttm 1 601.6 0.6 FEM
SPEAKER 0513.rttm 1 603.3 0.1 CHI
SPEAKER 0513.rttm 1 603.6 0.1 CHI
SPEAKER 0513.rttm 1 603.9 0.3 CHI
SPEAKER 0513.rttm 1 604.2 0.1 FEM

Here the last segment starts after the end of the original source.

This becomes problematic when using the latter file for vagrant ssh -c "~/launcher/WCE_from_SAD_outputs.sh /vagrant/data/ yunitator_english". Here, the tool finishes without error message, but doesn't produce the word count output. The wav_tmp folder is still present and contains this empty (corrupt?) wav file:

Input File : '/vagrant/data/wav_tmp/yunitator_english_0513_00604200-00000100.wav'
Channels : 1
Sample Rate : 44100
Precision : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

And finally, if I use this file in the analyze.sh pipeline, I get the following message:

(MSG) [2] in SMILExtract : openSMILE starting!
(MSG) [2] in SMILExtract : config file is: MED_2s_100ms_htk.conf
(MSG) [2] in cComponentManager : successfully registered 96 component types.
(MSG) [2] in cComponentManager : successfully finished createInstances
(19 component instances were finalised, 1 data memories were finalised)
(MSG) [2] in cComponentManager : starting single thread processing loop
(MSG) [2] in cComponentManager : Processing finished! System ran for 60436 ticks.
sox WARN trim: End position is after expected end of audio.
sox WARN trim: Last 1 position(s) not reached.
/home/vagrant/utils/analyze.sh: line 40: /vagrant/data//detailed_outputs/WCE_yunitator_english_0513.rttm: No such file or directory
paste: /vagrant/data//wce.temp: No such file or directory

vcm_0513.rttm and yunitator_english_0513.rttm are present in detailed_output, but the corresponding wce_0513.rttm is missing.

One hackish solution might be to append a second or two of silence to the end of the source wave file, I suppose. I haven't tried that yet.

The text was updated successfully, but these errors were encountered:

gobbios · 2019-10-20T13:33:27Z

I tried the silence approach with partial success:

vagrant ssh -c "sox /vagrant/data/0513.wav /vagrant/data/0513padded.wav pad 0 5"
vagrant ssh -c "sox --i '/vagrant/data/0513padded.wav'"

Input File : '/vagrant/data/0513padded.wav'
Channels : 1
Sample Rate : 44100
Precision : 16-bit
Duration : 00:10:09.12 = 26862075 samples = 45683.8 CDDA sectors
File Size : 53.7M
Bit Rate : 706k
Sample Encoding: 16-bit Signed Integer PCM

vagrant ssh -c "yunitate.sh data/"

SPEAKER 0513padded.rttm 1 600.0 4.2 CHI
SPEAKER 0513padded.rttm 1 604.2 2.0 FEM
SPEAKER 0513padded.rttm 1 606.2 3.8 MAL
SPEAKER 0513padded.rttm 1 610.2 0.8 MAL

still goes over the end time of the audio (barely though).

while vagrant ssh -c "yunitate.sh data/ english"

SPEAKER 0513padded.rttm 1 596.2 0.3 CHI
SPEAKER 0513padded.rttm 1 597.2 0.7 CHI
SPEAKER 0513padded.rttm 1 598.0 6.1 CHI

does works fine now. Incidentally the last segment aligns with the original file duration now (604.1).

WCE_from_SAD_outputs.sh /vagrant/data/ yunitator_english and analyze.sh data/ work as expected with the latter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yunitate segmentation outside audio duration #153

yunitate segmentation outside audio duration #153

gobbios commented Oct 20, 2019

gobbios commented Oct 20, 2019

yunitate segmentation outside audio duration #153

yunitate segmentation outside audio duration #153

Comments

gobbios commented Oct 20, 2019

gobbios commented Oct 20, 2019