Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck at finding different segments #1

Open
Johndirr opened this issue Nov 30, 2019 · 13 comments
Open

Stuck at finding different segments #1

Johndirr opened this issue Nov 30, 2019 · 13 comments

Comments

@Johndirr
Copy link

Running

$ ./resync -o lowres.wav -m highres.wav -s blub.srt

I get the following output. But it won't go past this state and never finishes resynchronizing.

[11-30 18:22:40][Debug] Extracting audio fingerprints from: lowres.wav
[11-30 18:22:40][Debug] Begin reading WAV file
[11-30 18:22:40][Debug] Opening mode chosen: readFile, proceeding
[11-30 18:22:40][Debug] Trying to read from file : lowres.wav
[11-30 18:22:40][Debug] Reading file data
[11-30 18:22:44][Debug] File data read and stored in buffer
[11-30 18:22:44][Debug] Processing data and extracting samples
[11-30 18:22:44][Debug] Checking chunkID, should be RIFF
[11-30 18:22:44][Debug] Wave File chunkID verification successful
[11-30 18:22:44][Debug] Begin decoding wave file
[11-30 18:22:44][Debug] File format is identified as WAV
[11-30 18:22:44][Debug] Finding FMT and DATA subchunks
[11-30 18:22:44][Debug] FMT index : 12 , DATA index : 70
[11-30 18:22:44][Debug] PCM : True
[11-30 18:22:44][Debug] MONO : True
[11-30 18:22:44][Debug] Sample Rate 16KHz : True
[11-30 18:22:44][Debug] BitRate 16 bits/sec : True
[11-30 18:22:44][Debug] Number of samples : 65917611
[11-30 18:22:44][Debug] Reading samples
[11-30 18:22:46][Debug] Successfully decoded
[11-30 18:22:46][Debug] File decoded successfully
[11-30 18:22:53][Debug] The audio is 4119.85 seconds in length.
[11-30 18:22:53][Debug] Audio fingerprints successfully generated.

[11-30 18:22:53][Debug] Extracting audio fingerprints from: highres.wav
[11-30 18:22:53][Debug] Begin reading WAV file
[11-30 18:22:53][Debug] Opening mode chosen: readFile, proceeding
[11-30 18:22:53][Debug] Trying to read from file : highres.wav
[11-30 18:22:53][Debug] Reading file data
[11-30 18:22:57][Debug] File data read and stored in buffer
[11-30 18:22:57][Debug] Processing data and extracting samples
[11-30 18:22:57][Debug] Checking chunkID, should be RIFF
[11-30 18:22:57][Debug] Wave File chunkID verification successful
[11-30 18:22:57][Debug] Begin decoding wave file
[11-30 18:22:57][Debug] File format is identified as WAV
[11-30 18:22:57][Debug] Finding FMT and DATA subchunks
[11-30 18:22:57][Debug] FMT index : 12 , DATA index : 70
[11-30 18:22:57][Debug] PCM : True
[11-30 18:22:57][Debug] MONO : True
[11-30 18:22:57][Debug] Sample Rate 16KHz : True
[11-30 18:22:57][Debug] BitRate 16 bits/sec : True
[11-30 18:22:57][Debug] Number of samples : 66148352
[11-30 18:22:57][Debug] Reading samples
[11-30 18:22:59][Debug] Successfully decoded
[11-30 18:22:59][Debug] File decoded successfully
[11-30 18:23:06][Debug] The audio is 4134.27 seconds in length.
[11-30 18:23:06][Debug] Audio fingerprints successfully generated.

[11-30 18:23:06][Debug]
Finding the different segments...

Any idea whats wrong? The audio files were created as stated in the docs.

@sypai
Copy link
Collaborator

sypai commented Dec 9, 2019

Hey @Johndirr , sorry for the delay.
Can you provide a link to the audio files?

@sypai
Copy link
Collaborator

sypai commented Dec 16, 2019

Ah! Unable to download the files.
I'd suggest you extract the fingerprints for both the files and send them over to me.

You can get fpcalc static binaries from here: chromaprint

It's easy, just run:
fpcalc -raw -json -overlap /path/to/audio/file
I'll look into what's causing the program to stop.

@Johndirr
Copy link
Author

Dumped the output into these files. Hope this will help.
lowres.txt
highres.txt

I can also upload the wav files to a different hoster if you suggest me one.

@sypai
Copy link
Collaborator

sypai commented Dec 16, 2019

Hey!
What I can tell after running the code for the two files is that these two files are almost the same.
The case that resync currently handles is that of two files with changed timings commonly due to the presence of extra portions (mainly commercials). As far as I can see the current files are the same maybe with different frame rates.

Maybe I can try something but for that, I'll need the files(media and subtitle).

@sypai
Copy link
Collaborator

sypai commented Dec 16, 2019

The google drive links seem to be broken.

can't-download-audio-files

@Johndirr
Copy link
Author

Johndirr commented Dec 16, 2019

Hi, you are right. The two files are the same with a different framerate. Uploaded the wavs to another site:
https://filebin.net/bz6ihecdnhp1v1gs/files.zip?t=mswa7vao
But I don't care that much about the difference for these files in particular. I just want to get the program to work :)

Reuploaded files

@sypai sypai closed this as completed Mar 27, 2020
@sypai sypai reopened this Mar 27, 2020
@Johndirr
Copy link
Author

Ok, I gave this another try. This time it ended in a Segmentation fault

./resync -o original.wav -m new.wav -s orignalsub.srt
Subtitle-Resync 0.00, CCExtractor Development.

[05-11 11:20:18][Debug] Extracting audio fingerprints from: original.wav
[05-11 11:20:18][Debug] Begin reading WAV file
[05-11 11:20:18][Debug] Opening mode chosen: readFile, proceeding
[05-11 11:20:18][Debug] Trying to read from file : original.wav
[05-11 11:20:18][Debug] Reading file data
[05-11 11:20:25][Debug] File data read and stored in buffer
[05-11 11:20:25][Debug] Processing data and extracting samples
[05-11 11:20:25][Debug] Checking chunkID, should be RIFF
[05-11 11:20:25][Debug] Wave File chunkID verification successful
[05-11 11:20:25][Debug] Begin decoding wave file
[05-11 11:20:25][Debug] File format is identified as WAV
[05-11 11:20:25][Debug] Finding FMT and DATA subchunks
[05-11 11:20:25][Debug] FMT index : 12 , DATA index : 70
[05-11 11:20:25][Debug] PCM : True
[05-11 11:20:25][Debug] MONO : True
[05-11 11:20:25][Debug] Sample Rate 16KHz : True
[05-11 11:20:25][Debug] BitRate 16 bits/sec : True
[05-11 11:20:25][Debug] Number of samples : 105009152
[05-11 11:20:25][Debug] Reading samples
[05-11 11:20:27][Debug] Successfully decoded
[05-11 11:20:27][Debug] File decoded successfully
[05-11 11:20:39][Debug] The audio is 6563.07 seconds in length.
[05-11 11:20:39][Debug] Audio fingerprints successfully generated.

[05-11 11:20:39][Debug] Extracting audio fingerprints from: new.wav
[05-11 11:20:39][Debug] Begin reading WAV file
[05-11 11:20:39][Debug] Opening mode chosen: readFile, proceeding
[05-11 11:20:39][Debug] Trying to read from file : new.wav
[05-11 11:20:39][Debug] Reading file data
[05-11 11:20:45][Debug] File data read and stored in buffer
[05-11 11:20:45][Debug] Processing data and extracting samples
[05-11 11:20:45][Debug] Checking chunkID, should be RIFF
[05-11 11:20:45][Debug] Wave File chunkID verification successful
[05-11 11:20:45][Debug] Begin decoding wave file
[05-11 11:20:45][Debug] File format is identified as WAV
[05-11 11:20:45][Debug] Finding FMT and DATA subchunks
[05-11 11:20:45][Debug] FMT index : 12 , DATA index : 70
[05-11 11:20:45][Debug] PCM : True
[05-11 11:20:45][Debug] MONO : True
[05-11 11:20:45][Debug] Sample Rate 16KHz : True
[05-11 11:20:45][Debug] BitRate 16 bits/sec : True
[05-11 11:20:45][Debug] Number of samples : 85014528
[05-11 11:20:45][Debug] Reading samples
[05-11 11:20:47][Debug] Successfully decoded
[05-11 11:20:47][Debug] File decoded successfully
[05-11 11:20:56][Debug] The audio is 5313.41 seconds in length.
[05-11 11:20:56][Debug] Audio fingerprints successfully generated.

[05-11 11:20:56][Debug]
Finding the different segments...

Segmentation fault

@sypai If you would be so kind to have a look at this. I uploaded the files to mega:
https://mega.nz/folder/KUgjmCpB#ailTQWAXfl_bpCBu-LnP5g
If you can't download from there please tell me a host and I'll reupload.

@sypai
Copy link
Collaborator

sypai commented May 18, 2020

Hello, @Johndirr
Apologies for the delay.

mega doesn't seem to work :(
Can we try G Drive once more?

@Johndirr
Copy link
Author

Johndirr commented May 19, 2020

Here you go:
https://drive.google.com/open?id=15EwO6LTaSWBm8HLtt6CwhoDvB3WkCxiz
EDIT: Accidentally closed the issue...

@Johndirr Johndirr reopened this May 19, 2020
@interlark
Copy link

I tried resync and have the same segmentation fault.

@interlark
Copy link

interlark commented Feb 20, 2021

The problem is somewhere in slice function

std::vector<uint32_t > Subtitle_Resync::AlignFP::slice(int lb, int ub)

According to backtrace, there are calls:
main -> Tool::sync -> AlignFP::brum_brum -> AlignFP::segment_it -> AlignFP::next and after it AlignFP::slice have been recursively called like ~70k times until the program died.
UPD: That's funny, but it seems like every recurrent call of function slice is called with the same arguments of ub ("upper bound") and lb ("lower bound"). lb=-20, ub=24883. Is it legal for lb be negative, because otherwise it's wrong lb to be compared with type of size_t here:

I think because of this wrong type comparison we have segmentation fault here. Forcing with negative values infinity recursive calls.

@shootlazer
Copy link

Same thing here. Different audios tried, different subtitles, same result. Shame because I haven't found any other tool that does exactly this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants