-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markup for Headings, Sentences, Paragraphs, etc. #7
Comments
I thought WebVTT was supposed to respect newlines? |
Oh, but a blank line indicates the end of the cue. Have you thought about embedding HTML in the webvtt file? |
I apologize for missing your replies for so long. I've definitely considered embedding HTML in the webvtt file, but I have no examples of others doing so to follow. Since I'm a newbie, I'd be more comfortable if I could find someone who knows what they're doing practicing this so that I could learn from them, rather than just charging ahead, trying to fit square pegs into round holes. |
I recently found this git while looking for a way to sync VTT files to audio and while this may be extremely niche at the moment with whisper.cpp I'm cranking out a lot of uses for this. I do not code on the regular but was able to turn the web based code into an ebook loaded with audio and a VTT file and turn it into an interactive transcript Ebook using Sigil. I'm also interested in ways to format the transcript and honestly the javascript code is lightyears beyond me. I was wondering if there's any updates for the coding in the last three years and secondly wanted to add to the formatting question and I will look into HTML within the VTT file and lastly just wanted to say thank you for this Git! |
Greetings!
I've long had interest in interactive transcripts (but no real experience building them).
I'm now taking my first steps toward actually building a functioning interactive transcript (still in a planning/architect/experimental phase) and the idea of using WebVTT as the storage format for my text timings is attractive, as it is already an accepted format on the web.
Where things currently break down for me is that current examples of rendering WebVTT content as an interactive transcript usually do not join together sentences which have been split into multiple cues.
Likewise, by simply parsing a typical WebVTT file, one would have no idea where paragraphs begin or end when piecing together cues.
I know that WebVTT can store chapter data as a separate WebVTT file which contains only chapter data.
I would be interested in an interactive transcript viewer which can read multiple WebVTT files and render them intermixed, as the time data dictates, with the WebVTT file containing chapter data being rendered into the interactive transcript as Headings within the body of the transcript.
One could argue that in order to solve the split sentences problem, one need only join all cues end to end inline, rather than each being its own block. That is true, however that still does not solve paragraph breaks.
So, at the very least, it appears to me that there needs to be some way of marking up (at a minimum) where new paragraphs begin (and optionally where they end). One could interpret the interjection of a chapter/heading (where they occur) as the end of the prior paragraph.
Do you have any suggestions for a UTF-8 character which could be inserted into a WebVTT file to notate the beginning of a new paragraph to a viewer building on top of what you have achieved (while being either ignored by a browser's default WebVTT subtitle displayer or the character being so unobtrusive that most people would not notice its presence?
I suppose that you could add any symbol you wished for notating the beginning or ending of paragraphs to be interpreted by an interactive transcript viewer and simply assign it a duration time of 0 seconds so that it is never visible to anyone viewing the subtitles but can still be easily read by a transcript viewer.
I'm looking forward to your response.
The text was updated successfully, but these errors were encountered: