srvk-eesen-offline-transcriber

srvk/eesen customized version of Tanel Alumae's kaldi-offline-transcriber

You probably want to use this inside the SRVK's Eesen Transcriber, not on its own.

speech2text.sh - Transcribe audio/video file and produce several output formats at once (plaintext, subtitles, NIST CTM scoring input, Audacity labels)
vids2web.sh - Transcribe and create video subtitles and searchable index in a web page
run-segmented.sh - If you have your own segmentation file this may improve transcription accuracy
run-scored.sh - If you have STM ground truth as well as audio/video, produce NIST SCLITE scoring results in build/trans/<videoname>/eesen/decode/score_*
run-scored-8k.sh - Same but for 8khz audio such as Switchboard corpus
batch.sh - Queue several files for transcription
slurm.sh - for batch processing, edit to change which transcribe script is used (speech2text.sh by default)
mkpages.sh - Make/update web pages from video and transcription output
watch.sh - Run this to start watching a shared folder for files to be transcribed
path.sh - set up the PATH environment variable for the above
Makefile - master control for transcriber

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
conf		conf
lib		lib
local		local
models		models
scripts		scripts
.gitignore		.gitignore
Makefile		Makefile
Makefile.aspire		Makefile.aspire
Makefile.rover		Makefile.rover
README.md		README.md
align.sh		align.sh
align_cha.sh		align_cha.sh
batch.sh		batch.sh
glm		glm
mkpages.sh		mkpages.sh
path.homebank.sh		path.homebank.sh
path.sh		path.sh
run-scored-8k.sh		run-scored-8k.sh
run-scored.sh		run-scored.sh
run-segmented.sh		run-segmented.sh
slurm.sh		slurm.sh
speech2diarize.sh		speech2diarize.sh
speech2per.sh		speech2per.sh
speech2phonectm.sh		speech2phonectm.sh
speech2phones.sh		speech2phones.sh
speech2text.aspire.sh		speech2text.aspire.sh
speech2text.sh		speech2text.sh
speech2wer.sh		speech2wer.sh
vids2web.sh		vids2web.sh
watch.sh		watch.sh

Provide feedback