Subtitle Generator and Translator for Videos Using IBM Watson Speech to Text and Globalization Pipeline Services in Node.js
This is a set of command line utilties written in node.js that you can use to generate a SubRip .srt file from an .mp4 video by using the IBM Watson Speech to Text service on IBM Bluemix. Once you have generated the SubRip file you can then translate it into multiple languages by using the IBM Globalization Pipeline service on IBM Bluemix.
You need to have Node.js installed on your machine. If you don't have Node.js, then you can download it from nodejs.org.
In order to be able to use these utilities, make sure you have ffmpeg installed on your system (including all necessary encoding libraries such as libmp3lame or libx264).
The subtitler
generator uses the fluent-ffmpeg package and this package requires that you have a version greater than 0.9 of ffmpeg be installed. The fluent-ffmpeg package will call ffmpeg
and ffprobe
so you need to have these in your PATH
or set in the FFMPEG_PATH
environment variable and the FFPROBE_PATH
environment variable. The subtitler
utility will be creating .mp3 files so you must have the libmp3lame
codec installed on your system.
You must also establish an IBM Bluemix account and create service instances for Watson Speech to Text and Globalization Pipeline.
Download or clone this repository and then install all packages.
$ git clone https://github.com/steveatkin/Subtitler
$ npm install
- Create a
speech-credentials.json
file with the credentials from your instance of Watson Speech to Text:
{
"credentials": {
"username": "……",
"password": "……"
}
}
- Create a
g11n-credentials.json
file with the credentials for a user of your instance of Globalization Pipeline. Be certain to create a user in the Globalization Pipeline service dashboard that hasAdministrator
privileges for all resource bundles:
{
"credentials": {
"url": "……",
"userId": "……",
"password": "……",
"instanceId": "……"
}
}
When calling subtitler
you can specify either the BCP language code that corresponds to the language being used in the video or you can use a customized speech engine by specifying the customization id. In all cases subtitler
allways uses the broadband speech engines in order to obtain the best results. You can also indicate whether or not sentence casing should be performed. By default subtitler
will capitalize the first letter in the first word of a subtitle and add a period to the end of the subtitle.
This is the general syntax for using subtitler
node subtitler filename source-language | customization-id sentence-casing[yes|no]
Currently only the following language codes are supported: en, en-GB, ar, es, fr, ja, pt-BR, and zh-Hans.
For example, if you wanted to create English subtitles for your video file using the default broadband speech model with sentence casing you would use the following command:
node subtitler myVideo.mp4 en yes
or with a customization id
node subtitler myVideo.mp4 xxxxxx-xxxxx yes
Once subtitler
finishes it will create a file named the same as the video filename except with the .srt
extension. A raw speech events file will also be created and has the same name as the video file except it will append on _events.json
. Additionally, an .mp3 file will be created that contains the extracted audio from the video file. It will be named the same as the video filename except that it will end in .mp3
.
Quite often when speech to text services are used to extract subtitles from audio and video, the generated subtitles may be incomplete English sentences. This frequently presents challenges for translating subtitles into other languages.
You can use the segmenter
utility to help transform the sentence fragments into complete sentences. This utlity calls the http://bark.phon.ioc.ee/punctuator service to cleanup the raw segements. Currently this can only be done for English segments.
When segmenter
is run it will generate a SubRip file that has the same name as the events file except it will end in .srt
.
This is the general syntax for using segmenter
node segmenter filename source-language
For example, if you wanted to segment raw English speech events into segments you would use the following command:
node segmenter myVideo_events.json en
When calling translator
you need to specify both the BCP source language and the BCP target language for the subtitle files. To obtain translated subtitles use must first upload the source subtitles using the translator
utility with the upload argument. Once translation is completed you can then download your translated subtitles by calling the translator
utility with the download argument. You can check the status of your translation in the Globalization Pipeline service dashboard. By default translator
will create a resource bundle with the same name as the video filename in the Globalization Pipeline service.
Once translator
has finished downloading your content it will create a file that has the same name as the source file with the target language code appended to the filename.
This is the general syntax for using translator
node translator filename source-language target-language upload | download
For example, if you wanted to translate your English SubRip file into Spanish you would use the following command:
node translator myVideo.srt en es upload
Once the translation is completed you would download it using the following command:
node translator myVideo.srt en es download
Apache 2.0. See license.txt
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.