This repository is for extracting asset data, namely voice clips and their corresponding transcriptions, from the game Disco Elysium, specifically the "Final Cut" version, and reformat the extracted data into a format that ESPnet understands and could use to train a vocoder.
My goal is to at the very least 1) have a vocoder using WaveNet with the characteristics of the narrator in the "Final Cut" version of the game, and 2) package and publish the vocoder as a mobile app, as the open source ones I found so far are not really great.
To these ends, I intend to have three repositories:
- One to extract the dialogue data, audio clips, and match them together in a format understood by ESPnet for training, which is this repository.
- One dedicated to problems that arise when training the vocoder.
- One (or maybe two for each currently dominant mobile platforms) for the packaging and publishing of the vocoder on the mobile platform.
I put the code for preparing the data into Mix tasks. You can check them under mix/tasks
to see the details.
I love the game and the voice of its narrator, and perhaps out of vanity I think I could do better than current open-source text-to-speech solutions available on mobile platforms.
This project is written with the Final Cut version of the game in mind, specifically version 2832f901
, released on 2021-04-19. I cannot ensure the correctness of the app for earlier or later versions, in fact I have tried using this repository on a later version and things no longer work. For now, to use this repository you will need to use a program to download version 2832f901
of the game, for example DepotDownloader.
Please also note that you will need around 65GB of free disk space to store the extracted audio clips.
So far I have only completed two mix
tasks doing the following:
- Extracting conversation, dialogue entry, actor, and item data from the dialogue bundle.
- Matching the extracted audio clips with the extracted dialogue entries.
I still need to implement two other mix
tasks doing the following:
3. Converting the matches into a csv
file following the LSJ format for training.
4. Putting everything into a single place so that with a single invocation we can generate the csv
file needed for training.
If you still want to check out the finished mix
tasks then please follow the instructions for setting up the repository and running those task in the sections below.
Should you wish to try out the code in this repo, please follow the instructions in the sections below:
You should have these installed:
- Elixir 1.14.0
- Erlang OTP 25.1
- PostgreSQL 13.3
I cannot guarantee that the code works for lower versions of the applications listed above.
Please also make sure that you have a around 65GB of free disk space for the audio clips.
Create a database.exs
file under the folder config
of the repository. The content of the file should look like this:
import Config
config :data_prepration, Elysium.Repo,
database: "elysium",
username: "<Your Database Username Here>",
password: "<Your Database Password Here>",
hostname: "localhost",
log: :info # Change this to false to mute ecto debug logs. Keep it otherwise.
Then run mix deps.get
to install dependencies of the project. Note that the file database.exs
is necessary for setting up the database as well.
Make sure that you have created a user within PostgreSQL using the credentials in the file database.exs
. Then run these commands to setup the database:
mix ecto.create
mix ecto.migrate
You will need to use Asset Studio to extract data from the asset files. Please purchase a copy of the game. I can give you a copy of the extracted data and the generated database as well if you cannot buy the game for some reason.
- Locate your local installation of the game.
- Open Asset Studio.
- Load the file at
<game root>/disco_Data/StreamingAssets/aa/StandaloneWindows64/dialoguebundle_assets_all_<some hash>.bundle
. - Export all the assets you see in Asset Studio. There should only be one asset containing the bundled dialogue data.
You should see the folder MonoBehaviour
within the location you chose in step #4.
- Please make sure that you have the free disk space needed to store the audio clips. You should have around 65GBs of free disk.
- Open Asset Studio.
- Load the folder at
<game root>/disco_Data/StreamingAssets/aa/StandaloneWindows64/
. - Filter the asset by type, make sure that only
AudioClip
is checked. - Export the files to a folder of your choice. It will take a while.
- You should see a new folder
AudioClip
within the folder you chose that contains all of the audio clips.
Run this command:
mix prepare_bundle <path to the dialogue bundle json file>
For example:
mix prepare_bundle '/extracted_assets/MonoBehaviour/Disco Elysium.json'
After running this task, you should see that the database configured in the file database.exs
is populated with conversation, dialogue entry, actor, and item data.
Run this command:
mix label_audio_clips <path to the folder containing the audio clips>
For example:
mix prepare_bundle '/extracted_assets/AudioClip'
After running this task, you should see the configured database is populated with audio clip metadata, in the table audio_clips
.
If you are interested in contributing or reporting bugs, please check the issue list. Constructive feedback is appreciated.