Skip to content

consumes youtube urls and produces voiceless instrumentals in all 12 keys

Notifications You must be signed in to change notification settings

bibby/rekey-karaoke

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rekey-karaoke

A webpage interface and backend processing queue for transforming a youtube URL of a song with instruments and voice(s) into instrumentals in all 12 keys. This could be useful for singers, bands, worship teams, and other vocal performers to generate practice tracks suited to their range.

This is a personal toy project, and has no guarantee of quality or fitness for purpose.


Setup

The intended deployment of this application is as a docker container that can start additional containers, thus it is needed for /var/run/docker.sock to be volumed in (see docker-compose.yml).

Volumes on the docker host are used for temporary storage of process artifacts. Artifacts are ultimately uploaded to S3 at a bucket you specify, but the temp storage is used to pass data between containers during the processing steps, and are cleaned up as a final step.

Decide on a root volume.

Edit ./build.sh at the root of this repo and set your volume to the var vol (line 2). Part of the build creates the necessary subdirectories that are owned by a non-root user. If the non-root user is not uid 1000, edit ./build.sh line 22 to declare this.

env file

Inspect ./server/consts.py to see what other options are available for a configuration env file (see: example.env). Minimally, your config file should include:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • S3_BUCKET
  • DOCKER_USER (if other than 1000)

Edit docker-compose.yml to name your env file for config values.

Process Steps

When a youtube url is supplied, it enters an initial queue and steps through the following processes:

  • MetaData : Gets song title, thumbnail, length, etc using image thr3a/yt-dlp or YTDLP_IMAGE
  • Download : Initial audio fetched with image thr3a/yt-dlp or YTDLP_IMAGE
  • KeyDetect : The original song key is detected/best-guessed with image sourced at ./key-detect. This is a found script, whose original author I cannot locate at this time, but will try to attribute properly ASAP
  • Split : Vocals get isolated and separated using image deezer/spleeter or SPLEETER_IMAGE
  • ReKey : Artifacts (Instrumentals AND Vocals) are transposed using the image sourced at ./rubber, which is a light wrapper for the rubber band pitch shifting lib.
  • Encode : wavs to mp3 with ffmpeg
  • Upload : with boto3
  • Cleanup

The web app uses the peewee ORM on a sqlite3 db, and uses tailwind and htmx .

Auth

There is no auth. Auth is fake. This is a toy.

Build

./build.sh

Start

docker-compose up -d

TODO

  • Polls at regular intervals. Would prefer long polling.
  • Processed audio could be stitched back onto the original video (like lyric videos), but video is currently ignored
  • Actual auth might be nice.
  • This could be a fun project to convert to AWS Lambda or other serverless/FaaS platform

About

consumes youtube urls and produces voiceless instrumentals in all 12 keys

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published