diff --git a/README.md b/README.md index 3d22d47..3e655bf 100644 --- a/README.md +++ b/README.md @@ -76,66 +76,12 @@ archival purposes, and portability to other projects. # AI Transcription -Perhaps a fully automated process will be forthcomming, but for now I'm -manually running commands to transcribe audio files with AI, and copying the -result by hand into existing markdown files inside of `./assets/todo/`. The -following commands are all in the nix dev shell (which you can enter using -`nix develop` if your not using direnv). +`flake.nix` packages a `bash` script named `transcribe`. It downloads the source audio of any file in `./assets`, transcribes it, then updates the asset with the transcription, and updates the frontmatter data to reflect this change. -First I pick a file from `./assets/todo` that doesn't have a transcript. Say -the filename begins with the date 2022-02-02. Well, I copy the url from it's -frontmatter key `source.url` and use `yt-dlp` to download the audio stream -and output the result to a file with that date as it's name: +1. Argument #1 is the markdown file to transcribe and update. +2. Arguument #2 is your name, to log in the assets metadata. ```bash -yt-dlp -x "https://website.com/some-video-or-audio-file-url" -o 2022-02-02 +nix run github:marcuswhybrow/ray-peat-rodeo#transcribe -- ./assets/todo/2024-10-12-example.md "Marcus Whybrow" ``` -Sometimes the output file will be called `2022-02-02.opus` or some other -extension, sometimes it will have no extension. Let's assume it's `.opus`. - -I then ask Whisper AI to transcribe the audio file and output a JSON file -describing the results. I believe it's faster to tell Whisper it's an English -language conversion: - -```bash -whisper --language English --output_format json 2022-02-02.opus -``` - -This takes a while, and a great while on old laptops. But once it's done you -shoud have a file in the same directory called `2022-02-02.json`. Whisper has -many output formats, but I've chosen JSON for it's flexibility in the next step. - -The closest format whisper can output is `txt`. But this has no timestamp data -in the output text. I'd like to pepper in timestamps (which whisper knows -about) every minute or so into the resulting output. And I want them to adhere -to our custom markdown extension format: `[h:mm:ss]` e.g. `[1:23:45]`. The -square brackets are important. - -So I call a custom tool written for this project that reads the JSON, ouputting -text in the way I've just descibed. I use linux redirection to append that -result to the end of the markdown file I started with: - -```bash -whisper-json2md source-audio.json >> ./assets/todo/2022-02-02-example.md -``` - -Then I have a look at this markdown file, and check it out in the browser -(which would be https://localhost:8000/example in this example). - -Finally I update the frontmatter to reflect the new state of this asset. -I add the following: - -```yaml -transcription: - date: 2024-04-10 # todays date - author: Whisper AI - kind: auto-generated - -added: - date: 2024-04-10 - author: Marcus Whybrow # or your name instead -``` - -When the website is deployed this metadata makes sure everything looks right, -and the appropriate descriptions and details are available. diff --git a/flake.nix b/flake.nix index 4007640..04f13da 100644 --- a/flake.nix +++ b/flake.nix @@ -50,6 +50,14 @@ cp -r ./internal/assets/* ./build/assets mv ./build $out ''; + + meta = { + description = "Takes a Whisper IA JSON file as it's first arguent & outputs markdown to stdout appropriate to append to Ray Peat Rodeo markdown file."; + homepage = "https://github.com/marcuswhybrow/ray-peat-rodeo"; + maintainers = [ + "Marcus Whybrow " + ]; + }; }; whisper-json2md = pkgs.buildGoApplication { @@ -65,6 +73,42 @@ ''; }; + transcribe = pkgs.writeScriptBin "transcribe" '' + set -o xtrace + + asset_path="$1" + author="$2" + + asset_name=$(basename "$asset_path") + source_url=$(${pkgs.yq-go}/bin/yq ".source.url | select(.)" "$asset_path") + + tmp_dir_audio=$(mktemp --directory) + audio_path="$tmp_dir_audio/$asset_name" + + ${pkgs.yt-dlp}/bin/yt-dlp -x "$source_url" -o "$audio_path" + audio_name_actual=$(ls -AU "$tmp_dir_audio" | head -1) + audio_path_actual="$tmp_dir_audio/$audio_name_actual" + + ls "$tmp_dir_audio" + + tmp_dir_json=$(mktemp --directory) + ${pkgs.openai-whisper}/bin/whisper --language English --output_format json --output_dir "$tmp_dir_json" "$audio_path_actual" + json_name=$(ls -AU "$tmp_dir_json" | head -1) + json_path="$tmp_dir_json/$json_name" + + today=$(date +"%Y-%m-%d") + yq="${pkgs.yq-go}/bin/yq --front-matter process --inplace" + $yq ".transcription.date = \"$today\"" "$asset_path" + $yq ".transcription.author = \"Whisper AI\"" "$asset_path" + $yq ".transcription.kind = \"auto-generated\"" "$asset_path" + $yq ".added.author = \"$author\"" "$asset_path" + $yq ".added.date = \"$today\"" "$asset_path" + ${inputs.self.packages.x86_64-linux.whisper-json2md}/bin/whisper-json2md "$json_path" >> "$asset_path" + + # rm -r "$tmp_dir_audio" + # rm -r "$tmp_dir_json" + ''; + default = build; }; @@ -131,6 +175,10 @@ # Custom tool to convert Whisper JSON output to our markdown format inputs.self.packages.x86_64-linux.whisper-json2md + + # Convenience bash script using yt-dlp, whisper & whisper-json2md to + # transcribe and update assets with a `source.url` in the frontmatter. + inputs.self.packages.x86_64-linux.transcribe ]; }; });