ICPE26 Industry Track Paper

Kubernetes Audio Transcription with Kueue and Whisper

This project demonstrates parallel audio transcription using OpenAI's Whisper model on Kubernetes/OpenShift, orchestrated by Kueue for workload management. It downloads audio files from the Rev.com earnings22 speech dataset and transcribes them using either CPU or GPU.

Quick Start

For experienced users, here's the TL;DR version:

# 1. Build and push containers
podman build -t quay.io/<user>/alpine:latest -f containers/init/Dockerfile containers/init/ && podman push quay.io/<user>/alpine:latest
podman build -t quay.io/<user>/whisper:latest -f containers/whisper/Dockerfile containers/whisper/ && podman push quay.io/<user>/whisper:latest

# 2. Create namespace and apply Kueue configs
oc create namespace sai
oc apply -f kueue/manifests/resource-flavor.yml
oc apply -f kueue/manifests/cluster-queue.yml
oc apply -f kueue/manifests/local-queue.yml

# 3. (Optional) Configure GitHub token
cp kueue/manifests/github-token-secret.yml.template kueue/manifests/github-token-secret.yml
# Edit github-token-secret.yml with your token, then:
oc apply -f kueue/manifests/github-token-secret.yml

# 4. Deploy ConfigMap and Job
oc apply -f kueue/manifests/download-script-configmap.yml
oc apply -f kueue/manifests/whisper-gpu.yml

# 5. Monitor
oc get workloads -n sai
oc get pods -n sai

See detailed instructions below for more information.

Project Structure

.
├── containers/
│   ├── init/
│   │   └── Dockerfile          # Alpine image with curl and jq for downloading files
│   └── whisper/
│       └── Dockerfile          # Whisper transcription image with Python and ffmpeg
└── kueue/
    └── manifests/
        ├── resource-flavor.yml              # Defines CPU/GPU resource types
        ├── cluster-queue.yml                # Cluster-wide resource quota definitions
        ├── local-queue.yml                  # Namespace-scoped queue
        ├── download-script-configmap.yml    # Shell script for downloading audio files
        ├── whisper-gpu.yml                  # Main job manifest (indexed job)
        └── github-token-secret.yml.template # Template for GitHub API token

Prerequisites

Software Requirements

OpenShift 4.19.4 cluster (or compatible Kubernetes cluster)
Podman (for building containers)
kubectl/oc CLI tools
Kueue installed from kubernetes-sigs/kueue

GPU Support Requirements (Optional)

For GPU-accelerated transcription, install the following operators on OpenShift:

Node Feature Discovery (NFD) Operator - Detects hardware features and labels nodes
NVIDIA GPU Operator - Manages NVIDIA GPU resources
Instantiate the ClusterPolicy CR after installing the GPU operator

Setup Instructions

Follow these steps in order to set up the audio transcription pipeline:

1. Build and Push Container Images

Build the init container (Alpine with curl and jq for downloading audio files):

podman build -t quay.io/<your-username>/alpine:latest -f containers/init/Dockerfile containers/init/
podman push quay.io/<your-username>/alpine:latest

Build the Whisper container (Python with ffmpeg and openai-whisper for transcription):

podman build -t quay.io/<your-username>/whisper:latest -f containers/whisper/Dockerfile containers/whisper/
podman push quay.io/<your-username>/whisper:latest

2. Create Namespace

Create a dedicated namespace for your transcription workloads:

oc create namespace sai

3. Configure Kueue Resources

Apply Kueue configuration manifests in this order:

Step 3.1: Resource Flavor - Defines available resource types (CPU/GPU flavors):

oc apply -f kueue/manifests/resource-flavor.yml

Step 3.2: Cluster Queue - Defines resource quotas (6 CPUs, 8Gi memory, 2 GPUs):

oc apply -f kueue/manifests/cluster-queue.yml

Step 3.3: Local Queue - Namespace-scoped queue linked to the cluster queue:

oc apply -f kueue/manifests/local-queue.yml

4. Configure GitHub Access (Optional but Recommended)

The download script accesses GitHub's API. Without authentication, you're limited to 60 requests/hour. To avoid rate limiting:

Step 4.1: Generate a GitHub personal access token with repo scope at https://github.com/settings/tokens

Step 4.2: Create a secret file from the template:

# Copy the template
cp kueue/manifests/github-token-secret.yml.template kueue/manifests/github-token-secret.yml

# Edit the file and replace YOUR_GITHUB_TOKEN_HERE with your actual token
# IMPORTANT: Never commit github-token-secret.yml to version control!

# Apply the secret
oc apply -f kueue/manifests/github-token-secret.yml

Security Note: The file kueue/manifests/github-token-secret.yml is git-ignored to prevent accidentally committing your token.

5. Deploy the Download Script ConfigMap

The ConfigMap contains a shell script that:

Fetches the list of MP3 files from the earnings22/media directory via GitHub API
Uses the JOB_COMPLETION_INDEX environment variable to select which file to download
Handles pagination for large directories (100 files per page)
Implements retry logic with exponential backoff for rate limiting
Downloads the selected audio file to /data directory using Git LFS media URLs

Apply the ConfigMap:

oc apply -f kueue/manifests/download-script-configmap.yml

Running Transcription Jobs

Step 6: Deploy and Monitor the Whisper Job

Understanding the Job Configuration

The job manifest (kueue/manifests/whisper-gpu.yml) uses an Indexed Job pattern for parallel processing:

Parallelism: 2 pods run concurrently
Completions: 6 total tasks (6 different audio files to process)
Completion Mode: Indexed - each pod gets a unique JOB_COMPLETION_INDEX (0-5)

Job Architecture

Init Container (download-audio):

Uses the Alpine image with curl and jq
Mounts the download script from ConfigMap at /scripts
Executes download-audio.sh which uses JOB_COMPLETION_INDEX to download a specific MP3 file
Saves the audio file to shared /data volume
Uses GitHub token from secret for API authentication (avoids rate limits)

Main Container (whisper-transcriber):

Uses the Whisper image with Python, ffmpeg, and openai-whisper
Reads the audio file from shared /data volume
Runs Whisper transcription with the tiny.en model
Outputs transcription to /tmp

Volume Configuration

Volume Name	Type	Purpose	Mount Points
`audio-data`	emptyDir	Shares downloaded audio between init and main containers	Init: `/data`, Main: `/data`
`model-cache-volume`	emptyDir	Caches Whisper model files to avoid re-downloading	Main: `/tmp/whisper_models`
`download-script`	configMap	Provides the download script to init container	Init: `/scripts` (executable)

Deploy the Job

Before deploying, update the job manifest (kueue/manifests/whisper-gpu.yml) if needed:

Change the namespace (default: sai)
Update image references to match your container registry
Adjust parallelism and completions values if desired

Deploy the job:

oc apply -f kueue/manifests/whisper-gpu.yml

Monitor Job Progress

Check Kueue workload status:

oc get workloads -n sai

View job status:

oc get jobs -n sai
oc describe job whisper-transcription-cpu -n sai

View running pods:

oc get pods -n sai

Check logs for a specific pod:

# View init container logs (download process)
oc logs <pod-name> -n sai -c download-audio

# View main container logs (transcription process)
oc logs <pod-name> -n sai -c whisper-transcriber

How Indexed Jobs Work

Each pod in the job receives a unique JOB_COMPLETION_INDEX environment variable:

Pod 1: JOB_COMPLETION_INDEX=0 → downloads file at index 0
Pod 2: JOB_COMPLETION_INDEX=1 → downloads file at index 1
...and so on

This enables parallel processing of different files without coordination between pods. With parallelism=2, two files are processed simultaneously until all 6 completions are done.

GPU vs CPU Configuration

The provided job uses CPU resources. To enable GPU transcription:

Ensure GPU operators are installed and nodes are labeled
Uncomment the GPU limits in whisper-gpu.yml:

limits:
  nvidia.com/gpu: 1

Modify the Whisper command to use GPU acceleration (requires CUDA-compatible setup)

Architecture Summary

┌─────────────────────────────────────────────────────────────┐
│                      Kueue ClusterQueue                     │
│                   (manages resource quotas)                 │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│                   Kueue LocalQueue (sai)                    │
│              (namespace-scoped queue)                       │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│             Indexed Job (6 completions, 2 parallel)         │
├─────────────────────────────────────────────────────────────┤
│  Pod [0]                              Pod [1]               │
│  ┌────────────────┐                  ┌────────────────┐     │
│  │ Init: Download │ JOB_INDEX=0      │ Init: Download │ ... │
│  │ script+GitHub  │───────────▶file0 │ script+GitHub  │     │
│  └───────┬────────┘                  └───────┬────────┘     │
│          │ /data (emptyDir)                  │              │
│          ▼                                   ▼              │
│  ┌────────────────┐                  ┌────────────────┐     │
│  │ Main: Whisper  │                  │ Main: Whisper  │     │
│  │ transcription  │                  │ transcription  │     │
│  └────────────────┘                  └────────────────┘     │
└─────────────────────────────────────────────────────────────┘

Dataset Information

Audio files are sourced from the Rev.com Speech Datasets - earnings22 collection, which contains earnings call recordings in MP3 format. The download script automatically filters for .mp3 files and handles GitHub's pagination to support large directories.

Troubleshooting

Common Issues

Issue	Solution
GitHub rate limit errors	Ensure the GitHub token secret is properly configured (Step 4)
Pod failures during download	Check init container logs: `oc logs <pod-name> -n sai -c download-audio`
GPU not detected	Verify NFD and GPU operators are running: `oc get pods -n nvidia-gpu-operator` and ensure nodes have GPU labels: `oc get nodes --show-labels \| grep nvidia`
Job stuck in queue	Check Kueue workload admission status: `oc describe workload <name> -n sai`
Image pull errors	Verify container images are pushed to your registry and image references in manifests are correct
Transcription fails	Check main container logs: `oc logs <pod-name> -n sai -c whisper-transcriber`

Version control

Software	Version
OpenShift	4.19.16
Kubernetes	1.32
DAS	0.1.0
Kueue
NVIDIA GPU Operator	25.3.4
Node Feature Discovery Operator	4.19.0-202510142112
cert-manager Operator	1.17.0
Gateway API Inference extension	v0.5.1
OpenShift Service Mesh (Required for Istio)	3.1.3
Istio	v1.26.3
llm-d-inference-scheduler	v0.3.2
vLLM (From ghcr.io/llm-d/llm-d-cuda:v0.3.0)	0.11.0rc6+precompiled
Guidellm	v0.3.1

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
containers		containers
das		das
kueue		kueue
llm-d/guidellm		llm-d/guidellm
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ICPE26 Industry Track Paper

Kubernetes Audio Transcription with Kueue and Whisper

Quick Start

Project Structure

Prerequisites

Software Requirements

GPU Support Requirements (Optional)

Setup Instructions

1. Build and Push Container Images

2. Create Namespace

3. Configure Kueue Resources

4. Configure GitHub Access (Optional but Recommended)

5. Deploy the Download Script ConfigMap

Running Transcription Jobs

Step 6: Deploy and Monitor the Whisper Job

Understanding the Job Configuration

Job Architecture

Volume Configuration

Deploy the Job

Monitor Job Progress

How Indexed Jobs Work

GPU vs CPU Configuration

Architecture Summary

Dataset Information

Troubleshooting

Common Issues

Version control

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

RedHatResearch/icpe26-k8s-ai-apis

Folders and files

Latest commit

History

Repository files navigation

ICPE26 Industry Track Paper

Kubernetes Audio Transcription with Kueue and Whisper

Quick Start

Project Structure

Prerequisites

Software Requirements

GPU Support Requirements (Optional)

Setup Instructions

1. Build and Push Container Images

2. Create Namespace

3. Configure Kueue Resources

4. Configure GitHub Access (Optional but Recommended)

5. Deploy the Download Script ConfigMap

Running Transcription Jobs

Step 6: Deploy and Monitor the Whisper Job

Understanding the Job Configuration

Job Architecture

Volume Configuration

Deploy the Job

Monitor Job Progress

How Indexed Jobs Work

GPU vs CPU Configuration

Architecture Summary

Dataset Information

Troubleshooting

Common Issues

Version control

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages