This project demonstrates parallel audio transcription using OpenAI's Whisper model on Kubernetes/OpenShift, orchestrated by Kueue for workload management. It downloads audio files from the Rev.com earnings22 speech dataset and transcribes them using either CPU or GPU.
For experienced users, here's the TL;DR version:
# 1. Build and push containers
podman build -t quay.io/<user>/alpine:latest -f containers/init/Dockerfile containers/init/ && podman push quay.io/<user>/alpine:latest
podman build -t quay.io/<user>/whisper:latest -f containers/whisper/Dockerfile containers/whisper/ && podman push quay.io/<user>/whisper:latest
# 2. Create namespace and apply Kueue configs
oc create namespace sai
oc apply -f kueue/manifests/resource-flavor.yml
oc apply -f kueue/manifests/cluster-queue.yml
oc apply -f kueue/manifests/local-queue.yml
# 3. (Optional) Configure GitHub token
cp kueue/manifests/github-token-secret.yml.template kueue/manifests/github-token-secret.yml
# Edit github-token-secret.yml with your token, then:
oc apply -f kueue/manifests/github-token-secret.yml
# 4. Deploy ConfigMap and Job
oc apply -f kueue/manifests/download-script-configmap.yml
oc apply -f kueue/manifests/whisper-gpu.yml
# 5. Monitor
oc get workloads -n sai
oc get pods -n saiSee detailed instructions below for more information.
.
├── containers/
│   ├── init/
│   │   └── Dockerfile          # Alpine image with curl and jq for downloading files
│   └── whisper/
│       └── Dockerfile          # Whisper transcription image with Python and ffmpeg
└── kueue/
    └── manifests/
        ├── resource-flavor.yml              # Defines CPU/GPU resource types
        ├── cluster-queue.yml                # Cluster-wide resource quota definitions
        ├── local-queue.yml                  # Namespace-scoped queue
        ├── download-script-configmap.yml    # Shell script for downloading audio files
        ├── whisper-gpu.yml                  # Main job manifest (indexed job)
        └── github-token-secret.yml.template # Template for GitHub API token
- OpenShift 4.19.4 cluster (or compatible Kubernetes cluster)
 - Podman (for building containers)
 - kubectl/oc CLI tools
 - Kueue installed from kubernetes-sigs/kueue
 
For GPU-accelerated transcription, install the following operators on OpenShift:
- Node Feature Discovery (NFD) Operator - Detects hardware features and labels nodes
 - NVIDIA GPU Operator - Manages NVIDIA GPU resources
 - Instantiate the ClusterPolicy CR after installing the GPU operator
 
Follow these steps in order to set up the audio transcription pipeline:
Build the init container (Alpine with curl and jq for downloading audio files):
podman build -t quay.io/<your-username>/alpine:latest -f containers/init/Dockerfile containers/init/
podman push quay.io/<your-username>/alpine:latestBuild the Whisper container (Python with ffmpeg and openai-whisper for transcription):
podman build -t quay.io/<your-username>/whisper:latest -f containers/whisper/Dockerfile containers/whisper/
podman push quay.io/<your-username>/whisper:latestCreate a dedicated namespace for your transcription workloads:
oc create namespace saiApply Kueue configuration manifests in this order:
Step 3.1: Resource Flavor - Defines available resource types (CPU/GPU flavors):
oc apply -f kueue/manifests/resource-flavor.ymlStep 3.2: Cluster Queue - Defines resource quotas (6 CPUs, 8Gi memory, 2 GPUs):
oc apply -f kueue/manifests/cluster-queue.ymlStep 3.3: Local Queue - Namespace-scoped queue linked to the cluster queue:
oc apply -f kueue/manifests/local-queue.ymlThe download script accesses GitHub's API. Without authentication, you're limited to 60 requests/hour. To avoid rate limiting:
Step 4.1: Generate a GitHub personal access token with repo scope at https://github.com/settings/tokens
Step 4.2: Create a secret file from the template:
# Copy the template
cp kueue/manifests/github-token-secret.yml.template kueue/manifests/github-token-secret.yml
# Edit the file and replace YOUR_GITHUB_TOKEN_HERE with your actual token
# IMPORTANT: Never commit github-token-secret.yml to version control!
# Apply the secret
oc apply -f kueue/manifests/github-token-secret.ymlSecurity Note: The file kueue/manifests/github-token-secret.yml is git-ignored to prevent accidentally committing your token.
The ConfigMap contains a shell script that:
- Fetches the list of MP3 files from the earnings22/media directory via GitHub API
 - Uses the 
JOB_COMPLETION_INDEXenvironment variable to select which file to download - Handles pagination for large directories (100 files per page)
 - Implements retry logic with exponential backoff for rate limiting
 - Downloads the selected audio file to 
/datadirectory using Git LFS media URLs 
Apply the ConfigMap:
oc apply -f kueue/manifests/download-script-configmap.ymlThe job manifest (kueue/manifests/whisper-gpu.yml) uses an Indexed Job pattern for parallel processing:
- Parallelism: 2 pods run concurrently
 - Completions: 6 total tasks (6 different audio files to process)
 - Completion Mode: Indexed - each pod gets a unique 
JOB_COMPLETION_INDEX(0-5) 
Init Container (download-audio):
- Uses the Alpine image with curl and jq
 - Mounts the download script from ConfigMap at 
/scripts - Executes 
download-audio.shwhich usesJOB_COMPLETION_INDEXto download a specific MP3 file - Saves the audio file to shared 
/datavolume - Uses GitHub token from secret for API authentication (avoids rate limits)
 
Main Container (whisper-transcriber):
- Uses the Whisper image with Python, ffmpeg, and openai-whisper
 - Reads the audio file from shared 
/datavolume - Runs Whisper transcription with the 
tiny.enmodel - Outputs transcription to 
/tmp 
| Volume Name | Type | Purpose | Mount Points | 
|---|---|---|---|
audio-data | 
emptyDir | Shares downloaded audio between init and main containers | Init: /data, Main: /data | 
model-cache-volume | 
emptyDir | Caches Whisper model files to avoid re-downloading | Main: /tmp/whisper_models | 
download-script | 
configMap | Provides the download script to init container | Init: /scripts (executable) | 
Before deploying, update the job manifest (kueue/manifests/whisper-gpu.yml) if needed:
- Change the namespace (default: 
sai) - Update image references to match your container registry
 - Adjust parallelism and completions values if desired
 
Deploy the job:
oc apply -f kueue/manifests/whisper-gpu.ymlCheck Kueue workload status:
oc get workloads -n saiView job status:
oc get jobs -n sai
oc describe job whisper-transcription-cpu -n saiView running pods:
oc get pods -n saiCheck logs for a specific pod:
# View init container logs (download process)
oc logs <pod-name> -n sai -c download-audio
# View main container logs (transcription process)
oc logs <pod-name> -n sai -c whisper-transcriberEach pod in the job receives a unique JOB_COMPLETION_INDEX environment variable:
- Pod 1: 
JOB_COMPLETION_INDEX=0→ downloads file at index 0 - Pod 2: 
JOB_COMPLETION_INDEX=1→ downloads file at index 1 - ...and so on
 
This enables parallel processing of different files without coordination between pods. With parallelism=2, two files are processed simultaneously until all 6 completions are done.
The provided job uses CPU resources. To enable GPU transcription:
- Ensure GPU operators are installed and nodes are labeled
 - Uncomment the GPU limits in 
whisper-gpu.yml: 
limits:
  nvidia.com/gpu: 1- Modify the Whisper command to use GPU acceleration (requires CUDA-compatible setup)
 
┌─────────────────────────────────────────────────────────────┐
│                      Kueue ClusterQueue                     │
│                   (manages resource quotas)                 │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│                   Kueue LocalQueue (sai)                    │
│              (namespace-scoped queue)                       │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────────────┐
│             Indexed Job (6 completions, 2 parallel)         │
├─────────────────────────────────────────────────────────────┤
│  Pod [0]                              Pod [1]               │
│  ┌────────────────┐                  ┌────────────────┐     │
│  │ Init: Download │ JOB_INDEX=0      │ Init: Download │ ... │
│  │ script+GitHub  │───────────▶file0 │ script+GitHub  │     │
│  └───────┬────────┘                  └───────┬────────┘     │
│          │ /data (emptyDir)                  │              │
│          ▼                                   ▼              │
│  ┌────────────────┐                  ┌────────────────┐     │
│  │ Main: Whisper  │                  │ Main: Whisper  │     │
│  │ transcription  │                  │ transcription  │     │
│  └────────────────┘                  └────────────────┘     │
└─────────────────────────────────────────────────────────────┘
Audio files are sourced from the Rev.com Speech Datasets - earnings22 collection, which contains earnings call recordings in MP3 format. The download script automatically filters for .mp3 files and handles GitHub's pagination to support large directories.
| Issue | Solution | 
|---|---|
| GitHub rate limit errors | Ensure the GitHub token secret is properly configured (Step 4) | 
| Pod failures during download | Check init container logs: oc logs <pod-name> -n sai -c download-audio | 
| GPU not detected | Verify NFD and GPU operators are running: oc get pods -n nvidia-gpu-operator and ensure nodes have GPU labels: oc get nodes --show-labels | grep nvidia | 
| Job stuck in queue | Check Kueue workload admission status: oc describe workload <name> -n sai | 
| Image pull errors | Verify container images are pushed to your registry and image references in manifests are correct | 
| Transcription fails | Check main container logs: oc logs <pod-name> -n sai -c whisper-transcriber | 
| Software | Version | 
|---|---|
| OpenShift | 4.19.16 | 
| Kubernetes | 1.32 | 
| DAS | 0.1.0 | 
| Kueue | |
| NVIDIA GPU Operator | 25.3.4 | 
| Node Feature Discovery Operator | 4.19.0-202510142112 | 
| cert-manager Operator | 1.17.0 | 
| Gateway API Inference extension | v0.5.1 | 
| OpenShift Service Mesh (Required for Istio) | 3.1.3 | 
| Istio | v1.26.3 | 
| llm-d-inference-scheduler | v0.3.2 | 
| vLLM (From ghcr.io/llm-d/llm-d-cuda:v0.3.0) | 0.11.0rc6+precompiled | 
| Guidellm | v0.3.1 |