AWS Transcribe API

A serverless AWS Lambda function that provides multi-language audio/video transcription services using AWS Transcribe. This service supports English, Chinese, Malay, and Indonesian languages and accepts S3 URLs for audio/video files.

✅ Deployment Status

Service: Successfully deployed and operational
Last Updated: October 1, 2025

Live API Endpoints:

POST https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe
GET https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status
GET https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health
POST https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url

Infrastructure:

✅ Lambda Functions: 4 functions deployed (25 MB each)
✅ S3 Bucket: Auto-created wenhao1223-transcribe-aws-transcribe-api-dev-20251001
✅ IAM Roles: Auto-generated with proper Transcribe and S3 permissions
✅ API Gateway: Configured with CORS support
✅ Security: Private bucket with encryption enabled
✅ Cost Optimization: 30-day lifecycle policy for auto-cleanup

🎤 Features

🤖 Automatic Language Detection: Automatically detects single or multiple languages in audio files
🌍 Multi-Language Support: Seamlessly handles mixed-language conversations and language switching
🎯 Smart Language Filtering: Optional candidate languages for improved accuracy and speed
🗣️ Language Support: English (en-us), Chinese (zh-cn), Malay (ms-my), Indonesian (id-id)
☁️ S3 Integration: Accept S3 URLs for audio/video files
🌐 RESTful API: Simple HTTP endpoints for transcription operations
⏱️ Real-time Status: Check transcription job status and retrieve results
⚡ Synchronous Processing: Process URLs and get immediate results
📊 Enhanced Response: Returns transcript with detected language information
🔗 CORS Enabled: Frontend-friendly with CORS support
🗑️ Auto S3 Cleanup: Automatic deletion of old transcripts after 30 days

🚀 Quick Test Commands

# 1. Health check
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health"

# 2. Quick transcribe with automatic language detection
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-s3-bucket.s3.amazonaws.com/sample-audio.m4a"}'

# 3. Quick transcribe with candidate languages (optional optimization)
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-s3-bucket.s3.amazonaws.com/sample-audio.m4a", "candidate_languages": ["en-us", "zh-cn"]}'

# 4. Start async transcription job (for very long audio files)
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-bucket.s3.amazonaws.com/audio.mp3", "language": "en-us"}'

# 5. Check job status (replace JOB_NAME with actual job name)
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status?job_name=JOB_NAME"

📋 API Endpoints

1. Health Check

GET /health

Check if the service is running and get supported languages.

curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health"

Response Format:

{
  "status": {
    "statusCode": 200,
    "message": "Service is healthy"
  },
  "data": {
    "service": "aws-transcribe-api",
    "supported_languages": ["en-us", "zh-cn", "ms-my", "id-id"],
    "language_detection": "automatic_multi_language_by_default",
    "features": {
      "automatic_language_identification": true,
      "multi_language_support": true,
      "single_and_mixed_language_audio": true,
      "speaker_labeling": true,
      "alternative_transcriptions": true,
      "candidate_language_filtering": true
    },
    "timestamp": "2023-12-01T12:34:56.789Z",
    "request_id": "test-request-id-123"
  }
}

2. Start Transcription Job

POST /transcribe

Start a new asynchronous transcription job for an audio/video file.

Request Body:

{
  "url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3",
  "language": "en-us"
}

Response Format:

{
  "status": {
    "statusCode": 200,
    "message": "Transcription job started successfully"
  },
  "data": {
    "job_name": "transcribe_job_20231201_123456_abc123",
    "job_status": "IN_PROGRESS",
    "language_code": "en-us",
    "media_url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3",
    "creation_time": "2023-12-01T12:34:56.321000+00:00",
    "estimated_completion_time": "Processing time varies based on audio length"
  }
}

3. Check Status

GET /status?job_name=<job_name>

Check the status of a transcription job and retrieve results if completed.

Response Format:

{
  "status": {
    "statusCode": 200,
    "message": "Job status retrieved successfully"
  },
  "data": {
    "job_name": "transcribe_job_20231201_123456_abc123",
    "status": "COMPLETED",
    "creation_time": "2023-12-01T12:34:56.789000+00:00",
    "completion_time": "2023-12-01T12:36:45.123000+00:00", // null if status is 'IN_PROGRESS'
    "language_code": "en-US",
    "transcript": "Hello, this is the transcribed text from your audio file.", // undefined if status is 'IN_PROGRESS'
    "transcript_uri": "https://s3.amazonaws.com/bucket/transcript.json", // undefined if status is 'IN_PROGRESS'
    "identitfied_language_code": "en-US" // undefined if status is 'IN_PROGRESS'
  }
}

4. Process URL (Quick Transcribe) 🆕 Enhanced with Automatic Language Detection

POST /process-url

Process an S3 URL and return the completed transcript immediately with automatic multi-language detection. No need to specify languages - AWS Transcribe automatically handles single or multiple languages in your audio.

Simple Request (Automatic Detection):

{
  "url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3"
}

Request with Language Candidates (Optional - Improves Accuracy):

{
  "url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3",
  "candidate_languages": ["en-us", "zh-cn", "ms-my"]
}

Response Format:

{
  "status": {
    "statusCode": 200,
    "message": "Transcription completed successfully"
  },
  "data": {
    "transcript": "Hello, 你好, mixed language transcript.",
    "detected_languages": [
      {"LanguageCode": "en-US", "DurationInSeconds": 120.5},
      {"LanguageCode": "zh-CN", "DurationInSeconds": 45.2}
    ],
    "language_identification": [
      {"LanguageCode": "en-US", "Score": 0.95},
      {"LanguageCode": "zh-CN", "Score": 0.85}
    ]
  }
}

Key Features:

🤖 Zero Configuration: Works out of the box with any supported language
🌍 Multi-Language Ready: Handles language switching automatically
🎯 Optional Optimization: Use candidate_languages for better accuracy
⚡ Simple API: Just provide the URL, everything else is automatic

Usage Examples:

# Using the sample audio file
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
  -H "Content-Type: application/json" \
  -d "{\"url\": \"https://your-s3-bucket-id.s3.us-east-1.amazonaws.com/sample-audio.m4a\"}"

# Using your own S3 URL
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
  -H "Content-Type: application/json" \
  -d "{\"url\": \"https://your-bucket.s3.amazonaws.com/audio-file.mp3\"}"

Features:

✅ Synchronous: Returns completed transcript immediately
✅ Automatic Detection: Detects language(s) automatically
✅ No Job Tracking: No need to check status separately
✅ Quick Response: Best for shorter audio files
⚠️ Timeout: May timeout for very long audio files (use /transcribe endpoint instead)

🚀 Supported Languages

Language	Code	AWS Transcribe Code
English (US)	`en-us`	`en-US`
Chinese (Simplified)	`zh-cn`	`zh-CN`
Malay (Malaysia)	`ms-my`	`ms-MY`
Indonesian	`id-id`	`id-ID`

🤖 Automatic Language Detection

How It Works

The API now automatically detects and handles languages using AWS Transcribe's multi-language identification:

Default Behavior (Zero Configuration):

{
  "url": "https://bucket.s3.amazonaws.com/audio.mp3"
}

✅ Single Languages: Automatically detects dominant language (English, Chinese, Malay, Indonesian)
✅ Multiple Languages: Handles language switching within the same audio
✅ Mixed Conversations: Perfect for international meetings or multilingual content
✅ No Setup Required: Just provide the URL, everything else is automatic

Optional Optimization with Candidate Languages:

{
  "url": "https://bucket.s3.amazonaws.com/audio.mp3",
  "candidate_languages": ["en-us", "zh-cn"]
}

🎯 Improved Accuracy: Narrows detection to specific languages you expect
⚡ Faster Processing: Reduces detection time by limiting language options
🎪 Smart Filtering: Only considers languages you specify

Benefits of Automatic Detection

🔄 Backward Compatible: Existing code works without changes
🌐 Universal: Handles any combination of supported languages
📊 Detailed Results: Returns confidence scores and language identification info
⚡ Optimized: Uses AWS Transcribe's latest multi-language capabilities

🛠️ Setup and Deployment

Prerequisites

Node.js 18+
Python 3.10+
AWS CLI configured with proper credentials
Serverless Framework 4.x

Quick Deployment

Install dependencies:

npm install
pip install -r requirements.txt

Deploy to AWS:
```
serverless deploy
```

The deployment will automatically:

✅ Create S3 bucket for transcription outputs
✅ Set up IAM roles with proper permissions
✅ Deploy Lambda functions
✅ Configure API Gateway with CORS
✅ Set up lifecycle policies for cost optimization

Alternative Deployment Script

# Install, test, and deploy using the deployment script
python deploy_lambda.py install
python deploy_lambda.py test
python deploy_lambda.py deploy

Environment Variables (.env)

Create a .env file in the project root to configure AWS credentials and API endpoints:

# AWS Credentials
AWS_ACCESS_KEY_ID=your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_key_here
AWS_REGION1=us-east-1  # Default AWS region

# LAMBDA API Endpoint (optional - for testing deployed APIs)
TRANSCRIBE_API_BASE_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev

TRANSCRIBE_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe
TANSCRIBE_PROCESS_URL_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url
TRANSCRIBE_STATUS_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status
TRANSCRIBE_HEALTH_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health

SAMPLE_S3_AUDIO_URL=https://wenhao1223-sample-test-dev.s3.us-east-1.amazonaws.com/sample-audio-en.m4a

🧪 Testing

Using Python Test Script

# Install requests library (if using HTTP tests)
pip install requests

# Generate HTML test interface (no API URL required)
python test_lambda.py --create-html

# Generate HTML test interface with pre-filled API URL
python test_lambda.py --create-html --api-url https://your-api-id.execute-api.us-east-1.amazonaws.com/dev

# Run automated tests
python test_lambda.py --api-url https://your-api-id.execute-api.us-east-1.amazonaws.com/dev

# Test with specific file
python test_lambda.py --api-url https://your-api-id.execute-api.us-east-1.amazonaws.com/dev --file document.pdf

Test Script Options:

--create-html: Generate an interactive HTML test interface (API URL optional)
--api-url: API Gateway URL (required for testing, optional for HTML generation)
--file: Specific file to upload (optional, creates test file if not provided)

Web Interface Testing

Open test_lambda.html in your browser for an interactive testing interface with:

Health check testing
File upload and transcription
Status monitoring
Real-time response display

Manual Testing with curl

Quick test commands:

# 1. Health Check
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health"

# 2. Quick Transcribe (immediate result)
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-s3-bucket.s3.amazonaws.com/sample-audio.m4a"}'

# 3. Start async transcription job
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-bucket.s3.amazonaws.com/audio.mp3", "language": "en-us"}'

# 4. Check job status (replace JOB_NAME with actual job name)
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status?job_name=JOB_NAME"

Note: Replace your-s3-bucket and your-bucket with actual S3 bucket names containing your audio files.

🤖 Language Detection Behavior

Understanding how the automatic language detection works in different scenarios:

Single Language Detection:

When AWS Transcribe detects a single dominant language:

// AWS Internal Response Format
{
  "LanguageCode": "en-US",
  "IdentifiedLanguageScore": {
    "LanguageCode": "en-US", 
    "Score": 0.95
  }
}

Multi-Language Detection:

When AWS Transcribe detects multiple languages in the same audio:

// AWS Internal Response Format  
{
  "LanguageCodes": [
    {"LanguageCode": "en-US", "DurationInSeconds": 120.5},
    {"LanguageCode": "zh-CN", "DurationInSeconds": 45.2}
  ],
  "LanguageIdSettings": {...}
}

API Response Examples:

Single Language Detected:

{
  "status": {
    "statusCode": "200",
    "message": "Transcription completed successfully"
  },
  "data": {
    "message": "Hello, this is the transcript text.",
    "detected_language": "en-US",
    "detected_languages": [
      {"LanguageCode": "en-US", "DurationInSeconds": 120.5}
    ],
    "language_identification": [
      {"LanguageCode": "en-US", "Score": 0.95}
    ]
  }
}

Multiple Languages Detected:

{
  "status": {
    "statusCode": "200",
    "message": "Transcription completed successfully"
  },
  "data": {
    "message": "Hello, 你好, mixed language transcript.",
    "detected_language": "en-US",
    "detected_languages": [
      {"LanguageCode": "en-US", "DurationInSeconds": 120.5},
      {"LanguageCode": "zh-CN", "DurationInSeconds": 45.2}
    ],
    "language_identification": [
      {"LanguageCode": "en-US", "Score": 0.95},
      {"LanguageCode": "zh-CN", "Score": 0.85}
    ]
  }
}

Key Points:

🤖 Automatic: No configuration required, works out of the box
🌍 Universal: Handles single and multi-language scenarios seamlessly
🎯 Optimizable: Use candidate_languages for improved accuracy
📊 Detailed: Returns confidence scores and timing information

📁 Project Structure

transcribe-api/
├── lambda_handler.py      # Main Lambda function handlers
├── serverless.yml         # Serverless Framework configuration
├── requirements.txt       # Python dependencies
├── package.json          # Node.js dependencies
├── deploy_lambda.py      # Deployment automation script
├── test_lambda.py        # Local testing script
├── test_lambda.html      # Web testing interface (use `python test_lambda.py --create-html` to generate)
├── .env                  # Environment variables (local)
├── README.md            # This documentation
└── media/               # Sample audio files for testing
    ├── sample-audio-en.m4a
    └── sample-audio-mix.m4a

🔧 Configuration

Serverless Framework Configuration

The serverless.yml includes:

Auto IAM Roles: Automatic creation with Transcribe and S3 permissions
S3 Bucket: Auto-creation with security and lifecycle policies
API Gateway: RESTful endpoints with CORS support
Environment Variables: Automatic configuration
Security: Private bucket with encryption enabled
Cost Optimization: 30-day lifecycle rules

IAM Permissions

The auto-generated IAM role includes:

# Transcribe permissions
- transcribe:StartTranscriptionJob
- transcribe:GetTranscriptionJob  
- transcribe:ListTranscriptionJobs
- transcribe:DeleteTranscriptionJob

# S3 permissions
- s3:GetObject, s3:PutObject, s3:DeleteObject
- s3:ListBucket, s3:GetBucketLocation

# CloudWatch Logs permissions
- logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents

Error Handling

Common error scenarios:

400 Bad Request: Missing URL, invalid language, malformed JSON
404 Not Found: Transcription job not found
500 Internal Server Error: AWS service errors, unexpected exceptions

🔒 Security Features

Auto IAM Roles: Least privilege access with auto-generated policies
Private S3 Bucket: No public access, encryption enabled
CORS Configuration: Proper cross-origin resource sharing
Input Validation: URL and parameter validation
No Hardcoded Credentials: Uses AWS IAM roles and environment variables

📈 Monitoring and Logging

CloudWatch Logs: Automatic logging for all Lambda functions
Health Endpoint: Service status and availability monitoring
Error Tracking: Detailed error responses and logging
Performance Metrics: Lambda duration and invocation metrics

💸 Cost Optimization

Lifecycle Policies: Auto-delete transcripts after 30 days
Efficient Packaging: Minimal deployment size
Resource Limits: Optimized memory and timeout settings
Pay-per-use: Only pay for actual transcription usage

🗑️ Cleanup

To remove the entire deployment and all resources:

# Using Serverless Framework
serverless remove

# Or using the deployment script
python deploy_lambda.py remove

This will delete:

Lambda functions
API Gateway
S3 bucket and all contents
IAM roles and policies
CloudWatch log groups

🆘 Troubleshooting

Common Issues:

403 Forbidden Errors:
- Ensure S3 URLs are accessible
- Check IAM permissions (auto-generated roles should work)
Timeout Issues:
- Use /process-url for shorter audio files
- Use /transcribe + /status for longer files
Invalid Language Codes:
- Use supported codes: en-us, zh-cn, ms-my, id-id
S3 URL Format:
- Ensure URLs are public S3 URLs or pre-signed URLs
- Format: https://bucket-name.s3.region.amazonaws.com/file-key
🔧 Multi-Language Detection Issues:
- Error: "Unexpected error: 'LanguageCode'"
- Cause: AWS Transcribe returns different response formats for single vs multi-language jobs
- Solution: Updated code handles both LanguageCode (single) and LanguageCodes (multi) fields
- Status: ✅ Fixed in latest deployment

Debug Steps:

Test health endpoint: GET /health - Check service status
Check CloudWatch logs - View detailed error messages in AWS Console
Validate S3 URL accessibility - Ensure files are accessible
Use test_lambda.html - Interactive debugging interface
Test with known working files - Use English audio first to verify setup
Check deployment status - Ensure latest code is deployed with serverless deploy

🤝 Contributing

Fork the repository
Create a feature branch
Test changes locally using test_lambda.py
Test web interface using test_lambda.html
Submit a pull request

📚 References

Last Updated: October 1, 2025
Service Status: ✅ Fully Operational
Infrastructure: Auto-managed via Serverless Framework

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
media		media
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
deploy_lambda.py		deploy_lambda.py
lambda_handler.py		lambda_handler.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
serverless.yml		serverless.yml
test_lambda.py		test_lambda.py

MyGovHub-Goodbye-World/transcribe-api

Folders and files

Latest commit

History

Repository files navigation

AWS Transcribe API

✅ Deployment Status

🎤 Features

🚀 Quick Test Commands

📋 API Endpoints

1. Health Check

2. Start Transcription Job

3. Check Status

4. Process URL (Quick Transcribe) 🆕 Enhanced with Automatic Language Detection

🚀 Supported Languages

🤖 Automatic Language Detection

How It Works

Benefits of Automatic Detection

🛠️ Setup and Deployment

Prerequisites

Quick Deployment

Alternative Deployment Script

Environment Variables (.env)

🧪 Testing

Using Python Test Script

Test Script Options:

Web Interface Testing

Manual Testing with curl

🤖 Language Detection Behavior

Single Language Detection:

Multi-Language Detection:

API Response Examples:

Single Language Detected:

Multiple Languages Detected:

📁 Project Structure

🔧 Configuration

Serverless Framework Configuration

IAM Permissions

Error Handling

🔒 Security Features

📈 Monitoring and Logging

💸 Cost Optimization

🗑️ Cleanup

🆘 Troubleshooting

Common Issues:

Debug Steps:

🤝 Contributing

📚 References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages