A serverless AWS Lambda function that provides multi-language audio/video transcription services using AWS Transcribe. This service supports English, Chinese, Malay, and Indonesian languages and accepts S3 URLs for audio/video files.
Β© 2025 Goodbye World team, for Great AI Hackathon Malaysia 2025 usage.
Service: Successfully deployed and operational
Last Updated: October 1, 2025
Live API Endpoints:
- POST
https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe - GET
https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status - GET
https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health - POST
https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url
Infrastructure:
- β Lambda Functions: 4 functions deployed (25 MB each)
- β
S3 Bucket: Auto-created
wenhao1223-transcribe-aws-transcribe-api-dev-20251001 - β IAM Roles: Auto-generated with proper Transcribe and S3 permissions
- β API Gateway: Configured with CORS support
- β Security: Private bucket with encryption enabled
- β Cost Optimization: 30-day lifecycle policy for auto-cleanup
- π€ Automatic Language Detection: Automatically detects single or multiple languages in audio files
- π Multi-Language Support: Seamlessly handles mixed-language conversations and language switching
- π― Smart Language Filtering: Optional candidate languages for improved accuracy and speed
- π£οΈ Language Support: English (en-us), Chinese (zh-cn), Malay (ms-my), Indonesian (id-id)
- βοΈ S3 Integration: Accept S3 URLs for audio/video files
- π RESTful API: Simple HTTP endpoints for transcription operations
- β±οΈ Real-time Status: Check transcription job status and retrieve results
- β‘ Synchronous Processing: Process URLs and get immediate results
- π Enhanced Response: Returns transcript with detected language information
- π CORS Enabled: Frontend-friendly with CORS support
- ποΈ Auto S3 Cleanup: Automatic deletion of old transcripts after 30 days
# 1. Health check
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health"
# 2. Quick transcribe with automatic language detection
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-s3-bucket.s3.amazonaws.com/sample-audio.m4a"}'
# 3. Quick transcribe with candidate languages (optional optimization)
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-s3-bucket.s3.amazonaws.com/sample-audio.m4a", "candidate_languages": ["en-us", "zh-cn"]}'
# 4. Start async transcription job (for very long audio files)
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-bucket.s3.amazonaws.com/audio.mp3", "language": "en-us"}'
# 5. Check job status (replace JOB_NAME with actual job name)
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status?job_name=JOB_NAME"GET /health
Check if the service is running and get supported languages.
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health"Response Format:
{
"status": {
"statusCode": 200,
"message": "Service is healthy"
},
"data": {
"service": "aws-transcribe-api",
"supported_languages": ["en-us", "zh-cn", "ms-my", "id-id"],
"language_detection": "automatic_multi_language_by_default",
"features": {
"automatic_language_identification": true,
"multi_language_support": true,
"single_and_mixed_language_audio": true,
"speaker_labeling": true,
"alternative_transcriptions": true,
"candidate_language_filtering": true
},
"timestamp": "2023-12-01T12:34:56.789Z",
"request_id": "test-request-id-123"
}
}POST /transcribe
Start a new asynchronous transcription job for an audio/video file.
Request Body:
{
"url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3",
"language": "en-us"
}Response Format:
{
"status": {
"statusCode": 200,
"message": "Transcription job started successfully"
},
"data": {
"job_name": "transcribe_job_20231201_123456_abc123",
"job_status": "IN_PROGRESS",
"language_code": "en-us",
"media_url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3",
"creation_time": "2023-12-01T12:34:56.321000+00:00",
"estimated_completion_time": "Processing time varies based on audio length"
}
}GET /status?job_name=<job_name>
Check the status of a transcription job and retrieve results if completed.
Response Format:
{
"status": {
"statusCode": 200,
"message": "Job status retrieved successfully"
},
"data": {
"job_name": "transcribe_job_20231201_123456_abc123",
"status": "COMPLETED",
"creation_time": "2023-12-01T12:34:56.789000+00:00",
"completion_time": "2023-12-01T12:36:45.123000+00:00", // null if status is 'IN_PROGRESS'
"language_code": "en-US",
"transcript": "Hello, this is the transcribed text from your audio file.", // undefined if status is 'IN_PROGRESS'
"transcript_uri": "https://s3.amazonaws.com/bucket/transcript.json", // undefined if status is 'IN_PROGRESS'
"identitfied_language_code": "en-US" // undefined if status is 'IN_PROGRESS'
}
}POST /process-url
Process an S3 URL and return the completed transcript immediately with automatic multi-language detection. No need to specify languages - AWS Transcribe automatically handles single or multiple languages in your audio.
Simple Request (Automatic Detection):
{
"url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3"
}Request with Language Candidates (Optional - Improves Accuracy):
{
"url": "https://your-bucket.s3.amazonaws.com/audio-file.mp3",
"candidate_languages": ["en-us", "zh-cn", "ms-my"]
}Response Format:
{
"status": {
"statusCode": 200,
"message": "Transcription completed successfully"
},
"data": {
"transcript": "Hello, δ½ ε₯½, mixed language transcript.",
"detected_languages": [
{"LanguageCode": "en-US", "DurationInSeconds": 120.5},
{"LanguageCode": "zh-CN", "DurationInSeconds": 45.2}
],
"language_identification": [
{"LanguageCode": "en-US", "Score": 0.95},
{"LanguageCode": "zh-CN", "Score": 0.85}
]
}
}Key Features:
- π€ Zero Configuration: Works out of the box with any supported language
- π Multi-Language Ready: Handles language switching automatically
- π― Optional Optimization: Use
candidate_languagesfor better accuracy - β‘ Simple API: Just provide the URL, everything else is automatic
Usage Examples:
# Using the sample audio file
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
-H "Content-Type: application/json" \
-d "{\"url\": \"https://your-s3-bucket-id.s3.us-east-1.amazonaws.com/sample-audio.m4a\"}"
# Using your own S3 URL
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
-H "Content-Type: application/json" \
-d "{\"url\": \"https://your-bucket.s3.amazonaws.com/audio-file.mp3\"}"Features:
- β Synchronous: Returns completed transcript immediately
- β Automatic Detection: Detects language(s) automatically
- β No Job Tracking: No need to check status separately
- β Quick Response: Best for shorter audio files
β οΈ Timeout: May timeout for very long audio files (use/transcribeendpoint instead)
| Language | Code | AWS Transcribe Code |
|---|---|---|
| English (US) | en-us |
en-US |
| Chinese (Simplified) | zh-cn |
zh-CN |
| Malay (Malaysia) | ms-my |
ms-MY |
| Indonesian | id-id |
id-ID |
The API now automatically detects and handles languages using AWS Transcribe's multi-language identification:
Default Behavior (Zero Configuration):
{
"url": "https://bucket.s3.amazonaws.com/audio.mp3"
}- β Single Languages: Automatically detects dominant language (English, Chinese, Malay, Indonesian)
- β Multiple Languages: Handles language switching within the same audio
- β Mixed Conversations: Perfect for international meetings or multilingual content
- β No Setup Required: Just provide the URL, everything else is automatic
Optional Optimization with Candidate Languages:
{
"url": "https://bucket.s3.amazonaws.com/audio.mp3",
"candidate_languages": ["en-us", "zh-cn"]
}- π― Improved Accuracy: Narrows detection to specific languages you expect
- β‘ Faster Processing: Reduces detection time by limiting language options
- πͺ Smart Filtering: Only considers languages you specify
- π Backward Compatible: Existing code works without changes
- π Universal: Handles any combination of supported languages
- π Detailed Results: Returns confidence scores and language identification info
- β‘ Optimized: Uses AWS Transcribe's latest multi-language capabilities
- Node.js 18+
- Python 3.10+
- AWS CLI configured with proper credentials
- Serverless Framework 4.x
-
Install dependencies:
npm install pip install -r requirements.txt
-
Deploy to AWS:
serverless deploy
The deployment will automatically:
- β Create S3 bucket for transcription outputs
- β Set up IAM roles with proper permissions
- β Deploy Lambda functions
- β Configure API Gateway with CORS
- β Set up lifecycle policies for cost optimization
# Install, test, and deploy using the deployment script
python deploy_lambda.py install
python deploy_lambda.py test
python deploy_lambda.py deployCreate a .env file in the project root to configure AWS credentials and API endpoints:
# AWS Credentials
AWS_ACCESS_KEY_ID=your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_key_here
AWS_REGION1=us-east-1 # Default AWS region
# LAMBDA API Endpoint (optional - for testing deployed APIs)
TRANSCRIBE_API_BASE_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev
TRANSCRIBE_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe
TANSCRIBE_PROCESS_URL_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url
TRANSCRIBE_STATUS_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status
TRANSCRIBE_HEALTH_API_URL=https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health
SAMPLE_S3_AUDIO_URL=https://wenhao1223-sample-test-dev.s3.us-east-1.amazonaws.com/sample-audio-en.m4a# Install requests library (if using HTTP tests)
pip install requests
# Generate HTML test interface (no API URL required)
python test_lambda.py --create-html
# Generate HTML test interface with pre-filled API URL
python test_lambda.py --create-html --api-url https://your-api-id.execute-api.us-east-1.amazonaws.com/dev
# Run automated tests
python test_lambda.py --api-url https://your-api-id.execute-api.us-east-1.amazonaws.com/dev
# Test with specific file
python test_lambda.py --api-url https://your-api-id.execute-api.us-east-1.amazonaws.com/dev --file document.pdf--create-html: Generate an interactive HTML test interface (API URL optional)--api-url: API Gateway URL (required for testing, optional for HTML generation)--file: Specific file to upload (optional, creates test file if not provided)
Open test_lambda.html in your browser for an interactive testing interface with:
- Health check testing
- File upload and transcription
- Status monitoring
- Real-time response display
Quick test commands:
# 1. Health Check
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/health"
# 2. Quick Transcribe (immediate result)
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/process-url" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-s3-bucket.s3.amazonaws.com/sample-audio.m4a"}'
# 3. Start async transcription job
curl -X POST "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/transcribe" \
-H "Content-Type: application/json" \
-d '{"url": "https://your-bucket.s3.amazonaws.com/audio.mp3", "language": "en-us"}'
# 4. Check job status (replace JOB_NAME with actual job name)
curl "https://your-api-id.execute-api.us-east-1.amazonaws.com/dev/status?job_name=JOB_NAME"Note: Replace
your-s3-bucketandyour-bucketwith actual S3 bucket names containing your audio files.
Understanding how the automatic language detection works in different scenarios:
When AWS Transcribe detects a single dominant language:
// AWS Internal Response Format
{
"LanguageCode": "en-US",
"IdentifiedLanguageScore": {
"LanguageCode": "en-US",
"Score": 0.95
}
}When AWS Transcribe detects multiple languages in the same audio:
// AWS Internal Response Format
{
"LanguageCodes": [
{"LanguageCode": "en-US", "DurationInSeconds": 120.5},
{"LanguageCode": "zh-CN", "DurationInSeconds": 45.2}
],
"LanguageIdSettings": {...}
}{
"status": {
"statusCode": "200",
"message": "Transcription completed successfully"
},
"data": {
"message": "Hello, this is the transcript text.",
"detected_language": "en-US",
"detected_languages": [
{"LanguageCode": "en-US", "DurationInSeconds": 120.5}
],
"language_identification": [
{"LanguageCode": "en-US", "Score": 0.95}
]
}
}{
"status": {
"statusCode": "200",
"message": "Transcription completed successfully"
},
"data": {
"message": "Hello, δ½ ε₯½, mixed language transcript.",
"detected_language": "en-US",
"detected_languages": [
{"LanguageCode": "en-US", "DurationInSeconds": 120.5},
{"LanguageCode": "zh-CN", "DurationInSeconds": 45.2}
],
"language_identification": [
{"LanguageCode": "en-US", "Score": 0.95},
{"LanguageCode": "zh-CN", "Score": 0.85}
]
}
}Key Points:
- π€ Automatic: No configuration required, works out of the box
- π Universal: Handles single and multi-language scenarios seamlessly
- π― Optimizable: Use
candidate_languagesfor improved accuracy - π Detailed: Returns confidence scores and timing information
transcribe-api/
βββ lambda_handler.py # Main Lambda function handlers
βββ serverless.yml # Serverless Framework configuration
βββ requirements.txt # Python dependencies
βββ package.json # Node.js dependencies
βββ deploy_lambda.py # Deployment automation script
βββ test_lambda.py # Local testing script
βββ test_lambda.html # Web testing interface (use `python test_lambda.py --create-html` to generate)
βββ .env # Environment variables (local)
βββ README.md # This documentation
βββ media/ # Sample audio files for testing
βββ sample-audio-en.m4a
βββ sample-audio-mix.m4a
The serverless.yml includes:
- Auto IAM Roles: Automatic creation with Transcribe and S3 permissions
- S3 Bucket: Auto-creation with security and lifecycle policies
- API Gateway: RESTful endpoints with CORS support
- Environment Variables: Automatic configuration
- Security: Private bucket with encryption enabled
- Cost Optimization: 30-day lifecycle rules
The auto-generated IAM role includes:
# Transcribe permissions
- transcribe:StartTranscriptionJob
- transcribe:GetTranscriptionJob
- transcribe:ListTranscriptionJobs
- transcribe:DeleteTranscriptionJob
# S3 permissions
- s3:GetObject, s3:PutObject, s3:DeleteObject
- s3:ListBucket, s3:GetBucketLocation
# CloudWatch Logs permissions
- logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEventsCommon error scenarios:
- 400 Bad Request: Missing URL, invalid language, malformed JSON
- 404 Not Found: Transcription job not found
- 500 Internal Server Error: AWS service errors, unexpected exceptions
- Auto IAM Roles: Least privilege access with auto-generated policies
- Private S3 Bucket: No public access, encryption enabled
- CORS Configuration: Proper cross-origin resource sharing
- Input Validation: URL and parameter validation
- No Hardcoded Credentials: Uses AWS IAM roles and environment variables
- CloudWatch Logs: Automatic logging for all Lambda functions
- Health Endpoint: Service status and availability monitoring
- Error Tracking: Detailed error responses and logging
- Performance Metrics: Lambda duration and invocation metrics
- Lifecycle Policies: Auto-delete transcripts after 30 days
- Efficient Packaging: Minimal deployment size
- Resource Limits: Optimized memory and timeout settings
- Pay-per-use: Only pay for actual transcription usage
To remove the entire deployment and all resources:
# Using Serverless Framework
serverless remove
# Or using the deployment script
python deploy_lambda.py removeThis will delete:
- Lambda functions
- API Gateway
- S3 bucket and all contents
- IAM roles and policies
- CloudWatch log groups
-
403 Forbidden Errors:
- Ensure S3 URLs are accessible
- Check IAM permissions (auto-generated roles should work)
-
Timeout Issues:
- Use
/process-urlfor shorter audio files - Use
/transcribe+/statusfor longer files
- Use
-
Invalid Language Codes:
- Use supported codes:
en-us,zh-cn,ms-my,id-id
- Use supported codes:
-
S3 URL Format:
- Ensure URLs are public S3 URLs or pre-signed URLs
- Format:
https://bucket-name.s3.region.amazonaws.com/file-key
-
π§ Multi-Language Detection Issues:
- Error:
"Unexpected error: 'LanguageCode'" - Cause: AWS Transcribe returns different response formats for single vs multi-language jobs
- Solution: Updated code handles both
LanguageCode(single) andLanguageCodes(multi) fields - Status: β Fixed in latest deployment
- Error:
- Test health endpoint:
GET /health- Check service status - Check CloudWatch logs - View detailed error messages in AWS Console
- Validate S3 URL accessibility - Ensure files are accessible
- Use
test_lambda.html- Interactive debugging interface - Test with known working files - Use English audio first to verify setup
- Check deployment status - Ensure latest code is deployed with
serverless deploy
- Fork the repository
- Create a feature branch
- Test changes locally using
test_lambda.py - Test web interface using
test_lambda.html - Submit a pull request
- AWS Transcribe Documentation
- Serverless Framework Documentation
- AWS Lambda Python Runtime
- AWS S3 Documentation
Last Updated: October 1, 2025
Service Status: β
Fully Operational
Infrastructure: Auto-managed via Serverless Framework