The open-source macOS dictation replacement you've been waiting for! π
Transform your voice into text instantly with the power of OpenAI, Groq, and Deepgram APIs. Say goodbye to macOS dictation limitations and hello to lightning-fast, accurate transcription with your own custom hotkeys! β‘οΈ
- Demo
- Why Choose VTS?
- Screenshots
- Getting Started
- Usage Guide
- Privacy & Security
- Troubleshooting
- Development
- Roadmap
- Feedback
- License
- Acknowledgements
π Turn on your sound! This demo includes audio to showcase the real-time transcription experience.
Monosnap.screencast.2025-07-23.03-22-29.mp4
- π€ AI-Powered Accuracy: Leverage OpenAI, Groq, and Deepgram models for superior transcription
- π Your Keys, Your Control: Bring your own API keys - no subscriptions, no limits
- π Drop-in Replacement: Works exactly like macOS dictation, but better!
- β¨οΈ Your Shortcut, Your Rules: Fully customizable global hotkeys (default: ββ§;)
- π― Smart Device Management: Intelligent microphone priority with seamless fallback
- π¬ Context-Aware: Custom system prompt boosts accuracy for your specific needs
- π 100% Open Source: Full transparency, community-driven, modify as you wish
Monosnap.screencast.2025-07-23.02-48-42.mp4
Onboarding:
Ready to use VTS? Head over to our Releases Page for:
- π₯ One-click downloads for macOS (Apple Silicon & Intel)
- π Complete installation instructions
- π System requirements and compatibility info
- π Release notes with latest features and fixes
- macOS 14.0+ (Apple Silicon & Intel supported)
- API key from OpenAI, Groq, or Deepgram (see setup below)
After installing VTS, you'll need an API key from one of these providers:
- OpenAI: Get your API key here
- Groq: Get your API key here
- Deepgram: Get your API key here
Only one API key is required - choose the provider you prefer!
- Choose Provider: Select OpenAI, Groq, or Deepgram from the dropdown
- Select Model: Pick whisper-1, whisper-large-v3, or other available models
- Enter API Key: Paste your API key in the secure field
- Start Recording: Press the global hotkey (default: ββ§;) and speak
- View Results: See real-time transcription inserted into the application you're using
- (Optional) Copy: Use buttons to copy the transcript
- View Available Devices: See all connected microphones with system default indicators
- Set Priority Order: Add devices to priority list with + buttons
- Automatic Fallback: App automatically uses highest-priority available device
- Real-time Switching: Seamlessly switches when preferred devices connect/disconnect
- Remove from Priority: Use β buttons to remove devices from priority list
- Add context-specific prompts to improve transcription accuracy
- Examples: "Medical terminology", "Technical jargon", "Names: John, Sarah, Mike"
- Prompts help the AI better understand domain-specific language
- No audio storage: Audio is processed in real-time, never stored locally
- API keys are safe: Keys are stored in Keychain
- TLS encryption: All API communication uses HTTPS
- Microphone permission: Explicit user consent required for audio access
- Microphone Permission Denied: Check System Settings > Privacy & Security > Microphone
- No Microphones Found: Click "Refresh" in the Microphone Priority section
- Wrong Microphone Active: Set your preferred priority order or check device connections
- App Not Responding to Hotkey: Ensure accessibility permissions are granted when prompted
This section is for developers who want to build VTS from source or contribute to the project.
- macOS 14.0+ (Apple Silicon & Intel supported)
- Xcode 15+ for building
- API key from OpenAI, Groq, or Deepgram for testing
- Clone the repository:
git clone https://github.com/j05u3/VTS.git
cd VTS
- Open in Xcode:
open VTSApp.xcodeproj
- Build and run:
- In Xcode, select the VTSApp scheme
- Build and run with βR
- Grant microphone permission when prompted
# Build via command line
xcodebuild -project VTSApp.xcodeproj -scheme VTSApp build
VTS follows a clean, modular architecture:
- CaptureEngine: Handles audio capture using AVAudioEngine with Core Audio device management
- DeviceManager: Manages microphone priority lists and automatic device selection
- TranscriptionService: Orchestrates streaming transcription with provider abstraction
- STTProvider Protocol: Clean interface allowing easy addition of new providers
- Modern SwiftUI: Reactive UI with proper state management and real-time updates
Currently, VTS includes manual testing capabilities through the built-in Text Injection Test Suite accessible from the app's interface. This allows you to test text insertion functionality across different applications.
Automated unit tests are planned for future releases.
- Permission Not Updating: During development/testing, when the app changes (rebuild, code changes), macOS treats it as a "new" app
- Solution: Remove the old app entry from System Settings > Privacy & Security > Accessibility, then re-grant permission
- Why This Happens: Each build gets a different signature, so macOS sees it as a different application
- Quick Fix: Check the app list in Accessibility settings and remove any old/duplicate VTS entries
- Reset App State: To test the complete onboarding flow, change the
PRODUCT_BUNDLE_IDENTIFIER
in Xcode project settings - Why This Works: Changing the bundle identifier creates a "new" app from macOS perspective, resetting all permissions and app state
- Most Reliable Method: This is more reliable than clearing UserDefaults and ensures a clean onboarding test including all system permissions
See CONTRIBUTING.md for details on how to contribute to VTS development.
- Auto-open at login: Auto-open at login with checkbox in the preferences window (β Implemented)
- Modern Release Automation: Automated releases with release-please and GitHub Actions (β Implemented)
- Sparkle Auto-Updates: Automatic app updates with GitHub Pages appcast hosting (β Implemented)
In a future or maybe pro version, to be decided/ordered by priority, your feedback and contributions are welcome!
- More models/providers: Support for more STT providers like Google, Azure, etc.
- Safe auto-cut: Auto-cut to maximum time if the user forgets to end (or accidentally starts).
- Comprehensive Test Suite: Automated unit tests covering:
- Core transcription functionality
- Provider validation and error handling
- Device management and priority logic
- Integration flows and edge cases
- LLM step: Use LLM to process the transcription and improve accuracy, maybe targetted to the app you're using or context in general. (Be able to easily input emojis?). I mean apply transformations based on the app you're injecting text to.
- Advanced Audio Processing: Noise reduction and gain control, but also some STT providers can do this so maybe not needed?.
- Accessibility Features: VoiceOver support and high contrast modes
Have feedback, suggestions, or issues? We'd love to hear from you!
π§ Send us your feedback - Quick and direct way to reach us
You can also:
- π Report bugs or request features on GitHub
- π‘ Share your ideas for improvements
- β Star the project if you find it useful!
MIT License - see LICENSE file for details.
VTS wouldn't be possible without the incredible work of the open-source community. Special thanks to:
- ios-icon-generator by @smallmuou - for the awesome icon generation script that made creating our app icons effortless
- create-dmg by @sindresorhus - for the excellent DMG creation script that streamlines our distribution process
Note: This project builds upon the work of many developers and projects. If I've missed crediting someone or something I sincerely apologize! Please feel free to open an issue or PR to help me give proper recognition where it's due.
Made with β€οΈ for the macOS community Testing release workflow enhancement