An advanced web-based audio player designed to improve speech intelligibility at high playback speeds through a suite of customizable audio enhancements.
- Adjustable Playback Speed: Play audio files at speeds ranging from 0.5x to 10x.
- Customizable Audio Enhancements: Enable, disable, and fine-tune 12 different filters to optimize speech clarity.
- Real-Time Processing: All adjustments impact audio playback in real time.
- Settings Management: Save and load custom settings for quick access.
-
Clone the Repository
git clone https://github.com/trottier/1kwpm.git
-
Navigate to the Project Directory
cd 1kwpm
-
Open the Application
Open
index.html
in your preferred web browser.
-
Load an Audio File
- Click the "Choose File" button to select an audio file (preferably speech).
-
Adjust Playback Speed
- Use the "Speed" slider to set the desired playback rate (0.5x to 10x).
-
Enable and Configure Audio Enhancements
- Each enhancement can be enabled/disabled using the checkbox next to its name.
- Fine-tune the effect using the provided sliders.
- The current value of each setting is displayed next to the slider.
-
Save and Load Settings
- Click "Save Settings" to store your current configuration in the browser's local storage.
- Click "Load Settings" to retrieve and apply previously saved settings.
The audio player includes the following filters, ordered by their anticipated impact on intelligibility:
-
- Description: Reduces background noise to enhance speech clarity by suppressing unwanted ambient sounds.
- Controls:
- Reduction Level: Adjusts the intensity of noise reduction (0% to 200%).
-
- Description: Automatically adjusts the volume to maintain consistent speech levels.
- Controls:
- Target Level: Sets the desired output level (0.5 to 2.0).
-
- Description: Balances audio levels to make softer sounds more audible without distorting louder sounds.
- Controls:
- Threshold: The level above which compression begins (-100 dB to 0 dB).
- Ratio: The amount of compression applied (1:1 to 20:1).
-
- Description: Advanced time-stretching that preserves the pitch and quality of speech even at high playback speeds.
- Controls:
- Time Stretch Factor: Adjusts the amount of time-stretching (0.5 to 2.0).
-
- Description: Maintains the natural formant frequencies of speech when adjusting playback speed to prevent unnatural sounding speech.
- Controls:
- Preservation Level: Sets the degree of formant preservation (0.5 to 2.0).
-
- Description: Reduces rapid fluctuations in the audio signal to create a smoother listening experience at high speeds.
- Controls:
- Smoothing Level: Adjusts the amount of smoothing applied (0% to 200%).
-
- Description: Automatically adjusts equalizer settings in real-time based on the audio content to enhance critical speech frequencies dynamically.
- Controls:
- EQ Intensity: Sets the intensity of dynamic equalization (0.5 to 2.0).
-
- Description: Enhances critical speech frequencies, making speech sounds more distinct.
- Controls:
- Enhancement Level: Adjusts the amount of spectral shaping (0.5 to 4.0).
-
- Description: Enhances clarity by emphasizing consonant sounds using a peaking filter.
- Controls:
- Frequency: Sets the target frequency for emphasis (1000 Hz to 5000 Hz).
- Gain: Adjusts the gain applied at the target frequency (0 dB to 20 dB).
-
- Description: Compresses the frequency spectrum to bring critical speech frequencies closer together.
- Controls:
- Compression Level: Adjusts the amount of frequency compression (0.5 to 4.0).
-
- Description: Enhances transient elements of speech, such as plosives and consonants, to make speech clearer.
- Controls:
- Enhancement Level: Sets the intensity of transient enhancement (0.5 to 2.0).
-
- Description: Applies processing based on human auditory perception to enhance important speech components.
- Controls:
- Modeling Intensity: Adjusts the level of psychoacoustic modeling (0.5 to 2.0).
This project is licensed under the MIT License.
- Web Audio API: For providing the tools to process audio in the browser.
- OpenAI's GPT-4: Assistance in generating code and documentation.
Note: Some advanced audio processing features like the Phase Vocoder and Formant Preservation are simplified due to the limitations of the Web Audio API. Future enhancements may involve integrating external libraries or custom audio processing to achieve more accurate implementations.