Add Audio Transcription Functionality #94

gromdimon · 2025-01-19T15:41:20Z

Is your feature request related to a problem? Please describe.

Nevron currently lacks the ability to process audio files or voice inputs. This limits its usability for scenarios where users may want to provide voice notes, meeting recordings, or podcasts for analysis, memory updates, or decision-making workflows.

Describe the solution you'd like

Add functionality to transcribe audio files into text using a reliable speech-to-text solution. This feature will enable Nevron to process voice-based inputs and use the transcriptions in workflows.

Proposed Implementation Steps:

Audio File Support:
- Support common audio file formats such as .mp3, .wav, .flac.
Integration with Speech-to-Text Services:
- Use an external library or API for transcription:
  - Whisper API.
  - AssemblyAI.
- Allow users to choose the transcription backend through settings.py.
Integration with Workflows:
- Add new Execution tool
Error Handling:
- Handle errors such as:
  - Unsupported file formats.
  - Poor audio quality leading to incomplete transcriptions.
  - API errors during transcription.
- Log detailed error messages for debugging.
Configuration Options:
- Add configuration options to settings.py, including:
  - Maximum file size.
  - Transcription backend and API keys.
  - Language settings for transcription.
Unit Tests:
- Write unit tests to validate audio transcription functionality using sample audio files:
  - Clear audio with text output verification.
  - Poor quality audio with expected errors.
  - Unsupported file formats.

Additional Context

We need to first check if audio is secure (doesn't have any malware)

The text was updated successfully, but these errors were encountered:

gromdimon added the feature New feature or request label Jan 19, 2025

gromdimon added this to the v0.3.0 milestone Jan 19, 2025

gromdimon assigned anderlean Jan 19, 2025

gromdimon modified the milestones: v0.3.0, v0.2.2 Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Audio Transcription Functionality #94

Add Audio Transcription Functionality #94

gromdimon commented Jan 19, 2025 •

edited

Loading

Add Audio Transcription Functionality #94

Add Audio Transcription Functionality #94

Comments

gromdimon commented Jan 19, 2025 • edited Loading

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Proposed Implementation Steps:

Additional Context

gromdimon commented Jan 19, 2025 •

edited

Loading