-
Notifications
You must be signed in to change notification settings - Fork 38
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
faster whisper implementation (#177)
Co-authored-by: Nick Stogner <[email protected]>
- Loading branch information
Showing
15 changed files
with
395 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Configure Speech To Text | ||
|
||
KubeAI provides a Speech to Text endpoint that can be used to transcribe audio files. This guide will walk you through the steps to enable this feature. | ||
|
||
## Enable Speech to Text model | ||
You can create neew models by creating a Model CRD object or by enabling a model from the model catalog. | ||
|
||
### Enable from model catalog | ||
KubeAI provides predefined models in the model catalog. To enable the Speech to Text model, you can set the `enabled` flag to `true` in the `helm-values.yaml` file. | ||
|
||
```yaml | ||
models: | ||
catalog: | ||
faster-whisper-medium-en-cpu: | ||
enabled: true | ||
minReplicas: 1 | ||
``` | ||
### Enable by creating Model CRD | ||
You can also create a Model CRD object to enable the Speech to Text model. Here is an example of a Model CRD object for the Speech to Text model: | ||
```yaml | ||
apiVersion: kubeai.org/v1 | ||
kind: Model | ||
metadata: | ||
name: faster-whisper-medium-en-cpu | ||
spec: | ||
features: [SpeechToText] | ||
owner: Systran | ||
url: hf://Systran/faster-whisper-medium.en | ||
engine: FasterWhisper | ||
minReplicas: 0 | ||
maxReplicas: 3 | ||
resourceProfile: cpu:1 | ||
``` | ||
## Usage | ||
The Speech to Text endpoint is available at `/openai/v1/transcriptions`. | ||
|
||
Example usage using curl: | ||
|
||
```bash | ||
curl -L -o kubeai.mp4 https://github.com/user-attachments/assets/711d1279-6af9-4c6c-a052-e59e7730b757 | ||
curl http://localhost:8000/openai/v1/audio/transcriptions \ | ||
-F "[email protected]" \ | ||
-F "language=en" \ | ||
-F "model=faster-whisper-medium-en-cpu" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.