Replies: 1 comment
-
Great example, thanks for sharing 💯 Open to any patches to make this more straightforward, offline access is a key focus for this repo. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When developing with WhisperKit, we found that it defaults to connecting to HuggingFace's servers. This may cause issues such as inability to use in scenarios with poor network environments or limited network traffic.
To address this, I explored the related information provided by #81 and ultimately implemented the requirement.
Setup
WhisperKit requires loading two things when loading models: one is the model itself, such as openai_whisper-large-v3-v20240930_547MB, and the other is the tokenizer corresponding to the model. To achieve local loading, we need to provide both files simultaneously. Refer to the loading function below
Note that you need to add the model file and tokenizer file to the bundle in the Copy Bundle Resources of Build Phases.
script
To facilitate model downloading and management, the following ci script was written
Where WHISPER_VARIANT can be found in the HuggingFace repository provided by WhisperKit, i.e., https://huggingface.co/argmaxinc/whisperkit-coreml
TOKENIZER_VARIANT needs to be confirmed by referring to the case in
WhisperKit/Sources/WhisperKit/Core/Utils.swift
Lines 366 to 394 in 0af7146
It's worth noting that when mounting tokenizerFolder, you must directly mount tokenizerFolder and preserve the file structure of tokenizerFolder/models/openai/whisper-large-v3.
Beta Was this translation helpful? Give feedback.
All reactions