You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While I have not tried the new microWakeWord model yet, which might fix this issue anyways, I had the following thought:
To reduce the amount of false positives and potentially allow speaker identification and custom-verifier-esque runs while still keeping a low latency pipeline, it could be an idea to add a service which runs wakeword verification after the wakeword has initially been detected, and cancels the pipeline if it deems the wakeword to be false.
To explain further:
The satellite runs its own local wakeword model, and triggers the pipeline if it detects a wakeword. With this pipeline trigger, it also sends the last ~2 seconds of audio, which then gets fed into a model of this secondary wakeword service. That service then for example runs a custom verifier/raven speaker identification etc. If it also detects a wakeword, it does nothing. If it does not detect a wakeword, it sends a message to cancel the pipeline which has triggered the wakeword.
The idea is similar to how Echo devices do it, which have the ability to trigger and then send the wakeword audio to a cloud server which does secondary verification. The short light blink resulting from that is better than a tts-reply that something was not understood.
The only downside is that this does not allow cancelling of any activation sounds to still remain low latency.
The text was updated successfully, but these errors were encountered:
While I have not tried the new microWakeWord model yet, which might fix this issue anyways, I had the following thought:
To reduce the amount of false positives and potentially allow speaker identification and custom-verifier-esque runs while still keeping a low latency pipeline, it could be an idea to add a service which runs wakeword verification after the wakeword has initially been detected, and cancels the pipeline if it deems the wakeword to be false.
To explain further:
The satellite runs its own local wakeword model, and triggers the pipeline if it detects a wakeword. With this pipeline trigger, it also sends the last ~2 seconds of audio, which then gets fed into a model of this secondary wakeword service. That service then for example runs a custom verifier/raven speaker identification etc. If it also detects a wakeword, it does nothing. If it does not detect a wakeword, it sends a message to cancel the pipeline which has triggered the wakeword.
The idea is similar to how Echo devices do it, which have the ability to trigger and then send the wakeword audio to a cloud server which does secondary verification. The short light blink resulting from that is better than a tts-reply that something was not understood.
The only downside is that this does not allow cancelling of any activation sounds to still remain low latency.
The text was updated successfully, but these errors were encountered: