-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support force activated pipelines #144
base: master
Are you sure you want to change the base?
Conversation
I feel like a remote-activated function like this needs a config option to enable/disable it, ideally defaulting to OFF. This feels like a privacy issue waiting to happen. |
It could be easily made optional. But I'm not sure what scenario you have in mind, who is the adversary? |
In general I'm leaning towards the "prank" level of privacy, so someone with access to the network. However, from my understanding so far, there is no authentication/authorization in the wyoming protocol? (Please correct me if I'm wrong, but I did not find any reference that shows any kind of authentication happening) This means, by extension, that the satellite could be controlled by someone who is NOT the HA server. Which means anyone in the network could trigger audio streaming - bypassing voice/hotword activation - to basically anywhere in the network. |
This is an interesting security discussion in general, maybe encryption/authentication could be added to wyoming (similarly to esphome). But it's completely orthogonal to this PR. Even without this PR a malicious user on the local network can connect to the satellite and stream audio. If wake-word detection happens locally on the satellite the malicious user would simply need to wait until the first detection, and then he could stream audio forever (pretending the pipeline never ends). Or, to avoid waiting, he could even do a more exotic attack like send an |
Thank you for this insight! I was not aware the protocol allowed that much freedom :D I've filed an issue in the protocol repo here, as this looks like a more foundational topic: rhasspy/wyoming#11 |
Hey @chatziko, very interesting approach. Thanks for the contributiuon. Don't know if you saw this one, it's a very long thread, but you can go bottom-up: #81 I did that and a few other "exploration" changes, however, PRs are not going through. |
Hey @llluis , wow, that's a long discussion in #81, I certainly missed it otherwise I would have participated. I hadn't followed I had a quick look at the code in
So in the end I found the approach of adding PS. Concerning |
Very nice PRs. I've tested them, including the button in HA. Thank you very much. I sincerely hope that this work can be reviewed and merged in the future. I wrote a small guide there for anyone interested to test and use it in their setup. |
Awesome, thanks for testing and writing a guide. |
Not sure at that point if that's related to one of the PR, but the button entity was named
|
Do you know if this activation can be associated with one of the current events like |
It should work, but of course with the events that are actually happening. No wakeword was detected so it makes sense that |
I think I've tested with all the events around the detection without success, but I'll try again with that one and report here for the results. It could also an issue in the original project so I'll test with the standard workflow with the wake word. Thank you for the quick answer. |
Quite sad that rhasspy/wyoming#10 did not make it to 1.5.4 since it was ready for a long time. |
Can't wait for this! I'm really hoping that after this is implemented, we can tweak the LLM call to include a "Follow up?" parameter which can just trigger an immediate detection event on the satellite so it can start listening right away. Basically, this gets us one step closer to true two-way, multi-step communication with LLMs. Very exciting. |
Excited for this to be implemented. This would allow me to completely replace my commercial voice assistants. Actionable notification and complex conversations will be great to have. |
This PR contains the client-side of a "push-to-talk"-like feature that allows the server to "force activate" the satellite, going directly to ASR. The change is simple: when the server sends
run-satellite
withstart_stage = asr
then we send back arun-satellite
(as usual) with that stage and start streaming immediately. More details will be given in the corresponding PR inhome-assistant/core
.#143 is recommended for this to work properly, otherwise the awake sound and debug recording will not be triggered for "force activated" pipelines. I made separate PRs for easier review, if you merge #143 first I can take care of the conflicts.
Note also that this wyoming change is needed by this PR.
Finally note that this is about a server-side "push-to-talk", not client side as in #82.