-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace AVS with custom ASR service #20
Comments
Bump! I'm trying to create an always listening device, so circumventing the wake-word mentality, and want to pass the audio down stream for processing. I'm having a heck of a time peeling back the layers. I'm looking for an example similar to above. |
@spidey99 seems like this repo is dead and not maintained anymore. Moreover, main contributors don't answer even to emails. I didn't find any help on official forum as well. Unfortunately, they flushed such a perspective idea down the toilet. I spent a lot of time poking around these repos and their dependencies. Finally, I decided to avoid wasting time on this particular project anymore. Actually, I believe the entire idea of re-using Respeaker Core hardware with AVS is a dead-end, as it makes no sense to buy a $99 board to get another Alexa (assuming Echo Dot is much cheaper, especially on Black Friday). For me, Seeed Studio had to concentrate on a software part that allows developers all over the world to easily connect their own SST/TTS services. It would make more sense for people who are willing to make an offline ASR solution based on languages that aren't supported by Amazon or Google. That's why I decided to focus my effort on extending librespeaker samples. Now I have a working prototype, which can stream audio chunks to custom WebSocket ASR server. Technically, there are 2 transports implemented in this repo: WS and MQTT. So we can send audio data the way we want. However, I'm not a C++ developer. My primary language is Java/TS. So there are still lots of things I want to improve. Unfortunately, can't do it right now due to a lack of C++ expertise. So if you have any ideas or suggestions, PRs are always welcome. I hope there will be more people who want to resurrect and improve this idea. As it's really hard to do it alone. |
Maybe I am late and not quite understanding the context, but I think you can just use pyaudio get the stream and push it into your ASR service. |
@songtaoshi if you just get the stream from pyaudio, there won't be any DSP algorithms applied at all. It makes no sense to send a raw audio stream to ASR w/o preprocessing. This board's value is only in DSP (NS, BF, AEC, etc.) that could be achieved only programmatically via librespeaker. I don't believe anyone wants to use a $99 hardware just as a usb mic array. There are much cheaper alternatives for this. |
Hi @KillingJacky,
In the dev manual you mentioned:
Is there any reference on how to use
respeakerd
w/o AVS? I just want to apply DSP algorithms (AGC, NS, AEC, etc.) to the input audio stream captured from Respeaker Core V2, and redirect the filtered audio as a byte array via web sockets to my ASR server. Is there any similar example? Or maybe you can provide a short description of what should be changed in the existing code to support such a scenario?P.S. I saw python client in a separate repo. But it doesn't use any DSP.
Would be greatly appreciated any help.
The text was updated successfully, but these errors were encountered: