Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace AVS with custom ASR service #20

Open
sskorol opened this issue Sep 13, 2020 · 5 comments
Open

Replace AVS with custom ASR service #20

sskorol opened this issue Sep 13, 2020 · 5 comments

Comments

@sskorol
Copy link

sskorol commented Sep 13, 2020

Hi @KillingJacky,

In the dev manual you mentioned:

It's also a good example showing how to utilize the librespeaker. Users can implement their own server application / daemon to invoke librespeaker.

Is there any reference on how to use respeakerd w/o AVS? I just want to apply DSP algorithms (AGC, NS, AEC, etc.) to the input audio stream captured from Respeaker Core V2, and redirect the filtered audio as a byte array via web sockets to my ASR server. Is there any similar example? Or maybe you can provide a short description of what should be changed in the existing code to support such a scenario?

P.S. I saw python client in a separate repo. But it doesn't use any DSP.

Would be greatly appreciated any help.

@sskorol
Copy link
Author

sskorol commented Nov 8, 2020

@fanjm95, @jerryyip maybe you have some thoughts folks?

@spidey99
Copy link

Bump!

I'm trying to create an always listening device, so circumventing the wake-word mentality, and want to pass the audio down stream for processing. I'm having a heck of a time peeling back the layers. I'm looking for an example similar to above.

@sskorol
Copy link
Author

sskorol commented Nov 29, 2020

@spidey99 seems like this repo is dead and not maintained anymore. Moreover, main contributors don't answer even to emails. I didn't find any help on official forum as well. Unfortunately, they flushed such a perspective idea down the toilet.

I spent a lot of time poking around these repos and their dependencies. Finally, I decided to avoid wasting time on this particular project anymore. Actually, I believe the entire idea of re-using Respeaker Core hardware with AVS is a dead-end, as it makes no sense to buy a $99 board to get another Alexa (assuming Echo Dot is much cheaper, especially on Black Friday).

For me, Seeed Studio had to concentrate on a software part that allows developers all over the world to easily connect their own SST/TTS services. It would make more sense for people who are willing to make an offline ASR solution based on languages that aren't supported by Amazon or Google. That's why I decided to focus my effort on extending librespeaker samples.

Now I have a working prototype, which can stream audio chunks to custom WebSocket ASR server. Technically, there are 2 transports implemented in this repo: WS and MQTT. So we can send audio data the way we want.

However, I'm not a C++ developer. My primary language is Java/TS. So there are still lots of things I want to improve. Unfortunately, can't do it right now due to a lack of C++ expertise. So if you have any ideas or suggestions, PRs are always welcome. I hope there will be more people who want to resurrect and improve this idea. As it's really hard to do it alone.

@songtaoshi
Copy link

Maybe I am late and not quite understanding the context, but I think you can just use pyaudio get the stream and push it into your ASR service.

@sskorol
Copy link
Author

sskorol commented Jun 28, 2021

@songtaoshi if you just get the stream from pyaudio, there won't be any DSP algorithms applied at all. It makes no sense to send a raw audio stream to ASR w/o preprocessing. This board's value is only in DSP (NS, BF, AEC, etc.) that could be achieved only programmatically via librespeaker. I don't believe anyone wants to use a $99 hardware just as a usb mic array. There are much cheaper alternatives for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants