This guide is designed to help you navigate through our project's issues based on their difficulty level.
These are beginner-friendly issues that typically require less prior knowledge of the project. They serve as a great starting point for newcomers.
- Look at the code and the software design and look doing more optimized and good quality code.
- Both client and server could be better designed, espacially about oriented object.
Give free rein to your imagination to improve the UI ! Especially the audio specter animaiton which is low res.
Version 1.0.0 makes the animation move only when the assistant speaks. it could be interesting to get a feedback of the app reccording the user too, but maybe not the same way. 1.0.0 uses a REC button.
These issues might require a bit more understanding of the project but are still approachable if you have some related experience.
By default, the Whisper API only supports files that are less than 25 MB. Over 25MB, the API fails. It implies to split audios.
More details here: OpenAI API documentation
Avoiding file saving for user reccordings and 11labs answers should improve response time. Something like cache based save instead ?
Attention to the openAI transcription which currently requires a file path as parameter, perhaps a cache file path works ? As long as it is faster to process, to improve user experience with better response time.
These are challenging issues suitable for contributors who are familiar with the project or have a deep understanding of the relevant technologies.
As this was a prototype, we decided that initiating the node server initiate one chat. It means that every electron client connected to the websocket are using the same chat.
In order to go for a wider solution, it could be great to handle one chat per connected client
1.0.0 version uses 11 labs text 2 speach: documented here
It should be a performance improvmeent to use the stream text 2 speach features available from ElevenLabs API: here
Phrase parsing strategy is being done at server side. It gets the GPT stream answer (word by word) and split content by sentences/par of sentences to process them faster and respecting the order.
By analysing the assistant behaviour (user experience), improve phrase parser so it split GPT answer more efficiently (looking for better response time / better prononciation...)
Making automatic voice reccord, no need to press space bar anymore to reccord
Interrupting the assistant could be a great feature to improve user experience and chat realism.
- Select an Issue: Browse through the lists above and choose an issue you're interested in.
- Inform the project owners: Comment on the issue expressing your interest. This ensures that multiple contributors aren't working on the same issue simultaneously.
- Review the Contributing Guide: Before starting, familiarize yourself with the contributing guidelines of the project.
- Fork, Clone, and Work: Fork the project repository, clone your fork locally, create a new branch for your chosen issue, and start working! Don't forget to check the Getting started guide to ease your work.
- Submit a Pull Request: Once you've made your changes, push them to your fork and submit a pull request to the main repository.
Thank you for your interest in contributing to this project. Together, we can make this project even better!