Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Pitch-retaining speed stretching with Phase Vocoder #1

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

CSharperMantle
Copy link

@CSharperMantle CSharperMantle commented Jun 28, 2023

Summary

This Pull Request proposes a phase vocoder-based method to adjust audio speed without altering its original pitch with reasonable performance.

Resolves TeamFlos/phira#109.

Performance considerations

The proposed code adds a feature that activates the audio stretcher only on necessity. Thus there should be no performance regression in the normal game, except for Exercises Mode with a non-default speed setting. Meanwhile, the new stretcher is added as a preprocessor, only running at the audio loading step, thus no perceivable in-game latency regression should exist. This is the most critical factor for real-time rhythmic games like Phira.

By using a highly optimized rustfft library, the proposed code could be automatically accelerated on desktops (w/ Intel AVX and SSE) and mobile platforms (w/ Arm NEON). However, a small amount of delay may still be present on low-end devices between clicking the triangular "Play" button and the single starts playing, but the amount is acceptable in the author's view.

Known limitations

The quality of stretched audio is limited by the algorithm being a lossy, predictive process.

Per-audio parameter tuning may be necessary to achieve optimal quality, including deciding on window functions and window sizes. However, do note that the current proposed settings in this PR already achieve an acceptable quality of generation. The most significant loss in quality happens to low-frequency parts of the audio, while more-perceivable mid to high-frequency components are largely unaffected.

Licensing

The proposed code is written by Rong "M." Bao, author of this PR. The implementation is adapted from a repository created by Andrew Yoon, licensed under a permissive CC0 1.0 Universal license. The algorithm described here is adapted from a repository written by Nasca O. Paul, which is placed in Public Domain.

The author of this PR and related documentation and code ("Code" hereafter) formally agrees that his Code could be licensed, used, and distributed under whatever license this main repository ( https://github.com/Mivik/sasa ) uses.

Modify music loading mechanism to incorporate phase vocoder-based time stretching, so that audio pitch would not be changed while altering speed of playback.
Remove trailing whitespaces. Remove redundant braces in import stmt.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[功能建议] 在练习模式中修改速度时不影响音调
1 participant