Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wanting to spark a dialogue on this subject #8

Open
lanmower opened this issue Jun 10, 2024 · 4 comments
Open

Wanting to spark a dialogue on this subject #8

lanmower opened this issue Jun 10, 2024 · 4 comments

Comments

@lanmower
Copy link

I'm very interested in this project, not knowing much about the process I blindly tried to use gpt to solve this issue, but your project is much nicer and further down the production line...

I ran a kick through it and it sounded a little bassy, was thiking about starting that as a discussion on how to improve its output, since the transients seem ok and it did pick up something that resembles a sine, that's a good starting point, getting it to pick up the pitch envelope

Do you have any thoughts on how to make it sensitive to that?

@lanmower
Copy link
Author

Ok so I'm running a batch of sounds on it here, I've got a 1000 analog drum sounds, could we use these as a ground truth to test and improve the output for this kind of input? I can already see some things that appear to be lost in translation...

image

This is an example, it appears that there is a phasing issue at the end of the wavetable it generates in many sounds

image

In short envelope tracks, it appears that it's having trouble picking up the length of the envelope, making them too sort

@lanmower
Copy link
Author

When looking at my version it really appears like there's something wrong with it, because I see this in the output
C:\app\WPy64-310111\python-3.10.11.amd64\lib\site-packages\syntheon\inferencer\vital\models\preprocessor.py:127: FutureWarning: Pass sr=16000 as keyword args. From version 0.10 passing these as positional arguments will result in an error
x, sr = librosa.load(f, sampling_rate)
C:\app\WPy64-310111\python-3.10.11.amd64\lib\site-packages\librosa\core\convert.py:1332: RuntimeWarning: divide by zero encountered in log10

  • 2 * np.log10(f_sq)

@gudgud96
Copy link
Owner

@lanmower Thank you for your interest in this project! I am currently not focusing fully on Syntheon, but happy to discuss about improvements.

For the issue which the kick sounds "bassy" one way is to introduce filter modulation. This might be made possible with a recent related research.

For shorter-than-expected envelope detection, we rely on librosa.onset.onset_detect to detect onsets, and cut out a one-shot sample for further analysis. The onset detection could go wrong, which results in a one-shot too-short most of the time (hence might also affect the resulting wavetable). One way is to migrate to other onset detection / transcription libraries (e.g. Essentia, or BasicPitch), but each comes with its own inaccuracies, and some might not work well on drums.

Happy to have a look at the analog drum sounds too.

@lanmower
Copy link
Author

I think I get it, after your explanation the output sounds make a lot more sense, a lot of the sounds come out very rich in the treble range, for instance my modified sine input produces like a very sharp output

image

image

bass.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants