-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature] add (LAION)-CLAP model embedding to FAD calculation #21
Conversation
|
||
SAMPLE_RATE = 16000 | ||
# SAMPLE_RATE = 16000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is used anywhere after your changes on passing in sample rate as argument.
In this case, we can remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can remove it. I left it in case, for some reason, you didn't agree with the change
|
||
|
||
def load_audio_task(fname, dtype="float32"): | ||
def load_audio_task(fname, sample_rate, dtype="float32"): | ||
# print("LOAD AUDIO TASK") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line can be removed too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I forgot to remove the debugging print
ckpt_dir=None, | ||
model_name="vggish", | ||
model_name="vggish", | ||
submodel_name="630k-audioset", # only for CLAP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might introduce too many optional arguments for the instantiation of FrechetAudioDistance
. While it is clear with comments right now, it might get bloated if we add in more models that need different arguments in future.
Moving forward, we probably need to introduce config files for different embedding models.
Let's leave it as is for now, and add a note here for future code refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess once we also add OpenL3 we will have to do some refactoring to make FrechetAudioDistance
easy to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We might need to define some data classes for different models, one example might be the different types of model configs in huggingface's transformers
, and we can keep them as JSON files.
If you have better ideas feel free to suggest / contribute too! :)
@@ -0,0 +1,627 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super awesome! Thank you so much!
I am wondering if this can be added as unit tests because in that case we can add them to our CI as well, and the checks can be run during PRs. What do you think? (we don't necessarily need to address this in this PR though, this notebook is comprehensive enough for now)
I personally tried when I added PANN, but PANN takes too long to download and it stalls the CI process. VGGish is fast enough so I kept it in the unit test. I am not sure if it takes long to download the CLAP model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad you find it useful, I find notebooks much more immediate to understand the code in some cases.
I don't have a specific opinion about adding the code as unit tests. It would be useful but it might also have the same issue with CLAP since each checkpoint is about 2GB.
I think the notebook can work both as a test as well as an easy and quick way for users to copy/paste the section of code they want to use.
If we include a section to the README
pointing to the notebook, anyone can immediatelly test the code and use what they need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup for now running the notebook and adding instructions on README should suffice. I will add them in the README.
In the meantime, let me also think if there are better ways to integrate checks in CI. This is more from a maintenance perspective though, having CI would be beneficial as a fast quality and correctness check.
@mcomunita sorry for the late reply as I am on vacation last week. Thank you for your contribution, CLAP comes in super handy. Comments as above - overall just some minor refactoring needed and some points that would love to have your feedback. Once they are addressed, let's get this merged! Thanks again! |
Do you want me to make the corrections or you're going to take care of it? |
@mcomunita I can do it later, no worries. Will merge PR once done! |
test/test_all.ipynb
notebook to test all configurations available