Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] add (LAION)-CLAP model embedding to FAD calculation #21

Merged
merged 4 commits into from
Oct 17, 2023

Conversation

mcomunita
Copy link
Contributor

  • add CLAP embeddings to FAD calculation
  • add all checkpoints available from https://github.com/LAION-AI/CLAP
  • add test/test_all.ipynb notebook to test all configurations available
  • fixes error with SAMPLE_RATE fixed to 16000
  • updates requirements


SAMPLE_RATE = 16000
# SAMPLE_RATE = 16000
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is used anywhere after your changes on passing in sample rate as argument.
In this case, we can remove this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can remove it. I left it in case, for some reason, you didn't agree with the change



def load_audio_task(fname, dtype="float32"):
def load_audio_task(fname, sample_rate, dtype="float32"):
# print("LOAD AUDIO TASK")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I forgot to remove the debugging print

ckpt_dir=None,
model_name="vggish",
model_name="vggish",
submodel_name="630k-audioset", # only for CLAP
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might introduce too many optional arguments for the instantiation of FrechetAudioDistance. While it is clear with comments right now, it might get bloated if we add in more models that need different arguments in future.

Moving forward, we probably need to introduce config files for different embedding models.
Let's leave it as is for now, and add a note here for future code refactoring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess once we also add OpenL3 we will have to do some refactoring to make FrechetAudioDistance easy to understand.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We might need to define some data classes for different models, one example might be the different types of model configs in huggingface's transformers, and we can keep them as JSON files.
If you have better ideas feel free to suggest / contribute too! :)

@@ -0,0 +1,627 @@
{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super awesome! Thank you so much!

I am wondering if this can be added as unit tests because in that case we can add them to our CI as well, and the checks can be run during PRs. What do you think? (we don't necessarily need to address this in this PR though, this notebook is comprehensive enough for now)

I personally tried when I added PANN, but PANN takes too long to download and it stalls the CI process. VGGish is fast enough so I kept it in the unit test. I am not sure if it takes long to download the CLAP model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you find it useful, I find notebooks much more immediate to understand the code in some cases.

I don't have a specific opinion about adding the code as unit tests. It would be useful but it might also have the same issue with CLAP since each checkpoint is about 2GB.

I think the notebook can work both as a test as well as an easy and quick way for users to copy/paste the section of code they want to use.

If we include a section to the README pointing to the notebook, anyone can immediatelly test the code and use what they need.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup for now running the notebook and adding instructions on README should suffice. I will add them in the README.

In the meantime, let me also think if there are better ways to integrate checks in CI. This is more from a maintenance perspective though, having CI would be beneficial as a fast quality and correctness check.

@gudgud96
Copy link
Owner

@mcomunita sorry for the late reply as I am on vacation last week.

Thank you for your contribution, CLAP comes in super handy. Comments as above - overall just some minor refactoring needed and some points that would love to have your feedback. Once they are addressed, let's get this merged!

Thanks again!

@mcomunita
Copy link
Contributor Author

Do you want me to make the corrections or you're going to take care of it?

@gudgud96
Copy link
Owner

@mcomunita I can do it later, no worries. Will merge PR once done!

@gudgud96 gudgud96 merged commit 3b34195 into gudgud96:main Oct 17, 2023
1 check passed
@mcomunita mcomunita deleted the laion-clap branch September 2, 2024 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants