Skip to content

Commit

Permalink
[chore] bump to 0.3.0
Browse files Browse the repository at this point in the history
  • Loading branch information
gudgud96 committed Mar 25, 2024
1 parent 9659a00 commit ab3f317
Show file tree
Hide file tree
Showing 3 changed files with 51 additions and 10 deletions.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2023 Hao Hao Tan
Copyright (c) 2024 Hao Hao Tan

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
52 changes: 45 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,24 @@
## Frechet Audio Distance in PyTorch

A lightweight library of Frechet Audio Distance calculation.
A lightweight library of Frechet Audio Distance (FAD) calculation.

Currently, we support embedding from:
- `VGGish` by [S. Hershey et al.](https://arxiv.org/abs/1812.08466)
- `PANN` by [Kong et al.](https://arxiv.org/abs/1912.10211)
- `CLAP` by [Wu et al.](https://arxiv.org/abs/2211.06687)
Currently, we support:
- FAD score, with embeddings from:
- `VGGish` by [S. Hershey et al.](https://arxiv.org/abs/1812.08466)

- `PANN` by [Kong et al.](https://arxiv.org/abs/1912.10211)
- `CLAP` by [Wu et al.](https://arxiv.org/abs/2211.06687)
- `EnCodec` by [Defossez et al.](https://arxiv.org/pdf/2210.13438.pdf)

- CLAP score, for text and audio matching

### Installation

`pip install frechet_audio_distance`

### Demo
### Example

#### For FAD:

```python
from frechet_audio_distance import FrechetAudioDistance
Expand Down Expand Up @@ -40,12 +47,43 @@ frechet = FrechetAudioDistance(
verbose=False,
enable_fusion=False, # for CLAP only
)
fad_score = frechet.score("/path/to/background/set", "/path/to/eval/set", dtype="float32")
# to use `EnCodec`
frechet = FrechetAudioDistance(
model_name="encodec",
sample_rate=48000,
channels=2,
verbose=False,
)

fad_score = frechet.score(
"/path/to/background/set",
"/path/to/eval/set",
dtype="float32"
)
```

You can also have a look at [this notebook](https://github.com/gudgud96/frechet-audio-distance/blob/main/test/test_all.ipynb) for a better understanding of how each model is used.

#### For CLAP score:

```python
from frechet_audio_distance import CLAPScore

clap = CLAPScore(
submodel_name="630k-audioset",
verbose=True,
enable_fusion=False,
)

clap_score = clap.score(
text_path="./text1/text.csv",
audio_dir="./audio1",
text_column="caption",
)
```

For more info, kindly refer to [this notebook](https://github.com/gudgud96/frechet-audio-distance/blob/main/test/test_clap_score.ipynb).

### Save pre-computed embeddings

When computing the Frechet Audio Distance, you can choose to save the embeddings for future use.
Expand Down
7 changes: 5 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "frechet_audio_distance"
version = "0.2.0"
version = "0.3.0"
authors = [
{ name="Hao Hao Tan", email="[email protected]" },
]
Expand Down Expand Up @@ -32,4 +32,7 @@ dependencies = [
]

[project.urls]
"Homepage" = "https://github.com/gudgud96/frechet-audio-distance"
"Homepage" = "https://github.com/gudgud96/frechet-audio-distance"

[tool.setuptools]
py-modules = []

0 comments on commit ab3f317

Please sign in to comment.