Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RASR compatible feature extraction #44

Merged
Merged
Changes from 1 commit
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
0f5a148
add rasr compatible feature extraction
kuacakuaca Dec 15, 2023
e8c5dde
remove f_min and f_max from config
kuacakuaca Dec 15, 2023
5fa4bf3
fix
kuacakuaca Jan 3, 2024
33dd047
add preemphasis, use amplitude instead of power spectrum, additive lo…
kuacakuaca Jan 29, 2024
3071b44
small change
kuacakuaca Jan 29, 2024
687854a
make alpha a parameter
kuacakuaca Jan 30, 2024
e7c850d
fix errors
kuacakuaca Feb 1, 2024
482f560
fix window broadcasting
albertz Feb 13, 2024
9e0cde3
test_rasr_compatible
albertz Feb 13, 2024
f45f01e
test_rasr_compatible more
albertz Feb 13, 2024
71f9e6a
test_rasr_compatible_raw_audio_samples (passing)
albertz Feb 13, 2024
259d7f3
test_rasr_compatible_preemphasis (failing)
albertz Feb 13, 2024
f41a60c
fix preemphasize
albertz Feb 13, 2024
acefd99
test_rasr_compatible_window (failing)
albertz Feb 13, 2024
ba47a78
testing custom hanning window implementations
albertz Feb 13, 2024
44aba81
cleanup, fix windowing (WIP)
albertz Feb 13, 2024
d845650
fix last Hanning window
albertz Feb 14, 2024
e2cda8b
fix device
albertz Feb 14, 2024
6925acc
simplify
albertz Feb 14, 2024
848c886
test_rasr_compatible_fft (failing)
albertz Feb 14, 2024
990a977
FFT test more direct (still failing)
albertz Feb 14, 2024
6556b3c
tests deterministic
albertz Feb 14, 2024
79a043e
copy RASR C++ FFT code for testing
albertz Feb 14, 2024
3476557
FFT fixes
albertz Feb 14, 2024
467f828
FFT becomes more exact
albertz Feb 14, 2024
69cd90d
FFT cleanup
albertz Feb 14, 2024
aca1eb3
test_rasr_compatible_amplitude_spectrum (failing)
albertz Feb 14, 2024
3ead58f
add fft scaling, updata test and remove spaces
kuacakuaca Mar 12, 2024
1af10fc
black
kuacakuaca Mar 12, 2024
fff39f0
adjust last window for different sequence lengths
kuacakuaca Mar 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions i6_models/primitives/feature_extraction.py
Original file line number Diff line number Diff line change
@@ -128,13 +128,15 @@ class RasrCompatibleLogMelFeatureExtractionV1Config(ModelConfiguration):
hop_size: window shift in seconds
min_amp: minimum amplitude for safe log
num_filters: number of mel windows
alpha: preemphasis weight
"""

sample_rate: int
win_size: float
hop_size: float
min_amp: float
num_filters: int
alpha: float = 1.0

def __post_init__(self) -> None:
super().__post_init__()
@@ -153,8 +155,10 @@ def __init__(self, cfg: RasrCompatibleLogMelFeatureExtractionV1Config):
self.hop_length = int(cfg.hop_size * cfg.sample_rate)
self.min_amp = cfg.min_amp
self.win_length = int(cfg.win_size * cfg.sample_rate)
# smallest power if two which is greater than or equal to win_length
self.n_fft = 2 ** math.ceil(math.log2(self.win_length))
self.n_fft = 2 ** math.ceil(
math.log2(self.win_length)
) # smallest power if two which is greater than or equal to win_length
self.alpha = cfg.alpha
albertz marked this conversation as resolved.
Show resolved Hide resolved

self.register_buffer(
"mel_basis",
@@ -178,9 +182,9 @@ def forward(self, raw_audio, length) -> Tuple[torch.Tensor, torch.Tensor]:
:param length in samples: [B]
:return features as [B,T,F] and length in frames [B]
"""
# preemphasis
# preemphasize
preemphasized = raw_audio.clone()
preemphasized[..., 1:] -= 1.0 * preemphasized[..., :-1]
preemphasized[..., 1:] -= self.alpha * preemphasized[..., :-1]

# zero pad for the last frame
padded = torch.cat([preemphasized, torch.zeros(preemphasized.shape[0], (self.hop_length - 1))], dim=1)
michelwi marked this conversation as resolved.
Show resolved Hide resolved