Skip to content

A Kotlin library for audio analysis: FFT, pitch shifting and accurate BPM detection for .wav files.

License

Notifications You must be signed in to change notification settings

xsoophx/Kymatik

Repository files navigation

Kymatik: A Kotlin Library for Audio Analysis 🎡πŸ§ͺ

Named after the fascinating study of wave phenomena, Kymatik offers a suite to analyze Audio data. It contains a range of functionalities, including accurate BPM detection and comprehensive FFT analysis. The FFT components can also be used to analyze your own samples.

Contents

Current Features

1. FFT Analysis

Kymatik offers comprehensive Fast Fourier Transform (FFT) methods that allow you to break down .wav audio files into their constituent frequencies. You can also use Kymatik to analyze your own samples with the help of the FFT methods. The currently supported FFT methods include:

2. Efficient and Precise BPM Detection

See BPM Analyzer πŸŽ›οΈ

3. Stretching / compressing audio files

Kymatik allows you to change the length of audio files, either by extending or shortening them. However, this currently affects the pitch of the audio. Efforts are underway to develop a pitch correction feature that will allow you to change the length of an audio file without affecting its pitch. This feature will be especially useful for syncing beats in remixes or adjusting the tempo of a track.

Upcoming Features 🚧

I'm (more or less) actively working on expanding Kymatik's capabilities. Some of the features currently in development include:

1. Expanding FFT Analysis:

  • Windowed FFT: Adding FFT Hop size as sample number (currently only supports hop size in seconds, e.g. 0.1s)
    • Current solution: hopIntervalInSeconds = hopSizeInSamples / sampleRate
  • Zero Padding: Zero padding the input signal to improve the frequency resolution of the FFT
  • Bluestein Algorithm
  • Goertzel Algorithm

2. Starting Position Detection:

A feature that will allow you to detect the starting position of a beat in a music file. This can be useful for removing silence at the beginning of a track or aligning beats in a remix.

3. Various pitch detection / shifting algorithms:

Including YIN, Harmonic Product Spectrum and more.

4. (optional) Audio visualization:

Visualize the audio data in various ways, including waveform, spectrogram and more.

5. Documentation:

Comprehensive documentation as a guide through the library's features and functionalities.

Please note that these features are in various stages of development and will be rolled out as they reach maturity.

Getting Started πŸš€

To get started with Kymatik, please follow these steps:

1. Installation

Make sure to check out https://jitpack.io/ for more info on how to use JitPack. To install Kymatik in your project, add the dependencyResolutionManagement to your settings.gradle and the following dependency to your build.gradle file:

repositories {
    maven { url "https://jitpack.io" }
}

dependencies {
    implementation("cc.suffro:kymatik:0.1.0-beta")
}

2. Usage

Kymatik is using Koin as its dependency injection framework. If using this project without Koin, you have to create each instance for each constructor on your own. If you are using it with Koin, the configurations out of the modules will be used. In the following examples, you can see how to use Kymatik with and without Koin.

Kymatik without Koin Dependency Injection

If you prefer not to use Koin, you can also use Kymatik without it. Here's an example of how to use Kymatik without Koin:

Call KoinManager.INSTANCE to initialize Kymatiks Koin dependency injection framework:

class Main {

    init {
        KoinManager.INSTANCE
    }
}

Kymatik with Koin Dependency Injection

Important: Make sure to initialize Koin before using it. This can be done through instantiating the class or by inheriting from it. Alternatively, you can also call KoinManager.INSTANCE to initialize Koin.

All the examples from above can be used with Koin as well. Here's an example of how to use Kymatik with Koin:

1. Implement KoinComponent and call KoinManager.INSTANCE to initialize Koin:

class Main : KoinComponent {

    init {
        KoinManager.INSTANCE
    }
}

Alternatively you can also create a class that inherits from BPMAnalyzer or instantiate an instance of it, which is then taking care of the Koin startup and shutdown in its init function:

class Main : BpmAnalyzer() {
    // Your code here
}

2. In your class, you can now use the Koin dependency injection to get your desired services.

Important: Make sure to stop Koin after you're done analyzing your audio files. This can be achieved by calling close()on the BPMAnalyzer instance or using the use function (see example above). Alternatively, you can also call KoinManager.INSTANCE.close() to stop Koin.

class Main : BpmAnalyzer() {

    fun analyze(wav: Wav): TrackInfo {
        val fileReader: FileReader<Wav> by inject()
        val wav = fileReader.read("path/to/your/wav/file.wav")

        return use { analyze(wav) }
    }
}

or

class Main : KoinComponent {

    init {
        KoinManager.INSTANCE
    }

    fun analyze(wav: Wav): TrackInfo {
        val bpmAnalyzer: BPMAnalyzer by inject()
        val fileReader: FileReader<Wav> by inject()
        val wav = fileReader.read("path/to/your/wav/file.wav")

        return bpmAnalyzer.use { it.analyze(wav) }
    }
}

Usage of Kymatik

Reading a .wav file and analyzing its BPM:

val wav = WAVReader.read("path/to/your/wav/file.wav")
val result = BpmAnalyzer().analyze(wav)

Reading a .wav file and calculating its FFT:

    fun calculateFFT() {
    val wav = WAVReader.read("path/to/your/wav/file.wav")
    val params = WindowProcessingParams(
        start = 0.0,
        end = 10.0,
        interval = 0.01,
        channel = 0,
        numSamples = FftSampleSize.DEFAULT
    )
    val fftResult = FFTProcessor.processWav(wav, params, WindowFunctionType.HAMMING.function)
}

Adjusting the tempo of a .wav file:

You can either use the new samples obtained by the changeTo functions:

// current BPM not determined yet
fun adjustTempo() {
    val wav = WAVReader.read("path/to/your/wav/file.wav")

    // analyzer can get injected, if class is KoinComponent
    val injectedAnalyzer by inject<Analyzer<Wav, TrackInfo>>()
    val samplesOne = SpeedAdjuster(injectedAnalyzer).changeTo(wav, 120.0)

    // otherwise, you can create an instance of the analyzer
    val analyzer = CombFilterAnalyzer(CombFilterOperationsImpl())
    val samplesTwo = SpeedAdjuster(analyzer).changeTo(wav, 120.0)
}

Or save the new wav file directly:

 // current BPM not determined yet
    fun stretchAndSaveWav() {
        val wav = WAVReader.read("path/to/your/wav/file.wav")
        val targetPath = Path.of("output/path/for/first.wav")

        // analyzer can get injected, if class is KoinComponent
        val injectedAnalyzer by inject<Analyzer<Wav, TrackInfo>>()
        val samplesOne = SpeedAdjuster(injectedAnalyzer).changeWavTo(wav, 120.0, targetPath)

        // otherwise, you can create an instance of the analyzer
        val analyzer = CombFilterAnalyzer(CombFilterOperationsImpl())
        val samplesTwo = SpeedAdjuster(analyzer).changeWavTo(wav, 120.0, 130.0, targetPath)
    }

Calculating the FFT of your custom samples:

  fun calculateFftOfCustom() {
    val samples = (0 until 1024).map { i -> i.toDouble() }
    val fftResult = FFTProcessor.process(samples, 44100)
}

This method is yielding a sequence of Frequency Domain Windows, each containing the FFT result of the respective time window.

Using your custom window function and a non-default FFT method:

 fun calculateWithCustomFunction() {
    val samples = (0 until 1024).map { i -> i.toDouble() }
    val customFunction: WindowFunction = { sample, length -> sample.toDouble() / length }

    val fftResult = FFTProcessor.process(
        inputSamples = samples,
        samplingRate = 44100,
        method = Method.R2C_DFT,
        windowFunction = customFunction
    )
}

Contributing

Contributions are warmly welcomed. Whether it's feature development, bug fixes or documentation improvements. Please refer to the contributing guidelines for more information.

Feedback and Support

For feature requests, bug reports or general feedback, please open an issue on this GitHub repository.

License

Kymatik is licensed under the MIT License, allowing for widespread use and adaptation. For full license details, please see the LICENSE file.

BPM Analyzer πŸŽ›οΈ

This project was originally designed to analyze the tempo (BPM) of .wav music files, utilizing various digital signal processing techniques. The primary method used for tempo detection in this repository is implemented in the CombFilterAnalyzer, which is giving high precision in determining BPM.

The Analyzer is making use of an algorithm, which was proposed on this page - see CombFilterAnalyzer.

Step 1: Filter bank

The algorithm begins by dissecting the audio signal into distinct frequency bands, isolating different instrumental ranges. This step is important, as it mitigates the potential for tempo detection errors caused by overlapping beats from various instruments. By applying the Fast Fourier Transform (FFT) and segmenting the resultant spectrum into predefined frequency ranges, each band captures a unique aspect of the music's profile (0-200Hz to 3200Hz). This is ensuring a comprehensive analysis across the spectrum.

Step 2: Smoothing

Each frequency band undergoes full-wave rectification followed by a convolution with an optional window function (a process for smoothing out the signal and accentuating the amplitudes). This smoothing helps to have a cleaner representation of the rhythmic pulse.

Step 3: Differential Rectification

The algorithm now differentiates the signals to highlight sudden changes in amplitude. By differentiating and then half-wave rectifying, the algorithm determines significant sound intensity increases, which typically align with the beats in music. This step transforms the smoothed envelopes into a form optimized for the final tempo analysis.

Step 4: Comb Filter

Finally, the algorithm uses a comb filter to scan through the differentiated signals. This comb filter is convolved with the signal to determine the alignment between the signal's rhythmic pattern and the filter's tempo. When the tempo of the comb filter resonates with the tempo of the music, the convolution results in a signal with pronounced peaks, indicating a strong correlation. By examining the energy output of these convolutions across a spectrum of tempos, the algorithm can accurately determine the music's tempo.

About

A Kotlin library for audio analysis: FFT, pitch shifting and accurate BPM detection for .wav files.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages