Musical Instrument Sound Classifier

Welcome to the Musical Instrument Sound Classifier repository!

This project utilizes machine learning to classify musical instrument sounds using Mel Spectrogram features extracted from audio files.

The repository is structured to facilitate easy exploration, experimentation, and deployment of the classifier.

About the Project

This project aims to classify sounds of musical instruments such as guitar, piano, drums, and violin. Key highlights include:

Mel Spectrogram feature extraction for audio preprocessing.
Pre-trained and custom-trained models for experimentation.
Progressive improvements documented through various model iterations.
Deployment-ready server for real-time classification.

Features

Multiple Models: Six different models, each with unique approaches, are trained and evaluated.
Visualization: Confusion matrices and model performance metrics are documented.
Deployment: Dockerized server and web interface for easy deployment.
Research-Driven Development: Insights and research guiding the development process are documented inside this file and this one.

Getting Started

Prerequisites

Python Environment: Install Python 3.11+.
Pytroch: Install Pytorch.
Dependencies: Install dependencies with:
```
pip install -r requirements.txt
```
Docker(Optional, but recommend): Ensure Docker is installed for deployment.

Installation

Clone this repository:

git clone https://github.com/LMicol/instrument-classifier

Navigate to the project directory:
```
cd instrument-classifier/
```
Set up the environment and install dependencies.

Usage

Instruments sounds Dataset

The dataset used in this project can be found at Micol/musical-instruments-sound-dataset.

I've used this Kaggle dataset as base and made some changes using the scripts in src/helpers.

Training Models

Explore the src/models directory for Jupyter Notebooks to train and evaluate models. Each model has its corresponding training script and saved weights. To train a model in your machine or play with the notebooks, you will need to setup the whole environment.

I recommend using docker and the web view if you just want to test the final model.

Web view

In the folder src/web you'll find a simple web interface to test the model API with your microfone, I recommend using Firefox to test it. If you allow your microphone, the response from the model will be highlighted with a red box. If you want to use a file upload, the highlight will be a blue box and won't change.

Deployment

Using Docker Compose

For deployment of both the server and web interface, use the docker-compose.yml file provided in the repository. This will set up two services:

Web Interface: Runs on port 5000.
Audio Server: Runs on port 8000.

To deploy, run the following command in the project root directory:

docker-compose up --build

Access the services:

Web Interface: http://localhost:5000
API Server: http://localhost:8000

Access the web interface at http://localhost:5000.

Documentation

Research

The docs/research.md file contains detailed information about the research conducted to guide model development. The file is structured in three categories:

Personal Thoughts: Personal monologues and internal discussion I've had
Actions: Things I've done.
Research Oobservations: Comments about code, model behavior and, research overall.

Ideas

The docs/ideas.md file includes:

How the model was developed and idea behind the implementation.
Results for each model iteration.
Confusion matrices and performance insights.

Visuals

The images directory contains performance metrics for each model.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Musical Instrument Sound Classifier

Table of Contents

About the Project

Features

Getting Started

Prerequisites

Installation

Usage

Instruments sounds Dataset

Training Models

Web view

Deployment

Using Docker Compose

Documentation

Research

Ideas

Visuals

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
docs		docs
images		images
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

License

LMicol/instrument-classifier

Folders and files

Latest commit

History

Repository files navigation

Musical Instrument Sound Classifier

Table of Contents

About the Project

Features

Getting Started

Prerequisites

Installation

Usage

Instruments sounds Dataset

Training Models

Web view

Deployment

Using Docker Compose

Documentation

Research

Ideas

Visuals

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages