Skip to content

An end-to-end machine learning project for detecting audio fingerprints. It includes data acquisition, data augmentation, feature extraction, dimensionality reduction, model building, and deployment.

Notifications You must be signed in to change notification settings

mahmoud1yaser/AudioPrint

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A voice and speaker recognition system using machine learning

Data

  • Every member of the team recorded equal number of voice records with equal number of categories.
  • Hierarchy:
    • DSP_Data_New
      • blank (blank audios)
      • close the door
      • open the door
      • unlock the door
    • DSP_Data_Verification
      • open the door (password)
      • others

Description

  • This is a web app that can recognize speech and verify voices in a form of Voice Command Door Lock that opens if the owners say the correct password "open the door".
  • The web page is an E-poster contains some information about the used data features, ML pipeline, the decision tree of the model , a section to test the app and a pie chart to show the confidence score of the result.

Machine Learning Process

  • We have followed the full machine learning pipeline used in the industry, from data acquisition to models deployment, and here are the steps:
    • 1- Data Acquisition

    • 2- Data Augmentation :to increase the amount of data to be trained by generating new data points from existing data to improve the performance and outcomes of the model.

    • 3- Exploratory Data Analysis "EDA" : to analyze the data using visual techniques that led us to take all features in frequency domain and only two features (AE,RMSE) in time domain

      note: we mainly have two models ,one for speech recognition and the other for voice verifying, the next steps are applied to both of them.

    • 4- Feature extraction and Dimensionality reduction : to reduce the number of features and only take the most efficient ones.

    • 5- Building The model: according to the accuracy we used random forest with its best hyper-parameters, the accuracy of the voice verifying model was 84% and the accuracy of the speech recognition model was 60% and because it's relatively low we imported an external model that convert speech to text to detect the password.

    • 6- Models deployment

Task-Info

  • Digital Signal Processing (SBE3110) class task 4 created by Team 9:

    Names Section Bench Number
    Mahmoud Yaser 2 30
    Ahmed El Sarta 1 8
    Adham Mohamed 1 9
    Maha Medhat 2 38
  • Languages & Frameworks

    • Python (Machine Learning)
    • HTML, CSS, JavaScript (Frontend)
    • Flask (Backend)
  • Submitted to: Dr. Tamer Basha & Eng. Abdallah

Preview

Alt text

All rights reserved © 2022 to our Team - Systems & Biomedical Engineering, Cairo University (Class 2024)

About

An end-to-end machine learning project for detecting audio fingerprints. It includes data acquisition, data augmentation, feature extraction, dimensionality reduction, model building, and deployment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.3%
  • Other 0.7%