Error when using high rates #7

SuperKogito · 2020-05-03T17:22:17Z

hi, i am currently working on my second phase of experiment using your source code. thank you :) i just have one doubt: the audio and rate in features vector, im having troubles in sampling_rate when i tried to substitute it with 44100. can you please tell me where im going wrong?

Originally posted by @thxrgxxs in #4 (comment)

SuperKogito · 2020-05-03T17:27:21Z

please provide your code too and not just the error logs.
From your code, this line does not seem correct:

features_vector = extract_features(audio=".wav audio", rate=44100)

The audio variable is supposed to be the audio signal from which to compute features

thxrgxxs · 2020-05-04T14:36:49Z

import numpy as np
from sklearn import preprocessing
from scipy.io.wavfile import read
from python_speech_features import mfcc
from python_speech_features import delta
from GenderIdentifier import GenderIdentifier


def extract_features(audio, rate):
        mfcc_feature = mfcc(# The audio signal from which to compute features.
                            audio,
                            # The samplerate of the signal we are working with.
                            rate,
                            # The length of the analysis window in seconds. 
                            # Default is 0.025s (25 milliseconds)
                            winlen       = 0.05,
                            # The step between successive windows in seconds. 
                            # Default is 0.01s (10 milliseconds)
                            winstep      = 0.01,
                            # The number of cepstrum to return. 
                            # Default 13.
                            numcep       = 13,
                            # The number of filters in the filterbank.
                            # Default is 26.
                            nfilt        = 30,
                            # The FFT size. Default is 512.
                            nfft         = 1024,
                            # If true, the zeroth cepstral coefficient is replaced 
                            # with the log of the total frame energy.
                            appendEnergy = True)
    
        
        mfcc_feature  = preprocessing.scale(mfcc_feature)
        deltas        = delta(mfcc_feature, 2)
        double_deltas = delta(deltas, 2)
        combined      = np.hstack((mfcc_feature, deltas, double_deltas))
        return combined

# init gender identifier
gender_identifier = GenderIdentifier("TestingData/females", 
                                     "TestingData/males", 
                                     "females.gmm", "males.gmm")
# get audio features vector
features_vector = extract_features(audio=".wav audio", rate = 44100)

                                                
# predict/identify speaker's gender
predicted_gender = gender_identifier.identify_gender(features_vector)

the recorded audio files are stored in ".wav audio" file in .wav format. so where do i insert the file path? thank you for you guidance, btw :)

SuperKogito · 2020-05-04T21:40:32Z

I don't think this will work. You are training your identifier using data with 16000 Hz sampling rate and using it to test/ recognize gender from a file with a 44100 Hz. Please, take to a look at the graph in the README and this article, that I wrote in order to get a better grasp of the theory. To have correct results you need to use training and testing data with the same sampling rate.

The docs are very clear about this:

Voice-based-gender-recognition/Code/FeaturesExtractor.py

Line 20 in f266bbc

audio_path (str) : path to wave file without silent moments.

so:
features_vector = extract_features(audio="INSERT-AUDIO-FILE-PATH-HERE", rate = 44100)

thxrgxxs · 2020-05-05T13:52:02Z

alright, i'll work on this and let you know how it turns out.

thxrgxxs · 2020-05-08T06:56:54Z

i converted all my audio files into 16000 and also inserted the file path. but it keeps showing the same error in the "rate" variable.

SuperKogito · 2020-05-18T16:50:24Z

if you are using the correct rates and have placed the files in the right folders then running the features extraction, training and testing code sections should be straightforward. Without seeing the code nor the error logs, I can only guess what's wrong. So without the code or the errors, I am afraid that I cannot provide any reliable input.

thxrgxxs · 2020-05-28T08:47:05Z

error code.zip

this is the wav files i recorded and tried to run using the FeatureExtractor1.py (real time gender classification) code. all the audio files are in 16000Hz. can you please help me out? thanks

SuperKogito mentioned this issue May 3, 2020

Real-time gender classification? #4

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when using high rates #7

Error when using high rates #7

SuperKogito commented May 3, 2020

SuperKogito commented May 3, 2020

thxrgxxs commented May 4, 2020 •

edited by SuperKogito

Loading

SuperKogito commented May 4, 2020

thxrgxxs commented May 5, 2020

thxrgxxs commented May 8, 2020

SuperKogito commented May 18, 2020

thxrgxxs commented May 28, 2020 •

edited

Loading

Error when using high rates #7

Error when using high rates #7

Comments

SuperKogito commented May 3, 2020

SuperKogito commented May 3, 2020

thxrgxxs commented May 4, 2020 • edited by SuperKogito Loading

SuperKogito commented May 4, 2020

thxrgxxs commented May 5, 2020

thxrgxxs commented May 8, 2020

SuperKogito commented May 18, 2020

thxrgxxs commented May 28, 2020 • edited Loading

thxrgxxs commented May 4, 2020 •

edited by SuperKogito

Loading

thxrgxxs commented May 28, 2020 •

edited

Loading