Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use this for repo for just testing? #67

Open
sandeshnaroju opened this issue Nov 28, 2020 · 11 comments
Open

How to use this for repo for just testing? #67

sandeshnaroju opened this issue Nov 28, 2020 · 11 comments

Comments

@sandeshnaroju
Copy link

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

@ruclion
Copy link

ruclion commented Dec 23, 2020

Just do this like readme said~

0.Convert Mel-Spectrograms
Download pre-trained AUTOVC model, and run the conversion.ipynb in the same directory.

1.Mel-Spectrograms to waveform
Download pre-trained WaveNet Vocoder model, and run the vocoder.ipynb in the same the directory.

Please note the training metadata and testing metadata have different formats.

@ghost
Copy link

ghost commented Jan 5, 2021

And how make inference after that?

@ruclion
Copy link

ruclion commented Jan 6, 2021

The import thing is to get "metadata.pkl"
it can be get by run make_spect.py -> python make_metadata.py
if you directly run them, then use author's wavs
if change wavs to ourselves, metadata.pkl is our's wavs, and then
read code conversion.ipynb and run it~

@ghost
Copy link

ghost commented Jan 6, 2021

python make_metadata.py does NOT generate "metadata.pkl". You can check the code.

@aneybaby727
Copy link

@ruclion I have the same problem make_metadata.py does NOT generate "metadata.pkl".

@hongchengzhu
Copy link

@ruclion I have the same problem make_metadata.py does NOT generate "metadata.pkl".

python make_metadata.py does NOT generate "metadata.pkl". You can check the code.

Hello, I met the same question as you. So could you please share how you solve the question? Thank you in advance!

@hongchengzhu
Copy link

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Have you solved the question? Could you share the solution, please? Thank you.

@atravler
Copy link

I just want to play with this repo, I don't want to train/build anything. Just use it for few times. Any instructions how to do it?

Have you solved the question? Could you share the solution, please? Thank you.

so do i,do you have any solution?

@jlian2
Copy link

jlian2 commented Sep 30, 2021

If you put only one wav file into each speaker directory, this modified make_metadata.py should work:

import pickle
from model_bl import D_VECTOR
from collections import OrderedDict
import numpy as np
import torch

C = D_VECTOR(dim_input=80, dim_cell=768, dim_emb=256).eval().cuda()
c_checkpoint = torch.load('3000000-BL.ckpt')
new_state_dict = OrderedDict()
for key, val in c_checkpoint['model_b'].items():
    new_key = key[7:]
    new_state_dict[new_key] = val
C.load_state_dict(new_state_dict)
num_uttrs = 1
len_crop = 128

# Directory containing mel-spectrograms
rootDir = './spmel'
dirName, subdirList, _ = next(os.walk(rootDir))
print('Found directory: %s' % dirName)


speakers = []

for speaker in sorted(subdirList):
    if len(speaker) != 4:
        continue
    print('Processing speaker: %s' % speaker)
    utterances = []
    utterances.append(speaker)
    _, _, fileList = next(os.walk(os.path.join(dirName,speaker)))
    
    idx_uttrs = np.random.choice(len(fileList), size=num_uttrs, replace=False)

    embs = []
    mel_specs = []
    for i in range(num_uttrs):

        tmp = np.load(os.path.join(dirName, speaker, fileList[idx_uttrs[i]]))
        candidates = np.delete(np.arange(len(fileList)), idx_uttrs)
        melsp = torch.from_numpy(tmp).cuda().unsqueeze(0)
        emb = C(melsp)
        embs.append(emb.detach().squeeze().cpu().numpy())   
        mel_specs.append(melsp.squeeze(0))
        
    utterances.append(np.mean(embs, axis=0)) #this is spker embedding
        
    
    for mel_spec in mel_specs:
        utterances.append(mel_spec.cpu().numpy())
    speakers.append(utterances)

print("len of speaker", len(speakers))
    

with open(os.path.join('metadata_own.pkl'), 'wb') as handle:
    pickle.dump(speakers, handle)

@dragen1860
Copy link

how to use my own source content wav and target style wav ? thank you.

@Ha0Tang
Copy link

Ha0Tang commented Jan 17, 2022

@dragen1860 have you fixed the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants