Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not combining utt2uniq as it does not exist #14

Open
980202006 opened this issue Apr 26, 2022 · 11 comments
Open

not combining utt2uniq as it does not exist #14

980202006 opened this issue Apr 26, 2022 · 11 comments

Comments

@980202006
Copy link

Is there any problem with this
image

@Natalia-T
Copy link
Collaborator

This is not a bug/warning, but only an info message.
This is an expected behavior of the Kaldi script utils/combine_data.sh for the given data.

@980202006
Copy link
Author

Thank you! I got this error. Are there any files missing from the directory?
Traceback (most recent call last):
File "/home/xingyum/models/Voice-Privacy-Challenge-2022/venv/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/xingyum/models/Voice-Privacy-Challenge-2022/venv/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/xingyum/.vscode-server/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/main.py", line 45, in
cli.main()
File "/home/xingyum/.vscode-server/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "/home/xingyum/.vscode-server/extensions/ms-python.python-2022.4.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("main"))
File "/home/xingyum/models/Voice-Privacy-Challenge-2022/venv/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/xingyum/models/Voice-Privacy-Challenge-2022/venv/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/xingyum/models/Voice-Privacy-Challenge-2022/venv/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/xingyum/models/Voice-Privacy-Challenge-2022/baseline/local/anon/gen_pseudo_xvecs.py", line 107, in
if pool_spk2gender[pool_spk] == gender:
KeyError: '392'

@Natalia-T
Copy link
Collaborator

Natalia-T commented Apr 28, 2022

Could you please attach the file (spk2gender) from the speaker pool that causes the problem in https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022/blob/master/baseline/local/anon/gen_pseudo_xvecs.py#L106

If you did not change the speaker pool in the original recipe, the path is baseline\data\libritts_train_other_500\spk2gender.

@980202006
Copy link
Author

1006 f
102 f
1049 m
104 f
1051 f
1065 m
107 m
1084 f
1085 m
1092 f
1094 m
1096 m
1097 m
1107 m
110 f
1110 m
111 f
1124 f
1132 m
1152 f
1154 f
1161 m
1166 f
1168 f
1171 f
1179 m
1184 m
1187 m
1200 m
1225 m
1230 m
1239 m
123 f
1250 f
1252 f
1258 m
1260 m
1261 m
1266 f
1274 m
127 m
1280 m
128 m
1291 f
1298 f
1331 m
133 m
1341 m
1342 f
1347 m
1353 m
1367 m
1370 f
1373 f
1374 m
1384 m
1403 m
1414 f
1421 f
1430 m
1444 m
1469 m
1474 f
147 m
1485 m
1492 m
1494 m
1495 m
1505 m
151 f
152 m
153 m
1544 f
1545 f
1559 m
1563 f
1564 f
1566 f
1569 f
1572 m
1579 f
1593 f
1595 m
1601 m
1614 m
1618 m
161 m
1621 m
1633 f
1636 f
1643 m
1646 m
1647 m
1648 f
1653 f
1664 f
1665 f
1674 f
1679 f
167 m
1680 f
1681 f
1685 m
168 m
1690 f
1691 f
1693 f
1695 f
1696 f
1699 m
1704 f
1708 m
1710 f
1714 f
1717 f
1721 f
1726 f
1733 f
1736 f
173 f
1746 m
1750 f
1756 f
1757 f
1760 f
1765 f
1767 f
1772 m
1773 f
177 f
1780 m
1784 f
1795 m
1804 f
1809 f
1813 f
1815 m
1819 f
1828 m
1844 m
1846 m
1863 f
1868 m
1870 m
1878 m
1901 f
1920 f
1924 m
1931 f
1938 m
1968 f
1977 f
1985 m
1989 m
199 f
2001 m
2003 m
2013 m
2021 m
2026 f
202 f
2042 f
2046 m
2050 f
2051 m
2062 f
2063 m
2067 m
2068 f
2089 f
2090 f
2096 m
20 f
2100 f
2104 m
2122 m
2133 m
2140 m
2143 f
2148 f
2152 f
215 f
2185 m
218 m
2195 m
2198 m
2208 m
2234 m
2237 f
2246 m
2262 m
2270 f
2273 m
2275 f
2276 f
2279 m
2284 m
2288 f
228 f
2292 m
2297 f
2301 f
2309 m
2312 f
2339 f
2341 m
2346 f
2351 f
2356 m
2361 f
2374 f
2380 m
238 f
2405 m
2407 m
2437 m
243 f
2445 f
2448 m
245 m
2485 f
2487 f
2488 f
2491 m
2496 m
2504 f
2522 f
2526 m
252 m
253 m
2541 m
2544 f
2545 m
2552 m
2553 f
255 m
2568 f
2574 m
2587 f
2588 m
25 m
2606 m
2607 f
2624 m
263 m
264 m
265 m
2660 m
2671 f
2676 f
2694 f
2712 f
2724 m
2730 f
2733 m
2735 f
273 m
2740 m
2748 f
2754 f
2762 f
277 f
2825 m
2834 f
283 m
2854 m
2895 f
2909 f
2919 f
2925 f
2930 m
2943 f
2946 m
294 f
2967 f
2975 f
2979 f
2985 m
2988 f
2990 m
2997 f
2998 m
29 m
3006 f
3020 m
3021 m
3033 f
3045 m
3053 f
3054 m
3060 m
3063 m
3079 f
3088 m
3090 f
3097 m
3098 m
3100 m
3109 m
310 f
3125 f
3132 m
3135 m
3137 m
3138 f
313 f
3142 m
3143 f
3144 m
3148 m
3172 f
3179 f
317 m
3192 m
3196 f
319 m
31 m
3227 f
3238 m
3244 f
3245 f
3257 m
3261 m
3268 m
3271 m
3272 m
3285 m
3288 m
3290 f
3314 f
3318 f
3319 f
331 m
3334 f
3346 f
3356 f
336 m
3373 m
3381 m
3394 f
3400 f
3409 f
3411 f
3417 m
3433 m
3465 f
3467 f
3470 m
3479 f
3488 m
348 m
3500 f
3503 m
3541 m
3547 m
3553 m
3554 f
3557 f
3559 f
3564 f
3567 m
3571 m
3587 m
3588 f
3592 m
3595 m
3598 f
3606 m
3618 m
3641 f
3647 f
3650 m
3656 m
3657 m
365 m
3665 m
366 f
3675 f
3679 f
3681 f
3691 m
3698 f
36 m
3744 m
3747 m
3757 f
3779 f
377 m
3780 m
3783 m
3793 m
3796 m
3798 f
37 m
3819 m
3843 f
3845 m
3848 f
3867 f
3871 m
3885 f
3894 m
3895 m
3896 m
3906 f
3909 f
3911 f
3912 m
3925 f
3926 f
3928 f

@Natalia-T
Copy link
Collaborator

Thank you @980202006.
This file spk2gender is incomplete, the total number of speakers in LibriTTS\train-other-500 is 1160, while your file contains only 411 speakers.

The bug happens because speaker 392 is missing in the file.
(The original \baseline\data\libritts_train_other_500\spk2gender file is in the attachment: spk2gender)

@980202006
Copy link
Author

I have the following problem, activate does not accept more than one argument.
Stage 8: Making evaluation subsets...
temp
dev
utils/subset_data_dir.sh: reducing #utt from 2321 to 343
utils/subset_data_dir.sh: reducing #utt from 2321 to 1018
utils/subset_data_dir.sh: reducing #utt from 2321 to 960
utils/combine_data.sh data/libri_dev_trials_all data/libri_dev_trials_f data/libri_dev_trials_m
activate does not accept more than one argument:
['data/libri_dev_trials_all', 'data/libri_dev_trials_f', 'data/libri_dev_trials_m']

the in_dir is data/libri_dev_trials_f
the in_dir is data/libri_dev_trials_m
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
fix_data_dir.sh: kept all 1978 utterances.
fix_data_dir.sh: old files are kept in data/libri_dev_trials_all/.backup
utils/subset_data_dir.sh: reducing #utt from 12253 to 600
utils/subset_data_dir.sh: reducing #utt from 12253 to 5422
utils/subset_data_dir.sh: reducing #utt from 12253 to 344
utils/combine_data.sh data/vctk_dev_trials_f_all data/vctk_dev_trials_f data/vctk_dev_trials_f_common
activate does not accept more than one argument:
['data/vctk_dev_trials_f_all', 'data/vctk_dev_trials_f', 'data/vctk_dev_trials_f_common']

the in_dir is data/vctk_dev_trials_f
the in_dir is data/vctk_dev_trials_f_common
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
utils/fix_data_dir.sh: file data/vctk_dev_trials_f_all/spk2gender is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 5766 utterances.
fix_data_dir.sh: old files are kept in data/vctk_dev_trials_f_all/.backup
utils/subset_data_dir.sh: reducing #utt from 12253 to 5255
utils/subset_data_dir.sh: reducing #utt from 12253 to 351
utils/combine_data.sh data/vctk_dev_trials_m_all data/vctk_dev_trials_m data/vctk_dev_trials_m_common
activate does not accept more than one argument:
['data/vctk_dev_trials_m_all', 'data/vctk_dev_trials_m', 'data/vctk_dev_trials_m_common']

the in_dir is data/vctk_dev_trials_m
the in_dir is data/vctk_dev_trials_m_common
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
utils/fix_data_dir.sh: file data/vctk_dev_trials_m_all/spk2gender is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 5606 utterances.
fix_data_dir.sh: old files are kept in data/vctk_dev_trials_m_all/.backup
utils/combine_data.sh data/vctk_dev_trials_all data/vctk_dev_trials_f_all data/vctk_dev_trials_m_all
activate does not accept more than one argument:
['data/vctk_dev_trials_all', 'data/vctk_dev_trials_f_all', 'data/vctk_dev_trials_m_all']

the in_dir is data/vctk_dev_trials_f_all
the in_dir is data/vctk_dev_trials_m_all
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
fix_data_dir.sh: kept all 11372 utterances.
fix_data_dir.sh: old files are kept in data/vctk_dev_trials_all/.backup
test
utils/subset_data_dir.sh: reducing #utt from 2620 to 438
utils/subset_data_dir.sh: reducing #utt from 2620 to 734
utils/subset_data_dir.sh: reducing #utt from 2620 to 762
utils/combine_data.sh data/libri_test_trials_all data/libri_test_trials_f data/libri_test_trials_m
activate does not accept more than one argument:
['data/libri_test_trials_all', 'data/libri_test_trials_f', 'data/libri_test_trials_m']

the in_dir is data/libri_test_trials_f
the in_dir is data/libri_test_trials_m
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
fix_data_dir.sh: kept all 1496 utterances.
fix_data_dir.sh: old files are kept in data/libri_test_trials_all/.backup
utils/subset_data_dir.sh: reducing #utt from 12350 to 600
utils/subset_data_dir.sh: reducing #utt from 12350 to 5328
utils/subset_data_dir.sh: reducing #utt from 12350 to 346
utils/combine_data.sh data/vctk_test_trials_f_all data/vctk_test_trials_f data/vctk_test_trials_f_common
activate does not accept more than one argument:
['data/vctk_test_trials_f_all', 'data/vctk_test_trials_f', 'data/vctk_test_trials_f_common']

the in_dir is data/vctk_test_trials_f
the in_dir is data/vctk_test_trials_f_common
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
utils/fix_data_dir.sh: file data/vctk_test_trials_f_all/spk2gender is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 5674 utterances.
fix_data_dir.sh: old files are kept in data/vctk_test_trials_f_all/.backup
utils/subset_data_dir.sh: reducing #utt from 12350 to 5420
utils/subset_data_dir.sh: reducing #utt from 12350 to 354
utils/combine_data.sh data/vctk_test_trials_m_all data/vctk_test_trials_m data/vctk_test_trials_m_common
activate does not accept more than one argument:
['data/vctk_test_trials_m_all', 'data/vctk_test_trials_m', 'data/vctk_test_trials_m_common']

the in_dir is data/vctk_test_trials_m
the in_dir is data/vctk_test_trials_m_common
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
utils/fix_data_dir.sh: file data/vctk_test_trials_m_all/spk2gender is not in sorted order or not unique, sorting it
fix_data_dir.sh: kept all 5774 utterances.
fix_data_dir.sh: old files are kept in data/vctk_test_trials_m_all/.backup
utils/combine_data.sh data/vctk_test_trials_all data/vctk_test_trials_f_all data/vctk_test_trials_m_all
activate does not accept more than one argument:
['data/vctk_test_trials_all', 'data/vctk_test_trials_f_all', 'data/vctk_test_trials_m_all']

the in_dir is data/vctk_test_trials_f_all
the in_dir is data/vctk_test_trials_m_all
utils/combine_data.sh [info]: not combining utt2uniq as it does not exist
utils/combine_data.sh [info]: not combining segments as it does not exist
utils/combine_data.sh: combined utt2spk
utils/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/combine_data.sh: combined utt2dur
utils/combine_data.sh [info]: not combining utt2num_frames as it does not exist
utils/combine_data.sh [info]: not combining reco2dur as it does not exist
utils/combine_data.sh [info]: not combining feats.scp as it does not exist
utils/combine_data.sh: combined text
utils/combine_data.sh [info]: not combining cmvn.scp as it does not exist
utils/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/combine_data.sh: combined wav.scp
utils/combine_data.sh: combined spk2gender
fix_data_dir.sh: kept all 11448 utterances.
fix_data_dir.sh: old files are kept in data/vctk_test_trials_all/.backup
Done

Stage 9: Anonymizing evaluation datasets...
anon_level = spk
libri_dev_enrolls
/home/xingyum/models/Voice-Privacy-Challenge-2022/baseline/exp/am_nsf_data

Anonymizing using x-vectors and neural wavform models...
activate does not accept more than one argument:
['--nj', '128', '--anoni-pool', 'libritts_train_other_500', '--data-netcdf', '/home/xingyum/models/Voice-Privacy-Challenge-2022/baseline/exp/am_nsf_data', '--ppg-model', 'exp/models/1_asr_am/exp', '--ppg-dir', 'exp/models/1_asr_am/exp/nnet3_cleaned', '--xvec-nnet-dir', 'exp/models/2_xvect_extr/exp/xvector_nnet_1a', '--anon-xvec-out-dir', 'exp/models/2_xvect_extr/exp/xvector_nnet_1a/anon', '--plda-dir', 'exp/models/2_xvect_extr/exp/xvector_nnet_1a', '--pseudo-xvec-rand-level', 'spk', '--distance', 'plda', '--proximity', 'farthest', '--cross-gender', 'false', '--rand-seed', '0', '--anon-data-suffix', '_anon', '--model-type', 'am_nsf_pytorch', '--inference-trunc-len', '-1', '--inference-batch-size-am', '10', '--inference-batch-size-wav', '5', 'libri_dev_enrolls']

param=libri_dev_enrolls

Stage a.0: Extracting xvectors for libri_dev_enrolls.
activate does not accept more than one argument:
['--nj', '29', 'data/libri_dev_enrolls', 'exp/models/2_xvect_extr/exp/xvector_nnet_1a', 'exp/models/2_xvect_extr/exp/xvector_nnet_1a/anon']

@980202006
Copy link
Author

Thank you for reply.

@Natalia-T
Copy link
Collaborator

@minhduc0711 proposed a solution to fix this issue: 5a8c9f9

So, you can similarly modify the first line in your env.sh file.

@980202006
Copy link
Author

Thank you! I have a new problem.
image
image

@Natalia-T
Copy link
Collaborator

The missing files in .../baseline/exp/models/2_xvect_extr/exp/xvector_nnet_1a/anon/xvectors_libri_dev_trials_f/pseudo_xvecs/

-   spk2gender
-   pseudo_xvector.scp
-   pseudo_xvector.ark

should be created here:

print("Writing pseud-speaker xvectors to: "+pseudo_xvecs_dir)

Did you successfully complete all the previous stages? (I think your code is a little different from the current git version).

@980202006
Copy link
Author

很抱歉延迟回复。我运行local/main_anonymization_train_data.sh遇到同样的问题,gen_pseudo_xvecs.py是否能提供一个可运行的参数。
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants