You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I got this error on the make tssep_pretrained_eval command:
FileNotFoundError: [Errno 2] No such file or directory: '~/tssep_data/egs/libri_css/data/ivector/simLibriCSS_oracle_ivectors.json'
I have not changed anything into the config files neither on tssep_pretrained_77_62000.yaml which is the base. However, I think simLibriCSS_oracle_ivectors.json is unecessary for evaluation, since for the evaluation only the produced i-vectors are needed (libriCSS_espnet_ivectors.json), and for domain adaptation feature_statistics.pkl is downloaded successfully.
Update (4/9/2024)
just a quick update.
I managed to overcome the previous error by commenting out the
following lines in the config.yaml:
line 113: - ~/testing_evaluation_tssep_data/tssep_data/egs/libri_css/data/ivector/simLibriCSS_oracle_ivectors.json'
line 124: SimLibriCSS-dev: true
and by changing:
aux_feature_statistics_domain_adaptation: null (from mean_std to null)
The problem now is that I receive CUDA_OUT_OF_MEMORY error:
Run eval: ~/testing_evaluation_tssep_data/tssep_data/egs/libri_css/tssep_pretrained/eval/62000/1
device: 0
Load feature statistics from cache: ~/testing_evaluation_tssep_data/tssep_data/egs/libri_css/tssep_pretrained/eval/62000/1/cache/feature_statistics.pkl
Use prefetch with threads for dataloading
0%| | 0/60 [00:01<?, ?it/s]
ERROR - extract_eval - Failed after 0:00:04!
Traceback (most recent calls WITHOUT Sacred internals):
File "~/testing_evaluation_tssep_data/tssep_data/tssep_data/eval/run.py", line 246, in main
eeg.eval(eg=eg)
File "~/testing_evaluation_tssep_data/tssep_data/tssep_data/eval/experiment.py", line 825, in eval
self.work(
File "~/testing_evaluation_tssep_data/tssep_data/tssep_data/eval/experiment.py", line 382, in work
ex['Observation'] = self.wpe(ex['Observation'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "~/testing_evaluation_tssep_data/tssep/tssep/train/enhancer.py", line 336, in __call__
nara_wpe.torch_wpe.wpe_v6(
File "~/miniconda3/envs/ivec_train_check/lib/python3.11/site-packages/nara_wpe/torch_wpe.py", line 222, in wpe_v6
Y_tilde_inverse_power = Y_tilde * inverse_power[..., None, :]
~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 20.60 GiB. GPU 0 has a total capacty of 22.02 GiB of which 14.05 GiB is free. Including non-PyTorch memory, this process has 7.97 GiB memory in use. Of the allocated memory 6.90 GiB is allocated by PyTorch, and 14.47 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
make: *** [Makefile:13: run] Error 1
But there is not any argument about eval batch size or anything relevant to tweak.
Have you got any thoughts on that?
Thanks
The text was updated successfully, but these errors were encountered:
I forgot to upload and adjust the code for the domain adaptation of the ivectors. The current master branch of tssep_data has now the changes to do the domain adaptation without access to the training data.
Disabeling the aux_feature_statistics_domain_adaptation has a large impact on the WER. (6% -> >70%). So it is not recommented to do that.
If you have only access to a small GPU, you can change the nn_segmenter parameters in the config to use less frames. Since the eval minibatch size is 1, it cannot be reduced.
An alternative is to use the CPU for inference, since they have usually more memory than the GPU.
Update (Sep. 23):
The code in the tssep repo is now updated and uses the memory efficient WPE implementation
Hi,
While trying to run the evaluation on the pretrained model: https://github.com/fgnt/tssep_data/blob/master/egs/libri_css/README.md#steps-to-evaluate-a-pretrained-model
I got this error on the
make tssep_pretrained_eval
command:I have not changed anything into the config files neither on
tssep_pretrained_77_62000.yaml
which is the base. However, I thinksimLibriCSS_oracle_ivectors.json
is unecessary for evaluation, since for the evaluation only the produced i-vectors are needed (libriCSS_espnet_ivectors.json
), and for domain adaptation feature_statistics.pkl is downloaded successfully.Update (4/9/2024)
just a quick update.
I managed to overcome the previous error by commenting out the
following lines in the config.yaml:
and by changing:
The problem now is that I receive CUDA_OUT_OF_MEMORY error:
But there is not any argument about eval batch size or anything relevant to tweak.
Have you got any thoughts on that?
Thanks
The text was updated successfully, but these errors were encountered: