Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The size of tensor a (352768) must match the size of tensor b (352800) at non-singleton dimension 1 #113

Open
Eddycrack864 opened this issue Sep 18, 2024 · 3 comments

Comments

@Eddycrack864
Copy link

Hi, there is a problem with the BS-Roformer-De-Reverb model (deverb_bs_roformer_8_384dim_10depth.ckpt) in version 0.21.0.

When I try to split an audio I get this error:

2024-09-18 03:14:22.947 - INFO - cli - Separator version 0.21.0 beginning with input file: /content/temp/5 Seconds of Summer - Youngblood (Official Video).wav
2024-09-18 03:14:22.948 - INFO - separator - Separator version 0.21.0 instantiating with output_dir: /content/drive/MyDrive/Vocales, output_format: wav
2024-09-18 03:14:22.957 - INFO - separator - Operating System: Linux #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
2024-09-18 03:14:22.959 - INFO - separator - System: Linux Node: 603cb85998bb Release: 6.1.85+ Machine: x86_64 Proc: x86_64
2024-09-18 03:14:22.959 - INFO - separator - Python Version: 3.10.12
2024-09-18 03:14:22.959 - INFO - separator - PyTorch Version: 2.4.0+cu121
2024-09-18 03:14:23.045 - INFO - separator - FFmpeg installed: ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
2024-09-18 03:14:23.047 - INFO - separator - ONNX Runtime GPU package installed with version: 1.19.2
2024-09-18 03:14:23.065 - INFO - separator - CUDA is available in Torch, setting Torch device to CUDA
2024-09-18 03:14:23.065 - INFO - separator - ONNXruntime has CUDAExecutionProvider available, enabling acceleration
2024-09-18 03:14:23.065 - INFO - separator - Loading model deverb_bs_roformer_8_384dim_10depth.ckpt...
2024-09-18 03:14:25.929 - INFO - mdxc_separator - MDXC Separator initialisation complete
2024-09-18 03:14:25.929 - INFO - separator - Load model duration: 00:00:02
2024-09-18 03:14:25.929 - INFO - separator - Starting separation process for audio_file_path: /content/temp/5 Seconds of Summer - Youngblood (Official Video).wav
  0% 0/115 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "/usr/local/bin/audio-separator", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/audio_separator/utils/cli.py", line 194, in main
    output_files = separator.separate(args.audio_file)
  File "/usr/local/lib/python3.10/dist-packages/audio_separator/separator/separator.py", line 735, in separate
    output_files = self.model_instance.separate(audio_file_path)
  File "/usr/local/lib/python3.10/dist-packages/audio_separator/separator/architectures/mdxc_separator.py", line 137, in separate
    source = self.demix(mix=mix)
  File "/usr/local/lib/python3.10/dist-packages/audio_separator/separator/architectures/mdxc_separator.py", line 261, in demix
    result = self.overlap_add(result, x, window, i, length)
  File "/usr/local/lib/python3.10/dist-packages/audio_separator/separator/architectures/mdxc_separator.py", line 195, in overlap_add
    result[..., start : start + length] += x[..., :length] * weights[:length]
RuntimeError: The size of tensor a (352768) must match the size of tensor b (352800) at non-singleton dimension 1

This error only occurs with that specific model, I tried some files and different overlaps and the error persists.

@beveradb
Copy link
Collaborator

beveradb commented Sep 18, 2024

Yeah I'm aware, thanks for the reminder though!

I was keen to get it to work in my latest batch of improvements to audio-separator but couldn't get it working (though I'm still hoping it's just the config which needs tweaking a little)

That's why I didn't announce it in #105 (comment)

Will look into it at some point unless someone beats me to it and raises a PR with a fix ☺️

@Bebra777228
Copy link
Contributor

It might be worth reducing the dim_t parameter in the configuration files.

I compared the Roformer model configurations from other repositories, and everywhere this parameter is set to 256. However, in your configuration files, it is set to 801. Additionally, some configurations differ significantly from what I found.

@beveradb
Copy link
Collaborator

Yeah, I wasn't able to get this model working as easily as the others just by modifying the config - but if you manage to get it working please raise a PR 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants