Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precision issues with __hv_log2_f SSE and NEON SIMD #184

Open
dromer opened this issue Jun 18, 2024 · 22 comments
Open

Precision issues with __hv_log2_f SSE and NEON SIMD #184

dromer opened this issue Jun 18, 2024 · 22 comments
Labels
bug Something isn't working help wanted Extra attention is needed simd Requires SIMD attention
Milestone

Comments

@dromer
Copy link
Collaborator

dromer commented Jun 18, 2024

I wanted to report an issue with a patch that does compile fine but has not the right output. It involves the ftom~ object, so this is to be expected?

_Originally posted by @lokkiikkol

@dromer
Copy link
Collaborator Author

dromer commented Jun 18, 2024

@lokkiikkol please don't post issues in (possibly) unrelated tickets. Especially when not giving any useful information.

Can you share a (minimum) patch that showcases the problem?

And try to explain the problem as well. "not the right output" doesn't give us anything to go on.
Which OS do you use and which generator for the build?

@dromer dromer added the bug Something isn't working label Jun 18, 2024
@lokkiikkol
Copy link

lokkiikkol commented Jun 18, 2024

sorry, yes. I use plugdata to export a patch to vst on osx (Sonoma). I just now put together a simple test patch, and it works with mtof~ and ftom~, however the resulting frequency is out of tune to the test frequency I put it against, this is the patch:

#N canvas 827 239 527 327 12;
#X obj 233 164 sig~;
#X msg 212 121 220;
#X obj 170 72 loadbang;
#X obj 229 241 osc~;
#X obj 230 350 dac~;
#X obj 311 211 ftom~;
#X obj 318 249 mtof~;
#X obj 306 285 osc~;
#X connect 0 0 3 0;
#X connect 0 0 5 0;
#X connect 1 0 0 0;
#X connect 2 0 1 0;
#X connect 3 0 4 0;
#X connect 5 0 6 0;
#X connect 6 0 7 0;
#X connect 7 0 4 1;

I will have to investigate further, why my original patch is not working, it is a bit more complex but I will try to strip it down some more.

@dromer
Copy link
Collaborator Author

dromer commented Jun 18, 2024

Hmm, it works correctly for me on Linux.

Are you on Apple Silicon? maybe there is an issue with the NEON optimization.

@lokkiikkol
Copy link

yep, I am on an m1. so you get no beating (left against right channel) at all in the exported vst?

@dromer
Copy link
Collaborator Author

dromer commented Jun 18, 2024

I do not. You can see here a quick scope comparison of the two outputs:

2024-06-18_20-45

@lokkiikkol
Copy link

lokkiikkol commented Jun 18, 2024

ok, this patch here exhibits the problem:

on my machine the left channel outputs a phasor at 220 hertz while the right channel stays silent...

Patch
#N canvas 827 239 527 327 12;
#X obj 478 270 ftom~;
#X obj 450 838 dac~;
#X obj 504 598 mtof~;
#X obj 489 385 /~ 12;
#X obj 490 446 wrap~;
#X obj 536 327 +~ 0.5;
#X obj 437 466 -~;
#X obj 434 492 *~ 12;
#X obj 535 562 +~;
#X obj 492 470 -~ 0.5;
#X obj 459 338 +~ 0.5;
#X obj 426 415 -~ 0.5;
#X obj 618 294 sig~;
#X obj 479 302 +~;
#X obj 462 556 -~;
#X obj 577 238 r root_note;
#X obj 549 384 wrap~;
#X obj 545 357 /~ 12;
#X obj 551 412 *~ 12;
#X text 618 371 modulo %12 with heavy compatible objects...;
#X msg 689 238 0;
#N canvas 334 91 775 633 pitchtocv 0;
#X obj 38 93 rpole~;
#X obj 33 -210 *~ 1e+07;
#X obj 51 121 samphold~;
#X obj 26 52 *~ -1;
#X obj 36 71 +~ 1;
#X obj 33 211 /~;
#X obj -10 74 sig~ 1;
#X obj 132 122 samphold~;
#X obj 152 81 rpole~;
#X obj 114 56 sig~ 1;
#X obj 180 34 *~ -1;
#X obj 180 55 +~ 1;
#X obj 56 185 +~;
#X obj 33 -189 clip~ 0 1;
#X obj 78 261 /~;
#X obj 78 234 sig~;
#X obj 78 208 samplerate~;
#X obj 17 -57 -~;
#X obj 17 -32 clip~ -1e-37 0;
#X obj 17 10 *~ 1e+37;
#X obj 17 -82 min~;
#X obj 17 -11 +~ 1e-37;
#X obj 49 -108 sig~ 1;
#X obj 125 -77 -~;
#X obj 125 -52 clip~ -1e-37 0;
#X obj 125 -10 *~ 1e+37;
#X obj 125 -102 min~;
#X obj 125 -31 +~ 1e-37;
#X obj 157 -128 sig~ -1;
#X obj 52 153 -~ 1;
#X obj 134 12 *~ 1;
#X obj 20 30 *~ 1;
#X obj 108 158 -~ 2;
#X obj 81 189 loadbang;
#X obj 56 -147 rzero~;
#X obj 97 -170 sig~ 1;
#X obj 79 343 outlet~;
#X obj 0 298 outlet~;
#X obj 33 -235 inlet~;
#X text 135 341 cv out;
#X text 16 279 saw out;
#X connect 0 0 2 0;
#X connect 0 0 5 0;
#X connect 1 0 13 0;
#X connect 2 0 29 0;
#X connect 3 0 4 0;
#X connect 4 0 0 1;
#X connect 5 0 37 0;
#X connect 6 0 0 0;
#X connect 7 0 32 0;
#X connect 8 0 7 0;
#X connect 9 0 8 0;
#X connect 10 0 11 0;
#X connect 11 0 8 1;
#X connect 12 0 5 1;
#X connect 12 0 14 1;
#X connect 13 0 2 1;
#X connect 13 0 34 0;
#X connect 14 0 36 0;
#X connect 15 0 14 0;
#X connect 16 0 15 0;
#X connect 17 0 18 0;
#X connect 18 0 21 0;
#X connect 19 0 31 0;
#X connect 20 0 17 0;
#X connect 21 0 19 0;
#X connect 22 0 20 1;
#X connect 22 0 17 1;
#X connect 23 0 24 0;
#X connect 24 0 27 0;
#X connect 25 0 30 0;
#X connect 26 0 23 0;
#X connect 27 0 25 0;
#X connect 28 0 26 1;
#X connect 29 0 12 0;
#X connect 30 0 10 0;
#X connect 31 0 3 0;
#X connect 31 0 7 1;
#X connect 32 0 12 1;
#X connect 33 0 16 0;
#X connect 34 0 20 0;
#X connect 34 0 23 1;
#X connect 34 0 26 0;
#X connect 35 0 34 1;
#X connect 38 0 1 0;
#X restore 395 129 pd pitchtocv;
#X obj 668 161 loadbang;
#X obj 485 699 phasor~;
#X obj 317 210 phasor~;
#X obj 398 73 sig~;
#X msg 402 44 220;
#X obj 395 104 osc~;
#X obj 568 511 -~ 0.5;
#X obj 372 -31 loadbang;
#X connect 0 0 13 0;
#X connect 2 0 23 0;
#X connect 3 0 4 0;
#X connect 3 0 11 0;
#X connect 4 0 9 0;
#X connect 5 0 17 0;
#X connect 6 0 7 0;
#X connect 7 0 14 0;
#X connect 8 0 2 0;
#X connect 9 0 6 1;
#X connect 10 0 3 0;
#X connect 11 0 6 0;
#X connect 12 0 13 1;
#X connect 12 0 14 1;
#X connect 13 0 5 0;
#X connect 13 0 10 0;
#X connect 14 0 8 0;
#X connect 15 0 12 0;
#X connect 16 0 18 0;
#X connect 17 0 16 0;
#X connect 18 0 28 0;
#X connect 20 0 12 0;
#X connect 21 1 0 0;
#X connect 21 1 24 0;
#X connect 22 0 20 0;
#X connect 23 0 1 1;
#X connect 24 0 1 0;
#X connect 25 0 27 0;
#X connect 26 0 25 0;
#X connect 27 0 21 0;
#X connect 28 0 8 1;
#X connect 29 0 26 0;

sorry don't know why it is not layouted....it looks fine before I post it.

@dromer
Copy link
Collaborator Author

dromer commented Jun 18, 2024

sorry don't know why it is not layouted....it looks fine before I post it.

Use tripple backticks instead of single :)

@dromer
Copy link
Collaborator Author

dromer commented Jun 18, 2024

These patches are much too complex to debug sensibly. Also I really recommend against using a phasor~ into a dac~ as this can potentially destroy audio equipment.

@lokkiikkol
Copy link

lokkiikkol commented Jun 18, 2024

actually this is the simpler version, sorry I posted twice the same:

Patch
#N canvas 827 239 527 327 12;
#X obj 348 481 ftom~;
#X obj 320 1049 dac~;
#X obj 374 809 mtof~;
#X obj 359 596 /~ 12;
#X obj 360 657 wrap~;
#X obj 406 538 +~ 0.5;
#X obj 307 677 -~;
#X obj 304 703 *~ 12;
#X obj 405 773 +~;
#X obj 362 681 -~ 0.5;
#X obj 329 549 +~ 0.5;
#X obj 296 626 -~ 0.5;
#X obj 488 505 sig~;
#X obj 349 513 +~;
#X obj 332 767 -~;
#X obj 447 449 r root_note;
#X obj 419 595 wrap~;
#X obj 415 568 /~ 12;
#X obj 421 623 *~ 12;
#X text 488 582 modulo %12 with heavy compatible objects...;
#X msg 559 449 0;
#X obj 538 372 loadbang;
#X obj 355 910 phasor~;
#X obj 187 421 phasor~;
#X msg 272 255 220;
#X obj 438 722 -~ 0.5;
#X obj 242 180 loadbang;
#X obj 256 349 sig~;
#X connect 0 0 13 0;
#X connect 2 0 22 0;
#X connect 3 0 4 0;
#X connect 3 0 11 0;
#X connect 4 0 9 0;
#X connect 5 0 17 0;
#X connect 6 0 7 0;
#X connect 7 0 14 0;
#X connect 8 0 2 0;
#X connect 9 0 6 1;
#X connect 10 0 3 0;
#X connect 11 0 6 0;
#X connect 12 0 13 1;
#X connect 12 0 14 1;
#X connect 13 0 5 0;
#X connect 13 0 10 0;
#X connect 14 0 8 0;
#X connect 15 0 12 0;
#X connect 16 0 18 0;
#X connect 17 0 16 0;
#X connect 18 0 25 0;
#X connect 20 0 12 0;
#X connect 21 0 20 0;
#X connect 22 0 1 1;
#X connect 23 0 1 0;
#X connect 24 0 27 0;
#X connect 25 0 8 1;
#X connect 26 0 24 0;
#X connect 27 0 23 0;
#X connect 27 0 0 0;

@dromer
Copy link
Collaborator Author

dromer commented Jun 18, 2024

Looks good here:
2024-06-18_21-04

Can you show the problem on a scope?
I'm using Cardinal here.

@Wasted-Audio Wasted-Audio deleted a comment from lokkiikkol Jun 18, 2024
@lokkiikkol
Copy link

These patches are much too complex to debug sensibly. Also I really recommend against using a phasor~ into a dac~ as this can potentially destroy audio equipment.

noted.

I don't currently have cardinal, but will try to get it, have to leave now unfortunately. I can see from the meters in my host (Element), that there is output which is not audible, this normally means that there is just a dc output (frequency at zero)

I don't have a scope in my plugins currently.

@lokkiikkol
Copy link

ok, got cardinal. as I (mostly) suspected, the output on the 2nd channel is a very slow saw, which is of course not audible, here are two screenshots:

Screenshot 2024-06-18 at 23 00 35 Screenshot 2024-06-18 at 23 00 19

@dromer
Copy link
Collaborator Author

dromer commented Jun 19, 2024

Thank you. Can you do the same for the first patch? It's much more minimal so likely easier to pinpoint what's going on.

Of course the issue could be either in ftom~ or in mtof~, but I think it's likely a bug in the NEON SIMD code for __hv_log2_f: https://github.com/Wasted-Audio/hvcc/blob/develop/hvcc/generators/ir2c/static/HvMath.h#L74-L101

@dromer
Copy link
Collaborator Author

dromer commented Jun 19, 2024

I just tried an SSE build (of the first patch) and I see a similar discrepancy.
The original signal is 220Hz, but the reproduced signal is about 216.2Hz

@lokkiikkol
Copy link

yep, it is a bit hard to show on a screenshot, but it is clearly not the same frequency, at 220hz it is about 3 hz of.

Screenshot 2024-06-19 at 11 37 27

I think that is only part of the problem though, I suspect the wrap~ object is also not performing right...let me check with a simple patch.

@dromer
Copy link
Collaborator Author

dromer commented Jun 19, 2024

Lets not conflate different issues in the same ticket. It is better to focus on one thing at a time.
Different issues will require different solutions etc.

If you see an issue with wrap~ please create a separate ticket with a specific (minimal) example.

@lokkiikkol
Copy link

lokkiikkol commented Jun 19, 2024

sure. probably the frequency shift gets worse with wrap~ because the calculation from the ftom~ object is already off.
so likely not a problem with wrap~, but you can use this patch, it will showcase the problem much more clearly, the frequencies are off by approx 8-10hz

#N canvas 827 239 527 327 12;
#X obj 233 164 sig~;
#X msg 212 121 220;
#X obj 170 72 loadbang;
#X obj 229 241 osc~;
#X obj 230 350 dac~;
#X obj 282 197 ftom~;
#X obj 303 289 mtof~;
#X obj 291 325 osc~;
#X obj 388 244 wrap~;
#X obj 321 252 -~;
#X connect 0 0 3 0;
#X connect 0 0 5 0;
#X connect 1 0 0 0;
#X connect 2 0 1 0;
#X connect 3 0 4 0;
#X connect 5 0 8 0;
#X connect 5 0 9 0;
#X connect 6 0 7 0;
#X connect 7 0 4 1;
#X connect 8 0 9 1;
#X connect 9 0 6 0;
Screenshot 2024-06-19 at 11 43 16

@dromer
Copy link
Collaborator Author

dromer commented Jun 19, 2024

I think the issue is likely with the approximation used in the SIMD implementation. The precision can possibly be improved somewhat by tweaking the constants used.

I don't know what the best approach is and we will likely need some help to achieve this.

@dromer dromer added the help wanted Extra attention is needed label Jun 19, 2024
@dromer dromer changed the title Some issue with ftom~ object Precision issues with __hv_log2_f SSE and NEON SIMD Jun 19, 2024
@lokkiikkol
Copy link

lokkiikkol commented Jun 19, 2024

this patch shows no frequency difference, but that can also mean that the mtof and mtof~ objects use the same calculations...

#N canvas 827 239 527 327 12;
#X obj 233 164 sig~;
#X obj 170 72 loadbang;
#X obj 231 327 osc~;
#X obj 232 436 dac~;
#X obj 304 285 mtof~;
#X obj 302 350 osc~;
#X obj 116 178 mtof;
#X msg 212 121 57;
#X obj 301 171 sig~;
#X connect 0 0 2 0;
#X connect 1 0 7 0;
#X connect 2 0 3 0;
#X connect 4 0 5 0;
#X connect 5 0 3 1;
#X connect 6 0 0 0;
#X connect 7 0 6 0;
#X connect 7 0 8 0;
#X connect 8 0 4 0;
Screenshot 2024-06-19 at 11 59 04

@dromer
Copy link
Collaborator Author

dromer commented Jun 19, 2024

You can see the patch implementations here: https://github.com/Wasted-Audio/hvcc/tree/develop/hvcc/interpreters/pd2hv/libs/pd

I don't think the issue is with mtof~ but with the SIMD code used in ftom~.

@lokkiikkol
Copy link

lokkiikkol commented Jun 19, 2024

this patch shows also no drift, meaning the problem has to be in the ftom~ object...

Patch
#N canvas 827 239 527 327 12;
#X obj 184 317 sig~;
#X obj 170 72 loadbang;
#X obj 183 357 osc~;
#X obj 232 436 dac~;
#X obj 304 285 mtof~;
#X obj 302 350 osc~;
#X msg 212 121 57;
#X obj 301 171 sig~;
#X msg 194 229 220;
#X connect 0 0 2 0;
#X connect 1 0 6 0;
#X connect 1 0 8 0;
#X connect 2 0 3 0;
#X connect 4 0 5 0;
#X connect 5 0 3 1;
#X connect 6 0 7 0;
#X connect 7 0 4 0;
#X connect 8 0 0 0;

@lokkiikkol
Copy link

lokkiikkol commented Jun 19, 2024

and this ftom replacement I did also shows the drift, of course it also uses hv.log~ so your assumptions seems legit.

Patch
#N canvas 827 239 527 327 12;
#X obj 174 144 sig~;
#X obj 203 267 /~;
#X obj 180 208 hv.log~;
#X obj 240 238 log;
#X msg 250 211 2;
#X obj 185 237 *~ 12;
#X obj 202 294 +~ 57;
#X obj 184 179 /~ 220;
#X obj 214 74 loadbang;
#X obj 229 368 mtof~;
#X obj 244 429 osc~;
#X obj 170 425 osc~;
#X obj 197 494 dac~;
#X msg 174 118 220;
#X connect 0 0 7 0;
#X connect 0 0 11 0;
#X connect 1 0 6 0;
#X connect 2 0 5 0;
#X connect 3 0 1 1;
#X connect 4 0 3 0;
#X connect 5 0 1 0;
#X connect 6 0 9 0;
#X connect 7 0 2 0;
#X connect 8 0 4 0;
#X connect 8 0 13 0;
#X connect 9 0 10 0;
#X connect 10 0 12 1;
#X connect 11 0 12 0;
#X connect 13 0 0 0;

@dromer dromer moved this to Todo in Core Improvements Jul 31, 2024
@dromer dromer added the simd Requires SIMD attention label Sep 1, 2024
@dromer dromer added this to the 1.0 milestone Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed simd Requires SIMD attention
Projects
Status: Todo
Development

No branches or pull requests

2 participants