Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POW Reference (Fallback) implementation wrong? #342

Open
notgiven688 opened this issue Sep 3, 2019 · 0 comments
Open

POW Reference (Fallback) implementation wrong? #342

notgiven688 opened this issue Sep 3, 2019 · 0 comments

Comments

@notgiven688
Copy link

notgiven688 commented Sep 3, 2019

Compare [1]

#define VARIANT2_SHUFFLE_ADD_SSE2(base_ptr, offset, reverse) \
and [2]
#define VARIANT2_SHUFFLE_ADD_NEON(base_ptr, offset, reverse) \

with the reference implementation here [3]:

#define VARIANT2_PORTABLE_SHUFFLE_ADD(base_ptr, offset, reverse) \

Looks to me as [3] does not yield the same result as [1] and [2] since the offsets (0x10,0x20,0x30) are not interchanged correctly.

In my opinion the correct reference implementation (for reverse=true) should read:

#define VARIANT2_PORTABLE_SHUFFLE_ADD(base_ptr, offset, reverse) \
  do if (variant >= 2) \
  { \
    uint64_t* chunk1 = U64((base_ptr) + ((offset) ^ 0x30)); \
    uint64_t* chunk2 = U64((base_ptr) + ((offset) ^ 0x20)); \
    uint64_t* chunk3 = U64((base_ptr) + ((offset) ^ 0x10)); \
    \
    const uint64_t chunk1_old[2] = { chunk1[0], chunk1[1] }; \
    \
    uint64_t b1[2]; \
    memcpy(b1, b + 16, 16); \
    chunk3[0] = chunk3[0] + b1[0]; \
    chunk3[1] = chunk3[1] + b1[1]; \
    \
    uint64_t a0[2]; \
    memcpy(a0, a, 16); \
    chunk1[0] = chunk2[0] + a0[0]; \
    chunk1[1] = chunk2[1] + a0[1]; \
    \
    uint64_t b0[2]; \
    memcpy(b0, b, 16); \
    chunk2[0] = chunk1_old[0] + b0[0]; \
    chunk2[1] = chunk1_old[1] + b0[1]; \
  } while (0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant