Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CRC intrinisics for targets lacking of CRC extension #627

Merged

Conversation

Cuda-Chen
Copy link
Collaborator

@Cuda-Chen Cuda-Chen commented Jan 16, 2024

Close #624.

sse2neon.h Outdated Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
@jserv
Copy link
Member

jserv commented Jan 17, 2024

The current _sse2neon_crc32_tbl uses 256 * 4 bytes, totaling 1 KiB. Is there a way to reduce its memory usage?

@DLTcollab DLTcollab deleted a comment from Cuda-Chen Jan 17, 2024
@jserv
Copy link
Member

jserv commented Jan 17, 2024

Reference implementation in Linux kernel.

  • Accelerated CRC32(C) using ARM CRC, NEON and Crypto Extensions instructions
  • Accelerated CRC-T10DIF using ARM NEON and Crypto Extensions instructions (CRC16 CRC algorithm used for the T10 (SCSI) Data Integrity Field (DIF))

sse2neon.h Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
@jserv
Copy link
Member

jserv commented Jan 22, 2024

sse2neon.h Outdated Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
sse2neon.h Outdated Show resolved Hide resolved
@Cuda-Chen
Copy link
Collaborator Author

Hi @jserv ,
before converting this PR to a normal PR, I would like to ask whether it is necessary to implement runtime dispatching in this PR?

@Cuda-Chen Cuda-Chen force-pushed the optimize-CRC-for-targets-lacking-of-CRC branch from cbaaa4b to 66267b5 Compare January 30, 2024 13:16
@jserv
Copy link
Member

jserv commented Jan 30, 2024

before converting this PR to a normal PR, I would like to ask whether it is necessary to implement runtime dispatching in this PR?

Compile-time options for diverse Arm extensions would be great.

@Cuda-Chen
Copy link
Collaborator Author

before converting this PR to a normal PR, I would like to ask whether it is necessary to implement runtime dispatching in this PR?

Compile-time options for diverse Arm extensions would be great.

For my experience, I will use -mcpu to determine the instruction set and the features (e.g., crypto and crc).

Anyway, I think I will make this PR ready for merge.

@Cuda-Chen Cuda-Chen marked this pull request as ready for review January 30, 2024 13:48
@Cuda-Chen Cuda-Chen requested a review from marktwtn as a code owner January 30, 2024 13:48
@Cuda-Chen Cuda-Chen requested a review from jserv January 30, 2024 13:48
@jserv jserv merged commit 4a036e6 into DLTcollab:master Jan 30, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize CRC intrinisics for targets lacking of CRC extension
2 participants