Use iterators in `create_border_luma`, `add_residue`, `predict_dcpred` #15

okaneco · 2023-11-01T02:01:40Z

Use iterators in places where indices are manually calculated because the compiler doesn't always optimize them. Iterators can remove extra bound checks or enable other optimzations like memset/memcpy or vectorized mov instructions.

Use branchless clamping in a loop to produce better vectorized code. This generates saturating truncation instructions instead of branches of greater than comparisons masking with and/andnot.

I was working on personal code and came across similar patterns where I found indexing manually to be a fair amount slower than using iterators.

I was reading the RFC to get a better understanding of the prediction functions and saw some places that could be improved.

The main heuristic I used for finding and replacing indexing was if it was complex enough (ie, more than for i in 0..arr.len() { arr[i] ... }), and the indexing was being done horizontally as opposed to vertical strides. Then I plugged the code in to Compiler Explorer to see if there were improvements, then testing it on the benchmark. I know that isn't the full picture because most of these functions have constants provided as arguments but it was still a helpful barometer.

There are 3 commits which get ~6% increase on the image benches. I've tried to break them up into small, easy to review commits.

Use iterators in places where indices are manually calculated because the compiler doesn't always optimize them. Iterators can remove extra bound checks or enable other optimzations like memset/memcpy or vectorized mov instructions. Use branchless clamping in a loop to produce better vectorized code

fintelia merged commit b71c697 into image-rs:main Dec 17, 2023
9 checks passed

okaneco deleted the use-iters0 branch December 17, 2023 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use iterators in `create_border_luma`, `add_residue`, `predict_dcpred` #15

Use iterators in `create_border_luma`, `add_residue`, `predict_dcpred` #15

okaneco commented Nov 1, 2023

Use iterators in create_border_luma, add_residue, predict_dcpred #15

Use iterators in create_border_luma, add_residue, predict_dcpred #15

Conversation

okaneco commented Nov 1, 2023

Use iterators in `create_border_luma`, `add_residue`, `predict_dcpred` #15

Use iterators in `create_border_luma`, `add_residue`, `predict_dcpred` #15