Skip to content

Cherry-pick: Harden bound checking tests of AES-XTS and replace SSE instructions that degraded performance for certain input lengths #2319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: fips-2024-09-27
Choose a base branch
from

Conversation

nebeid
Copy link
Contributor

@nebeid nebeid commented Apr 8, 2025

Original PRs: #2286 and #2140
Original commits: a39439b and 37c2b5e

Description of changes:

This is a follow-up to #2228 where an out-of-bound (OOB) read was fixed in the AVX512 implementation of AES-XTS and more tests were added.
This cherry-picks:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

nebeid and others added 2 commits April 8, 2025 11:45
…ts. (aws#2286)

This change hardens the tests introduced in aws#2227 Fix out-of-bound (OOB)
input read in AES-XTS Decrypt in AVX-512 implementation.
It adds a memory page preceding the input and output buffer that is
protected against read and write in order to detect any under-read, in
which case a segfault occurs.

The suspected code that can potentially cause a "pre-bound" OOB is the
cipher-stealing section in Encrypt

[crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl#L1809-L1810](https://github.com/aws/aws-lc/blob/v1.48.5/crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl#L1809-L1810)
and decrypt

[crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl#2572-L2573](https://github.com/aws/aws-lc/blob/v1.48.5/crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl#L2572-L2573).

The efficacy of the added test was shown by changing the decrypt
cipher-stealing code for example to:
```diff
--- a/crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl
+++ b/crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl
@@ -2569,7 +2569,7 @@ ___
   vpshufb       %xmm10,%xmm8,%xmm8

-  vmovdqu       -0x10($input,$length,1),%xmm3
+  vmovdqu       -0x12($input,$length,1),%xmm3
   vmovdqu       %xmm8,-0x10($output,$length,1)
 ```
With this change, a segmentation fault occurs in the test vector of input length 17 bytes (1 AES block + 1 byte); which is the smallest test vector that requires cipher stealing. At the changed line:
- `$input` points at byte 16, i.e. past the first block
- `$length` = 1, after [l.2429](https://github.com/aws/aws-lc/blob/v1.48.5/crypto/fipsmodule/aes/asm/aesni-xts-avx512.pl#L2429)
- the read index with the diff change is `$input + $length - 18` = `$input -17`, which points at byte "-1", i.e. the byte right before byte 0 of the input, i.e. an underread, this causes a segfault at this vector.
- Other larger changes, e.g. -0x20, will have the same result.

Another test changes the location of the written output
```@@ -2607,7 +2607,7 @@ ___

   .L_done_${rndsuffix}:
   # store last ciphertext value
-  vmovdqu       %xmm8,-0x10($output)
+  vmovdqu       %xmm8,-0x11($output)
 ___
   }
 ```
- This test caused a segfault with the smallest input of 1 block = 16
bytes

Similar tests in the encrypt path gave the same result of segfaulting
when trying to read before the input beginning.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.

(cherry picked from commit a39439b)
A bimodal performance occurred in the XTS encrypt AVX512 implementation. We have observed more than 80% drop in performance. This is caused by mixing SSE and AVX instructions in the AVX512 implementation. For a subset of input lengths, the code path contained a single move movdqa, an SSE instruction. Use vmovdqa instead.

(cherry picked from commit 37c2b5e)
@nebeid nebeid requested a review from a team as a code owner April 8, 2025 15:57
@codecov-commenter
Copy link

codecov-commenter commented Apr 8, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.60%. Comparing base (d1ab658) to head (01653f2).

Additional details and impacted files
@@                 Coverage Diff                 @@
##           fips-2024-09-27    #2319      +/-   ##
===================================================
- Coverage            78.60%   78.60%   -0.01%     
===================================================
  Files                  585      585              
  Lines               100601   100608       +7     
  Branches             14259    14262       +3     
===================================================
+ Hits                 79076    79079       +3     
- Misses               20890    20892       +2     
- Partials               635      637       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nebeid nebeid requested a review from justsmth April 8, 2025 22:58
### Description of changes:
BoringSSL now requires C++17 after
google/boringssl@9ff8491.
When we build speed.cc with BoringSSL's headers we need to specify our
our build to use C++17. This is blocking all of our CI with failures
like
[this](https://us-west-2.codebuild.aws.amazon.com/project/eyJlbmNyeXB0ZWREYXRhIjoiWk5IUWJGRGxBcTJ2Mkp4WGF3dnBwYjc5V0ZZYSt5SVVGbkwvODkydTNTaVQ2V2FMN3hwa0tjSWNFemw2QWtCWW5welFWV3lpRFpKVitwejgvelhpRWh3NDNqcWhKalpPYW9hL2tLMDlJSDFPT1NkNyIsIml2UGFyYW1ldGVyU3BlYyI6Ilp4VXRNQXFGM1BZYVlaRkIiLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D/build/73b9a032-23e0-493c-b4f2-08fce7df7798).

### Call-outs:
This does not change AWS-LC's normal C++11 requirement.

### Testing:
The CI will build this.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.

(cherry picked from commit 04a0f10)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants