Optimize `AWSSDKUtils.ToHex()` for speed and memory #3293

stevenaw · 2024-04-22T04:16:14Z

Description

Optimize uppercase and lowercase versions of ToHex() across all runtimes. Use the built-in option for uppercase on .NET8. Improved speed + memory, particularly for uppercase on .NET8. There's still more room for improvement for the other scenarios but I decided to keep it simple and follow the existing bit twiddling patterns I saw in UrlEncode in the file

Motivation and Context

Testing

Added unit tests for upper + lower case prior to making change. Ran them at the end to validate the new implementation passed. Ran benchmarks against netcoreapp3.1 and net8.0

Screenshots (if appropriate)

Benchmark Code

    [MemoryDiagnoser]
  public class ToHexString
  {
      // 11 = "Hello World" :)
      [Params(11, 128)]
      public int N { get; set; }

      // 11 = "Hello World" :)
      [Params(true, false)]
      public bool Lowercase { get; set; }

      public byte[] Payload { get; set; }

      [GlobalSetup]
      public void Setup()
      {
          Payload = Encoding.UTF8.GetBytes(new string('A', N));
      }

      [Benchmark(Baseline = true)]
      public string Original()
      {
          var data = Payload;
          var lowercase = Lowercase;

          StringBuilder sb = new StringBuilder();

          for (int i = 0; i < data.Length; i++)
          {
              sb.Append(data[i].ToString(lowercase ? "x2" : "X2", CultureInfo.InvariantCulture));
          }

          return sb.ToString();
      }

      [Benchmark]
      public string Optimized()
      {
          var data = Payload;
          var lowercase = Lowercase;

#if NET8_0_OR_GREATER
          if (!lowercase)
          {
              return Convert.ToHexString(data);
          }
#endif

          char[] chars = ArrayPool<char>.Shared.Rent(data.Length * 2);

          try
          {
              Func<int, char> converter = lowercase ? (Func<int, char>)ToLowerHex : (Func<int, char>)ToUpperHex;

              for (int i = 0; i < data.Length; i++)
              {
                  // Break apart the byte into two four-bit components and
                  // then convert each into their hexadecimal equivalent.
                  byte b = data[i];
                  int hiNibble = b >> 4;
                  int loNibble = b & 0xF;

                  chars[i * 2] = converter(hiNibble);
                  chars[i * 2 + 1] = converter(loNibble);
              }

              return new string(chars, 0, data.Length * 2);
          }
          finally
          {
              ArrayPool<char>.Shared.Return(chars);
          }
      }
      private static char ToUpperHex(int value)
      {
          // Maps 0-9 to the Unicode range of '0' - '9' (0x30 - 0x39).
          if (value <= 9)
          {
              return (char)(value + '0');
          }
          // Maps 10-15 to the Unicode range of 'A' - 'F' (0x41 - 0x46).
          return (char)(value - 10 + 'A');
      }

      private static char ToLowerHex(int value)
      {
          // Maps 0-9 to the Unicode range of '0' - '9' (0x30 - 0x39).
          if (value <= 9)
          {
              return (char)(value + '0');
          }
          // Maps 10-15 to the Unicode range of 'A' - 'F' (0x41 - 0x46).
          return (char)(value - 10 + 'a');
      }
  }

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist

My code follows the code style of this project
My change requires a change to the documentation
I have updated the documentation accordingly
I have read the README document
I have added tests to cover my changes
All new and existing tests passed

License

I confirm that this pull request can be released under the Apache 2 license

danielmarbach · 2024-04-28T19:01:45Z

FYI I also added a similar unit test in this draft PR that optimizes UrlEncode that you mention here #3307

There I also removed Netstandard from the unit test since that is not a valid target for unit tests.

danielmarbach

I just stumbled into this because I was tweaking UrlEncode. Great stuff 😍

danielmarbach · 2024-04-28T19:02:38Z

sdk/src/Core/Amazon.Util/AWSSDKUtils.cs

@@ -39,6 +39,7 @@
 #if NETSTANDARD
 using System.Net.Http;
 using System.Runtime.InteropServices;
+using System.Buffers;


This should probably move outside the #if

danielmarbach · 2024-04-28T19:04:16Z

sdk/src/Core/Amazon.Util/AWSSDKUtils.cs

+            {
+                Func<int, char> converter = lowercase ? (Func<int, char>)ToLowerHex : (Func<int, char>)ToUpperHex;
+
+                for (int i = 0; i < data.Length; i++)


Out of curiosity have you tried reversing the loop to elide bound checks?

Also could this be a candidate for string.Create since that would take care of the pooling already?

Thanks for the suggestions @danielmarbach ! Both were very helpful.

stevenaw · 2024-05-07T02:04:04Z

sdk/src/Core/Amazon.Util/AWSSDKUtils.cs

+#if NETCOREAPP3_1_OR_GREATER
+            Func<int, char> converter = lowercase ? (Func<int, char>)ToLowerHex : (Func<int, char>)ToUpperHex;
+
+            return string.Create(data.Length * 2, (data, converter), (chars, state) =>


One note: I had to keep this lambda non-static in order to be compatible with <LangVersion>8</LangVersion> in the csproj.

This should be changed in my opinion. A reasonable version should at least be 9 even when targeting Netstandard 2.0. That is a good language subset without exposing yourself to weird edge cases. That's how also the Azure .NET SDK treated it until they changed it to an even later version as far as I recall.

Raised #3316

stevenaw · 2024-05-07T02:11:23Z

sdk/src/Core/Amazon.Util/AWSSDKUtils.cs

+                byte[] data = state.data;
+                Func<int, char> converter = state.converter;
+
+                for (int i = data.Length - 1; i >= 0; i--)


One note: I found two copies of the loops here were required to get the best performance as they each operate on different types. Trying to funnel both paths through a Span<byte> to consolidate the code ended up ~10% slower when consumed by .NET Framework.

I'm not sure if any reviewers knew of tricks to get around this without hitting "slow span" paths there.

My experience is that even if you end up with slow downs in the microbenchmarks without a reasonable range due to the massive allocation reductions the overall beneficial effects of getting rid of the allocations especially in a real system using some form of concurrency will outweigh the small latency hits. It is not worth the complexity of maintaining different paths for accommodate for the slow span path. This also takes into account that more modern .NET versions are becoming more and more mainstream and people that are on .NET Framework know what they are not getting.

Thanks for the prompt feedback and advice here @danielmarbach . That makes perfect sense to me.
I've pushed another changeset. My laptop is on battery at the moment so benchmarking will be less reliable, but I can try to run and share them later tonight.

EDIT: It seems the csproj changes mean I now have conflicts. Looks like a rebase is ahead.

stevenaw · 2024-05-07T02:17:46Z

I've pushed a new commit to incorporate the first review feedback. I've rerun the benchmarks, now slightly modified to focus on 128 length. Eliding the bounds check seems to have saved around 20ns.

normj

Sorry for the delayed response to the PR but impressive performance numbers! I had just a couple comments and discussing with others on the team about the LangVersion change.

normj · 2024-05-24T23:28:43Z

sdk/test/NetStandard/UnitTests/Core/AWSSDKUtilsTests.cs

+namespace UnitTests.NetStandard.Core
+{
+    [Trait("Category", "Core")]
+    public class AWSSDKUtilsTests


Since the #else block in the ToHex method is really just for .NET Framework can you copy the tests to the .NET Framework unit tests in the sdk\test\UnitTests\Custom folder. Hopefully as part of V4 we can do some test restructure cleanup so we don't have to copy the test between the .NET Framework and .NET Standard+.

normj · 2024-05-24T23:33:26Z

sdk/src/Core/Amazon.Util/AWSSDKUtils.cs

-            hex = hex.Replace("-", string.Empty);
-            return hex;
-        }
+        public static string BytesToHexString(byte[] value) => ToHex(value, false);


Not sure why we had both ToHex and BytesToHexString. Going with old codebase. The BytesToHexString is only called in AmazonS3ResponseHandler.cs. Since this is v4 I think we can just get rid of this method and update AmazonS3ReponseHandler.

stevenaw · 2024-05-26T13:34:07Z

Thanks for the initial feedback @normj . I've found an existing "sdk/test/UnitTests/Custom/Util/AWSSDKUtilsTests.cs" file to add the netfx-side tests to rather than creating a new file but please let me know if you'd instead like it outside of that "Util" folder instead. I look forward to hearing your decision about the <LangVersion>.

As an aside:
In case anyone else were to hit this issue and search for it, I had some issues building AWSSDK.Extensions.CrtIntegration and some downstream dependants due to a path length issue on Windows. This was causing the file AWSSDK.Extensions.CrtIntegration.Netstandard.GeneratedMSBuildEditorConfig.editorconfig to not be found. This was resolved by enabling long paths in Windows: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry#enable-long-paths-in-windows-10-version-1607-and-later

normj · 2024-05-28T18:50:27Z

@stevenaw That is fine putting the tests in AWSSDKUtilsTests. Lets go ahead with the change with LangVersion to 9.0.

stevenaw · 2024-05-29T22:49:28Z

Thanks for the confirmation @normj and great to hear about LangVersion too. I think all the requested changes are now done on my end unless you wanted to have all the project files' LangVersion updated in this PR. Presently it's just AWSSDK.Core.NetStandard.csproj and its unit tests which target the newer version.

normj · 2024-05-30T23:44:37Z

@stevenaw Lets not expand your PR with generator changes for LangVersion. The change overall looks good to me. As soon as I can I'll do a more thorough review.

normj · 2024-06-05T00:59:22Z

PR looks good. Getting another team member to do another pass before merging.

normj · 2024-06-05T03:15:28Z

Thanks for the perf PR!

stevenaw added 3 commits April 21, 2024 23:00

Add tests which currently pass

9f52e1a

Direct all to ToHex(), special-case impl for NET8

8c449be

Optimize ToHex

d21ca60

dscpinheiro added the v4 label Apr 22, 2024

ashovlin requested a review from normj April 23, 2024 17:33

danielmarbach reviewed Apr 28, 2024

View reviewed changes

string.create() and elide bounds checks

4f70127

stevenaw commented May 7, 2024

View reviewed changes

Reduce duplicate code paths and use LangVersion=9

5af6b58

danielmarbach mentioned this pull request May 9, 2024

V4 Development: Sensible LangVersion #3316

Closed

2 tasks

Merge branch 'v4-development' into optimize-tohex-functions

b621181

normj requested changes May 24, 2024

View reviewed changes

Remove "BytesToHexString" and add netfx test cases

3b64892

normj approved these changes Jun 5, 2024

View reviewed changes

normj requested a review from boblodgett June 5, 2024 00:57

boblodgett approved these changes Jun 5, 2024

View reviewed changes

normj merged commit 59f087d into aws:v4-development Jun 5, 2024

normj mentioned this pull request Jun 18, 2024

Change V4 .NET Framework target from 4.6.2 to 4.7.2 #3346

Merged

normj mentioned this pull request Jun 29, 2024

V4 Development Tracker #3362

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `AWSSDKUtils.ToHex()` for speed and memory #3293

Optimize `AWSSDKUtils.ToHex()` for speed and memory #3293

stevenaw commented Apr 22, 2024 •

edited

Loading

danielmarbach commented Apr 28, 2024 •

edited

Loading

danielmarbach left a comment

danielmarbach Apr 28, 2024

danielmarbach Apr 28, 2024

stevenaw May 7, 2024

stevenaw May 7, 2024

danielmarbach May 8, 2024

danielmarbach May 9, 2024

stevenaw May 7, 2024

danielmarbach May 8, 2024

stevenaw May 8, 2024 •

edited

Loading

stevenaw commented May 7, 2024

normj left a comment

normj May 24, 2024

normj May 24, 2024

stevenaw commented May 26, 2024

normj commented May 28, 2024

stevenaw commented May 29, 2024

normj commented May 30, 2024

normj commented Jun 5, 2024

normj commented Jun 5, 2024

Optimize AWSSDKUtils.ToHex() for speed and memory #3293

Optimize AWSSDKUtils.ToHex() for speed and memory #3293

Conversation

stevenaw commented Apr 22, 2024 • edited Loading

Description

Motivation and Context

Testing

Screenshots (if appropriate)

Types of changes

Checklist

License

danielmarbach commented Apr 28, 2024 • edited Loading

danielmarbach left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevenaw May 8, 2024 • edited Loading

Choose a reason for hiding this comment

stevenaw commented May 7, 2024

normj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevenaw commented May 26, 2024

normj commented May 28, 2024

stevenaw commented May 29, 2024

normj commented May 30, 2024

normj commented Jun 5, 2024

normj commented Jun 5, 2024

Optimize `AWSSDKUtils.ToHex()` for speed and memory #3293

Optimize `AWSSDKUtils.ToHex()` for speed and memory #3293

stevenaw commented Apr 22, 2024 •

edited

Loading

danielmarbach commented Apr 28, 2024 •

edited

Loading

stevenaw May 8, 2024 •

edited

Loading