Micro-optimization in IsAsciiLetterOrDigit #121309

xtqqczze · 2025-11-03T18:06:40Z

Saves an instruction on XARCH:
lea vs mov, add

Example diff:

        mov      edx, edi
        movzx    rdx, word  ptr [r12+2*rdx]
-       mov      esi, edx
-       or       esi, 32
-       add      esi, -97
-       cmp      esi, 25
+       lea      esi, [rdx-0x30]
+       cmp      esi, 9
        setbe    sil
        movzx    rsi, sil
-       add      edx, -48
-       cmp      edx, 9
+       or       edx, 32
+       add      edx, -97
+       cmp      edx, 25
        setbe    dl
        movzx    rdx, dl
        or       edx, esi
-       je       G_M15724_IG11
+       je       SHORT G_M15724_IG11
        inc      edi
        cmp      edi, 8
        jge      G_M15724_IG11
        jmp      SHORT G_M15724_IG14
-						;; size=59 bbWeight=0.64 PerfScore 7.04
+						;; size=53 bbWeight=0.64 PerfScore 7.04

Saves an instruction on XARCH: `lea` vs `mov`, `add`

xtqqczze · 2025-11-03T18:07:00Z

@MihuBot

tannergooding · 2025-11-03T18:30:31Z

lea vs mov, add

This is potentially a de-optimization.

Not only are there often fewer LEA than ALU ports, but LEA is expected to be used for "addressing" and as such often gets specialized hardware support such as utilizing the AGU, participating in stack pointer tracking, fast store forwarding prediction, etc. There are also often special considerations of the "two operand" vs "three operand" LEA, with the latter being more restricted and more expensive.

Some newer hardware is more flexible and will allow LEA without scaled index and with only two sources from base, index, and displacement to be executed as an ALU operation instead, but this isn't a guarantee and may still break the other optimizations that are possible.

If this was beneficial to generally do, it's likely a general purpose optimization that should be done by the JIT (rather than a "one off" micro-optimization to a single method).

xtqqczze · 2025-11-03T18:54:44Z

@EgorBot -amd -intel

using System;
using System.Net;
using BenchmarkDotNet.Attributes;

public class IPv4_u16_Benchmarks
{
    public IEnumerable<string> Data() => [
        new string('A', 64),
        "HelloWorld1234567890"
    ];

    [Benchmark]
    [ArgumentsSource(nameof(Data))]
    public bool M(string s)
    {
        for (int i = 0; i < s.Length; i++)
        {
            char ch = s[i];
            if (!char.IsAsciiLetterOrDigit(ch))
                return false;
        }
        return true;
    }
}

xtqqczze · 2025-11-04T23:16:25Z

This is potentially a de-optimization.

@tannergooding Benchmarks show ratios of 0.99, 0.95, 1.05 for znver4, cascadelake and skylake respectively. So this is indeed a deoptimization on older processors (but an optimization on newer ones).

tannergooding · 2025-11-04T23:31:11Z

Benchmarks show ratios of 0.99, 0.95, 1.05 for znver4, cascadelake and skylake respectively. So this is indeed a deoptimization on older processors (but an optimization on newer ones).

I would say the benchmark differences are likely too small (0-6ns) to give any kind of definitive result. They are going to be influenced by things like the run to run differences in code alignment, in BDN measuring the overhead of an "empty call", and even the latency of the hardware timer itself (typically around 10-15 cycles on such CPUs).

The code in question is a small sample that differs by an elidable register to register mov instruction. It is a micro-optimization in every sense of the word and so it's not something we'd typically take without significantly more evidence showing its worthwhile.

Beyond that, I still stand by the earlier point in that if this was a desirable optimization it isn't something we should be touching managed code to achieve. These types of subtle codegen differences are the type of thing that need to be handled in the JIT instead. Doing so ensures that its not just one function that benefits, but most functions that employ similar patterns. It is probably representative of some general-purpose transform that is missing and which might have broader impact for cases where it can actually impact the addressing mode of a load/store.

I'd recommend closing this PR and focusing any efforts for this or similar single-method micro-optimizations in the JIT instead so the impact can be more than a handful of nanoseconds.

Micro-optimization in IsAsciiLetterOrDigit

563d8f1

Saves an instruction on XARCH: `lea` vs `mov`, `add`

github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Nov 3, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Nov 3, 2025

MihuBot mentioned this pull request Nov 3, 2025

[JitDiff X64] [xtqqczze] Micro-optimization in IsAsciiLetterOrDigit MihuBot/runtime-utils#1618

Open

EgorBot mentioned this pull request Nov 3, 2025

Benchmarks for #121309 (xtqqczze) EgorBot/runtime-utils#538

Open

This was referenced Nov 3, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

Failed to install runtime_python_requirements #114924

Open

xtqqczze closed this Nov 5, 2025

xtqqczze deleted the IsAsciiLetterOrDigit branch November 5, 2025 02:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Micro-optimization in IsAsciiLetterOrDigit #121309

Micro-optimization in IsAsciiLetterOrDigit #121309

Uh oh!

xtqqczze commented Nov 3, 2025 •

edited

Loading

Uh oh!

xtqqczze commented Nov 3, 2025

Uh oh!

tannergooding commented Nov 3, 2025 •

edited

Loading

Uh oh!

xtqqczze commented Nov 3, 2025

Uh oh!

xtqqczze commented Nov 4, 2025

Uh oh!

tannergooding commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Micro-optimization in IsAsciiLetterOrDigit #121309

Micro-optimization in IsAsciiLetterOrDigit #121309

Uh oh!

Conversation

xtqqczze commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xtqqczze commented Nov 3, 2025

Uh oh!

tannergooding commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xtqqczze commented Nov 3, 2025

Uh oh!

xtqqczze commented Nov 4, 2025

Uh oh!

tannergooding commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xtqqczze commented Nov 3, 2025 •

edited

Loading

tannergooding commented Nov 3, 2025 •

edited

Loading