Skip to content

[RISC-V] Reverse unordered FP comparisons in branches #115943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tomeksowi
Copy link
Contributor

Unordered floating-point comparisons are achieved by neg'ing the ordered counterparts. Avoid that by reversing both the FP comparison and the zero-comparison fused with the branch.

Also, minor codegen haircut.

Part of #84834, cc @dotnet/samsung

@tomeksowi tomeksowi marked this pull request as draft May 23, 2025 14:53
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 23, 2025
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 23, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@tomeksowi
Copy link
Contributor Author

No regressions.

Diffs are based on 94,874 contexts (52,265 MinOpts, 42,609 FullOpts).

MISSED contexts: 3 (0.00%)

Overall (-23,124 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
linux.riscv64.Checked.mch 111,279,548 -23,124 -0.03%
MinOpts (-604 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
linux.riscv64.Checked.mch 80,673,320 -604 -0.00%
FullOpts (-22,520 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
linux.riscv64.Checked.mch 30,606,228 -22,520 -0.05%
Example diffs
linux.riscv64.Checked.mch
-12 (-11.11%) : 86422.dasm - Microsoft.VisualBasic.CompilerServices.Operators:CompareSingle(float,float):int (FullOpts)
@@ -23,22 +23,19 @@ G_M44467_IG01:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
 						;; size=16 bbWeight=1 PerfScore 9.00
 G_M44467_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             feq.s          a0, fa0, fa1
-            xori           a0, a0, 1
             sext.w         t6, a0
-            beqz           t6, G_M44467_IG06
-						;; size=16 bbWeight=1 PerfScore 7.50
+            bnez           t6, G_M44467_IG06
+						;; size=12 bbWeight=1 PerfScore 7.00
 G_M44467_IG03:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             flt.s          a0, fa0, fa1
-            xori           a0, a0, 1
             sext.w         t6, a0
-            beqz           t6, G_M44467_IG05
+            bnez           t6, G_M44467_IG05
             flt.s          a0, fa1, fa0
-            xori           a0, a0, 1
             sext.w         t6, a0
-            beqz           t6, G_M44467_IG04
+            bnez           t6, G_M44467_IG04
             addi           a0, zero, 0xD1FFAB1E
             j              G_M44467_IG07
-						;; size=40 bbWeight=0.50 PerfScore 8.50
+						;; size=32 bbWeight=0.50 PerfScore 8.00
 G_M44467_IG04:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             addi           a0, zero, 0xD1FFAB1E
             j              G_M44467_IG07
@@ -56,7 +53,7 @@ G_M44467_IG07:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             addi           sp, sp, 16
             ret						;; size=16 bbWeight=1 PerfScore 7.50
 
-; Total bytes of code 108, prolog size 16, PerfScore 34.75, instruction count 27, allocated bytes for code 108 (MethodHash=6af2524c) for method Microsoft.VisualBasic.CompilerServices.Operators:CompareSingle(float,float):int (FullOpts)
+; Total bytes of code 96, prolog size 16, PerfScore 33.75, instruction count 24, allocated bytes for code 96 (MethodHash=6af2524c) for method Microsoft.VisualBasic.CompilerServices.Operators:CompareSingle(float,float):int (FullOpts)
 ; ============================================================
 
 Unwind Info:
@@ -67,7 +64,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 27 (0x0001b) Actual length = 108 (0x00006c)
+  Function Length   : 24 (0x00018) Actual length = 96 (0x000060)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-12 (-11.11%) : 86423.dasm - Microsoft.VisualBasic.CompilerServices.Operators:CompareDouble(double,double):int (FullOpts)
@@ -23,22 +23,19 @@ G_M37116_IG01:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
 						;; size=16 bbWeight=1 PerfScore 9.00
 G_M37116_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             feq.d          a0, fa0, fa1
-            xori           a0, a0, 1
             sext.w         t6, a0
-            beqz           t6, G_M37116_IG06
-						;; size=16 bbWeight=1 PerfScore 7.50
+            bnez           t6, G_M37116_IG06
+						;; size=12 bbWeight=1 PerfScore 7.00
 G_M37116_IG03:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             flt.d          a0, fa0, fa1
-            xori           a0, a0, 1
             sext.w         t6, a0
-            beqz           t6, G_M37116_IG05
+            bnez           t6, G_M37116_IG05
             flt.d          a0, fa1, fa0
-            xori           a0, a0, 1
             sext.w         t6, a0
-            beqz           t6, G_M37116_IG04
+            bnez           t6, G_M37116_IG04
             addi           a0, zero, 0xD1FFAB1E
             j              G_M37116_IG07
-						;; size=40 bbWeight=0.50 PerfScore 8.50
+						;; size=32 bbWeight=0.50 PerfScore 8.00
 G_M37116_IG04:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             addi           a0, zero, 0xD1FFAB1E
             j              G_M37116_IG07
@@ -56,7 +53,7 @@ G_M37116_IG07:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
             addi           sp, sp, 16
             ret						;; size=16 bbWeight=1 PerfScore 7.50
 
-; Total bytes of code 108, prolog size 16, PerfScore 34.75, instruction count 27, allocated bytes for code 108 (MethodHash=2f356f03) for method Microsoft.VisualBasic.CompilerServices.Operators:CompareDouble(double,double):int (FullOpts)
+; Total bytes of code 96, prolog size 16, PerfScore 33.75, instruction count 24, allocated bytes for code 96 (MethodHash=2f356f03) for method Microsoft.VisualBasic.CompilerServices.Operators:CompareDouble(double,double):int (FullOpts)
 ; ============================================================
 
 Unwind Info:
@@ -67,7 +64,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 27 (0x0001b) Actual length = 108 (0x00006c)
+  Function Length   : 24 (0x00018) Actual length = 96 (0x000060)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
-16 (-10.26%) : 47378.dasm - BringUpTest_JTrueGtFP:JTrueGtFP(float):int (FullOpts)
@@ -25,31 +25,27 @@ G_M56740_IG02:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             lui            a1, 0xD1FFAB1E
             fmv.w.x        ft4, a1
             flt.s          a1, ft4, fa0
-            xori           a1, a1, 1
             sext.w         t6, a1
-            beqz           t6, G_M56740_IG06
-						;; size=28 bbWeight=1 PerfScore 9.50
+            bnez           t6, G_M56740_IG06
+						;; size=24 bbWeight=1 PerfScore 9.00
 G_M56740_IG03:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             fmv.w.x        ft4, zero
             flt.s          a1, ft4, fa0
-            xori           a1, a1, 1
             sext.w         t6, a1
-            beqz           t6, G_M56740_IG05
+            bnez           t6, G_M56740_IG05
             lui            a1, 0xD1FFAB1E
             fmv.w.x        ft4, a1
             flt.s          a1, ft4, fa0
-            xori           a1, a1, 1
             sext.w         t6, a1
-            beqz           t6, G_M56740_IG04
+            bnez           t6, G_M56740_IG04
             auipc          t6, 0xD1FFAB1E
             flw            ft4, 0xD1FFAB1E(t6)
             flt.s          a1, ft4, fa0
-            xori           a1, a1, 1
             sext.w         t6, a1
-            bnez           t6, G_M56740_IG07
+            beqz           t6, G_M56740_IG07
             addi           a0, zero, 0xD1FFAB1E
             j              G_M56740_IG07
-						;; size=76 bbWeight=0.50 PerfScore 14.50
+						;; size=64 bbWeight=0.50 PerfScore 13.75
 G_M56740_IG04:        ; bbWeight=0.50, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref
             addi           a0, zero, 0xD1FFAB1E
             j              G_M56740_IG07
@@ -69,7 +65,7 @@ G_M56740_IG07:        ; bbWeight=1, gcrefRegs=0000 {}, byrefRegs=0000 {}, byref,
 RWD00  	dd	FF7FFFFFh		; -3.40282e+38
 
 
-; Total bytes of code 156, prolog size 16, PerfScore 42.75, instruction count 38, allocated bytes for code 156 (MethodHash=4c11225b) for method BringUpTest_JTrueGtFP:JTrueGtFP(float):int (FullOpts)
+; Total bytes of code 140, prolog size 16, PerfScore 41.50, instruction count 34, allocated bytes for code 140 (MethodHash=4c11225b) for method BringUpTest_JTrueGtFP:JTrueGtFP(float):int (FullOpts)
 ; ============================================================
 
 Unwind Info:
@@ -80,7 +76,7 @@ Unwind Info:
   E bit             : 0
   X bit             : 0
   Vers              : 0
-  Function Length   : 39 (0x00027) Actual length = 156 (0x00009c)
+  Function Length   : 35 (0x00023) Actual length = 140 (0x00008c)
   ---- Epilog scopes ----
   ---- Scope 0
   Epilog Start Offset        : 3523193630 (0xd1ffab1e) Actual offset = 3523193630 (0xd1ffab1e) Offset from main function begin = 3523193630 (0xd1ffab1e)
+0 (0.00%) : 94768.dasm - Microsoft.Diagnostics.Tracing.ZippedETLWriter:PrepForWrite():System.Collections.Generic.List`1[System.String]:this (FullOpts)

No diffs found?

+0 (0.00%) : 94736.dasm - Microsoft.Diagnostics.Tracing.TraceEventNativeMethods:IsElevated():System.Nullable`1[ubyte] (FullOpts)

No diffs found?

+0 (0.00%) : 94480.dasm - Microsoft.Diagnostics.Tracing.EventPipeEventMetaDataHeader:.ctor(FastSerialization.PinnedStreamReader,int,int,int,int,int,System.String):this (FullOpts)

No diffs found?

Details

Size improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same size Improvements (bytes) Regressions (bytes)
linux.riscv64.Checked.mch 11,769 698 0 11,071 -23,124 +0

PerfScore improvements/regressions per collection

Collection Contexts with diffs Improvements Regressions Same PerfScore Improvements (PerfScore) Regressions (PerfScore) PerfScore Overall in FullOpts
linux.riscv64.Checked.mch 11,769 684 0 11,085 -0.48% 0.00% -0.0076%

Context information

Collection Diffed contexts MinOpts FullOpts Missed, base Missed, diff
linux.riscv64.Checked.mch 94,874 52,265 42,609 3 (0.00%) 3 (0.00%)

jit-analyze output

@tomeksowi tomeksowi marked this pull request as ready for review May 26, 2025 07:28
@risc-vv
Copy link

risc-vv commented May 27, 2025

d609565 is being scheduled for building and testing

GIT: d6095657b1b72c093baf6b57a8b6f05a47ec5f56
REPO: dotnet/runtime
BRANCH: main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-riscv Related to the RISC-V architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants