feat: simplify operations with `sign` #336

avik-pal · 2025-02-11T23:17:52Z

Trying to simplify

module @reactant_gradient attributes {mhlo.num_partitions = 1 : i64, mhlo.num_replicas = 1 : i64} {
  func.func @main(%arg0: tensor<8x4xf32>, %arg1: tensor<16xf32>) -> tensor<8x4xf32> {
    %cst = stablehlo.constant dense<0.000000e+00> : tensor<f32>
    %cst_0 = stablehlo.constant dense<0.000000e+00> : tensor<32x4x8xf32>
    %0 = stablehlo.broadcast_in_dim %arg0, dims = [2, 1] : (tensor<8x4xf32>) -> tensor<16x4x8xf32>
    %1 = stablehlo.broadcast_in_dim %arg1, dims = [0] : (tensor<16xf32>) -> tensor<16x4x8xf32>
    %2 = stablehlo.multiply %0, %1 : tensor<16x4x8xf32>
    %3 = stablehlo.sine %2 : tensor<16x4x8xf32>
    %4 = stablehlo.cosine %2 : tensor<16x4x8xf32>
    %5 = stablehlo.concatenate %3, %4, dim = 0 : (tensor<16x4x8xf32>, tensor<16x4x8xf32>) -> tensor<32x4x8xf32>
    %6 = stablehlo.abs %5 : tensor<32x4x8xf32>
    %7 = stablehlo.add %6, %6 : tensor<32x4x8xf32>
    %8 = stablehlo.compare  GE, %5, %cst_0 : (tensor<32x4x8xf32>, tensor<32x4x8xf32>) -> tensor<32x4x8xi1>
    %9 = stablehlo.negate %7 : tensor<32x4x8xf32>
    %10 = stablehlo.select %8, %7, %9 : tensor<32x4x8xi1>, tensor<32x4x8xf32>
    %11 = stablehlo.slice %10 [0:16, 0:4, 0:8] : (tensor<32x4x8xf32>) -> tensor<16x4x8xf32>
    %12 = stablehlo.slice %10 [16:32, 0:4, 0:8] : (tensor<32x4x8xf32>) -> tensor<16x4x8xf32>
    %13 = stablehlo.negate %3 : tensor<16x4x8xf32>
    %14 = stablehlo.multiply %12, %13 : tensor<16x4x8xf32>
    %15 = stablehlo.multiply %11, %4 : tensor<16x4x8xf32>
    %16 = stablehlo.add %14, %15 : tensor<16x4x8xf32>
    %17 = stablehlo.multiply %16, %1 : tensor<16x4x8xf32>
    %18 = stablehlo.reduce(%17 init: %cst) applies stablehlo.add across dimensions = [0] : (tensor<16x4x8xf32>, tensor<f32>) -> tensor<4x8xf32>
    %19 = stablehlo.reshape %18 : (tensor<4x8xf32>) -> tensor<8x4xf32>
    return %19 : tensor<8x4xf32>
  }
}

most of the internal operations are actually no-ops (this is from a chunk inside a transformer so kind of a common pattern). After these passes

module @reactant_gradient attributes {mhlo.num_partitions = 1 : i64, mhlo.num_replicas = 1 : i64} {
  func.func @main(%arg0: tensor<8x4xf32>, %arg1: tensor<16xf32>) -> tensor<8x4xf32> {
    %cst = stablehlo.constant dense<0.000000e+00> : tensor<f32>
    %0 = stablehlo.broadcast_in_dim %arg0, dims = [2, 1] : (tensor<8x4xf32>) -> tensor<16x4x8xf32>
    %1 = stablehlo.broadcast_in_dim %arg1, dims = [0] : (tensor<16xf32>) -> tensor<16x4x8xf32>
    %2 = stablehlo.multiply %0, %1 : tensor<16x4x8xf32>
    %3 = stablehlo.sine %2 : tensor<16x4x8xf32>
    %4 = stablehlo.cosine %2 : tensor<16x4x8xf32>
    %5 = stablehlo.concatenate %3, %4, dim = 0 : (tensor<16x4x8xf32>, tensor<16x4x8xf32>) -> tensor<32x4x8xf32>
    %6 = stablehlo.multiply %5, %5 : tensor<32x4x8xf32>
    %7 = stablehlo.negate %6 : tensor<32x4x8xf32>
    %8 = stablehlo.slice %7 [0:16, 0:4, 0:8] : (tensor<32x4x8xf32>) -> tensor<16x4x8xf32>
    %9 = stablehlo.slice %7 [16:32, 0:4, 0:8] : (tensor<32x4x8xf32>) -> tensor<16x4x8xf32>
    %10 = stablehlo.multiply %9, %3 : tensor<16x4x8xf32>
    %11 = stablehlo.negate %10 : tensor<16x4x8xf32>
    %12 = stablehlo.multiply %8, %4 : tensor<16x4x8xf32>
    %13 = stablehlo.add %11, %12 : tensor<16x4x8xf32>
    %14 = stablehlo.multiply %13, %1 : tensor<16x4x8xf32>
    %15 = stablehlo.reduce(%14 init: %cst) applies stablehlo.add across dimensions = [0] : (tensor<16x4x8xf32>, tensor<f32>) -> tensor<4x8xf32>
    %16 = stablehlo.reshape %15 : (tensor<4x8xf32>) -> tensor<8x4xf32>
    return %16 : tensor<8x4xf32>
  }
}

avik-pal · 2025-02-11T23:18:33Z

Some of these need to be qualified using no_nan

wsmoses · 2025-02-12T00:09:16Z

src/enzyme_ad/jax/Passes/EnzymeHLOOpt.cpp

+// (select (x > 0) z (neg z)) -> (mul (sign x) z)
+// (select (x >= 0) z (neg z)) -> (mul (sign x) z)
+// (select (x > 0) (neg z) z) -> (mul (sign x) (neg z))
+// (select (x >= 0) (neg z) z) -> (mul (sign x) (neg z))


is this actually simpler/faster?

This one mostly enables the other optimizations.

a select is usually faster than a mul, so this in isolation makes things locally worse. Is it feasible for the downstream operations to work on select style forms?

wsmoses · 2025-02-12T00:10:25Z

src/enzyme_ad/jax/Passes/EnzymeHLOOpt.cpp

+};
+
+// (mul (sign x) (abs x)) -> x
+// (mul (abs x) (sign x)) -> x


this seems reasonable, can you split to do this individually?

wsmoses · 2025-02-12T00:11:23Z

src/enzyme_ad/jax/Passes/EnzymeHLOOpt.cpp

+};
+
+// (mul (neg x) (neg y)) -> (mul x y)
+// (mul (neg x) y) -> (neg (mul x y))


top one is always good, the one negation variations we should separate since there's a separate question of whether we want to propagate them up or down (e.g. if we had mul (neg x), constant) we'd want to do mul x (-constant)

wsmoses · 2025-02-12T00:12:15Z

src/enzyme_ad/jax/Passes/EnzymeHLOOpt.cpp

+
+// This pattern only does partially the following. We rely on transforming the op to a
+// pattern which further uses the above pattern.
+// (mul (sign x) (add (abs x) (abs x))) -> (mul x x)


longer term i feel like this merits a broader sign analysis (alongside perhaps a transpose analysis)

avik-pal added 4 commits February 11, 2025 14:22

feat: simplify sign abs

dba7c7c

feat: simplify positive negative select

5b0bd8e

feat: multiply negate simplify

d1517cc

feat: mul sign simplify

75a3bab

wsmoses reviewed Feb 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: simplify operations with `sign` #336

feat: simplify operations with `sign` #336

avik-pal commented Feb 11, 2025 •

edited

Loading

avik-pal commented Feb 11, 2025

wsmoses Feb 12, 2025

avik-pal Feb 12, 2025

wsmoses Feb 12, 2025

wsmoses Feb 12, 2025

wsmoses Feb 12, 2025

wsmoses Feb 12, 2025

feat: simplify operations with sign #336

Are you sure you want to change the base?

feat: simplify operations with sign #336

Conversation

avik-pal commented Feb 11, 2025 • edited Loading

avik-pal commented Feb 11, 2025

wsmoses Feb 12, 2025

Choose a reason for hiding this comment

avik-pal Feb 12, 2025

Choose a reason for hiding this comment

wsmoses Feb 12, 2025

Choose a reason for hiding this comment

wsmoses Feb 12, 2025

Choose a reason for hiding this comment

wsmoses Feb 12, 2025

Choose a reason for hiding this comment

wsmoses Feb 12, 2025

Choose a reason for hiding this comment

feat: simplify operations with `sign` #336

feat: simplify operations with `sign` #336

avik-pal commented Feb 11, 2025 •

edited

Loading