Slicing a shape with a negative index sometimes does not work #206

pranavm-nvidia · 2024-09-16T17:02:16Z

If we use the output of slicing a shape with a negative index in certain operations, we get a failure:

>>> a = tp.Tensor([1, 2, 3])
>>> a.shape
shape([3], dtype=int32, loc=gpu:0, shape=[1])

>>> tp.ones([a.shape[0]])
tensor([1.0000, 1.0000, 1.0000], dtype=float32, loc=gpu:0, shape=[3])

>>> tp.ones([a.shape[-1]])

MTRTException: InternalError: failed to run compilation on module with symbol name: outs_t439_19

Additional context:
Traceback (most recent call last):
  File "/tripy/tripy/backend/mlir/compiler.py", line 102, in compile
    executable = compiler.compiler_stablehlo_to_executable(
mlir_tensorrt.runtime._mlir_libs._api.MTRTException: InternalError: failed to run compilation on module with symbol name: outs_t439_19
.
    error: StableHLO dynamic shape canonicalization failed to converge within 4 iterations

slyubomirsky · 2024-09-17T21:51:40Z

MLIR dump from this example:

module @outs_t39_2 {
  func.func @main() -> tensor<?xf32> {
    %c = stablehlo.constant dense<[1, 2, 3]> : tensor<3xi32>
    %c_0 = stablehlo.constant dense<3> : tensor<i32>
    %c_1 = stablehlo.constant dense<1> : tensor<1xi32>
    %c_2 = stablehlo.constant dense<3> : tensor<1xi32>
    %c_3 = stablehlo.constant dense<1> : tensor<i32>
    %c_4 = stablehlo.constant dense<1> : tensor<1xi32>
    %c_5 = stablehlo.constant dense<0> : tensor<i32>
    %c_6 = stablehlo.constant dense<1> : tensor<i32>
    %c_7 = stablehlo.constant dense<0> : tensor<1xi32>
    %c_8 = stablehlo.constant dense<1> : tensor<1xi32>
    %0 = stablehlo.compare  LE, %c_7, %c_8 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %1 = stablehlo.select %0, %c_7, %c_8 : tensor<1xi1>, tensor<1xi32>
    %c_9 = stablehlo.constant dense<1> : tensor<1xi32>
    %2 = stablehlo.real_dynamic_slice %c_4, %1, %c_8, %c_9 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %c_10 = stablehlo.constant dense<> : tensor<0xi32>
    %3 = stablehlo.dynamic_reshape %2, %c_10 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %c_11 = stablehlo.constant dense<-1> : tensor<i32>
    %c_12 = stablehlo.constant dense<> : tensor<0xi32>
    %4 = stablehlo.compare  EQ, %c_12, %c_10 : (tensor<0xi32>, tensor<0xi32>) -> tensor<0xi1>
    %5 = stablehlo.select %4, %c_12, %c_12 : tensor<0xi1>, tensor<0xi32>
    %6 = stablehlo.dynamic_broadcast_in_dim %3, %5, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %7 = stablehlo.dynamic_broadcast_in_dim %c_11, %5, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %8 = stablehlo.add %6, %7 : tensor<i32>
    %c_13 = stablehlo.constant dense<0> : tensor<1xi32>
    %c_14 = stablehlo.constant dense<1> : tensor<1xi32>
    %9 = stablehlo.compare  LE, %c_13, %c_14 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %10 = stablehlo.select %9, %c_13, %c_14 : tensor<1xi1>, tensor<1xi32>
    %c_15 = stablehlo.constant dense<1> : tensor<1xi32>
    %11 = stablehlo.real_dynamic_slice %c_4, %10, %c_14, %c_15 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %12 = stablehlo.dynamic_reshape %11, %c_10 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %13 = stablehlo.compare  EQ, %c_12, %c_10 : (tensor<0xi32>, tensor<0xi32>) -> tensor<0xi1>
    %14 = stablehlo.select %13, %c_12, %c_12 : tensor<0xi1>, tensor<0xi32>
    %15 = stablehlo.dynamic_broadcast_in_dim %12, %14, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %16 = stablehlo.dynamic_broadcast_in_dim %c_11, %14, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %17 = stablehlo.add %15, %16 : tensor<i32>
    %18 = stablehlo.compare  EQ, %c_12, %c_10 : (tensor<0xi32>, tensor<0xi32>) -> tensor<0xi1>
    %19 = stablehlo.select %18, %c_12, %c_12 : tensor<0xi1>, tensor<0xi32>
    %20 = stablehlo.dynamic_broadcast_in_dim %17, %19, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %21 = stablehlo.dynamic_broadcast_in_dim %c_6, %19, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %22 = stablehlo.add %20, %21 : tensor<i32>
    %23 = stablehlo.reshape %8 : (tensor<i32>) -> tensor<1xi32>
    %24 = stablehlo.reshape %22 : (tensor<i32>) -> tensor<1xi32>
    %25 = stablehlo.compare  LE, %23, %24 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %26 = stablehlo.select %25, %23, %24 : tensor<1xi1>, tensor<1xi32>
    %c_16 = stablehlo.constant dense<1> : tensor<1xi32>
    %27 = stablehlo.real_dynamic_slice %c_2, %26, %24, %c_16 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %28 = stablehlo.dynamic_reshape %27, %c_10 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %29 = stablehlo.dynamic_broadcast_in_dim %28, %c_1, dims = [] : (tensor<i32>, tensor<1xi32>) -> tensor<1xi32>
    %cst = stablehlo.constant dense<1.000000e+00> : tensor<f32>
    %30 = stablehlo.dynamic_broadcast_in_dim %cst, %29, dims = [] : (tensor<f32>, tensor<1xi32>) -> tensor<?xf32>
    return %30 : tensor<?xf32>
  }
}

slyubomirsky · 2024-09-18T04:56:31Z

Contrasting with the dump from tp.ones([a.shape[0]]):

module @outs_t17_2 {
  func.func @main() -> tensor<?xf32> {
    %c = stablehlo.constant dense<[1, 2, 3]> : tensor<3xi32>
    %c_0 = stablehlo.constant dense<3> : tensor<i32>
    %c_1 = stablehlo.constant dense<1> : tensor<1xi32>
    %c_2 = stablehlo.constant dense<3> : tensor<1xi32>
    %c_3 = stablehlo.constant dense<0> : tensor<i32>
    %c_4 = stablehlo.constant dense<1> : tensor<i32>
    %c_5 = stablehlo.constant dense<0> : tensor<1xi32>
    %c_6 = stablehlo.constant dense<1> : tensor<1xi32>
    %0 = stablehlo.compare  LE, %c_5, %c_6 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %1 = stablehlo.select %0, %c_5, %c_6 : tensor<1xi1>, tensor<1xi32>
    %c_7 = stablehlo.constant dense<1> : tensor<1xi32>
    %2 = stablehlo.real_dynamic_slice %c_2, %1, %c_6, %c_7 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %c_8 = stablehlo.constant dense<> : tensor<0xi32>
    %3 = stablehlo.dynamic_reshape %2, %c_8 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %4 = stablehlo.dynamic_broadcast_in_dim %3, %c_1, dims = [] : (tensor<i32>, tensor<1xi32>) -> tensor<1xi32>
    %cst = stablehlo.constant dense<1.000000e+00> : tensor<f32>
    %5 = stablehlo.dynamic_broadcast_in_dim %cst, %4, dims = [] : (tensor<f32>, tensor<1xi32>) -> tensor<?xf32>
    return %5 : tensor<?xf32>
  }
}

Most of the code in the first dump is, I assume, from handling the slice and tensor addition in convert_to_positive_idx:

                return index if index >= 0 else index + t_shape[i]

An alternative to consider would be if it results in less code or easier-to-analyze code to do this in FlatIR. However, in principle, I don't think the generated MLIR is wrong, so it should not result in this sort of error.

pranavm-nvidia added bug and removed bug labels Sep 16, 2024

pranavm-nvidia mentioned this issue Sep 16, 2024

Concatenating a shape tensor using the * op does not work with fill #207

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slicing a shape with a negative index sometimes does not work #206

Slicing a shape with a negative index sometimes does not work #206

pranavm-nvidia commented Sep 16, 2024

slyubomirsky commented Sep 17, 2024

slyubomirsky commented Sep 18, 2024

Slicing a shape with a negative index sometimes does not work #206

Slicing a shape with a negative index sometimes does not work #206

Comments

pranavm-nvidia commented Sep 16, 2024

slyubomirsky commented Sep 17, 2024

slyubomirsky commented Sep 18, 2024