Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slicing a shape with a negative index sometimes does not work #206

Open
pranavm-nvidia opened this issue Sep 16, 2024 · 2 comments
Open

Comments

@pranavm-nvidia
Copy link
Collaborator

If we use the output of slicing a shape with a negative index in certain operations, we get a failure:

>>> a = tp.Tensor([1, 2, 3])
>>> a.shape
shape([3], dtype=int32, loc=gpu:0, shape=[1])

>>> tp.ones([a.shape[0]])
tensor([1.0000, 1.0000, 1.0000], dtype=float32, loc=gpu:0, shape=[3])

>>> tp.ones([a.shape[-1]])

MTRTException: InternalError: failed to run compilation on module with symbol name: outs_t439_19

Additional context:
Traceback (most recent call last):
  File "/tripy/tripy/backend/mlir/compiler.py", line 102, in compile
    executable = compiler.compiler_stablehlo_to_executable(
mlir_tensorrt.runtime._mlir_libs._api.MTRTException: InternalError: failed to run compilation on module with symbol name: outs_t439_19
.
    error: StableHLO dynamic shape canonicalization failed to converge within 4 iterations
@slyubomirsky
Copy link
Collaborator

MLIR dump from this example:

module @outs_t39_2 {
  func.func @main() -> tensor<?xf32> {
    %c = stablehlo.constant dense<[1, 2, 3]> : tensor<3xi32>
    %c_0 = stablehlo.constant dense<3> : tensor<i32>
    %c_1 = stablehlo.constant dense<1> : tensor<1xi32>
    %c_2 = stablehlo.constant dense<3> : tensor<1xi32>
    %c_3 = stablehlo.constant dense<1> : tensor<i32>
    %c_4 = stablehlo.constant dense<1> : tensor<1xi32>
    %c_5 = stablehlo.constant dense<0> : tensor<i32>
    %c_6 = stablehlo.constant dense<1> : tensor<i32>
    %c_7 = stablehlo.constant dense<0> : tensor<1xi32>
    %c_8 = stablehlo.constant dense<1> : tensor<1xi32>
    %0 = stablehlo.compare  LE, %c_7, %c_8 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %1 = stablehlo.select %0, %c_7, %c_8 : tensor<1xi1>, tensor<1xi32>
    %c_9 = stablehlo.constant dense<1> : tensor<1xi32>
    %2 = stablehlo.real_dynamic_slice %c_4, %1, %c_8, %c_9 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %c_10 = stablehlo.constant dense<> : tensor<0xi32>
    %3 = stablehlo.dynamic_reshape %2, %c_10 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %c_11 = stablehlo.constant dense<-1> : tensor<i32>
    %c_12 = stablehlo.constant dense<> : tensor<0xi32>
    %4 = stablehlo.compare  EQ, %c_12, %c_10 : (tensor<0xi32>, tensor<0xi32>) -> tensor<0xi1>
    %5 = stablehlo.select %4, %c_12, %c_12 : tensor<0xi1>, tensor<0xi32>
    %6 = stablehlo.dynamic_broadcast_in_dim %3, %5, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %7 = stablehlo.dynamic_broadcast_in_dim %c_11, %5, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %8 = stablehlo.add %6, %7 : tensor<i32>
    %c_13 = stablehlo.constant dense<0> : tensor<1xi32>
    %c_14 = stablehlo.constant dense<1> : tensor<1xi32>
    %9 = stablehlo.compare  LE, %c_13, %c_14 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %10 = stablehlo.select %9, %c_13, %c_14 : tensor<1xi1>, tensor<1xi32>
    %c_15 = stablehlo.constant dense<1> : tensor<1xi32>
    %11 = stablehlo.real_dynamic_slice %c_4, %10, %c_14, %c_15 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %12 = stablehlo.dynamic_reshape %11, %c_10 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %13 = stablehlo.compare  EQ, %c_12, %c_10 : (tensor<0xi32>, tensor<0xi32>) -> tensor<0xi1>
    %14 = stablehlo.select %13, %c_12, %c_12 : tensor<0xi1>, tensor<0xi32>
    %15 = stablehlo.dynamic_broadcast_in_dim %12, %14, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %16 = stablehlo.dynamic_broadcast_in_dim %c_11, %14, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %17 = stablehlo.add %15, %16 : tensor<i32>
    %18 = stablehlo.compare  EQ, %c_12, %c_10 : (tensor<0xi32>, tensor<0xi32>) -> tensor<0xi1>
    %19 = stablehlo.select %18, %c_12, %c_12 : tensor<0xi1>, tensor<0xi32>
    %20 = stablehlo.dynamic_broadcast_in_dim %17, %19, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %21 = stablehlo.dynamic_broadcast_in_dim %c_6, %19, dims = [] : (tensor<i32>, tensor<0xi32>) -> tensor<i32>
    %22 = stablehlo.add %20, %21 : tensor<i32>
    %23 = stablehlo.reshape %8 : (tensor<i32>) -> tensor<1xi32>
    %24 = stablehlo.reshape %22 : (tensor<i32>) -> tensor<1xi32>
    %25 = stablehlo.compare  LE, %23, %24 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %26 = stablehlo.select %25, %23, %24 : tensor<1xi1>, tensor<1xi32>
    %c_16 = stablehlo.constant dense<1> : tensor<1xi32>
    %27 = stablehlo.real_dynamic_slice %c_2, %26, %24, %c_16 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %28 = stablehlo.dynamic_reshape %27, %c_10 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %29 = stablehlo.dynamic_broadcast_in_dim %28, %c_1, dims = [] : (tensor<i32>, tensor<1xi32>) -> tensor<1xi32>
    %cst = stablehlo.constant dense<1.000000e+00> : tensor<f32>
    %30 = stablehlo.dynamic_broadcast_in_dim %cst, %29, dims = [] : (tensor<f32>, tensor<1xi32>) -> tensor<?xf32>
    return %30 : tensor<?xf32>
  }
}

@slyubomirsky
Copy link
Collaborator

Contrasting with the dump from tp.ones([a.shape[0]]):

module @outs_t17_2 {
  func.func @main() -> tensor<?xf32> {
    %c = stablehlo.constant dense<[1, 2, 3]> : tensor<3xi32>
    %c_0 = stablehlo.constant dense<3> : tensor<i32>
    %c_1 = stablehlo.constant dense<1> : tensor<1xi32>
    %c_2 = stablehlo.constant dense<3> : tensor<1xi32>
    %c_3 = stablehlo.constant dense<0> : tensor<i32>
    %c_4 = stablehlo.constant dense<1> : tensor<i32>
    %c_5 = stablehlo.constant dense<0> : tensor<1xi32>
    %c_6 = stablehlo.constant dense<1> : tensor<1xi32>
    %0 = stablehlo.compare  LE, %c_5, %c_6 : (tensor<1xi32>, tensor<1xi32>) -> tensor<1xi1>
    %1 = stablehlo.select %0, %c_5, %c_6 : tensor<1xi1>, tensor<1xi32>
    %c_7 = stablehlo.constant dense<1> : tensor<1xi32>
    %2 = stablehlo.real_dynamic_slice %c_2, %1, %c_6, %c_7 : (tensor<1xi32>, tensor<1xi32>, tensor<1xi32>, tensor<1xi32>) -> tensor<?xi32>
    %c_8 = stablehlo.constant dense<> : tensor<0xi32>
    %3 = stablehlo.dynamic_reshape %2, %c_8 : (tensor<?xi32>, tensor<0xi32>) -> tensor<i32>
    %4 = stablehlo.dynamic_broadcast_in_dim %3, %c_1, dims = [] : (tensor<i32>, tensor<1xi32>) -> tensor<1xi32>
    %cst = stablehlo.constant dense<1.000000e+00> : tensor<f32>
    %5 = stablehlo.dynamic_broadcast_in_dim %cst, %4, dims = [] : (tensor<f32>, tensor<1xi32>) -> tensor<?xf32>
    return %5 : tensor<?xf32>
  }
}

Most of the code in the first dump is, I assume, from handling the slice and tensor addition in convert_to_positive_idx:

                return index if index >= 0 else index + t_shape[i]

An alternative to consider would be if it results in less code or easier-to-analyze code to do this in FlatIR. However, in principle, I don't think the generated MLIR is wrong, so it should not result in this sort of error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants