Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[GlobalOptimization] Fix a silent bug in DetatchElementwiseFromNamedO…
…ps pass (iree-org#19356) This moves match failure checks before modifying linalg ops, and loosens the check for identity map access to the output tensor. ### Context: Specific depthwise convolution ops were encountering numeric failures. See <iree-org#18600> and <iree-org#19339>. I noticed that the bias was not affecting the output values, and tracked down where the bias was getting deleted. The issue is that the pass `DetatchElementwiseFromNamedOps` was modifying the `depthwise_conv` op to use a zero-fill *before* checking for some match failures. This resulted in a partial application of the pattern where the original bias did not get added back to the modified linalg op result. The depthwise conv ops were specifically failing to have an identity map for the output tensor access. For example: ```mlir module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @torch_jit(%arg0: tensor<1x96x56x56xf32>, %arg1: tensor<96x1x7x7xf32>, %arg2: tensor<96xf32>) -> tensor<1x96x56x56xf32> { %cst = arith.constant 0.000000e+00 : f32 %padded = tensor.pad %arg0 low[0, 0, 3, 3] high[0, 0, 3, 3] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<1x96x56x56xf32> to tensor<1x96x62x62xf32> %0 = tensor.empty() : tensor<1x96x56x56xf32> %broadcasted = linalg.broadcast ins(%arg2 : tensor<96xf32>) outs(%0 : tensor<1x96x56x56xf32>) dimensions = [0, 2, 3] %collapsed = tensor.collapse_shape %arg1 [[0, 1], [2], [3]] : tensor<96x1x7x7xf32> into tensor<96x7x7xf32> %1 = linalg.depthwise_conv_2d_nchw_chw {dilations = dense<1> : vector<2xi64>, strides = dense<1> : vector<2xi64>} ins(%padded, %collapsed : tensor<1x96x62x62xf32>, tensor<96x7x7xf32>) outs(%broadcasted : tensor<1x96x56x56xf32>) -> tensor<1x96x56x56xf32> return %1 : tensor<1x96x56x56xf32> } } ``` generalizes to ```mlir #map = affine_map<(d0, d1, d2, d3) -> (d1)> #map1 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)> #map2 = affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d3, d1 + d4, d2 + d5)> #map3 = affine_map<(d0, d1, d2, d3, d4, d5) -> (d3, d4, d5)> #map4 = affine_map<(d0, d1, d2, d3, d4, d5) -> (d0, d3, d1, d2)> module { ml_program.global private mutable @global_seed(dense<0> : tensor<i64>) : tensor<i64> func.func @torch_jit(%arg0: tensor<1x96x56x56xf32>, %arg1: tensor<96x1x7x7xf32>, %arg2: tensor<96xf32>) -> tensor<1x96x56x56xf32> { %cst = arith.constant 0.000000e+00 : f32 %padded = tensor.pad %arg0 low[0, 0, 3, 3] high[0, 0, 3, 3] { ^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index): tensor.yield %cst : f32 } : tensor<1x96x56x56xf32> to tensor<1x96x62x62xf32> %0 = tensor.empty() : tensor<1x96x56x56xf32> %1 = linalg.generic {indexing_maps = [#map, #map1], iterator_types = ["parallel", "parallel", "parallel", "parallel"]} ins(%arg2 : tensor<96xf32>) outs(%0 : tensor<1x96x56x56xf32>) { ^bb0(%in: f32, %out: f32): linalg.yield %in : f32 } -> tensor<1x96x56x56xf32> %collapsed = tensor.collapse_shape %arg1 [[0, 1], [2], [3]] : tensor<96x1x7x7xf32> into tensor<96x7x7xf32> %2 = linalg.generic {indexing_maps = [#map2, #map3, #map4], iterator_types = ["parallel", "parallel", "parallel", "parallel", "reduction", "reduction"]} ins(%padded, %collapsed : tensor<1x96x62x62xf32>, tensor<96x7x7xf32>) outs(%1 : tensor<1x96x56x56xf32>) { ^bb0(%in: f32, %in_0: f32, %out: f32): %3 = arith.mulf %in, %in_0 : f32 %4 = arith.addf %out, %3 : f32 linalg.yield %4 : f32 } -> tensor<1x96x56x56xf32> return %2 : tensor<1x96x56x56xf32> } } ``` For some reason, the channel dim `d3` appears after the spatial dims (`d1` and `d2`) for this particular op. --------- Signed-off-by: zjgarvey <[email protected]>
- Loading branch information