MIGX EP Subgraph debug fixes and cleanup for unsupported ops #196

TedThemistokleous · 2025-11-06T17:21:26Z

Description

Add fixes for control flow operators that were causing odd cases with subgraphs
Add show when we have unsupported operators without the need for using the _dump_model_ops flag
Add support for the Size operator in models
Simplify model dumping code and encapsulate this operation in the EP
Additional output for canEvalNodeArgument() - Give user clear indication of which OP + Input is failing and why

Motivation and Context

Changes found for control flow, size operator and other debug fixes and cleanup

We want to be clear there's an unsupported op in the run on initial get capabilities to be clear where graph break occur
The requiring dump model ops code wasn't clear to use when we have unsupported ops to the user if there is missing support

Added some additional debug output (warnings) if an input op isn't supported due to issues with evaluation. This isn't obvious if an operator fails due to shape or const tensor inputs.

Should make debugging easier more clear before we start getting into subGraph optimization, splits and partitioning. Especially important when there's cases where say a subgraph has inputs that seemingly error out and can't be evaluated correctly as the input isn't found but exists in the parent graph (As seen with the IF operator)

Should be clear which node and corresponding input is being filtered out by the MIGraphX EP. This is useful to hunt down operators who may be either malformed or incorrectly handled due to subgraphs/splits/partitioning

Put this logic behind dump_model_as_onnx call to avoid repeated code .

TedThemistokleous · 2025-11-06T17:39:11Z

This will supersede - #195

ahsan-ca · 2025-11-07T21:26:53Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider_utils.h

      return isInputNode(n, input_name);
    });
    if (nit == in_nodes.end()) {
+      LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't Find node to name";


Suggested change

LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't Find node to name";

LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Cannot find node to name";

ahsan-ca · 2025-11-07T21:47:52Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc

                                                    "SimplifiedLayerNormalization",
                                                    "Sin",
                                                    "Sinh",
+                                                    "Size",


Question; Is this needed for other changes in this? Wondering if it makes more sense to add this in a separate PR.

Its a one line change it was needed for a customer. Bundling all that work into on PR.

ahsan-ca · 2025-11-07T21:52:25Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc

+        // if graph is split we dont know if output is used so we need this, otherwise if the graph isn't split
+        // then we can safely assume this output is a dangling output from a node and to discard it as part of the
+        // final graph output
+        if (is_graph_split) {
+          output_names.push_back(name);
+        }


Is there a change you made here? Seems identical except the format of braces.
Not sure why the comment is also showing as a change.

Formatting with lintrunner

ahsan-ca · 2025-11-07T22:00:02Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider_utils.h


    // Input cannot be constant folded
    if (IsGraphInput(graph, input_name)) {
+      LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't be const folded";


Suggested change

LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't be const folded";

LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Cannot be const folded";

ahsan-ca · 2025-11-07T22:00:26Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider_utils.h

    }

    if (!canEvalShapeGeneral(graph, *nit, input_nodes)) {
+      LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't eval shape";


Suggested change

LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't eval shape";

LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Cannot eval shape";

TedThemistokleous · 2025-11-07T23:09:27Z

@apwojcik should cherry pick this to wml main since it has the debug code broken away from the dump_model_ops if you're hitting unsupported ops or issues with preprocessing operators with invalid inputs for some of the newer graphs seen on the windows/UAI side.

TedThemistokleous added 5 commits November 6, 2025 12:06

Add support for Size Onnx OP

a43c0ee

Add logging for failure modes in canEvalNodeArgument()

0aaec89

Should be clear which node and corresponding input is being filtered out by the MIGraphX EP. This is useful to hunt down operators who may be either malformed or incorrectly handled due to subgraphs/splits/partitioning

encapsulate dump_model_ops call

c6e339b

Put this logic behind dump_model_as_onnx call to avoid repeated code .

Add control flow updates from TritonInferenceServer debug

b66b4fd

TedThemistokleous self-assigned this Nov 6, 2025

TedThemistokleous added Cleanup Cleanup or simplify blocks of code Bugfix Fix to a bug or reported issue labels Nov 6, 2025

TedThemistokleous requested review from ahsan-ca and apwojcik November 6, 2025 17:21

TedThemistokleous added the Upstream Changset that should be merged upstream to Microsoft/Onnxruntime label Nov 6, 2025

Lintrunner pass

022ace6

ahsan-ca reviewed Nov 7, 2025

View reviewed changes

TedThemistokleous merged commit c4e1e69 into rocm7.1_internal_testing Nov 7, 2025
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MIGX EP Subgraph debug fixes and cleanup for unsupported ops #196

MIGX EP Subgraph debug fixes and cleanup for unsupported ops #196

TedThemistokleous commented Nov 6, 2025

Uh oh!

TedThemistokleous commented Nov 6, 2025

Uh oh!

ahsan-ca Nov 7, 2025

Uh oh!

ahsan-ca Nov 7, 2025

Uh oh!

TedThemistokleous Nov 7, 2025

Uh oh!

ahsan-ca Nov 7, 2025

Uh oh!

TedThemistokleous Nov 7, 2025

Uh oh!

ahsan-ca Nov 7, 2025

Uh oh!

ahsan-ca Nov 7, 2025

Uh oh!

TedThemistokleous commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Can't Find node to name";
	LOGS_DEFAULT(WARNING) << "Node:" << node->Name() << " Input:" << input_name << " Cannot find node to name";

MIGX EP Subgraph debug fixes and cleanup for unsupported ops #196

MIGX EP Subgraph debug fixes and cleanup for unsupported ops #196

Conversation

TedThemistokleous commented Nov 6, 2025

Description

Motivation and Context

Uh oh!

TedThemistokleous commented Nov 6, 2025

Uh oh!

ahsan-ca Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

ahsan-ca Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

TedThemistokleous Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

ahsan-ca Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

TedThemistokleous Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

ahsan-ca Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

ahsan-ca Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

TedThemistokleous commented Nov 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants