Fix bug in artifact creation of the LX6 instructions #453

nirvedhmeshram · 2024-06-24T15:17:31Z

There was a bug where were appending LX6 instructions to an existing vector for subsequent entry points rather than making a new vector for each entry point. This change fixes that and hence fixes #447 which was indeed a kernel time out due to bad artifacts.
Also adds a multi-dispatch e2e test to CI.

makslevental · 2024-06-24T16:48:16Z

Is this the kernel lag issue we've all been facing? I'm wondering how I was seeing the seeing the same thing even though I wasn't reusing the same buffer/vector in my own path (in xaiepy).

newling

Nice find!

In the future we could add a test that there is exactly 1xclbin generated for a test like this, I can try and add that at some stage.

newling · 2024-06-24T16:48:19Z

compiler/plugins/target/AMD-AIE/iree-amd-aie/Target/AIETarget.cpp

@@ -359,6 +358,8 @@ LogicalResult AIETargetBackend::serializeExecutable(

    std::ifstream instrFile(static_cast<std::string>(npuInstPath));
    std::string line;
+    // Vector to store LX6 instructions.


FYI: I've heard that AIE2p has LX7 instructions (not a request for change!)

I believe it's true that forthcoming gens will have LX7 (and higher?) but the instructions themselves don't actually have anything to do with the LX arch (which is Tensilica's naming scheme for their archs and not connected to our firmware that runs these instructions). But you're right the comment will be out of date.

Lets have a clean up PR for this, I noticed you did LX[0-9] in some places, maybe we should say LX6/LX7 instead?

I think the safest term would be command processor but who cares (honestly when I first started it was useful knowing that it was LX6 because there's a datasheet type thing out there for that so it clarified what people were really talking about).

nirvedhmeshram · 2024-06-24T16:54:51Z

Is this the kernel lag issue we've all been facing? I'm wondering how I was seeing the seeing the same thing even though I wasn't reusing the same buffer/vector in my own path (in xaiepy).

I think that might be a different issue, this one is only hit when we have multiple dispatches in one executable in iree xrt driver.

nirvedhmeshram · 2024-06-24T16:58:27Z

Nice find!

In the future we could add a test that there is exactly 1xclbin generated for a test like this, I can try and add that at some stage.

Yes that would be useful, there is one challenge though, we will create multiple XCLBINs as we create them first and then merge it into the previous one, so the old xclbin artifact is still present even though we wont use it at runtime.

Btw jut to be clear, this test currently will produce three xclbins and use all of them at runtime, we need something like #418 for it to use 1 xclbin at runtime. As I said previously it will still produced three XCLBIN's at compile time.

nirvedhmeshram requested review from MaheshRavishankar, yzhang93, Abhishek-Varma and jtuyls as code owners June 24, 2024 15:17

nirvedhmeshram requested a review from newling June 24, 2024 15:17

nirvedhmeshram force-pushed the nm_fix_artifact_bug branch 2 times, most recently from 1ee2715 to 77b0747 Compare June 24, 2024 15:22

Fix bug in artifact creation of the Lx6 instructions

79cc064

nirvedhmeshram force-pushed the nm_fix_artifact_bug branch from 77b0747 to 79cc064 Compare June 24, 2024 15:25

nirvedhmeshram mentioned this pull request Jun 24, 2024

Add amd-aie-direct HAL target (3/n) #420

Merged

newling approved these changes Jun 24, 2024

View reviewed changes

nirvedhmeshram merged commit 1ed8257 into main Jun 24, 2024
2 checks passed

nirvedhmeshram deleted the nm_fix_artifact_bug branch June 24, 2024 17:02

nirvedhmeshram mentioned this pull request Jun 28, 2024

Add a test which has more than one dispatch in CI #397

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug in artifact creation of the LX6 instructions #453

Fix bug in artifact creation of the LX6 instructions #453

nirvedhmeshram commented Jun 24, 2024 •

edited

Loading

makslevental commented Jun 24, 2024

newling left a comment

newling Jun 24, 2024

makslevental Jun 24, 2024

nirvedhmeshram Jun 24, 2024

makslevental Jun 24, 2024

nirvedhmeshram commented Jun 24, 2024

nirvedhmeshram commented Jun 24, 2024 •

edited

Loading

Fix bug in artifact creation of the LX6 instructions #453

Fix bug in artifact creation of the LX6 instructions #453

Conversation

nirvedhmeshram commented Jun 24, 2024 • edited Loading

makslevental commented Jun 24, 2024

newling left a comment

Choose a reason for hiding this comment

newling Jun 24, 2024

Choose a reason for hiding this comment

makslevental Jun 24, 2024

Choose a reason for hiding this comment

nirvedhmeshram Jun 24, 2024

Choose a reason for hiding this comment

makslevental Jun 24, 2024

Choose a reason for hiding this comment

nirvedhmeshram commented Jun 24, 2024

nirvedhmeshram commented Jun 24, 2024 • edited Loading

nirvedhmeshram commented Jun 24, 2024 •

edited

Loading

nirvedhmeshram commented Jun 24, 2024 •

edited

Loading