Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elfloader changes for shim patching for aie4 #8355

Merged
merged 2 commits into from
Sep 4, 2024

Conversation

HimanshuChoudhary-Xilinx
Copy link
Collaborator

@HimanshuChoudhary-Xilinx HimanshuChoudhary-Xilinx commented Aug 16, 2024

Problem solved by the commit

Host address patching on shimBD for aie4.

Bug / issue (if any) fixed, which PR introduced the bug, how it was discovered

aie4 have different shimbd configuration, Patching is done on 0th and 1st word of BD.

How problem was solved, alternative solutions (if any) and why they were rejected

added new patching schema, shim_dma_aie4_base_addr_symbol_kind

Risks (if any) associated the changes in the commit

What has been tested and how, request additional testing if necessary

Documentation impact (if any)

@gbuildx
Copy link
Collaborator

gbuildx commented Aug 16, 2024

Can one of the admins verify this patch?

Copy link
Collaborator

@stsoe stsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fill out the PR description.

@@ -796,7 +814,7 @@ class module_elf : public module_impl
ert_cmd_opcode
get_ert_opcode() const override
{
if (m_os_abi == Elf_Amd_Aie2ps)
if (m_os_abi == Elf_Amd_Aie2ps || m_os_abi == Elf_Amd_npu3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not an expert of this area of the code, but it looks to me that there are lots of code like
"if aie2ps or aie4(npu3) then ..." can we make it more clear if you don't want to use more comment to explain what we are doing here.

  1. can we use aie2ps and aie4 instead of aie2ps and npu3, those seems not a pair in same level
  2. can we use a inline function or macro to make those dup code less?

if (m_os_abi == Elf_Amd_npu3)
patch57_npu3(bd_data_ptr, new_value + item.offset_to_base_bo_addr);
else
patch57(bd_data_ptr, new_value + item.offset_to_base_bo_addr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This OK. But I think differentiating this by a new symbol type is cleaner than os_abi. Conceptly, one symobl type should map to one patching schema

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @HimanshuChoudhary-Xilinx is trying to avoid adding more patching schema because currently we only use 3bits for encoding schema-type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @HimanshuChoudhary-Xilinx is trying to avoid adding more patching schema because currently we only use 3bits for encoding schema-type.

not 4 bits (0-3)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case, we should definitely add a new patching scheme instead of adding a new os_abi. @larry9523

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not a different patching... shimbd in aie2ps and npu3 have different configuration
in aie2ps, host address get patched on 1st,2nd, 8th word of BD while in npu3 its on 0th and 1st word of BD.
its the same 57bit patching

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not a different patching... shimbd in aie2ps and npu3 have different configuration in aie2ps, host address get patched on 1st,2nd, 8th word of BD while in npu3 its on 0th and 1st word of BD. its the same 57bit patching

I would argue shim_dma_48 is not a different patching either. It just patch to second and third of 8 BD WORDs.

OS_ABI is used to determine ELF's architecture. What ELF spec we should follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have enough symbol type to hold all potential patching method?
if so, then we need something like,
patch57_aie2ps
patch57_aie4
if not, we may need combination of symbol and os_abi, like what Himanshu did.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We do have enough symbol type to hold all potential patching method for the moment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Open question, if os_abi is about ELF spec, then how we differentiate between a elf with aie2ps controlcode and elf with aie4 controlcode because an elf with aie2ps controlcode will not work on aie4 device or vice-versa.
XRT should have a check on this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Open question, if os_abi is about ELF spec, then how we differentiate between a elf with aie2ps controlcode and elf with aie4 controlcode because an elf with aie2ps controlcode will not work on aie4 device or vice-versa. XRT should have a check on this.

I am totally fine if we define a different os_abi between aie2ps and aie4 if we need to. I am just saying we should use a new patching schema at this place instead of using os_abi.

@dezhiAmd
Copy link
Collaborator

Please fill out the PR description.

I keep saying this in almost all PRs from the link below:
https://gitenterprise.xilinx.com/XRT/aiebu/pulls?q=is%3Apr+author%3Achoudhar+

Please, please fill out PR description especially

  1. Problem solved by this commit
  2. How problem is solved

@HimanshuChoudhary-Xilinx HimanshuChoudhary-Xilinx changed the title elfloader changes for shim patching for npu3 elfloader changes for shim patching for aie4 Aug 28, 2024
@HimanshuChoudhary-Xilinx
Copy link
Collaborator Author

HimanshuChoudhary-Xilinx commented Aug 28, 2024

  1. Added new patching schema
  2. Made same os_abi for aie2ps and aie4

Copy link
Contributor

@xuhz xuhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the os_abi is still required, but not to be used to choose patching method
e.g to check an elf compiled with -t aie4 loading to aie4 hw

@HimanshuChoudhary-Xilinx
Copy link
Collaborator Author

i think the os_abi is still required, but not to be used to choose patching method e.g to check an elf compiled with -t aie4 loading to aie4 hw

we dont need os_abi because as @larry9523 said it represent elf spec and elf spec for aie2ps and aie4 is same

@larry9523
Copy link
Collaborator

i think the os_abi is still required, but not to be used to choose patching method e.g to check an elf compiled with -t aie4 loading to aie4 hw

Let's clean this up by using e_machine

@HimanshuChoudhary-Xilinx
Copy link
Collaborator Author

i think the os_abi is still required, but not to be used to choose patching method e.g to check an elf compiled with -t aie4 loading to aie4 hw

Let's clean this up by using e_machine

lets clean that in new pr as it will need aiebu changes also

Copy link
Contributor

@xuhz xuhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't read the previous comments carefully
we have e_machine to tell aie2ps or aie4
os_abi is not for that purpose.
but elf_amd_aie2ps is not a good name which caused the confusion. It should be changed in future PR with a better name, something like elf_amd_aie_uc

@@ -115,6 +115,7 @@ struct patcher
scalar_32bit_kind = 3,
control_packet_48 = 4, // patching scheme needed by firmware to patch control packet
shim_dma_48 = 5, // patching scheme needed by firmware to patch instruction buffer
shim_dma_aie4_base_addr_symbol_kind = 6, // patching scheme needed by AIE4 firmware
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the ELF spce accordingly.

Signed-off-by: HimanshuChoudhary-Xilinx <[email protected]>
Signed-off-by: HimanshuChoudhary-Xilinx <[email protected]>
@maxzhen maxzhen merged commit ea90b89 into Xilinx:master Sep 4, 2024
17 checks passed
vipangul pushed a commit to vipangul/XRT that referenced this pull request Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants