-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ARC auto-sync architecture #2570
base: next
Are you sure you want to change the base?
Conversation
@Rot127, fuzzing, as I expected, shows that those instructions that are not yet supported for ARC in llvm lead to incorrect decoding and program crash. llvm-mc from the official github also fails with an error. Should we do something now or ignore it until llvm updates? |
You can fix it in our fork of LLVM. And open an issue in LLVM with a reference to the fix commit. They can cherry-pick it. If you have the time, you can also open a PR with the fix yourself. |
The problem is solved now. The DecodeGPR32RegisterClass return value was not checked (in llvm disassembler also) |
aa93895
to
e2fb2d7
Compare
e2fb2d7
to
06afa37
Compare
Do you consider this done? Then I would review it this week. |
Yes. I think it is done now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continue later (at arc.h
). But so far looks very good!
Result = decodeInstruction(DecoderTable64, Instr, | ||
Insn64, Address, NULL); | ||
if (MCDisassembler_Success == Result) { | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you an issue please that we should also covert the LLVM_DEBUG
stuff?
It seems to become more and more common to use.
No need to fix it here, but it nice to track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM_DEBUG is already present in the patches
arch/ARC/ARCMapping.c
Outdated
MI->MRI = MRI; | ||
|
||
ARC_LLVM_printInst(MI, MI->address, "", O); | ||
// ARC_add_cs_groups(MI); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// ARC_add_cs_groups(MI); |
arch/ARC/ARCMapping.c
Outdated
|
||
void ARC_init_cs_detail(MCInst *MI) | ||
{ | ||
if (detail_is_set(MI)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Invert condition and return early please.
arch/ARC/ARCMapping.h
Outdated
static inline void add_cs_detail(MCInst *MI, | ||
int /* arc_op_group */ op_group, ...) | ||
{ | ||
if (!MI->flat_insn->detail) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!MI->flat_insn->detail) | |
if (!details_is_set(MI)) |
ARC_INS_ST_DI_AW = 195 | ||
ARC_INS_ST_DI = 196 | ||
ARC_INS_ST = 197 | ||
ARC_INS_SUB1_ = 198 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARC_INS_SUB1_ = 198 | |
ARC_INS_SUB1_ = 198 |
These instruction ids ending in _
look fishy. I suspect they end in a .
or with a {
. But they for sure end with a non alphabetical character. Can you check if this is expected? For some archs a .
actually belongs to the mnemonic (PPC). For some it doesn't (ARM).
If the mnemonic should not end in a special character, you can check the getNormalizedMnemonic
function in the PrinterCapstone
. You can remove special chars there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works like that because of instructions like sub1.$cc
. I can replace last dot if the mnemonic cointains it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please remove it.
bindings/python/capstone/arc.py
Outdated
_fields_ = ( | ||
('type', ctypes.c_uint), | ||
('value', ARCOpValue), | ||
('access', ctypes.c_uint8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
('access', ctypes.c_uint8) | |
('access', ctypes.c_int) |
It was an enum in arc.h
. Please either make it an uint8
or change it here to an int
bindings/python/capstone/arc.py
Outdated
|
||
class ARCOp(ctypes.Structure): | ||
_fields_ = ( | ||
('type', ctypes.c_uint), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
('type', ctypes.c_uint), | |
('type', ctypes.c_int), |
enums are ints if I am not mistaken.
bindings/python/capstone/arc.py
Outdated
_fields_ = ( | ||
('op_count', ctypes.c_uint8), | ||
('operands', ARCOp * 8), | ||
('update_flags', ctypes.c_bool), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
('update_flags', ctypes.c_bool), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a very straight forward review! Excellent job, well done!
Please add ARC
in cs_v6_release_guide.md
und New features
.
Simple two liner is enough. Just please mention about the memory operands.
Otherwise, please address the other small requests.
Regarding the not yet implemented memory operand type: Do you want to add them? It complicates the details a bit. But on the other hand there seems only the MemOperandRI
type. So not too much work I think? Or was there another problem?
Nonetheless, we can merge it like this. But we should have them before v6 Gold.
cc @XVilka @kabeor
@@ -36,8 +38,10 @@ | |||
"LoongArch": "LoongArch", | |||
"SystemZ": "SystemZ", | |||
"Mips": "Mips", | |||
"ARC": "ARC", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ARC": "ARC", |
# from autosync.cpptranslator.patches.Helper import get_text | ||
# if patch.get_main_capture_name() == "template_def": | ||
# print(get_text(self.src, q[0].start_byte, q[0].end_byte).decode()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# from autosync.cpptranslator.patches.Helper import get_text | |
# if patch.get_main_capture_name() == "template_def": | |
# print(get_text(self.src, q[0].start_byte, q[0].end_byte).decode()) |
# if "template_declaration" in pattern: | ||
# print(self.tree.root_node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# if "template_declaration" in pattern: | |
# print(self.tree.root_node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a test for this new Patch in test_patches.py
.
if len(param_list_caller.named_children) == 1: | ||
# If function just return fieldFromInstruction(...) | ||
inst_param: Node = param_list_caller.named_children[0] | ||
else: | ||
inst_param = param_list_caller.named_children[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please add a test for this edge case.
@@ -0,0 +1,60 @@ | |||
// Copyright © 2024 Rot127 <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your copyright
suite/cstest/src/test_detail_arc.c
Outdated
@@ -0,0 +1,99 @@ | |||
// Copyright © 2024 Rot127 <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also your copyright please
suite/test_corpus3.py
Outdated
@@ -134,6 +135,8 @@ def test_file(fname): | |||
("CS_ARCH_ALPHA", "CS_MODE_BIG_ENDIAN"): 56, | |||
("CS_ARCH_HPPA", "CS_MODE_HPPA_11+CS_MODE_BIG_ENDIAN"): 57, | |||
("CS_ARCH_HPPA", "CS_MODE_HPPA_20+CS_MODE_BIG_ENDIAN"): 58, | |||
("CS_ARCH_ARC", "CS_MODE_LITTLE_ENDIAN"): 59, | |||
("CS_ARCH_ARC", "CS_MODE_BIG_ENDIAN"): 60, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
("CS_ARCH_ARC", "CS_MODE_BIG_ENDIAN"): 60, |
tests/details/arc.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these tests cover all the possible OP_GROUPS
? If no, please add one test for each op group.
Otherwise, all good.
Also, please add tests which checks the implicit reads, writes and groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Almost done. Just those few nitpicks.
Also, please mark the suggestions as resolved when done.
switch (op_group) { | ||
default: | ||
printf("ERROR: Operand group %d not handled!\n", op_group); | ||
assert(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert(0); | |
CS_ASSERT_RET(0); |
ARC_INS_ADD_, | ||
ARC_INS_ADD_F, | ||
ARC_INS_ADD, | ||
ARC_INS_AND_, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't forget about those ones.
class BadConditionCode(Patch): | ||
""" | ||
Patch return BadConditionCode | ||
to assert(0 && "Unknown condition code passed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to assert(0 && "Unknown condition code passed") | |
to CS_ASSERT(0 && "Unknown condition code passed") |
@@ -0,0 +1,32 @@ | |||
# Copyright © 2024 Dmitry Sibitsev <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your email please :)
Your checklist for this pull request
Detailed description
This PR adds support for the ARC architecture in capstone. It is worth noting that this architecture is experimental in lvm, that is, full support for all instructions is not fully implemented. Also, MC tests in llvm (like llvm_root/llvm/test/MC/Disassembler/ARC/ldst.txt ) have a format different from other architectures, which is why tests auto-generated by MCUpdater.py does not add instruction bytes to yaml. I think this format will change when ARC is fully supported
To successfully pass the autosync test, a merge of the following PR is required: capstone-engine/llvm-capstone#69
Test plan
There are no differences in the way capstone is built with the ARC architecture
MC tests were generated by MCUpdater. Only instruction bytes are manually added. All other tests are added manually
At the moment, all tests are passing successfully, there are CI/CD results to confirm
Closing issues
None