-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend coverage of aarch64 lifter, including SIMD #1546
Open
DukMastaaa
wants to merge
175
commits into
BinaryAnalysisPlatform:master
Choose a base branch
from
UQ-PAC:aarch64-pull-request-2
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Extend coverage of aarch64 lifter, including SIMD #1546
DukMastaaa
wants to merge
175
commits into
BinaryAnalysisPlatform:master
from
UQ-PAC:aarch64-pull-request-2
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Stur instructions
…UQ-PAC/bap into implement-missing-aarch64-insns
separated into category files
LLVM can't seem to disassemble ARMv8.4 instructions like RMIF, SETF8 and SETF16. Also, CFINV gets turned into MSR (register) but LLVM returns ill-formed asm...? I've commented this in aarch64-pstate.lisp.
i typed is_zero with underscore instead of primitive is-zero
documentation added for macros and helper functions.
…UQ-PAC/bap into implement-missing-aarch64-insns
llvm mnemonics most likely incorrect, will investigate why bap's llvm doesn't disassemble these insns
i've used ` bap mc --cpu=cortex-a55 --triple=aarch64` to get the llvm mnemonic, but will need to talk to ivan about lisp context and specifying generic armv8.x instead of a specific cpu
…UQ-PAC/bap into implement-missing-aarch64-insns
Miscellaneous fixes and adding instructions fix: replace lognot with lnot LDURHH, LDURSB, LDURSH, LDURSW RBIT (and reverse-bits helper) UMSUBL,SMSUBL,UMADDL,SMADDL
Implemented all LD (multiple structres), LD (single structures), LD.R…
packages form a flat namespace (and seem to need a flat file hierarchy as well) we'll just use the aarch64-simd- prefix as a replacement for folders
will need to find out why primitive doesn't work for concat
function overloads are not nice sometimes
Added a few more arithmetic instructions.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
File structure
SIMD instructions have been implemented in files under
plugins/arm/semantics
with theaarch64-simd-
prefix,in the
aarch64
package. This is done asbap
only looks in the top level of thesemantics
folder,so adding a subdirectory
simd
won't be recognised.For FP instructions soon to be implemented, this approach (with prefix
aarch64-fp
) will also be used.nth-reg-in-group
primitiveFor instructions in the
CASP
andLDn
families, LLVM gives BAP register groups likeX0_X1
orQ0_Q1_Q2
which used to require a large switch statement like the following to extract the actual register:A new Primus Lisp primitive,
(nth-reg-in-group sym n)
, returns then
th register in a register group passed in as a symbol,sym
.For example,
(nth-reg-in-group 'D0_D1_D2_D3 2)
returnsD2
.(:warning: there is a slight problem with the implementation, please see the notes on
CASP
below)Non-SIMD Instructions
There are a lot of instructions implemented in this PR; these are sufficient to fully lift the cntlm binary (cross-compiled for aarch64) except two
FMOV
variants. This was tested using the--print-missing
option tobap disassemble
(#1410).Instructions added are listed here, with some containing the BIL code hidden under a collapsible menu.
Arithmetic
ADDS*ri
,ADDS*rs
,ADD*rx
,ADDXrx64
SUBXr*
,SUBSXrx
,SUBSXrx64
Similar to
ADDS
.UMADDLrr
,SMADDLrr
,UMSUBLrr
,SMSUBLrr
The rest are similar.
UMULHrr
ADR
Atomic
CASP
familyThis uses the
load-acquire
andstore-release
intrinsics as described in #1458.(:warning:)
The
nth-reg-in-group
primitive is also used to extract the registers in thexa_xb
pairs.However, its implementation prevents the following expression from reifying correctly:
The expected result is
X0.X1
, but printing out the result withmsg
gives0x30000000000000004
.As a temporary workaround, a helper function
(register-pair-concat r-pair)
has been defined, containing a large switch statement with cases for each'Xa_Xb
, but this is not ideal.Some advice on how to resolve this would be much appreciated.
Data movement
BIL code has not been provided for most instructions in this category due to the amount of instructions and the minute differences between them.
Loads:
LDR*ro*
,LDR*pre
,LDR*post
,LDR*ui
LDRBBro*
,LDRBBpre
,LDRBBpost
LDRHHro*
,LDRHHpre
,LDRHHpost
,LDRHHui
LDP*pre
andLDP*post
,LDP*i
LDRSWui
,LDRSWro*
LDURBBi
,LDURHHi
LDURSB*i
,LDURSH*i
,LDURSWi
LDUR*i
Stores:
STR*ro*
,STR*pre
,STR*post
STRHHui
STRBBro*
,STRBBpre
,STRBBpost
STP*pre
,STP*post
,STP*i
STURHHi
,STURBBi
Other:
EXTR*rri
Logical
ANDS*ri
,ANDS*rs
BIC*r
,BICS*rs
REV*r
,REV16*r
,REV32Xr
Note that
REV16*r
etc. reverses the bytes within each container of size 16.ASRV*r
,LSRV*r
,LSLV*r
,RORV*r
Nothing special about these.
RBIT*r
Special
BRK
This passes the label argument to a
software-breakpoint
intrinsic.SIMD Instructions
We use
.
to indicate one ofB
,H
,S
,D
,Q
instead of*
to avoid name conflicts with existing non-SIMD macros.Arithmetic
Here, we just reuse
*
to also indicate some number for element count or element size.ADDv*i*
,SUBv*i*
,MULv*i*
Note:
+
has a higher precedence in the textual representation than.
, so although the spacing in the BIL output below is misleading, the output is correct.The rest are similar and only differ in the binary operation.
Loads
This PR implements all of the SIMD load instructions; see the PR diff for a full list.
Instructions with interesting BIL output are listed below.
LDNP.i
As an instruction with non-temporal properties, LDNP relaxes the order of its memory accesses. This is represented as a call to a 'non-temporal-hint' intrinsic where the address is passed as a parameter.
LD..v._POST
(e.g.ld2 {v0.4s, v1.4s}, [x2], x3
)This instruction family receives register groups from LLVM like
CASP
.The BIL code separates each memory access individually to accurately model the interleaving done by the processor.
This may not be ideal for generated code size -- advice on making such levels of detail toggleable would be appreciated.
Similar expansions apply to the rest of the
LDn
family.Logical
ANDv*i*
,EORv*i*
,NOTv*i*
,ORRv*i*
,ORNv*i*
These are done on the whole register
Vn
.Misc. movement
INSvi32gpr
,INSvi32lane
The implementation uses bitmasks and bit shifts to insert the vector elements, but could equivalently use extract and concat.
Please advise if this is preferred.
MOVIv*i*
,MOVIv*b_ns
EXTv*i*
This is implemented literally as described in the ISA with extract after concat.
Store
Most of these have nearly identical implementations to the non-SIMD
STP
variants.STR.ro*
,STR.pre
,STR.post
,STR.ui
For
STR.ro*
:For
STR.post
(pre
is similar):For
STR.ui
:STP.pre
,STP.post
,STP.i
STUR.i