- Target instruction set and pointer size
- Target calling convention
- Runtime data structures (not really covered here)
- GC encoding
- So far only JIT32_GCENCODER and everything else
- Debug info (so far mostly the same for all targets?)
- EH info (not really covered here)
One advantage of the CLR is that the VM (mostly) hides the (non-ABI) OS differences
- 32 vs. 64 bits
- This work is not yet complete in the backend, but should be sharable
- Instruction set architecture:
- instrsXXX.h, emitXXX.cpp and targetXXX.cpp
- lowerXXX.cpp
- codeGenXXX.cpp and simdcodegenXXX.cpp
- unwindXXX.cpp
- Calling Convention: all over the place
- Calling Convention
- Struct args and returns seem to be the most complex differences
- Importer and morph are highly aware of these
- E.g. fgMorphArgs(), fgFixupStructReturn(), fgMorphCall(), fgPromoteStructs() and the various struct assignment morphing methods
- Importer and morph are highly aware of these
- HFAs on ARM
- Struct args and returns seem to be the most complex differences
- Tail calls are target-dependent, but probably should be less so
- Intrinsics: each platform recognizes different methods as intrinsics (e.g. Sin only for x86, Round everywhere BUT amd64)
- Target-specific morphs such as for mul, mod and div
- Lowering: fully expose control flow and register requirements
- Code Generation: traverse blocks in layout order, generating code (InstrDescs) based on register assignments on nodes
- Then, generate prolog & epilog, as well as GC, EH and scope tables
- ABI changes:
- Calling convention register requirements
- Lowering of calls and returns
- Code sequences for prologs & epilogs
- Allocation & layout of frame
- Calling convention register requirements
- Conditional compilation (set in jit.h, based on incoming define, e.g. #ifdef X86)
_TARGET_64_BIT_ (32 bit target is just ! _TARGET_64BIT_)
_TARGET_XARCH_, _TARGET_ARMARCH_
_TARGET_AMD64_, _TARGET_X86_, _TARGET_ARM64_, _TARGET_ARM_
- Target.h
- InstrsXXX.h
- The instrDesc is the data structure used for encoding
- It is initialized with the opcode bits, and has fields for immediates and register numbers.
- instrDescs are collected into groups
- A label may only occur at the beginning of a group
- The emitter is called to:
- Create new instructions (instrDescs), during CodeGen
- Emit the bits from the instrDescs after CodeGen is complete
- Update Gcinfo (live GC vars & safe points)
- The instruction encodings are captured in instrsXXX.h. These are the opcode bits for each instruction
- The structure of each instruction's encoding is target-dependent
- An "instruction" is just the representation of the opcode
- An instance of "instrDesc" represents the instruction to be emitted
- For each "type" of instruction, emit methods need to be implemented. These follow a pattern but a target may have unique ones, e.g.
emitter::emitInsMov(instruction ins, emitAttr attr, GenTree* node)
emitter::emitIns_R_I(instruction ins, emitAttr attr, regNumber reg, ssize_t val)
emitter::emitInsTernary(instruction ins, emitAttr attr, GenTree* dst, GenTree* src1, GenTree* src2) (currently Arm64 only)
- Lowering ensures that all register requirements are exposed for the register allocator
- Use count, def count, "internal" reg count, and any special register requirements
- Does half the work of code generation, since all computation is made explicit
- But it is NOT necessarily a 1:1 mapping from lowered tree nodes to target instructions
- Its first pass does a tree walk, transforming the instructions. Some of this is target-independent. Notable exceptions:
- Calls and arguments
- Switch lowering
- LEA transformation
- Its second pass walks the nodes in execution order
- Sets register requirements
- sometimes changes the register requirements children (which have already been traversed)
- Sets the block order and node locations for LSRA
- LinearScan:: startBlockSequence() and LinearScan::moveToNextBlock()
- Sets register requirements
- Register allocation is largely target-independent
- The second phase of Lowering does nearly all the target-dependent work
- Register candidates are determined in the front-end
- Local variables or temps, or fields of local variables or temps
- Not address-taken, plus a few other restrictions
- Sorted by lvaSortByRefCount(), and marked "lvTracked"
- The code to find and capture addressing modes is particularly poorly abstracted
- genCreateAddrMode(), in CodeGenCommon.cpp traverses the tree looking for an addressing mode, then captures its constituent elements (base, index, scale & offset) in "out parameters"
- It optionally generates code
- For RyuJIT, it NEVER generates code, and is only used by gtSetEvalOrder, and by lowering
- For the most part, the code generation method structure is the same for all architectures
- Most code generation methods start with "gen"
- Theoretically, CodeGenCommon.cpp contains code "mostly" common to all targets (this factoring is imperfect)
- Method prolog, epilog,
- genCodeForBBList
- walks the trees in execution order, calling genCodeForTreeNode, which needs to handle all nodes that are not "contained"
- generates control flow code (branches, EH) for the block