-
Notifications
You must be signed in to change notification settings - Fork 177
Program Abstraction in Tai‐e (core classes and IR)
This document introduces Tai-e's abstraction of the Java program being analyzed. You will likely need to use the classes introduced in this document when developing analyses on top of Tai-e. See Section 2 of our technical report for more discussions.
-
JClass
(inpascal.taie.language.classes
) represents classes in the program. Each instance contains various information of a class, such as class name, modifiers, declared methods and fields, etc. -
JMethod
andJField
: (inpascal.taie.language.classes
): represents class members, i.e., methods and fields in the program. EachJMethod
/JField
instance contains various information of a method/field, such as declaring class, name, etc. -
ClassHierarchy
(inpascal.taie.language.classes
): manages all the classes of the program. It offers APIs to query class hierarchy information, such as method dispatching, subclass checking, etc. -
Type
(inpascal.taie.language.type
): represents types in the program. It has several subclasses, e.g.,PrimitiveType
,ClassTyp
, andArrayType
, representing different kinds of Java types. -
TypeSystem
(inpascal.taie.language.type
): provides APIs for retrieving specific types and subtype checking. -
World
(inpascal.taie
): manages the whole-program information of the program. By using its getters, you can access these information, e.g.,ClassHierarchy
andTypeSystem
.World
is essentially a singleton class, and you can obtain the instance by callingWorld.get()
.
Tai-e IR is typed, 3-address, statement and expression based representation of Java method body. The IR classes reside in package pascal.taie.ir
and its sub-packages.
There are three core classes in Tai-e IR:
-
IR
is the central data structure of intermediate representation in Tai-e, and each IR instance can be seen as a container of the information for the body of a particular method, such as variables, parameters, statements, etc. You could easily obtain IR instance of a method byJMethod.getIR()
(providing the method is not abstract). -
Stmt
represents all statements in the program. This interface has a dozen of subclasses, corresponding to various statements.Stmt
s are stored inIR
, and you could obtain them viaIR.getStmts()
. -
Exp
represents all expressions in the program. This interface has dozens of subclasses, corresponding to various expressions.Exp
s are associated withStmt
s, and you could obtain them via specific APIs ofStmt
.
We believe that the API of IR is self-documenting and easy to use. To make IR more intelligible, we present a formal definition (i.e., context-free grammar) below that illustrates all kinds of expressions and statements in the IR, and how Stmt
are formed by Exp
. Most non-terminals in the grammar corresponds to classes in pascal.taie.ir
.
Exp -> Var | Literal | FieldAccess | ArrayAccess | NewExp | InvokeExp | UnaryExp | BinaryExp | InstanceOfExp | CastExp
-
Var -> Identifier
-
Literal -> IntLiteral | LongLiteral | FloatLiteral | DoubleLiteral | StringLiteral | ClassLiteral | NullLiteral | MethodHandle | MethodType
-
FieldAccess -> InstanceFieldAccess | StaticFieldAccess
- InstanceFieldAccess -> Var.FieldRef
- StaticFieldAccess -> FieldRef
- FieldRef -> <ClassType: Type FieldName>
- FieldName -> Identifier
-
ArrayAccess -> Var[Var]
-
NewExp -> NewInstance | NewArray | NewMultiArray
- NewInstance -> new ClassType
- NewArray -> new Type[Var]
- NewMultiArray -> new Type LengthList EmptyList
- LengthList -> [Var] | [Var]LengthList
- EmptyList -> ε | []EmptyList
-
InvokeExp -> InvokeVirtual | InvokeInterface | InvokeSpecial | InvokeStatic | InvokeDynamic
- InvokeVirtual -> invokevirtual Var.MethodRef(ArgList)
- InvokeInterface -> invokeinterface Var.MethodRef(ArgList)
- InvokeSpecial -> invokespecial Var.MethodRef(ArgList)
- InvokeStatic -> invokestatic MethodRef(ArgList)
- InvokeDynamic -> invokedynamic BootstrapMethodRef MethodName MethodType [BootstrapArgList] (ArgList)
- MethodRef -> <ClassType: Type MethodName(TypeList)>
- MethodName -> Identifier
- TypeList -> ε | Type TypeList'
- TypeList' -> ε | , Type TypeList'
- ArgList -> ε | Var ArgList'
- ArgList' -> ε | , Var ArgList'
- BootstrapMethodRef -> MethodRef
- BootstrapArgList -> ε | Literal BootstrapArgList'
- BootstrapArgList' -> ε | , Literal BootstrapArgList'
-
UnaryExp -> NegExp | ArrayLengthExp
- NegExp -> !Var
- ArrayLengthExp -> Var.length
-
BinaryExp -> ArithmeticExp | BitwiseExp | ComparisonExp | ConditionExp | ShiftExp
- ArithmeticExp -> Var ArithmeticOp Var
- ArithmeticOp -> + | - | * | / | %
- BitwiseExp -> Var BitwiseOp Var
- BitwiseOp -> "|" | & | ^
- ComparisonExp -> Var ComparisonOp Var
- ComparisonOp -> cmp | cmpl | cmpg
- ConditionExp -> Var ConditionOp Var
- ConditionOp -> == | != | < | > | <= | >=
- ShiftExp -> Var ShiftOp Var
- ShitOp -> << | >> | >>>
-
InstanceOfExp -> Var instanceof Type
-
CastExp -> (Type) Var
Stmt -> AssignStmt | JumpStmt | Invoke | Return | Throw | Catch | Monitor | Nop
-
AssignStmt -> New | AssignLiteral | Copy | LoadArray | StoreArray | LoadField | StoreField | Unary | Binary | InstanceOf | Cast
- New -> Var = NewExp;
- AssignLiteral -> Var = Literal;
- Copy -> Var = Var;
- LoadArray -> Var = ArrayAccess;
- StoreArray -> ArrayAccess = Var;
- LoadField -> Var = FieldAccess;
- StoreField -> FieldAccess = Var;
- Unary -> Var = UnaryExp;
- Binary -> Var = BinaryExp;
- InstanceOf -> Var = InstanceOfExp;
- Cast -> Var = CastExp;
-
JumpStmt -> Goto | If | Switch
- Goto -> goto Label;
- If -> if ConditionExp goto Label;
- Switch -> TableSwitch | LookupSwitch
- TableSwitch -> tableswitch (Var) { CaseList default: goto Label; }
- LookupSwitch -> lookupswitch (Var) { CaseList default: goto Label; }
- Label -> IntLiteral
- CaseList -> ε | case IntLiteral: goto Label; CaseList
-
Invoke -> InvokeExp; | Var = InvokeExp;
-
Return -> return; | return Var;
-
Throw -> throw Var;
-
Catch -> catch Var;
-
Monitor -> monitorenter Var; | monitorexit Var;
-
Nop -> nop;