OpenArkCompiler generates a virtual table for each class. A virtual table of one class consists of virtual functions of its super classs, virtual functions of the current class, and default functions of the interfaces which the current class implements. If function of the current class override function of its superclass, the function of its superclass will be replaced by the function of the current class in the virtual table of the current class.
The following is a specific example:
class A {
public int first() {
return 0;
}
}
class B extends A {
public void foo() {
}
public int first() {
return 1;
}
}
class C extends A {
public void bar() {
}
public int first() {
return 2;
}
}
public class IsEmpty {
public static void main(String [] args) {
A x = new B();
x.first();
A y = new C()
y.first();
}
public void add(A x) {
x.first();
}
}
The structure of the virtual table generated by OpenArkCompiler is as follows:
A:
_vtb_LA_3B:
.quad Ljava_2Flang_2FObject_3B_7Cclone_7C_28_29Ljava_2Flang_2FObject_3B - .
.quad Ljava_2Flang_2FObject_3B_7Cequals_7C_28Ljava_2Flang_2FObject_3B_29Z - .
.quad Ljava_2Flang_2FObject_3B_7Cfinalize_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7CgetClass_7C_28_29Ljava_2Flang_2FClass_3B - .
.quad Ljava_2Flang_2FObject_3B_7ChashCode_7C_28_29I - .
.quad Ljava_2Flang_2FObject_3B_7Cnotify_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7CnotifyAll_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7CtoString_7C_28_29Ljava_2Flang_2FString_3B - .
.quad Ljava_2Flang_2FObject_3B_7Cwait_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7Cwait_7C_28J_29V - .
.quad Ljava_2Flang_2FObject_3B_7Cwait_7C_28JI_29V - .
.quad LA_3B_7Cfirst_7C_28_29I - .
B:
__vtb_LB_3B:
.quad Ljava_2Flang_2FObject_3B_7Cclone_7C_28_29Ljava_2Flang_2FObject_3B - .
.quad Ljava_2Flang_2FObject_3B_7Cequals_7C_28Ljava_2Flang_2FObject_3B_29Z - .
.quad Ljava_2Flang_2FObject_3B_7Cfinalize_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7CgetClass_7C_28_29Ljava_2Flang_2FClass_3B - .
.quad Ljava_2Flang_2FObject_3B_7ChashCode_7C_28_29I - .
.quad Ljava_2Flang_2FObject_3B_7Cnotify_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7CnotifyAll_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7CtoString_7C_28_29Ljava_2Flang_2FString_3B - .
.quad Ljava_2Flang_2FObject_3B_7Cwait_7C_28_29V - .
.quad Ljava_2Flang_2FObject_3B_7Cwait_7C_28J_29V - .
.quad Ljava_2Flang_2FObject_3B_7Cwait_7C_28JI_29V - .
.quad LB_3B_7Cfirst_7C_28_29I - .
.quad LB_3B_7Cfoo_7C_28_29V - .
C:
__vtb_LC_3B:
The first 11 functions are the same as those of A and B.
… …
.quad LC_3B_7Cfirst_7C_28_29I - .
.quad LC_3B_7Cbar_7C_28_29V - .
Comparison shows that:
- All classes inherit from the object class. Therefore, the first 11 functions in the virtual table are inherited from the object class, and the layout is the same as that of the object class.
- For function 12, function 12 of the subclass B overrides function 12 of the superclass A. Therefore, in the same position, the virtual table of class A is LA_3B_7Cfirst_7C_28_29I and the virtual table of class B is LB_3B_7Cfirst_7C_28_29I. Class C inherits from Class A and overrides the first function. In addition, the iD interface is implemented. Therefore, LC_3B_7Cfirst_7C_28_29I lies on position 12, and LiD_3B_7Csecond_7C_28_29I lies on position 13.
For virtual call, we cannot determine which function is called during compilation. The following is an example:
public class IsEmpty {
public static void main(String [] args) {
A x = new B();
x.first();
A y = new C();
y.first();
}
public void add(A x) {
x.first();
}
}
In this case, we are not sure whether the first function in B or C is called during compilation.
However, the layout of the first function in A, B, and C is the same. In this example, the offset of the first function in the virtual table is 12. Therefore, we can access the function by obtaining the virtual table pointer from the corresponding object and add offset 12 to the pointer.
In a program execution process, as shown in Figure 1, if a virtual function is called, we perform the following steps:
- Determine the instance class of the object (this_ pointer). In Figure 1, this is the instance of class C.
- Search the virtual table of the corresponding class using the function index.
- Return the function pointer that is actually called.
Interface call is similar as multiple inheritance and is more complex than Java single inheritance. For multiple inheritance, a unique inheritance order cannot be determined naturally.
In a closed environment, a sequence (iA, iB, iC) can be determined through topology sorting. The advantage is that interface call can be processed in a way similar as that of a virtual table. However, the interface table of a class obtained in this way will be very large with few data. This approach is not available considering performance and code size.
In an open environment, a sequence cannot be determined through topology sorting during compilation. As a result, the sequence of functions in the interface table is not fixed. Therefore, in actual implementation, it is unlikely to achieve a function table with a consistent sequence and an access mechanism by offset. During compilation, we can determine all interfaces implemented by a class and the inheritance relationship of the interfaces. During running, the function signatures can be compared to determine which function needs to be called. As string comparison causes overhead, OpenArkCompiler calculates the hash value of the function name and signature during compilation. First, compare the hash values. If they are the same, and no hash conflict exists, the function is called. If a hash conflict occurs, OpenArkCompiler compares the function name with signature and obtain the function pointer.
In addition, considering running efficiency and ROM size, we divide the interface table into two levels of tables. Level 1 is the real hash table. We set the hash value to 23, and enter the function address in the position corresponding to the hash value (0-C22). If no conflict occurs in the level 1 hash table, the entries following the last position that contains the function address are blank. In this way, the subsequent entries can be deleted without occupying space. If a conflict occurs in the level-2 hash table, enter 0 in the position where the conflict occurs, and then enter the address of the level-2 hash table in entry 23.
The structure of the level-1 table is as follows:
Sequence | Interface Table Entry | Description |
---|---|---|
0 | &Func or 0 | Function address corresponding to the hash value. If no function address exists, the value is 0. If a conflict occurs, the conflict position is also 0, and the level-2 table address is entered in entry 23. If no conflict occurs, the entries after the last value (n) with a corresponding function are deleted. |
1 | &Func or 0 | |
2 | &Func or 0 | |
... | &Func or 0 | |
n | &Func | |
23 | &itbC | Address of the level-2 table If no conflict occurs in the level-1 table, this entry does not exist. |
The structure of the level-2 function table is as follows:
Interface Table Entry | Description |
---|---|
Size | Size of the interface table that does not conflict |
1 | Align the site position. Meaningless |
Func1_sig | Func1 signature hash value |
&Func1 | Fun1 address |
Func2_sig | Func2 signature hash value |
&Func2 | Func2 address |
...... | |
Func3_sig and Func4_sig | Func3 signature and Func4 signature hash values. The two values are the same. |
1 | Indicates a conflict as Func3 and Func4 signature harsh values are the same. |
...... | |
Func3_sig | Func3 signature |
&Func3 | Func3 address |
Func4_sig | Func4 signature |
&Func4 | Func4 address |
...... |
Figure 2 shows the process of calling an interface function that is declared as follows:
interface B { funcB(); }
interface C { funcC(); }
interface D { funcD(); }
Class A implements B, C, D {}
As shown in Figure 2, in a program execution process, we perform the following steps:
- Determine the class instance of the object. Here, the object is the instance of class A.
- In the level-1 table, search for the address based on the harsh value. If the address exists, the function pointer is returned. If the value is 0, search the level-2 table. In the level-2 table, search for the address using the signature hash value. If the address is found, the function pointer is returned. Otherwise, use the function name to search the table.
- Indirectly invoke the function pointer and transfer related parameter (args) to the indirect invocation.
The following is a specific example:
The IsEmpty class implements interfaces A and B. Each interface declares two functions.
interface A{
public int add();
public int minus();
}
interface B{
public int mult();
public int div();
}
public class IsEmpty implements A, B {
public static void main(String[]args) {
}
public void test(B x) {
x.mult();
}
public int add() {
return 6 + 3;
}
public int minus() {
return 6 - 3;
}
public int mult() {
return 6 * 3;
}
public int div() {
return 6 / 3;
}
}
The interface table of IsEmpty is in the maple code is as follows:
var $__itb_LIsEmpty_3B fstatic <[24] <* void>> = [0, 0, 0, 0, 0, 0, 0, 0, addroffunc ptr &LIsEmpty_3B_7Cdiv_7C_28_29I, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, addroffunc ptr &LIsEmpty_3B_7Cadd_7C_28_29I, 0, 0, addrof ptr $__itabC_LIsEmpty_3B]
var $__itbC_LIsEmpty_3B fstatic <[6] <* void>> = [2, 1, 0xb97, addroffunc ptr &LIsEmpty_3B_7Cmult_7C_28_29I, 0x1f7f, addroffunc ptr &LIsEmpty_3B_7Cminus_7C_28_29I]
Corresponding assembly structure:
__itb_LIsEmpty_3B:
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad LIsEmpty_3B_7Cdiv_7C_28_29I - .
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad 0
.quad LIsEmpty_3B_7Cadd_7C_28_29I - .
.quad 0
.quad 0
.quad __itabC_LIsEmpty_3B - .
__itbC_LIsEmpty_3B:
.quad 2
.quad 1
.quad 2967
.quad LIsEmpty_3B_7Cmult_7C_28_29I - .
.quad 8063
.quad LIsEmpty_3B_7Cminus_7C_28_29I - .
The entry content is as follows:
-
In the level-1 table (__itb_LIsEmpty_3B), there are 23 entries in total, where entry 9 and entry 12 are function addresses, and entry 23 is the level-2 table address. Therefore, a conflict occurs in the level-1 table, and a specific function address needs to be confirmed in the level-2 table.
-
In the level-2 table, the first entry is 2, indicating that there are two functions that do not conflict with each other. The second entry is 1, which is used for alignment and placeholder. The last four entries are the hash values generated by the function signature and the corresponding function addresses.
In the following example, the test function in the source code generates interface-call. The corresponding maple code is as follows:
if (eq u1 u64 (regread u64 %4, constval u64 0)) {
callassigned &MCC_getFuncPtrFromItabSecondHash64 (regread ptr %3, constval u64 0xb97, conststr ptr "mult|()I") { regassign u64 %4}
}
icallassigned (regread u64 %4, regread ref %2) {}
The calling logic is as follows:
Check whether the entry corresponding to the hash value in the level-1 interface table is null. If not, use the address. If yes, call the getFuncPtrFromItabSecondHash64 function.
The getFuncPtrFromItabSecondHash64 function has three parameters: interface table address, hash value of the basename function, and the function signature. The complete calling process is as follows: Find the corresponding interface table address based on classinfo and compare the hash values. If the comparison is successful, and no conflict occurs, the correct address can be obtained. If a conflict occurs, compare the signature names (character string comparison).
The format of the accessed interface table is the same as that of IsEmpty.
The Default
function is introduced in Java 8. The implementation in the superclass overrides the default function of the interface. Override depends on the inheritance relationship between interfaces. As shown in the following figure, the cA class inherits the cB class to implement the iD interface. The foo function is implemented in both the cB class and iD interface. For the cA class, the implementation of the foo function depends on the superclass cB rather than the iD interface.
interface iD {
public default void foo(){System.out.println("iD foo");}
}
class cB {
public void foo(){System.out.println("cB foo");}
}
class cA extends cB implements iD {
}
public class IsEmpty {
public static void main(String [] args) {
iD obj = new cA();
obj.foo();
}
}
As shown in the following figure, the getValue function is defined in both parent and son interfaces. For the Sson calss, the implementation of the getValue function depends on the son interface rather than the parent interface.
interface Parent{
default void getValue(){
System.out.println("Parent getVatue......");
}
}
interface Son extends Parent{
default void getValue(){
System.out.println("OfInt getValue......")
}
}
abstract class OfPrimitive implements Parent{
}
class SSon extends OfPrimitive implements Son{
}
public class Main {
static int get() {
return 1;
}
public static void main(String[] args) {
Parent son = (Parent)new SSon();
son.getValue();
SSon son2;
if(get()==1) {
son2 = new SSon();
}
else son2 = new SSon();
son2.getValue();
}
}