Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELF可执行文件在虚拟地址空间的分布 #14

Open
chenpengcong opened this issue May 27, 2018 · 0 comments
Open

ELF可执行文件在虚拟地址空间的分布 #14

chenpengcong opened this issue May 27, 2018 · 0 comments

Comments

@chenpengcong
Copy link
Owner

chenpengcong commented May 27, 2018

本文将根据一个实际可执行文件,画出其ELF文件结构图及其与虚拟地址空间和物理内存的映射关系。

本机环境: deepin15.5 64bit

可执行程序源代码如下:

#include "unistd.h"
int main()
{
    while (1) {
        sleep(1000);
    }
    return 0;
}

编译出可执行文件:$ gcc -static -o SectionMapping SectionMapping.c

要画出其ELF文件结构图,我们需要查看该可执行文件各个section的大小和偏移等信息,这些信息通过查看ELF header和段表得知

查看ELF header:$ readelf -h SectionMapping

$ readelf -h SectionMapping
ELF Header:
  Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00 
  Class: ELF64
  Data: 2's complement, little endian
  Version: 1 (current)
  OS/ABI: UNIX - GNU
  ABI Version: 0
  Type: EXEC (Executable file)
  Machine: Advanced Micro Devices X86-64
  Version: 0x1
  Entry point address: 0x400930
  Start of program headers: 64 (bytes into file)
  Start of section headers: 804624 (bytes into file)
  Flags: 0x0
  Size of this header: 64 (bytes)
  Size of program headers: 56 (bytes)
  Number of program headers: 6
  Size of section headers: 64 (bytes)
  Number of section headers: 32
  Section header string table index: 31

从输出结果获取到的信息为

  • ELF header大小为64bytes
  • Section table(段表)偏移为0xc4710(804624),大小为0x800ytes(64 * 32)
  • Program header table(程序头表)偏移为0x40(64),大小为0x150bytes(56 * 6)

查看Section table(段表):$ readelf -S SectionMapping

$ readelf -S SectionMapping
There are 32 section headers, starting at offset 0xc4710:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .note.ABI-tag     NOTE             0000000000400190  00000190
       0000000000000020  0000000000000000   A       0     0     4
  [ 2] .note.gnu.build-i NOTE             00000000004001b0  000001b0
       0000000000000024  0000000000000000   A       0     0     4
readelf: Warning: [ 3]: Link field (0) should index a symtab section.
  [ 3] .rela.plt         RELA             00000000004001d8  000001d8
       0000000000000108  0000000000000018  AI       0    24     8
  [ 4] .init             PROGBITS         00000000004002e0  000002e0
       0000000000000017  0000000000000000  AX       0     0     4
  [ 5] .plt              PROGBITS         00000000004002f8  000002f8
       0000000000000058  0000000000000000  AX       0     0     8
  [ 6] .text             PROGBITS         0000000000400350  00000350
       00000000000888b7  0000000000000000  AX       0     0     16
  [ 7] __libc_freeres_fn PROGBITS         0000000000488c10  00088c10
       0000000000000ab7  0000000000000000  AX       0     0     16
  [ 8] __libc_thread_fre PROGBITS         00000000004896d0  000896d0
       00000000000000e1  0000000000000000  AX       0     0     16
  [ 9] .fini             PROGBITS         00000000004897b4  000897b4
       0000000000000009  0000000000000000  AX       0     0     4
  [10] .rodata           PROGBITS         00000000004897c0  000897c0
       000000000001c724  0000000000000000   A       0     0     32
  [11] __libc_subfreeres PROGBITS         00000000004a5ee8  000a5ee8
       0000000000000050  0000000000000000   A       0     0     8
  [12] __libc_IO_vtables PROGBITS         00000000004a5f40  000a5f40
       00000000000006a8  0000000000000000   A       0     0     32
  [13] __libc_atexit     PROGBITS         00000000004a65e8  000a65e8
       0000000000000008  0000000000000000   A       0     0     8
  [14] __libc_thread_sub PROGBITS         00000000004a65f0  000a65f0
       0000000000000008  0000000000000000   A       0     0     8
  [15] .eh_frame         PROGBITS         00000000004a65f8  000a65f8
       000000000000a64c  0000000000000000   A       0     0     8
  [16] .gcc_except_table PROGBITS         00000000004b0c44  000b0c44
       00000000000000af  0000000000000000   A       0     0     1
  [17] .tdata            PROGBITS         00000000006b0eb8  000b0eb8
       0000000000000020  0000000000000000 WAT       0     0     8
  [18] .tbss             NOBITS           00000000006b0ed8  000b0ed8
       0000000000000030  0000000000000000 WAT       0     0     8
  [19] .init_array       INIT_ARRAY       00000000006b0ed8  000b0ed8
       0000000000000010  0000000000000008  WA       0     0     8
  [20] .fini_array       FINI_ARRAY       00000000006b0ee8  000b0ee8
       0000000000000010  0000000000000008  WA       0     0     8
  [21] .jcr              PROGBITS         00000000006b0ef8  000b0ef8
       0000000000000008  0000000000000000  WA       0     0     8
  [22] .data.rel.ro      PROGBITS         00000000006b0f00  000b0f00
       00000000000000e4  0000000000000000  WA       0     0     32
  [23] .got              PROGBITS         00000000006b0fe8  000b0fe8
       0000000000000008  0000000000000008  WA       0     0     8
  [24] .got.plt          PROGBITS         00000000006b1000  000b1000
       0000000000000070  0000000000000008  WA       0     0     8
  [25] .data             PROGBITS         00000000006b1080  000b1080
       0000000000001ad0  0000000000000000  WA       0     0     32
  [26] .bss              NOBITS           00000000006b2b60  000b2b50
       0000000000001898  0000000000000000  WA       0     0     32
  [27] __libc_freeres_pt NOBITS           00000000006b43f8  000b2b50
       0000000000000030  0000000000000000  WA       0     0     8
  [28] .comment          PROGBITS         0000000000000000  000b2b50
       0000000000000025  0000000000000001  MS       0     0     1
  [29] .symtab           SYMTAB           0000000000000000  000b2b78
       000000000000b1d8  0000000000000018          30   753     8
  [30] .strtab           STRTAB           0000000000000000  000bdd50
       000000000000685b  0000000000000000           0     0     1
  [31] .shstrtab         STRTAB           0000000000000000  000c45ab
       000000000000015f  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  l (large), p (processor specific)

从输出我们可以看到每个section的大小及其偏移

通过查看段表和ELF header,此时我们已经可以画出可执行文件的整个ELF文件结构图

接下来我们查看可执行程序的程序头(Program header),该程序头描述了ELF该如何加载被操作系统映射到进程的虚拟空间

$ readelf -l SectionMapping
Elf file type is EXEC (Executable file)
Entry point 0x400930
There are 6 program headers, starting at offset 64
Program Headers:
  Type Offset VirtAddr PhysAddr
                 FileSiz MemSiz Flags Align
  LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000b0cf3 0x00000000000b0cf3 R E 0x200000
  LOAD 0x00000000000b0eb8 0x00000000006b0eb8 0x00000000006b0eb8
                 0x0000000000001c98 0x0000000000003570 RW 0x200000
  NOTE 0x0000000000000190 0x0000000000400190 0x0000000000400190
                 0x0000000000000044 0x0000000000000044 R 0x4
  TLS 0x00000000000b0eb8 0x00000000006b0eb8 0x00000000006b0eb8
                 0x0000000000000020 0x0000000000000050 R 0x8
  GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000 RW 0x10
  GNU_RELRO 0x00000000000b0eb8 0x00000000006b0eb8 0x00000000006b0eb8
                 0x0000000000000148 0x0000000000000148 R 0x1
 Section to Segment mapping:
  Segment Sections...
   00 .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_IO_vtables __libc_atexit __libc_thread_subfreeres .eh_frame .gcc_except_table 
   01 .tdata .init_array .fini_array .jcr .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs 
   02 .note.ABI-tag .note.gnu.build-id 
   03 .tdata .tbss 
   04     
   05 .tdata .init_array .fini_array .jcr .data.rel.ro .got

可以看到该可执行文件有两个需要被映射到虚拟地址空间的Segment,分别为Segment0和Segment1,且它们由多个属性类似的Section组成。

Segment:对于属性类似(相同权限)的section,把它们合并到一起当作一个Segment进行映射,这样做的好处是可以明显减少页面内部碎片,从而节省了内存空间。

起始虚拟地址 segment在文件中所占空间的长度 segment在虚拟地址空间所占用的长度 segment在文件中的偏移 权限 对齐
Segment0 0x400000 0xb0cf3 0xb0cf3 0x00 可读可执行(RE) 0x200000
Segment1 0x6b0eb8 0x1c98 0x3570 0xb0eb8 可读可写(RW) 0x200000

注意:为什么Segment1在ELF文件中所占用空间长度为0x1c98,而在进程虚拟地址空间占用的大小memsize为0x3570

原因为Segment1包含了.bss__libc_freeres_ptrs这两个Section,从查看段表的输出结果中可以看到这两个Section的TypeNOBITS,意味着它们在ELF文件中没有实际内容,不占文件空间。

只要把.bss和__libc_freeres_ptrs在内存中实际占据空间加上0x1c98就可以得到0x3570这个值,计算过程如下:
.bss在.data之后,.data的结束地址为0xb1080+0x1ad0 = 0xb2b50,由于.bss是32对齐,所以.bss的起始地址为0xb2b60,结束地址为0xb2b60+0x1898 = 0xb43f8,由于__libc_freeres_ptrs为8对齐,所以起始地址为0xb43f8,终止地址为0xb43f8 + 0x30 = 0xb4428,然后使用__libc_freeres_ptrs的结束地址减去.data的结束地址就可以得到这两个段在内存中占据的空间大小0xb4428-0xb2b50 = 0x18d8,加上0x1c98刚好等于0x3570

查询系统页大小:$ getconf PAGE_SIZE

$ getconf PAGE_SIZE
4096

根据上述得到的信息,我们可以画出可执行文件,虚拟地址空间和物理内存三者的映射关系

sectionmapping

值得注意的是,可执行文件的0xb0000 ~ 0xb1000区域在虚拟地址空间被映射了2次,分别对应虚拟地址空间的0x4b0000 ~ 0x4b10000和0x6b0000 ~ 0x6b1000,然后这两部分虚拟地址空间最终映射到物理内存中的同一个页面

这样做的好处是增加空间使用率。由于可执行程序0xb0000~0xb1000这一页数据同时包含了Segment0和Segment1中的section,如果将该页数据分成两部分,分别只包含一个Segment的数据,那么最终将被映射到两页物理内存,且这两页物理内存有效数据都少于页大小4096bytes,浪费内存。

参考:
《程序员的自我修养》

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant