InnocentZero's Treasure Chest

HomeFeedAbout Me

10 Jan 2025

ELF Structure

Structure of ELF files

ELF Header

ELF files have a header section that can be read with readelf -h executable which gives you quite a bit of information about the binary.

ELF Header:
Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
Class:                             ELF64
Data:                              2's complement, little endian
Version:                           1 (current)
OS/ABI:                            UNIX - System V
ABI Version:                       0
Type:                              DYN (Position-Independent Executable file)
Machine:                           Advanced Micro Devices X86-64
Version:                           0x1
Entry point address:               0x1040
Start of program headers:          64 (bytes into file)
Start of section headers:          13520 (bytes into file)
Flags:                             0x0
Size of this header:               64 (bytes)
Size of program headers:           56 (bytes)
Number of program headers:         13
Size of section headers:           64 (bytes)
Number of section headers:         30
Section header string table index: 29

Needless to say, a lot of this is just metadata about the binary that is read by the OS to load the binary.

ELF Sections

ELF sections comprise all all the information that is needed to build an executable from an object file. They are only needed during compile time and not runtime. However, some of these sections may get mapped to segments during runtime. readelf -S executable tells you the sections.

Some of the more important ones are:

  • .text: The instructions of the binary are contained here. They are executed and rip moves through this section.
  • .data/.rodata: This are the sections that contain initialized global data. ro stands for read-only.
  • .bss: This is the section for uninitialized global variables.
  • .interp: This holds the runtime linker, also known as the interpreter of the program.
  • Some linker scripts may also contain preallocated space for stack and heap, although it's not really the job of ELF sections to define them.

For an example hello world binary in C, the following was the output for readelf -S

  There are 30 section headers, starting at offset 0x34d0:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000000318  00000318
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.gnu.pr[...] NOTE             0000000000000338  00000338
       0000000000000040  0000000000000000   A       0     0     8
  [ 3] .note.gnu.bu[...] NOTE             0000000000000378  00000378
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .note.ABI-tag     NOTE             000000000000039c  0000039c
       0000000000000020  0000000000000000   A       0     0     4
  [ 5] .gnu.hash         GNU_HASH         00000000000003c0  000003c0
       000000000000001c  0000000000000000   A       6     0     8
  [ 6] .dynsym           DYNSYM           00000000000003e0  000003e0
       00000000000000a8  0000000000000018   A       7     1     8
  [ 7] .dynstr           STRTAB           0000000000000488  00000488
       000000000000008f  0000000000000000   A       0     0     1
  [ 8] .gnu.version      VERSYM           0000000000000518  00000518
       000000000000000e  0000000000000002   A       6     0     2
  [ 9] .gnu.version_r    VERNEED          0000000000000528  00000528
       0000000000000030  0000000000000000   A       7     1     8
  [10] .rela.dyn         RELA             0000000000000558  00000558
       00000000000000c0  0000000000000018   A       6     0     8
  [11] .rela.plt         RELA             0000000000000618  00000618
       0000000000000018  0000000000000018  AI       6    23     8
  [12] .init             PROGBITS         0000000000001000  00001000
       000000000000001b  0000000000000000  AX       0     0     4
  [13] .plt              PROGBITS         0000000000001020  00001020
       0000000000000020  0000000000000010  AX       0     0     16
  [14] .text             PROGBITS         0000000000001040  00001040
       0000000000000141  0000000000000000  AX       0     0     16
  [15] .fini             PROGBITS         0000000000001184  00001184
       000000000000000d  0000000000000000  AX       0     0     4
  [16] .rodata           PROGBITS         0000000000002000  00002000
       0000000000000015  0000000000000000   A       0     0     4
  [17] .eh_frame_hdr     PROGBITS         0000000000002018  00002018
       0000000000000024  0000000000000000   A       0     0     4
  [18] .eh_frame         PROGBITS         0000000000002040  00002040
       000000000000007c  0000000000000000   A       0     0     8
  [19] .init_array       INIT_ARRAY       0000000000003dd0  00002dd0
       0000000000000008  0000000000000008  WA       0     0     8
  [20] .fini_array       FINI_ARRAY       0000000000003dd8  00002dd8
       0000000000000008  0000000000000008  WA       0     0     8
  [21] .dynamic          DYNAMIC          0000000000003de0  00002de0
       00000000000001e0  0000000000000010  WA       7     0     8
  [22] .got              PROGBITS         0000000000003fc0  00002fc0
       0000000000000028  0000000000000008  WA       0     0     8
  [23] .got.plt          PROGBITS         0000000000003fe8  00002fe8
       0000000000000020  0000000000000008  WA       0     0     8
  [24] .data             PROGBITS         0000000000004008  00003008
       0000000000000010  0000000000000000  WA       0     0     8
  [25] .bss              NOBITS           0000000000004018  00003018
       0000000000000008  0000000000000000  WA       0     0     1
  [26] .comment          PROGBITS         0000000000000000  00003018
       0000000000000036  0000000000000001  MS       0     0     1
  [27] .symtab           SYMTAB           0000000000000000  00003050
       0000000000000240  0000000000000018          28     6     8
  [28] .strtab           STRTAB           0000000000000000  00003290
       000000000000012a  0000000000000000           0     0     1
  [29] .shstrtab         STRTAB           0000000000000000  000033ba
       0000000000000116  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), l (large), p (processor specific)
  • Nr, Name, and Size should be obvious.
  • EntSize contains the size of the entries, if the entries in the section have fixed sizes. Like symbol tables.
  • Type will be explained later. Address contains the starting address of the section in the binary. This depends on the previous sections and the alignment requirements of the section.
  • Offset and Align should also be obvious. The fields below Address will be explained below.

Types of Sections

The types of sections:

  • NULL: This marks an empty section. It is the first section of the binary for demarcation purposes. Acts as a placeholder.
  • PROGBITS: These just have program-defined info, like the instructions (.text), the global data (.data/.rodata), the .interp section (defines the interpreter).
  • DYNAMIC: Holds dynamic linking information. It is actually a dynamic table that has tags and name/value mapping of sorts that helps the runtime linker load shared libs and stuff.
  • INIT_ARRAY: This contains an array of pointers to functions that must be executed before main. Only for .init_array.
  • FINI_ARRAY: This contains an array of pointers to functions that must be executed on exit. Only for .fini_array.
  • GNU_HASH: This is a sort of hash table for faster symbol lookup used by the dynamic linker. Used for .gnu.hash
  • NOBITS: Used for .bss, which is zeroed out upon loading. This contains the section having undefined global variables.
  • DYNSYM: Used for .dynsym section, contains the dynamic symbol table.
  • STRTAB: As the name suggests, it contains a string table. Usually it's indexed. For sections .strtab and .dynstr, which are obviously static and dynamic string tables.
  • SYMTAB: Once again, symbol table for .symtab. Larger than .dynsym as it's more detailed, but not required at runtime.
  • RELA: Contains reloc tables. These specify how to to modify certain addresses in the program to account for the layout of shared libraries or changes in addresses during linking.

I'm not covering specific sections like .got and .plt in detail as they require a separate post of their own.

The flags for each section have been added below in the readelf output.

How sections are represented

Each section has an entry in the section header table, which links to the actual Elf32_Shdr structure. There is also a Section Header String Table, which stores the names of the sections. Each section is represented as an Elf32_Shdr structure in memory which holds things such as the type, flags, address of a related table, etc in the section header table.

Symbol table and string tables

A single ELF can contain two symbol tables: .symtab and .dynsym. .symtab is the global symbol table, containing all symbol references. .strtab is the string table of .symtab. There is a one to one correspondence between the entries.

We also have .dynsym which holds symbols needed for dynamic linking. The following image shows the relation between the sections and their entries.

elf_symbols.jpg

ELF Segments

Sections gather all the information needed to link a given executable. Segments, on the other hand, contain information needed to load the program into memory. Segments can be imagined as a tool to make linux loader's life easier, as they group sections by attributes into single segments in order to make the loading process more efficient. Otherwise the loader would load each individual section into memory independently.

Similar to sections, Segments, also called Program Headers, also have a Program Header Table that lists all the segments. This table is read by the loader and helps map the ELF into memory. These can be seen via readelf -l executable.


Elf file type is DYN (Position-Independent Executable file)
Entry point 0x1040
There are 13 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000000040 0x0000000000000040
                 0x00000000000002d8 0x00000000000002d8  R      0x8
  INTERP         0x0000000000000318 0x0000000000000318 0x0000000000000318
                 0x000000000000001c 0x000000000000001c  R      0x1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000630 0x0000000000000630  R      0x1000
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x0000000000000191 0x0000000000000191  R E    0x1000
  LOAD           0x0000000000002000 0x0000000000002000 0x0000000000002000
                 0x00000000000000bc 0x00000000000000bc  R      0x1000
  LOAD           0x0000000000002dd0 0x0000000000003dd0 0x0000000000003dd0
                 0x0000000000000248 0x0000000000000250  RW     0x1000
  DYNAMIC        0x0000000000002de0 0x0000000000003de0 0x0000000000003de0
                 0x00000000000001e0 0x00000000000001e0  RW     0x8
  NOTE           0x0000000000000338 0x0000000000000338 0x0000000000000338
                 0x0000000000000040 0x0000000000000040  R      0x8
  NOTE           0x0000000000000378 0x0000000000000378 0x0000000000000378
                 0x0000000000000044 0x0000000000000044  R      0x4
  GNU_PROPERTY   0x0000000000000338 0x0000000000000338 0x0000000000000338
                 0x0000000000000040 0x0000000000000040  R      0x8
  GNU_EH_FRAME   0x0000000000002018 0x0000000000002018 0x0000000000002018
                 0x0000000000000024 0x0000000000000024  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_RELRO      0x0000000000002dd0 0x0000000000003dd0 0x0000000000003dd0
                 0x0000000000000230 0x0000000000000230  R      0x1

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.gnu.property .note.gnu.build-id .note.ABI-tag .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt 
   03     .init .plt .text .fini 
   04     .rodata .eh_frame_hdr .eh_frame 
   05     .init_array .fini_array .dynamic .got .got.plt .data .bss 
   06     .dynamic 
   07     .note.gnu.property 
   08     .note.gnu.build-id .note.ABI-tag 
   09     .note.gnu.property 
   10     .eh_frame_hdr 
   11     
   12     .init_array .fini_array .dynamic .got 

Types of Segments

There are many types of segments. The most common and important ones are:

  • PT_PHDR: Contains the program header.
  • PT_LOAD: Actually loaded in the memory. Every other section is mapped to this.
  • PT_INTERP: Holds the .interp section responsible for providing the interpreter.
  • PT_NULL: First entry of the table, unassigned.
  • PT_DYNAMIC: Holds the .dynamic section.

    The following image represents the layout in memory (as an approximation):

    segment_memory.png

An interesting thing to note here is the GNU_STACK segment in the output. It has a peculiar size of 0. This implies that the stack size is decided by the kernel. Its size is always ignored and it is just there for permission management.

Another thing to be mentioned is GNU_EH_FRAME that specifies the frame unwinding information. Usually the same as .eh_frame_hdr section.

Tags: programming

Other posts
Creative Commons License
This website by Md Isfarul Haque is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.