Sunday, May 20, 2012

Reverse engineering C binaries ( Linux ELF )

Most of the people know about writing a C library, seldom does a common man think about how to know what is written into it. Suppose there is a binary given to you and nothing about is known to you. Now you have to get it running, But you have no clue whats going on....

Consider such a circumstance

Let the program be such that it asks for a password in the beginning, depending upon the password given it runs further. Consider this code

 #include<stdio.h>
int main()
{
int password;    //A integer password
printf("Enter Password :-");
scanf("%d",&password);
if(password==1024)  //Compare if the password is correct
       printf("correct!\n");
else 
       printf("wrong\n");
return 0;
}


This compares the password first and then outputs correct or wrong based on the password. Compile and run this. You are prompted by

Enter Password:-

Now suppose we have only the executable how do we tell what is the password. Basically our first instinct is to try all possibilities.




$./passwrd
Enter Password:-1032
wrong


$./passwrd
Enter Password:-1045
wrong




Its tiresome and sometimes impossible. Lets take another approach. The first basic thing is to know what sort of executable it is, so type in

$file passwrd


It gives output about the type of file, which in my case turns out to be

passwrd: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped



It says that the file is a ELF 64-bit LSB executable and belongs to x86-64 architecture. It also reveals that the file is dynamically linked meaning that it uses inbuilt libraries of the system lastly it says the GNU/Linux version. Not much helpful but we still got to know what type of a file it is.

Now that the file is a architecture specific binary and all the binaries are formed by translating assembler instructions to the machine code. We must be able to reverse engineer it to the assembler part at least. So we try objdump


$objdump -d passwrd

It gives an output that starts like this



passwrd:     file format elf64-x86-64




Disassembly of section .init:


0000000000400460 <_init>:
  400460: 48 83 ec 08           sub    $0x8,%rsp
  400464: e8 93 00 00 00       callq  4004fc <call_gmon_start>
  400469: e8 22 01 00 00       callq  400590 <frame_dummy>
  40046e: e8 3d 02 00 00       callq  4006b0 <__do_global_ctors_aux>
  400473: 48 83 c4 08           add    $0x8,%rsp
  400477: c3                   retq   


Disassembly of section .plt:


0000000000400478 <printf@plt-0x10>:
  400478: ff 35 72 0b 20 00     pushq  0x200b72(%rip)        # 600ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
  40047e: ff 25 74 0b 20 00     jmpq   *0x200b74(%rip)        # 600ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
  400484: 0f 1f 40 00           nopl   0x0(%rax)


0000000000400488 <printf@plt>:
  400488: ff 25 72 0b 20 00     jmpq   *0x200b72(%rip)        # 601000 <_GLOBAL_OFFSET_TABLE_+0x18>
  40048e: 68 00 00 00 00       pushq  $0x0
  400493: e9 e0 ff ff ff       jmpq   400478 <_init+0x18>


0000000000400498 <puts@plt>:
  400498: ff 25 6a 0b 20 00     jmpq   *0x200b6a(%rip)        # 601008 <_GLOBAL_OFFSET_TABLE_+0x20>
  40049e: 68 01 00 00 00       pushq  $0x1
  4004a3: e9 d0 ff ff ff       jmpq   400478 <_init+0x18>


00000000004004a8 <__libc_start_main@plt>:
  4004a8: ff 25 62 0b 20 00     jmpq   *0x200b62(%rip)        # 601010 <_GLOBAL_OFFSET_TABLE_+0x28>
  4004ae: 68 02 00 00 00       pushq  $0x2
  4004b3: e9 c0 ff ff ff       jmpq   400478 <_init+0x18>


00000000004004b8 <__isoc99_scanf@plt>:
  4004b8: ff 25 5a 0b 20 00     jmpq   *0x200b5a(%rip)        # 601018 <_GLOBAL_OFFSET_TABLE_+0x30>
  4004be: 68 03 00 00 00       pushq  $0x3
  4004c3: e9 b0 ff ff ff       jmpq   400478 <_init+0x18>


The first part is the init part of the program or that part that executed at the starting of the program. Followed by .plt(Procedure Linkage Table) contains Global Offset Tables to link the program to the required libraries. This part is followed by

Disassembly of section .text:

00000000004004d0 <_start>:
  4004d0: 31 ed                 xor    %ebp,%ebp
  4004d2: 49 89 d1             mov    %rdx,%r9
  4004d5: 5e                   pop    %rsi
  4004d6: 48 89 e2             mov    %rsp,%rdx
  4004d9: 48 83 e4 f0           and    $0xfffffffffffffff0,%rsp
  4004dd: 50                   push   %rax
  4004de: 54                   push   %rsp
  4004df: 49 c7 c0 a0 06 40 00 mov    $0x4006a0,%r8
  4004e6: 48 c7 c1 10 06 40 00 mov    $0x400610,%rcx
  4004ed: 48 c7 c7 b4 05 40 00 mov    $0x4005b4,%rdi
  4004f4: e8 af ff ff ff       callq  4004a8 <__libc_start_main@plt>
  4004f9: f4                   hlt    
  4004fa: 90                   nop
  4004fb: 90                   nop

00000000004004fc <call_gmon_start>:
  4004fc: 48 83 ec 08           sub    $0x8,%rsp
  400500: 48 8b 05 d9 0a 20 00 mov    0x200ad9(%rip),%rax        # 600fe0 <_DYNAMIC+0x190>
  400507: 48 85 c0             test   %rax,%rax
  40050a: 74 02                 je     40050e <call_gmon_start+0x12>
  40050c: ff d0                 callq  *%rax
  40050e: 48 83 c4 08           add    $0x8,%rsp
  400512: c3                   retq   
  400513: 90                   nop
  400514: 90                   nop
  400515: 90                   nop
  400516: 90                   nop
  400517: 90                   nop
  400518: 90                   nop
  400519: 90                   nop
  40051a: 90                   nop
  40051b: 90                   nop
  40051c: 90                   nop
  40051d: 90                   nop
  40051e: 90                   nop
  40051f: 90                   nop

0000000000400520 <__do_global_dtors_aux>:
  400520: 55                   push   %rbp
  400521: 48 89 e5             mov    %rsp,%rbp
  400524: 53                   push   %rbx
  400525: 48 83 ec 08           sub    $0x8,%rsp
  400529: 80 3d 00 0b 20 00 00 cmpb   $0x0,0x200b00(%rip)        # 601030 <__bss_start>
  400530: 75 4b                 jne    40057d <__do_global_dtors_aux+0x5d>
  400532: bb 40 0e 60 00       mov    $0x600e40,%ebx
  400537: 48 8b 05 fa 0a 20 00 mov    0x200afa(%rip),%rax        # 601038 <dtor_idx.6559>
  40053e: 48 81 eb 38 0e 60 00 sub    $0x600e38,%rbx
  400545: 48 c1 fb 03           sar    $0x3,%rbx
  400549: 48 83 eb 01           sub    $0x1,%rbx
  40054d: 48 39 d8             cmp    %rbx,%rax
  400550: 73 24                 jae    400576 <__do_global_dtors_aux+0x56>
  400552: 66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)
  400558: 48 83 c0 01           add    $0x1,%rax
  40055c: 48 89 05 d5 0a 20 00 mov    %rax,0x200ad5(%rip)        # 601038 <dtor_idx.6559>
  400563: ff 14 c5 38 0e 60 00 callq  *0x600e38(,%rax,8)
  40056a: 48 8b 05 c7 0a 20 00 mov    0x200ac7(%rip),%rax        # 601038 <dtor_idx.6559>
  400571: 48 39 d8             cmp    %rbx,%rax
  400574: 72 e2                 jb     400558 <__do_global_dtors_aux+0x38>
  400576: c6 05 b3 0a 20 00 01 movb   $0x1,0x200ab3(%rip)        # 601030 <__bss_start>
  40057d: 48 83 c4 08           add    $0x8,%rsp
  400581: 5b                   pop    %rbx
  400582: c9                   leaveq 
  400583: c3                   retq   
  400584: 66 66 66 2e 0f 1f 84 data32 data32 nopw %cs:0x0(%rax,%rax,1)
  40058b: 00 00 00 00 00 

0000000000400590 <frame_dummy>:
  400590: 48 83 3d b0 08 20 00 cmpq   $0x0,0x2008b0(%rip)        # 600e48 <__JCR_END__>
  400597: 00 
  400598: 55                   push   %rbp
  400599: 48 89 e5             mov    %rsp,%rbp
  40059c: 74 12                 je     4005b0 <frame_dummy+0x20>
  40059e: b8 00 00 00 00       mov    $0x0,%eax
  4005a3: 48 85 c0             test   %rax,%rax
  4005a6: 74 08                 je     4005b0 <frame_dummy+0x20>
  4005a8: bf 48 0e 60 00       mov    $0x600e48,%edi
  4005ad: c9                   leaveq 
  4005ae: ff e0                 jmpq   *%rax
  4005b0: c9                   leaveq 
  4005b1: c3                   retq   
  4005b2: 90                   nop
  4005b3: 90                   nop

00000000004005b4 <main>:
  4005b4: 55                   push   %rbp
  4005b5: 48 89 e5             mov    %rsp,%rbp
  4005b8: 48 83 ec 10           sub    $0x10,%rsp
  4005bc: b8 fc 06 40 00       mov    $0x4006fc,%eax
  4005c1: 48 89 c7             mov    %rax,%rdi
  4005c4: b8 00 00 00 00       mov    $0x0,%eax
  4005c9: e8 ba fe ff ff       callq  400488 <printf@plt>
  4005ce: b8 0e 07 40 00       mov    $0x40070e,%eax
  4005d3: 48 8d 55 fc           lea    -0x4(%rbp),%rdx
  4005d7: 48 89 d6             mov    %rdx,%rsi
  4005da: 48 89 c7             mov    %rax,%rdi
  4005dd: b8 00 00 00 00       mov    $0x0,%eax
  4005e2: e8 d1 fe ff ff       callq  4004b8 <__isoc99_scanf@plt>
  4005e7: 8b 45 fc             mov    -0x4(%rbp),%eax
  4005ea: 3d 00 04 00 00       cmp    $0x400,%eax
  4005ef: 75 0c                 jne    4005fd <main+0x49>
  4005f1: bf 11 07 40 00       mov    $0x400711,%edi
  4005f6: e8 9d fe ff ff       callq  400498 <puts@plt>
  4005fb: eb 0a                 jmp    400607 <main+0x53>
  4005fd: bf 1a 07 40 00       mov    $0x40071a,%edi
  400602: e8 91 fe ff ff       callq  400498 <puts@plt>
  400607: b8 00 00 00 00       mov    $0x0,%eax
  40060c: c9                   leaveq 
  40060d: c3                   retq   
  40060e: 90                   nop
  40060f: 90                   nop

0000000000400610 <__libc_csu_init>:
  400610: 48 89 6c 24 d8       mov    %rbp,-0x28(%rsp)
  400615: 4c 89 64 24 e0       mov    %r12,-0x20(%rsp)
  40061a: 48 8d 2d 03 08 20 00 lea    0x200803(%rip),%rbp        # 600e24 <__init_array_end>
  400621: 4c 8d 25 fc 07 20 00 lea    0x2007fc(%rip),%r12        # 600e24 <__init_array_end>
  400628: 4c 89 6c 24 e8       mov    %r13,-0x18(%rsp)
  40062d: 4c 89 74 24 f0       mov    %r14,-0x10(%rsp)
  400632: 4c 89 7c 24 f8       mov    %r15,-0x8(%rsp)
  400637: 48 89 5c 24 d0       mov    %rbx,-0x30(%rsp)
  40063c: 48 83 ec 38           sub    $0x38,%rsp
  400640: 4c 29 e5             sub    %r12,%rbp
  400643: 41 89 fd             mov    %edi,%r13d
  400646: 49 89 f6             mov    %rsi,%r14
  400649: 48 c1 fd 03           sar    $0x3,%rbp
  40064d: 49 89 d7             mov    %rdx,%r15
  400650: e8 0b fe ff ff       callq  400460 <_init>
  400655: 48 85 ed             test   %rbp,%rbp
  400658: 74 1c                 je     400676 <__libc_csu_init+0x66>
  40065a: 31 db                 xor    %ebx,%ebx
  40065c: 0f 1f 40 00           nopl   0x0(%rax)
  400660: 4c 89 fa             mov    %r15,%rdx
  400663: 4c 89 f6             mov    %r14,%rsi
  400666: 44 89 ef             mov    %r13d,%edi
  400669: 41 ff 14 dc           callq  *(%r12,%rbx,8)
  40066d: 48 83 c3 01           add    $0x1,%rbx
  400671: 48 39 eb             cmp    %rbp,%rbx
  400674: 72 ea                 jb     400660 <__libc_csu_init+0x50>
  400676: 48 8b 5c 24 08       mov    0x8(%rsp),%rbx
  40067b: 48 8b 6c 24 10       mov    0x10(%rsp),%rbp
  400680: 4c 8b 64 24 18       mov    0x18(%rsp),%r12
  400685: 4c 8b 6c 24 20       mov    0x20(%rsp),%r13
  40068a: 4c 8b 74 24 28       mov    0x28(%rsp),%r14
  40068f: 4c 8b 7c 24 30       mov    0x30(%rsp),%r15
  400694: 48 83 c4 38           add    $0x38,%rsp
  400698: c3                   retq   
  400699: 0f 1f 80 00 00 00 00 nopl   0x0(%rax)

00000000004006a0 <__libc_csu_fini>:
  4006a0: f3 c3                 repz retq 
  4006a2: 90                   nop
  4006a3: 90                   nop
  4006a4: 90                   nop
  4006a5: 90                   nop
  4006a6: 90                   nop
  4006a7: 90                   nop
  4006a8: 90                   nop
  4006a9: 90                   nop
  4006aa: 90                   nop
  4006ab: 90                   nop
  4006ac: 90                   nop
  4006ad: 90                   nop
  4006ae: 90                   nop
  4006af: 90                   nop

00000000004006b0 <__do_global_ctors_aux>:
  4006b0: 55                   push   %rbp
  4006b1: 48 89 e5             mov    %rsp,%rbp
  4006b4: 53                   push   %rbx
  4006b5: 48 83 ec 08           sub    $0x8,%rsp
  4006b9: 48 8b 05 68 07 20 00 mov    0x200768(%rip),%rax        # 600e28 <__CTOR_LIST__>
  4006c0: 48 83 f8 ff           cmp    $0xffffffffffffffff,%rax
  4006c4: 74 19                 je     4006df <__do_global_ctors_aux+0x2f>
  4006c6: bb 28 0e 60 00       mov    $0x600e28,%ebx
  4006cb: 0f 1f 44 00 00       nopl   0x0(%rax,%rax,1)
  4006d0: 48 83 eb 08           sub    $0x8,%rbx
  4006d4: ff d0                 callq  *%rax
  4006d6: 48 8b 03             mov    (%rbx),%rax
  4006d9: 48 83 f8 ff           cmp    $0xffffffffffffffff,%rax
  4006dd: 75 f1                 jne    4006d0 <__do_global_ctors_aux+0x20>
  4006df: 48 83 c4 08           add    $0x8,%rsp
  4006e3: 5b                   pop    %rbx
  4006e4: c9                   leaveq 
  4006e5: c3                   retq   
  4006e6: 90                   nop
  4006e7: 90                   nop

.text houses the executable code in the ELF binary. This is followed by

Disassembly of section .fini:

00000000004006e8 <_fini>:
  4006e8: 48 83 ec 08           sub    $0x8,%rsp
  4006ec: e8 2f fe ff ff       callq  400520 <__do_global_dtors_aux>
  4006f1: 48 83 c4 08           add    $0x8,%rsp
  4006f5: c3                   retq   

Now what is important to us is the executable code. We see that .text has been divide into lot of formats like 

00000000004004d0 <_start>:

0000000000400590 <frame_dummy>:

00000000004005b4 <main>:

These represent the functions created by the user and the compiler so as to link the program to run the shared code and also run the program over the OS. As there is no other function that seems to be created by the programmer lets check out 00000000004005b4 <main>:

On close observation we spot a instruction 

4005ea: 3d 00 04 00 00        cmp    $0x400,%eax 

It is preceded by 

4005e2: e8 d1 fe ff ff        callq  4004b8 <__isoc99_scanf@plt>
4005e7: 8b 45 fc              mov    -0x4(%rbp),%eax

And followed by

 4005ef: 75 0c                 jne    4005fd <main+0x49>
  4005f1: bf 11 07 40 00        mov    $0x400711,%edi
  4005f6: e8 9d fe ff ff        callq  400498 <puts@plt>
  4005fb: eb 0a                 jmp    400607 <main+0x53>
  4005fd: bf 1a 07 40 00        mov    $0x40071a,%edi
  400602: e8 91 fe ff ff        callq  400498 <puts@plt>
  400607: b8 00 00 00 00        mov    $0x0,%eax

With enough common sense we can understand that this code is directly after scanf function and also prints over to the screen based on the compare instruction. 

4005ea: 3d 00 04 00 00        cmp    $0x400,%eax 

Compares contents of register %eax to 0x400. And if true branches to 0x4005fd

4005ef: 75 0c                 jne    4005fd <main+0x49>

Where it calls printf function (puts@plt ( putstring) )

  4005fd: bf 1a 07 40 00        mov    $0x40071a,%edi
  400602: e8 91 fe ff ff        callq  400498 <puts@plt>
  400607: b8 00 00 00 00        mov    $0x0,%eax

If the comparison is wrong it moves on linearly in the code, where it calls printf again

  4005f1: bf 11 07 40 00        mov    $0x400711,%edi
  4005f6: e8 9d fe ff ff        callq  400498 <puts@plt>
  4005fb: eb 0a                 jmp    400607 <main+0x53>

And then jumps of to 0x400607(Standard way of implementation of "if-else" statement by compiler), so as to avoid execution of the other printf function.

So our program compares the number 0x400 and prints accordingly. So the password must be 0x400 or 1024 if it is not altered by any function in between. And we see that no manipulation has been done in between. So straightaway try 1024 as the password.

Enter Password:-1024
correct!

Congratulations! you successfully reverse engineered the binary and cracked the password, of course we knew it beforehand. For further reading on ELF (widely used executable format of present day ) see http://www.acsu.buffalo.edu/~charngda/elf.html

No comments:

Post a Comment