Step One: Down The Rabbit Hole.

The point of reverse engineering software is to determine what a program is going to do/designed to do.  Not all programs do what they are designed to do.  If we don’t have the source code for the program we will have to figure out from assembly language what’s going on.

In the books section of the blog you can find the book “Reverse Engineering for Beginners”  which is an awesome book that I highly recommend.  It’s also free so why wouldn’t you get it.  We are going to be using it to get a feel for what assembly matches what C program constructs.  To get started in this post we are going to reverse engineer an executable.  I’m not going to explain much about the details just give the example.  This comes from an example from the book.

Binary

We got a binary executable file.  What’s a binary executable file?  A binary executable file is a file made of binary code, usually created by a compiler, designed to run on whatever architecture it was compiled for.  We don’t know anything about our binary file other than the fact that it’s a binary.

So what the heck do we do?  We find out what it is.  I’m on a Linux machine.  There are different tools for different machines.  In this case we are going to use the file command.  You can read about the file command here and you probably should.  Or you can type $ man file into your terminal on a Linux box and do the same thing.

$ file first
first: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically
 linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, 
BuildID[sha1]=bc444e3a2949b86644e5839af0da750d9a6893ae, not stripped

We get a lot of information here about what kind of file this is.   The most important information we get for today is that it’s an ELF 32-bit file.

What Next

We could just run the program, but then you could just be screwed after you do that as well.  So how do we figure out what’s going on?  Throw it into a disassembler and see what we can find out.

I am going to use Binary Ninja and see what I can figure out.

bnfirst

The part in the above image we want to pay attention to is the main function.

We are presented with the assembly language instructions that make up the main function of the first program.  The address of the instruction is the left column directly under main:, the next column to the right is the machine opcode, then the assembly instruction.

I’ll go over a lot of this in depth as the blog progresses so I’m just going to go over what’s happening in this function really quick.  The first thing that happens is the ebp register is pushed onto the stack, then the current value of esp is placed in ebp.  Which is setting up the stack for the main function.  Then 0 is placed in eax, and the retn instruction is called.

Long story short the only thing this program does is set up the stack and return 0.  Which would correspond to the following C program:

int main() {
    return 0;
}

If we plug this program into an editor and compile it we can verify if we are correct.

Test

Here’s where we find out if we were right.  I put the exact lines of the program into gvim and compiled using the following:

$ gcc -m32 -o test -g test.c

Which compiled the file as a 32-bit executable. We can then check the output in gdb with the disassemble main command.

(gdb) disass /m main
Dump of assembler code for function main:
1	int main() {
   0x080483db <+0>:	push   ebp
   0x080483dc <+1>:	mov    ebp,esp

2	    return 0;
   0x080483de <+3>:	mov    eax,0x0

3	}
   0x080483e3 <+8>:	pop    ebp
   0x080483e4 <+9>:	ret    

End of assembler dump. 

Which matches what we had in Binary Ninja for the main function of the first program.

Conclusion

We have reverse engineered our first program.  Admittedly it wasn’t that complicated of a program.  But it’s a start and we now know how the stack frame is set up for a function in x86 assembly language.  I like getting my hands dirty from the start so I wanted to post a really simple example of reversing a program at the beginning.  There’s a lot here to dissect in the next few posts.

As we move forward we will get more complicated and see how different things work on different platforms.  The goal is to work on reverse engineering every day.  That way it never falls into the lose it from use it or lose it category.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s