In today’s module we are going to build on the empty main function we looked at in assembly module 2. In the process we are going to grab a .gdbinit file that will make gdb a more useful tool. I also got the naming a little off and you’ll notice that the files for this module are mod3, not mod4 so keep that in mind when looking at the repo.
GDB is the Gnu Project Debugger which comes with most Linux Distros. To add some more information and functionality grab this file and place it in your ~/.gdbinit, .gdbinit. It may take some getting used to but it will be worth it to work on the command line if you’re not used to it.
Lets start by looking at the binary mod3gcc32.
This is the output of gdb with the .gdbinit file from above in place. I have set a breakpoint at the main function using the
break main command. Then executed the command
run. Execution has stopped at the beginning of the main() function. At the top we have the registers displayed and underneath the remaining disassembled code. Notice that the stack prologue has already executed by the time the breakpoint is hit.
The first and second instructions we see are extending the stack frame by 16 bytes. Remember that the stack grows down towards lower addresses and we are using hex notation. So by subtracting 0xc + 0c4 = 0x10 from esp we are increasing the allocated stack memory by 16 bytes.
Look at the next instruction. We are calling a hex value. That hex value looks a lot like the values in the left most column of our display. So a logical guess would be we are pushing an address onto the stack. We could go jumping around in memory and see what’s there but we should try to see what the context of the memory call is first. We want to see what kind of information we should be looking for.
We see that the next instruction is a call to the puts function. Lets take a look at the puts function definition. It turns out that puts(const *str) accepts a string pointer as an argument. In the 32-bit Linux execution environment arguments for calling a function are passed on the stack from right to left. So they are pushed on the stack in reverse order.
In this instance we are pushing the address of a string onto the stack for the puts function to use as an argument. Lets take a minute to remember that an array of characters and a pointer to an array are the same thing in C programming. So lets take a look at the place the pointer is aimed at and keep in mind we are looking for a string:
Well that doesn’t look much like a string at first glance. We need to remember how characters are stored. In ASCII or Unicode values. So lets look at an ASCII table and see what those values turn out to be.
6c = l, 65 = e, 48 = H so it looks like the first chunk is lleH. 45 = E, 52 = R, 20 = SPACE, 6f = 0, so the second chung is ER o. 73 = s, 72 = r, 65 = e, 76 = v, so the third block is srev. Then the last block is, 21 = !, 67 = g, 6e = n, 69 = i, which is !gni. Arranging the blocks in order we have lleH ER o srev !gni. Which doesn’t make any sense until we remember that IA32 uses little endian architecture so everything is reversed. Reversing the blocks we get Hell o RE vers ing! = Hello REversing!
So we have a program that prints out Hello REversing! to the screen.
But are we sure that we are done? The only way to know for sure is if we go through the remaining instructions.
add esp, 0x10 is moving the top of the stack back to where it was before this stack frame was constructed. We then place 0x0 in eax which we learned previously is the return value. Then we see a value moved into ecx.
Lets figure out what the instruction
mov ecx, DWORD PTR[ebx - 0x4] is doing. It turns out that the DWORD PTR prefix is telling us that we are moving a 32-bit integer located at the memory address ebx – 0x04 into the ecx register.
Exercise: Run the instruction disass main in gdb with mod3gcc32 and see if you can trace what is going on with this particular address.
Exercise: If you have installed the radare2 tool set use rabine2 with various options on mod3gcc32 and see if knowing the results would have helped you when going through the mod3gcc32 binary.
Lets review what we have learned in this quick exploration. First we set up gdb so that it will display some useful information when we debug a program. Most beneficial to us is that it prints out the next set of instructions.
We then learned that in IA32 programs arguments are passed on the stack from right to left. So when we saw a memory address pushed onto the stack right before a function call that says that we should look at the function to see what kind of arguments it takes. In our case we found out the function takes a pointer to a string. So we looked at the memory address and found the ASCII values for the string Hello REversing!
I’m going to introduce the Application Binary Interface (ABI) here. Much like an Application Programming Interface (API) the ABI instructs how the program interfaces with the system at the binary level. We will dig into a bit more detail about the specifics as we learn more about interacting with binaries. Things like how the arguments are passed between the calling function and the callee are defined in the ABI. You can find the System V ABI here. But you can also find the information in other places. For example here is the calling conventions for x86 architecture. We will end up referencing the ABI’s just like the manuals for the CPU’s we are dealing with.
We have figured out the functionality of our given binary. So today you have successfully reverse engineered your first program that does something besides setting up a stack frame and exiting. Hopefully even though it was simple you had fun doing it. Next time we will look at an even more complicated program and dig into reverse engineering it. There are binaries that have been compiled under different systems in my GitHub repo. Specifically I would suggest taking a look at the 64-bit binaries and the C++ binaries to see how they differ.
Remember that what we find in a binary is determined by the compiler and the compiler options used to create the binary. Until next time add a few things to the mod3.c and mod3.cpp programs in the repo and compile with different compilers and see how the disassembly changes. That’s how we develop intuition.