Computing Quick and Dirty

Last time we figured out a C program corresponded to a binary I had laying around.  We figured out what the program was but didn’t discuss what a program really is.  So this time lets look at what a program really is when we throw it at a computer.

Executable

An executable is a file that can be run on a machine that it’s designed for.  If you know what reverse engineering software is you probably understand what an executable is.  The point of reverse engineering is to take that executable and recover the code, or at least the functionality from it.

It’s A Program…

Lets trace a program from being written in a high level language (C is the main high level language until further notice)  to execution.

First the code has to be written and saved in a text editor of your choosing.  You had a goal, you wrote some code to accomplish that goal.  Second you pull out your handy C compiler and try compiling your program.  Assuming all is good and there are no errors your C compiler performs approximately four stages of compiling and spits out an executable file.

That whole four stages of compilation is a huge gloss over.  That’s actually one of the more important parts of this whole process.  It’s important because how the program is compiled and converted into machine code directly effects how we see the disassembly of that program.  Lets look at an example of just how important that part is.

Different Compiler Options:

Here’s a really simple program that runs a for loop and spits out whatever number it’s on:

#include 

int main(void) {
    
    for(int i = 0; i < 10; i++) {
        printf("%d\n", i);
    }

    return 0;
}

Simple program with nothing crazy going on.  Lets take a look at what happens when we compile with no flags and then compile with an optimization flag in gcc.

No flags:

qdpost
With an optimization flag -O

qdv2
They look a heck of a lot different, but both versions do exactly the same thing.

What Happens Next:

When the program is run it’s loaded into memory and some magic goes on in the hardware that decides when to send the executable binary to the CPU to execute it.  So it’s not magic it’s a bunch of parts working together in the operating system.

That Important Part:

So that compilation part is really important.  I’ll say it again, that compilation part is really important.  So important that learning how compilers work is going to be a big part of learning to do reverse engineering.  We are going to dig down into some math too.  I love all that shit.

We’ve Got It Out Lets Take A Look:

Since we’ve got some assembly up in those pictures lets see what we can determine about the program code while we’re at it.

In the first compilation we used no flags.  We see that 0x0 is moved into a memory address, so we are storing the counter variable i in memory.  That’s expensive relatively speaking (relative to say storing it in a register).  How do I know we are storing it in memory?  The [] brackets are indicators of memory in x86_64 assembly and x86 assembly.

Then we execute an unconditional jump, compare the value in memory to 0x9 and act appropriately.  In the fewer than 9 case that moves to the code block on the left in the graph.  We then move a bunch of stuff around (technical I know) print the number, and increment the value in memory, jump back to the compare and do it all over again until the memory contains 0xa = 10.  When that happens it moves to the code block on the right and returns to the calling function.

In the code with the optimization compiler flag we are doing the same process just in a different way.  First all the processes, except printf, are done in registers.  The register ebx is our counter and when it hits 0xa the loop terminates.  That’s basically all there is to it from the high birds eye view.  If you notice we don’t even have a stack frame set up in the optimized version, it all happens in registers.

Execution:

When execution of the binary occurs the instruction pointer (in this case rip) will move through the machine code line by line following all the instructions.  Until the function returns control back to the calling function.

Conclusion:

That’s a birds eye view of what actually occurs for a program from start to execution.  By birds eye I mean looking down at earth from the space station.  I enjoy all this stuff a lot, which is why I’m writing this blog in the first place.  There is a whole library of books on the topics we are going to cover as time moves forward.  There should be some exciting discoveries along the way and a much better understanding of how programs and computers work.

Advertisements

One thought on “Computing Quick and Dirty

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s