Last module we traced user input through a simple program. This time we are going to see what happens when we set up a simple validation routine. As usual we are going to expand on previously created programs and look at the assembly to figure out what’s happening. Lets start off looking at the binary for this module.
Scanning through the assembly we see some familiar function calls to printf() and scanf(). There’s a call to a function named validation which is followed by a new set of instructions we haven’t seen before. Lets take a look at what those instructions are.
- test performs a bitwise AND operation on two operands.
- je is jump if equal.
- jmp performs an unconditional jump.
Lets figure out this section of code. The test instruction will set the ZF flag to 1 if eax is 0. Then je will jump to the specified address if the ZF flag is 1. Which means we are checking the eax register to see if it’s 0 and if it is we are jumping to another section of code. If not then execution will fall through to the next block of instructions which prints something out. After printing we jump to the end of the main() function. If eax had been 0 we would have jumped to another print operation and continued to the end of the main() function.
In plain language this block of code is printing a statement based on the value in eax. We make a call to validation before the evaluating eax so a logical guess would be that we are testing the result of the validation call. Think about the relationship between eax and the results of the call to validation. If you’re thinking that return values are passed between callee and caller function you would be correct. A logical guess would be that validation is passing a value back to main and we are printing something out based on that value.
With that in mind lets take a look at what validation contains:
A few things might jump out at you as you glance over the code. If not don’t worry about it start looking at it line by line. First thing I noticed was a call to the function strcmp(). Which takes two string arguments and returns 0 if they are equal, nonzero if they are not. Next we see a familiar instruction,
test eax, eax which is followed by a new instruction
- jne is jump not equal.
The jne instruction will jump to the specified address if eax is not zero. Which means that the arguments are not the same for strcmp(). The jump takes us to moving 0 into eax. Which is the value we return to main(). Otherwise we return a 1 to main by falling through the jne instruction and jumping to the leave instruction.
Lets go back through what we have just learned about our program. We have learned that it takes user input by the scanf() function, most likely prompted by the printf() function before it. We then can see in the assembly that we pass that input and another memory address to the validation function. The other memory address ebp-0xc must be the other string for compare.
Then based on if those two strings are equal validation returns either a 0 or a 1. Which is then tested and based on the value the program prints out something.
Taking a General View:
We already know that if we look at the memory addresses of the program where the strings are stored we are going to get a mess of ASCII values. So instead of trying to figure out what those strings are by grinding out values in an ASCII table lets use some available tools. We can take a look at the strings in the binary using rabin2 that is included in the radare2 toolset. By using the command
rabin2 -z mod7gcc32 we get the following output:
With the knowledge of the strings we can guess that this is a password check routine. The easiest thing to do is to try the the one string that doesn’t make sense as a message to the screen.
How we define our end goal will determine what we need to do to be successful at reverse engineering an application. For example if our end goal was to simply figure out what the correct password was in this validation scheme we could have achieved success without even glancing at the assembly instructions. All we had to do was take a look at the strings and we found the right password. Realistic? Not at all. But if that’s our measure of success we would have succeeded.
A different definition of success could be that we understand the control flow of the program and how the validation scheme works. We found that there was a stored value passed to a function which used a C library function to give a simple “yes” or “no” answer to if we have the right password. We don’t need to follow the assembly exactly to figure that out either.
If our definition of success was determining if this is a good validation scheme we would need to evaluate the binary at a more in depth level. If you look closely you will see that the user input is handled in the exact same way as the program in module 6 handled it. You can also see the same code with respect to the ecx register. Which tells us that this code is just as vulnerable to a buffer overflow and has the same protection as the code from module 6.
Making Use of Available Tools:
Just to take a look at the binary in another way we can load it into another tool. For this portion I’m going to use Binary Ninja to demonstrate a control flow graph (CFG).
This is the CFG for the main() function. We can get a lot more information a much faster by using tools such as Binary Ninja. Notice how we can see the string “R3versing” being pushed onto the stack right before the call to validation without needing to go hunting for strings.
While Binary Ninja certainly makes things easier it is not free. There is a demo version with limited functionality available that you can play around with.
We have now seen a very simple validation scheme in assembly. You may be thinking to yourself why bother with the assembly dump when you can just use a powerful tool like Binary Ninja, radare2, Hopper, or IDA to evaluate these binaries. The answer to that question is two fold.
First, after evaluating the assembly of the binary by itself you should already have an idea of the logic and control flow of the program. That makes understanding what you are looking at in more powerful tools more meaningful and easier to comprehend. Experience will make it even easier to interpret what more powerful tools are telling you.
Second, powerful tools only take you so far. What would happen if you only knew how to read a control flow graph in Binary Ninja and you opened up a binary that just dumped nameless sections with big blocks of code. You’d have to dig into the code exactly how we are doing to make use of the features Binary Ninja offers you to construct something more readable and easy to deal with. We will see some of this later when dealing with binaries that are not so simple.
Your homework is to trace the user supplied data through the binary and figure out everywhere it is placed and exactly how it is used. Try drawing a graph of the stack and map out where the data is placed and called.