In todays module we are going to start covering some of the architecture of the computer, specifically the CPU. There is no background needed for this module except knowledge that computers exist and that they do things.
Shake Hands With Your Computer:
We can all agree that computers exist and they do “things”. What those things are varies on what you are using your computer for. What you think your computer does and what it actually does may not be the same thing though. As I’m typing this on my keyboard and watching the words appear on the screen there are multiple processes occurring below the surface that I can’t see.
Each key on the keyboard is mapped to an action somewhere in the black box sitting next to me on my desk. That action is then mapped to another action somewhere else in the box based on what program I’m interacting with. Am I typing words? Am I moving a player in a video game? It’s all based on context. This introduces us to a concept of abstraction.
In a very general sense abstraction in the world of computers is the hiding of details to make understanding easier. In the example of the keyboard I don’t need to know how it happens, just that I will print “k” on the screen if I hit the “k” button on the keyboard. There are programs that deal with that.
What we are going to do in the set of architecture modules is strip away that abstraction to gain a better understanding of what actually happens when we run a program on a computer.
Simple Is Good:
Lets forget everything about the computer you’re reading this on. It could be a phone, laptop, tablet, desktop, or something in between. It doesn’t matter right now. Imagine a black box that you input two numbers into and it outputs the sum like a magic 8 ball.
What has to happen in that black box? It has to accept two numbers and then it has to have an instruction to add them, and then it has to have an instruction to display them. Well this isn’t as simple as it seems at first glance, lets write out the instructions:
- Accept first number
- Store first number somehow
- Accept second number
- Add second number somehow to first number
- Store new number
- Display new number
We have six instructions that are actually pretty high level. How do we store those numbers? How do we read them in? How do we display them? The list goes on. Each one of those instructions could be broken down into many more instructions just to get that one task accomplished.
Lets make it simpler. Our magic box just stores a secret number we give it somehow. Now our instructions are broken down into:
- Accept number
- Store number
That’s a bit easier and we can work with it. The box doesn’t understand any human language though. So we can’t just tell it what to do directly. We can however apply a current to the box in a sequence that it understands as instructions. We can represent those sequences by binary: 1 for current, 0 for no current. That binary is the language our black box speaks.
Central Processing Unit:
That black box is our CPU, Central Processing Unit. It carries out the instructions that have been “taught” to it using sequences of binary. The CPU is what carries out all of the instructions on your computer. No matter what program you’re running the instructions ultimately end up in the CPU and get executed there.
The instructions aren’t executed in any human language recognizable form, they are in binary. Those binary instructions take the form of opcodes that run through the CPU in a sequence defined by the executable program running.
Not all CPU’s are the same. There are many different types that have different features. There are ARM chips which are commonly used in tablets and phones. There are the x86 set of chips which denote the processors made by Intel after the 8086 chip. The most common ways to differentiate between different CPU’s that I have come across in reading are twofold.
- Address width: 16-bit, 32-bit, 64-bit etc. The size of a memory address available from the CPU is one way that CPU’s are different. The address space of the 64-bit architecture is much larger than the 32-bit for example. But we’ll get to address space later.
- Instruction set: The instruction set is a way to differentiate CPU’s as well. Some common instruction sets are ARM, MIPS, x86, etc. Note that the instruction set is different based on the width of the addresses as well.
Those two pieces of information determine how the CPU processes instructions. There is some backwards compatibility in some architectures. For example the x86_64 architecture instruction set contains the x86 32-bit instruction set, and 32-bit x86 programs are able to run on x86_64 bit CPU’s. The reverse is not true however, 32-bit CPU’s are unable to run 64-bit programs. However cross compatability is not supprted. The x86 CPU is unable to natively run a program compiled for the MIPS architecture without a program to translate it into an x86 program.
To end this module lets quickly review everything and pull it all together. The Central Processing Unit (CPU) of a acomputer is what carries out all the instructions. Those instructions are in the form of binary and are called opcodes. The opcode meaning is based on the instruction set of the particular CPU being discussed and those instruction sets are not cross compatible with other instruction sets. The best we have is backwards compatibility when it is built into the chip. We will be very interested in instruction sets and CPU operation as we progress in reverse engineering.