Home Up Next

The CPU Assembly language Main store Secondary storage Input/Output

The CPU

The processor lies at the heart of every computer. A processor is also called a CPU (central processing unit) and contains the logic necessary to read instructions from memory, decode (or interpret) them, and then carry them out (i.e., execute) them.

The processor is also called a stored program von Neuman machine. The term stored program means that the computer program is stored in the same memory system as the data used by the program. That is, instructions and data occupy a common memory and there is not a separate memory for programs and data.

The term von Neumann machine also implies 'stored program' and is in honor of John von Neumann who was a mathematician and computer pioneer.

Not all computers are von Neumann machines. For example, the so-called Harvard Architecture describes computes in which programs and data are in separate memories.

For the purpose of this course, we are interested only in stored program computers.

A consequence of the von Neumann architecture is that for each instruction, memory has to be accessed at least twice; the first access is to read the instruction and the second access is to read data required by the instruction. This action is called the fetch/execute cycle and is typical of first-generation microprocessors and microcontrollers. Because a stored program computer has to access memory twice per instruction, the path between the processor and memory is often called the von Neumann bottleneck. This term implies that one of the factors limiting the speed of computers is the CPU to memory path.

The modern computer is rarely a pure von Neumann machine. Modern computers read several instructions at a time and store them internally in the CPU in a high-speed cache memory ready for use. This means that a computer can often simultaneously access an instruction from internal cache memory and data from the main store at the same time. Such a mechanism help overcome the von Neumann bottleneck.

Some computers overlap the execution of instructions by reading the next instruction and beginning its execution before the current instruction has finished execution. This improves their speed. Such computers are said to be pipelines and are called RISC processors.

Some computers have multiple execution units and are able to execute several instructions at the same time in parallel. These are called superscalar machines.

However, all these operating modes are simply variations on the stored program von Neumann computer.

The Instruction

The instruction executed by a computer consists of a string of bits. Typically, a computer instruction is 8, 16, 32, or 64 bits wide. the encoding of the bits of an instruction varies from machine to machine. For example, the bits of an instruction that is executed by an Intel Pentium machine are encoded entirely differently to the bits of an instruction that runs on a PowerPC. This means that you cannot directly move (i.e., port) machine code from one computer to another. You can move high-level language (C, Pascal, Java, Fortran) from one machine to another because high-level language programs are converted into machine code by a compiler before they are executed. Note that you can transfer machine level (binary) code between machines of the same family; for example, binary code that runs on Intel's Pentium will run on also run on AMD's processors.

An machine-level instruction consists of two parts. The first part is the operation to be performed (e.g. addition) and the second part is the operand or operands (i.e., the data used by the instruction. For example, and instruction might be ADD A,B,C and its effect is to calculate A = B+C. However, most instructions are not like this for practical design reasons.

The operand used by an instruction can be in one of three forms. The first is the literal (i.e., constant or immediate operand). A literal is a value that does not change during the execution of a table; that is, a literal is not a variable. A literal can be stored in an instruction and does not require an additional memory access to read it.

The second form of operand is the variable that is stored in memory (or a register). For example, ADD A,B,C means add the contents of memory location B to memory location C and store the result in memory location A.

The third form of operand is the pointer. In this case, the operand specified a variable that gives the address of the actual operand. This variable is called a pointer because it pints at the operand. Pointers are used to access data structures such as tables, arrays, vectors, matrices, and lists. In everyday algebra, the expression X = Yi + 3 illustrates all three types of operand. X is a variable that can change. The '3' is a constant. The Yi is the ith variable of array Y and i is a pointer to it.

Inside a computer, data and instructions can occupy three type of location.  One location is a register. A register is a 'named' location within the processor that can be rapidly accessed. Nothing can be accessed faster than a register. Different computers give registers different names (the Motorola 68K used names D0, D, D2; the ARM uses names r0, r1, r2... the Pentium uses names AX, BX...). The second location for data is in main store (immediate access store). This is a very large memory where all the programs and data currently being executed are stored. The third location of programs and data is in secondary storage such as disk or CD.

Computer instruction generally come in three formats: one address, two address, or three address. The number of addresses describes the number of operand addresses used by the instruction. The next table describes these formats. Note that an address can be a register or a memory location.

Format Typical style Effect
     
Three address ADD P,Q,R Add Q to R and put result in P
Two address ADD P,Q Add Q to P and put result in P (this destroys the original value of P)
One address ADD P Add P to the contents of a temporary register called the accumulator

First-generation 8-bit microprocessors and modern 8- and 16-bit microcontrollers uses a one address format. One operand can be specified by a memory address and the other operand is a register called an accumulator, You don't need to specify it because it is 'assumed' consider the following simple code.

LDA P
ADD Q
STA R

This sequence loads the accumulator with the contents of memory location P (LDA P). Then it adds the contents of location Q to the accumulator (ADD Q). Finally is stores the contents of the accumulator in R (STA R).

Note that 8-bit machines used consecutive 8-bit words to create an instruction that contained an 8-bit operation code followed by two 8-bit words that made up a 16-bit operand address.

Processors like the 68K and the Pentium have a two address instruction format (sometimes called a one-and-a-half address format). One address is usually a location in memory and the other address is a register. For example, the 68K can execute code like

      MOVE P,D0
      ADD  Q,D0
      ADD  R,D0
      ADD  S,D0
      MOVE D0,T

In this case we have added P, Q, R and S and put the result in T.

True three address instructions once existed in the world of the mainframe. No modern microprocessor provides a true three address format. However, modern RISC processors do provide a three address format where the operands are registers. For example, the ARM microprocessors has 16 register R0 to R15 (most RISC machine have 32 registers). Topical RISC code might be

      ADD R3,R4,R5
      ADD R3,R3,R7
      MUL R4,R3,R10

In this case we add R4 to R5 and put the result in R3. Then were add R3 to R7 and put the result in R3. Finally, we multiply R3 by R10 and put the result in R4. Two special instructions called load and store as used to move data between memory and a register. Indeed, these machines are called both RISC machines and load and store machines.

ADDRESSING MODES

From now on (to keep things simple) we are going to use a simple machine to describe computer architecture. We will use a 32-bit machine modeled on the Motorola 68K because it is so easy to understand. This course (BCS Certificate Technology) requires only a basic understanding of the way in which instructions are executed and a knowledge of assembly or low-level language.

We will assume a computer with eight registers called D0 to D7 that hold data (i.e., variables). There are also eight address registers A0 to A7 that hold addresses. Address registers are used as pointers.

Consider the two instructions

ADD P,D0
ADD #1234,D0

The first instruction adds the contents of memory location P to data register D0. The second instruction adds the literal (i.e., constant) to the contents of register D0. Note that somewhere in the program, the address P will be assigned the actual address of the location in memory.

Now consider the operation

ADD (A0),D0

This adds the contents of the memory location pointed at by address register A0 to data register D0. In this case we have used pointer based or register indirect addressing to access memory.

All computer addressing modes are variants on these three fundamental modes.

INSTRUCTION TYPES

Students taking this course are not expected to become expert assembly language programmers. You are expected to appreciate the basic instruction types executed by most processors. Instructions fall into three basic categories: data movement, data processing, and program flow control.

Data movement instructions transfer data between memory and registers or between registers. For example, the 68K has a single MOVE instruction that can perform:

      MOVE D0,D1   Copy the contents of register D0 to register D1
      MOVE #4,D1   Copy the literal value 4 into register D1
      MOVE P,D1    Copy the contents of memory location P to register D1
      MOVE D2,Q     Copy the contents of register D2 to memory location Q
      MOVE (A0),D4   Copy the contents of memory pointed at by register A0 to register D4
      MOVE D7,(A5)   Copy the contents of register D7 to the memory pointed at by register A5

Data processing instructions perform an operation on data. The operation may be arithmetic or logical. Consider

      ADD  D0,D1   Add the contents of register D0 to register D1 and put the result in D1
           SUB  #4,D1   Subtract the literal 4 from the contents of register D1 and put the result in D1
      AND  P,D1    Perform a logical AND operation between the contents of memory location P and register D3 and put the result in D3.
      EOR  D4,Q   Perform an exclusive OR between the contents of register D4 and memory location Q and put the result in Q.
           LSL  #3,D0  Move (shift) the bits in D0 three places left

Program flow control instruction affect the sequence in which instructions are executed. We will discuss this in more detail elsewhere. Such instructions are used to implement high-level language constructs such as IF...THEN...ELSE and  REPEAT...UNTIL and to call functions and procedures. Consider

      BEQ  PQR   If the result of the last operation was zero then execute the code on line PQR
           BSR  ABC  Call the procedure on line ABC
           RTS       Return from the current procedure to the calling point