Today, most computer users know that information is represented in binary form by strings (i.e., sequences) of 1s and 0s like 01011110. Does it really matter that computers use the binary system to represent information? You can argue that it’s what the computer does with information that really matters, and not how the information is stored internally.
In some ways it really doesn’t matter how computers process data internally. The fact that the decimal number we call 25 is represented inside the computer by the binary sequence 00011001, or that the word “Alan” is represented by the binary sequence 01000001011011000110000101101110 is of supreme indifference to the computer user. On the other hand, these issues are important to the aspiring computer scientist, because the way in which computers represent and manipulate information determines the accuracy of mathematical calculations, the cost of storing a digitized photograph, or the speed of spelling checker. Therefore, we can’t entirely ignore what goes on inside the computer.
We examine how information is represented and manipulated inside a computer. We’re going to describe the various ways in which numbers can be represented by 1s and 0s and introduce binary arithmetic. However, we’re also going to look at some of the more interesting aspects of binary data; for example, we demonstrate how one or more bits in a string of 1s and 0s can be corrupted (i.e., changed) and how we can detect and correct these errors. Such errors arise when data is stored on disk or tape and a 1 accidentally gets converted into a 0 or vice versa. We also examine codes that long binary strings into shorter strings—a sort of digital shorthand!
If computers operated in decimal arithmetic just as we do, it would be a lot easier to teach how they work because we could employ the arithmetic used in our everyday lives. To build a decimal computer would require electronic circuits capable of storing and manipulating the ten decimal digits 0, 1, 2, 3, 4, 5 ,6, 7, 8, 9. In such a computer, the value 4 might be represented by a 0.4 volt signal and the value 6 by a 0.6 volt signal. Unfortunately, we can’t yet build very low-cost circuits that can reliably distinguish between ten different voltage levels. Because computers are constructed from tens of thousands of gates, a true decimal computer would be unimaginably expensive.
We can build very low-cost electronic circuits capable of distinguishing between the two voltage levels. These two voltages levels are called states and are given the values 1 and 0 (sometimes called true and false, respectively). If you design a circuit that behaves as a switch with the states off and on, it is very easy to manipulate the signals representing 1 and 0.
The basic unit of information in a computer is the bit (BInary digiT) that must have the value 0 or 1. Bit are said to be indivisible because they can’t be subdivided into smaller units of information. Computers don’t operate on a single bit at a time; they operate on groups of typically, 8, 16, 32, or 64 bits. In general, the more bits a computer can process at once, the faster it is. As computer technology has grown both more powerful and cheaper, the groups of bits processed by a computer have become larger. The first microprocessors in the 1970’s could deal with only four bits at a time. By the early 1990s 64-bit microcomputers were beginning to enter the personal computer market. The following terms are used to describe groups of bits of various lengths:
|
As you can see, the terminology used to describe groups of bits hasn’t been standardized; for example, some texts refer to a 16-bit value as a word, whereas others refer to a 32-bit value as a word. Even worst, the generic computer science term for any group of bits is a word. Consequently, word has both a specific and a general meaning, although its actual meaning is normally clear from the context. |
Numbers inside a computer are represented by strings of bits like 01101011. Before we explain how the decimal numbers used in everyday life are translated into bits, we need to make an observation—there are numbers, and there are numbers. There are whole or integer numbers like 1234 and 27; there are fractional numbers like ½ and 0.123; there are positive numbers like +24 and negative numbers like ‑46; and there are floating point (or scientific) numbers like 1.2345 x 1023. We will soon see that all these types of number can be converted into binary form.
We are going to begin by looking at unsigned integers. An integer is a whole number, and an unsigned integer is positive (i.e., it cannot be negative and must be zero or greater).
The decimal numbers we use in everyday life are represented in the positional notation; that is, the value or weighting of each digit in the number is determined by its position. The value of the 6 in 1261 is ten times as great as the 6 in 126. The decimal number 1261 is equal to 1 x 103 + 2 x 102 + 6 x 101 + 1 x 100 (note that the value of x0 is 1 for any value of x). In the positional notation, the value by which a digit is multiplied when it moves one place left in a number, is called the base.
We can now extend the positional notation to binary numbers. An m-bit binary integer in the base two can be written am-1am-2...a3a2a1a0 (where the ai is a bit that is either 0 or 1) has the value:
am-12m-1 + am-22m-2 + ... + ai2i + … + a323 + a222 + a121 + a020.
This expression looks more complex than it really is. Each of the ai terms represents a bit, and each of the 2i terms represents the contribution or weight made by that term to the number. Figure 1.1 illustrates the relationship between the bits and their weighting, and demonstrates how the 4-bit binary value 1101 is interpreted. Here the digits are a3=1, a2=1, a1=0, a0=1, and the weights are 23=8, 22=4, 21=2, 20=1. This number is therefore equal to 1 x 23 + 1 x 22 +0 x 21 + 1 x 20 = 8 + 4 + 0 + 1 = 13. Because we use decimal, binary and hexadecimal bases in this chapter, we employ the subscripts 10, 2, and 16, respectively to indicate the base of a number (e.g., 123410, 101010112, 12A316). When a number’s base of is obvious from the context in which it is used, the subscript is often omitted.
![]() |
The structure of binary numbers |
Positional notation can be used to handle any base. There’s nothing stopping you inventing a computer that uses base three arithmetic with the digits 0, 1, and 2; for example, the base 3 number 10213 has the decimal value 1 x 33 + 0 x 32 +2 x 31 + 1 x 30 = 27 + 0 + 6 + 1 = 34. Indeed, there’s nothing to stop you using a negative base if you really want to. If you decided to use the base -3, the number 1221-3 would be equal to 1 x (-3)3 + 2 x (-3)2 +2 x (-3)1 + 1 x (-30) = -27 + 18 - 6 + 1 = -14. You will be pleased to learn that the only base used by actual computers is the base 2.
Earlier we said that computers operate on groups of bits. How big should such a group be? We really need to answer the question, “How many different values can we represent in n bits?” The diagram below demonstrates how we can generate a sequence of binary values using one bit, two bits, and three bits.
![]() |
The binary tree Starting at the left-hand side with a single bit, we can move along one of two paths—up represents the bit in a 0 state and down represents the bit in a 1 state. That is, a single bit provides two possible states; 0 and 1. Adding a second bit provides four paths from the starting point to states 00, 01, 10, and 11. Adding a third bit, generates eight paths to states 000, 001, 010, 011, 100, 101, 110, and 111.
Each time a bit is appended to the number the total number of paths doubles. Four bits give 16 paths, 5 bits give 32 paths, and so on. An n-bit word provides 2n different paths or bit patterns. An 8-bit byte has 28 = 256 possible values, and a 16-bit word has 216 = 65,536 different values. The eight 3-bit codes are the binary codes for the integers 0, 1, 2, ..., 7, respectively.
The 3-bit binary sequence
|
The positional notation system of numbering we have just described is identical to the system we use in everyday decimal arithmetic. The only difference between binary and decimal numbers is that the weightings in the binary system are 1, 2, 4, 8, 16, 32, 64,... and the binary digits are {0, 1}, whereas the weightings in the decimal system are 1, 10, 100, 1000, 10000,... and the decimal digits are {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. The rules of arithmetic in both binary and decimal arithmetic are identical. The binary numbers we describe in this section are also commonly known as natural 8421-weighted binary integers.
Let’s look at a few examples of the conversion of numbers between the decimal and the binary bases. The 8-bit binary value 101011002 can be converted into decimal form by adding up the powers of two: 27 + 25 + 23 + 22 = 17210. Similarly, the decimal value 10010 can be converted into binary form by decomposing it into powers of two: 10010 = 64 + 32 + 4 = 11001002. Two algorithms for converting binary values to decimal form and vice versa are described in the following two panels.
|
Decimal to Binary Conversion
Take the decimal number and divide it by two. Record the remainder which is either one or zero. Take the result of the division (the quotient) and divide it by two, and record the remainder. Continue until the result is zero, and the remainder one. The remainders represent the binary number (the first remainder is the most-significant bit). Consider the conversion of 1910 to binary form.
Step 1 19 ¸ 2 = 9 R = 1 Step 2 9 ¸ 2 = 4 R = 1 Step 3 4 ¸ 2 = 2 R = 0 Step 4 2 ¸ 2 = 1 R = 0 Step 5 1 ¸ 2 = 0 R = 1
The binary value of 19 is therefore 100112. |
|
Binary to Decimal Conversion
Take the left-most bit, double it, and add it to the next bit on its right. Take the result, double it and add it to the next bit on its right. Continue until the least-significant bit is reached. Consider the conversion of 101112 to decimal form.
Step 1 1 x 2 = 2 (start at left hand end) Step 2 2 + 0 = 2 (add in the next bit) Step 3 2 x 2 = 4 (double the total) Step 4 4 + 1 = 5 (add in the next bit) Step 5 5 x 2 = 10 (double the total) Step 6 10 + 1 = 11 (add in the next bit) Step 7 11 x 2 = 22 (double the total) Step 8 22 + 1 = 23 (end - lsb reached) |
![]() |
This demonstrates how all the eight possible values of three bits can be plotted at the vertices of a cube. The most-significant bit of the 3-bit word selects the top or the bottom face of the cube bottom; the middle bit selects the front or the back; and the least-significant bit selects the left or the right. The idea of plotting a sequence of bits as a point in space will come in useful later. |