Back Home Up Next

The CPU Assembly language Main store Secondary storage Input/Output

Computer Memory

Computer and Human Memory

Computer scientists encounter the notion of human memory long before computer memory (except those that go straight from diapers to keyboards without actually learning to speak...).

Human Memory

Human memory operates is very different way from almost all computer memories. This figure illustrates how the human memory might respond to the input “gate”. When the input is “gate” is received by the brain, it is distributed in parallel to multiple regions of memory simultaneously, and recognized at several locations. The single input “gate” might produce the multiple responses “farm gate”, “logical gate”, “door”, and “Watergate”. The actual response selected by a human for further processing depends on the context of the input—you’d make a very different choice if you were talking about circuit diagrams than if you were talking about politics.

The input to the human memory is not the address of the data you are attempting to retrieve; it’s a key that is matched against many regions of memory simultaneously. Indeed, we use a special phrase to describe this matching process—we say “Does it ring a bell?” A memory whose storage locations are accessed in parallel by means of a key is called a content addressable memory or an associative memory. Electronic content addressable memories do exist, but they are very expensive, store only small quantities of information., and are found only in very special applications.

 

Organization of computer memory

This illustrates the essential features of computer memory, which behaves in a very different way to human memory (i.e., associative or content addressable memory). We can regard computer memory as a black box with three ports, called an address port, a data port, and a control port (a port is the point at which information enters or leaves a system). The address port is used to tell the memory which location is to be accessed. During a read cycle, information from the memory is sent to the computer via its data port, and in a write cycle information is loaded into the memory via its data port. The external system (i.e., the computer) uses the control port to tell the memory whether to carry out a read cycle or a write cycle.

We called this memory a black box because we don’t care about its internal organization or how it works—we’re interested only in what it does. The address indicates which of the locations (or memory cells) is to be accessed; for example, a memory with 216 = 65,536 locations has a 16-bit address that accesses location 0 (address 0000000000000000) to location 65,535 (address 1111111111111111). Each memory cell contains an n-bit binary value, where n is the number of bits in the words processed by the CPU.

Once again, consider the difference between the computer memory and the associative memory. A location within the computer memory must be accessed explicitly by providing the memory with an address; that is, you ask the memory what is stored in location x. You don’t provide an associative memory with an address (indeed, the very concept of an address is meaningless). Suppose you want to find whether an associative memory contains red or blue gizmos. You apply the key “gizmo” to the memory and examine the response (i.e., the memory indicates all elements that have a match with gizmo). If you have to search a conventional computer memory for a data item, you have to search the memory element by element until you find the data you require.

 

Memory Terminology

There are many types of storage mechanism, each with its own characteristics. We begin by listing some of  the fundamental parameters of memory systems.

Memory cell A memory cell is the smallest unit of information storage and holds a single 0 or 1. Memory cells are often grouped together to form words. The location of each cell in the memory is specified by its address, which is called a physical address to distinguish it from the logical address of an operand generated by the computer.

Capacity A memory's capacity is expressed as the number of bits or bytes that it can hold. Semiconductor devices are normally specified in terms of bits (e.g., a 256 Mbit DRAM chip), whereas CDs and disks are specified in terms of bytes (e.g., a 600 Mbyte CD or a 4 Gbyte hard disk). Some manufacturers use the convention that 1K = 1,000 and 1M = 1,000,000—we will use the normal convention that 1K = 210 = 1,024 and 1 M = 220 = 1,048,576.

Density The density of a memory system is a measure of how much data can be stored per unit area or per unit volume; that is density = capacity/size.

Access time A memory component's most important parameter is its access time, which is the time taken to read data from a given memory location, measured from the start of a read cycle. Access time is made up of two parts: the time taken to locate the required memory cell within the memory array, and the time taken for the data to become available from the memory cell. Strictly speaking, we should refer to read cycle access time and write cycle access time. Since many semiconductor memories have almost identical read and write access times, we regard access time as the read or write access time. This is not true of all forms of memory, because some devices have quite different read and write access times. Some memories are also specified in terms of cycle time, which is the time that must elapse between two successive read or write accesses. Access time and cycle times are often identical. However this statement is not true for semiconductor dynamic memories and flash EPROMs.

Random access When memory is organized so that the access time of any cell within it is constant and is independent of the actual location of the cell, the memory is said to be random access memory (RAM). That is, the access time of random access memory doesn’t depend where the data being accessed is located. This means that the CPU does not have to worry about the time taken to read a word from memory because all read cycles have the same duration.

If a memory is random access for the purpose of read cycles, it is invariably random access for the purpose of write cycles. It is unfortunate that the term RAM is often employed to describe read/write memory where data may be read from the memory or written into it (as opposed to read-only memory). This usage is incorrect, because the term random access indicates only the property of constant access time and has nothing to do with the memory's ability to modify (i.e., write) its data. Another term for random access is immediate access. The dialed telephone system is a good example of random access memory in everyday life. The time taken to connect with (access) any subscriber is constant and independent of their physical location.

Serial access In a serial access memory, the time taken to access data is dependent on the physical location of the data within the memory and can vary over a wide range for any given system. Examples of serial access memories are magnetic tape transports, disk drives, CD drives, shift registers, and magnetic bubble memories. Serial access is also referred to as sequential access. It’s easy to see why serial access memories have variable access times. If data is written on a magnetic tape, the time taken to read the data is the time taken for the piece of tape containing the data to move to the read head. This data might be 1 in or 2400 ft from the beginning of the tape.

Bandwidth The bandwidth of a memory system indicates the speed at which data can be transferred between the memory and the host computer and is measured in bytes/second. Bandwidth is determined by the access time of the memory, the type of data path between the memory and the CPU, and the interface between the memory and CPU. For example, a hard disk might have a bandwidth of 40 Mbyte/s; that is, 40 Mbytes can be transferred between the disk and CPU in a second.

Latency Bandwidth indicates how fast you can transfer data once you have the data to transfer. Latency refers to the delay between beginning a memory access and the start of the transfer. When speaking of disk drives, latency refers to the time taken for the disk to rotate until the desired data is under the read/write head. When speaking of buses, latency refers to the time taken to get control of a bus before a data transfer can take place.

Volatile memory Volatile memory loses its stored data when the source of power is removed. Most semiconductor memories in which data is stored as a charge on a capacitor or as the state of a transistor (on or off) in a bistable circuit are volatile. Some semiconductor devices such as EPROM and flash memory are non-volatile and retain data when the power is off. Memories based on magnetism are generally non-volatile because their magnetic state doesn't depend on a continuous supply of power.

Read-only memory The contents of a read-only memory (ROM) can be read but not modified (under normal operating conditions). True read-only memories are, by definition, non-volatile. Read-only memory is frequently used to hold operating systems and other system software in small microprocessor systems (e.g., palm-top personal organizers).

Static memory Once data has been written into a static memory cell, the data remains there until it is either altered by over-writing it with new data, or by removing the source of power if the memory is volatile.

Dynamic memory Dynamic memories (DRAM) store data in the form of an electronic charge on the inter-electrode capacitance of a field effect transistor. Because this capacitor is not perfect, the charge gradually leaks away, discharging the capacitor and losing the data. Dynamic memories require additional circuits to restore the charge on the capacitors periodically (every 2—16 ms) in an operation known as memory refreshing. DRAMs are much cheaper than static memories of the same capacity.

 

Memory Hierarchy

The main store that holds the programs in a typical computer ranges from 64 to 4096 Mbytes and has an access time of about 50 ns. Even this memory is sometimes partitioned into a high-speed cache memory and a slower main store. The program that starts the computer running is called the BIOS and is stored in semiconductor read-only memory. Finally, the PC may have one or more hard disks, a CD-ROM drive and a tape transport. If all these devices store data, why do we need so many of them? The ideal memory has the following characteristics.

High speed A memory’s access time should be very low, preferably 1 ns, or less.

Small size Memory should be physically small. One hundred thousand gigabytes per cubic centimeter would be nice.

Low power consumption The entire memory system should run off a watch battery for 100 years.

Highly robust The memory should not be prone to errors; a logical one should never spontaneously turn into a logical zero or vice versa. It should also be able to work at temperatures of -60oC or at 200oC. (The military are very keen on this aspect of systems design.)

Low cost The memory should cost nothing and should ideally be given away free with software.

Memory hierarchy

This illustrates the memory hierarchy found in many computers. Memory devices at the top of the hierarchy are expensive, fast, and have small capacities. Devices at the bottom of the hierarchy are cheap and store vast amount of data, but are abysmally slow. This diagram isn't exact because, for example, the CD ROM has a capacity of 600 Mbytes and (from the standpoint of capacity) should appear above hard disks in this figure.

 

At the tip of the memory hierarchy we have the internal CPU memory. Registers in CPUs have very low access times and are built with the same technology as the CPU itself. They are very expensive (in terms of the silicon resources they take up) limiting the number of internal registers and scratchpad memory within the CPU itself. This is especially true when the CPU is fabricated on a silicon chip, although the number of registers that can be included on a chip has increased dramatically in recent years.

Immediate access store (also called RAM or main store) holds programs and data during their execution and is relatively fast (10 ns?70 ns). Main store is invariably implemented as semiconductor static or dynamic memory. Up to the 1970s ferrite core stores and plated wire memories were found in main stores. Random access magnetic memory systems are now all but obsolete because they are slow, costly, consume relatively high power, and are physically bulky. There are two types of random access memory: cache and main store. Cache memory holds copies of frequently used data.

The magnetic disk stores large quantities of data in a small space and has a very low cost per bit. Unfortunately, accessing data on a particular track is a serial process and a disk's access time, although fast in human terms, is orders of magnitude slower than immediate access store. A typical disk drive can store 200 Gbytes (i.e., 238 bytes) and has an access time of 8 ms. In the late 1990s an explosive growth in disk technology took place and low-cost hard disks became available with greater storage capacities than CD-ROMs and tape systems.

The CD-ROM was developed by the music industry to store sound on thin plastic disks called CDs (compact disks). Unlike hard disks, CD-ROMs use interchangeable media. CD-ROM technology uses a laser beam to read tiny dots embedded on a layer inside the disk. CDs are very inexpensive and store up to about 600 Mbytes but have longer access times than conventional hard disks. In general, the CD-ROM is used to distribute software. Writable CD drives and media are more expensive and are used to back up data or to distribute data. When we cover magnetic media and CDs in detail, we will find that magnetic media require that the read/write head to be very close to the media's surface, whereas the CD uses optical technology and the read head doesn't come into contact with the surface. This makes the CD a good interchangeable medium. The DVD is a second-generation CD ROM that can store over 9 Gbytes.

Magnetic tape is an exceedingly cheap serial access medium and can store several gigabytes on a tape costing a few dollars. The average access time of tape drives is very long in comparison with other storage technologies and, therefore, it is largely used for archival purposes. Writable CDs have now replaced tapes in many applications.

By combining all these types of memory in a single computer system, the computer engineer can get the best of all worlds. You can construct a relatively low-cost memory system with a performance only a few percent lower than that of a memory constructed entirely from expensive high-speed RAM. The key to computer memory design is having the right data in the right place at the right time. A large computer system may have thousands of programs and millions of data files. Fortunately, the CPU requires few programs and files at any one time. By designing an operating system that moves data from disk into the main store so that the CPU always (or nearly always) finds the data it wants in the main store, the system has the speed of a giant high-speed store at a tiny fraction of the cost. Such an arrangement is called a virtual memory because the memory appears to the user as, say, a 4 Gbyte main store, when in reality there may be a real main memory of only 512 Mbytes and 200 Gbytes of disk storage. We examine virtual memory systems in the next chapter.

Classes of memory

 

This figure summarizes the various types of memory currently available according to type (this list is not complete).

 

Memory types inside a computer

A computer’s main memory or immediate access store holds the program being executed and its data. This memory is, of course, random access memory and is invariably composed of semiconductor integrated circuits, ICs, made by the same technology used to fabricate the microprocessor itself. Two distinct types of memory component are used to fabricate the main store. One is called static RAM and the other dynamic ram (i.e., DRAM). From the user’s point of view there is little difference between static RAM and DRAM, apart from the cost—DRAM is much cheaper than static RAM. Static RAM is somewhat more reliable than DRAM and consumes less power. You find static RAM in systems with small memory requirements or in systems that must minimize the power consumption. Static RAM can operate in a power-down mode in which its data is maintained by a tiny current from a small battery.

The difference between static and dynamic RAM lies in the way in which the devices are constructed. Static RAM uses logic element called flip-flops to hold data—these are described in chapter 4. DRAM stores data in the form of an electronic charge. Unfortunately, this charge gradually leaks away and the DRAM must be periodically updated or refreshed to retain its data. The need to perform a refresh operation every 2ms or so complicates the design of DRAM systems.

 

 

Cache Memory

A high-performance personal computer stores programs and data in its main memory, which is composed of, typically, 8 to 64 Mbytes of DRAM with an access time of about 50ns. Because the CPU can read the main store in less than 50ns, the processor is often forced to waste time waiting for the main store to provide it with data. The overall performance can be improved by using a faster main store. Unfortunately, it’s not cost-effective to build personal computers with large very fast memories. Cache memory technology provides a means of increasing the effective speed of the main store without using faster memory; that is the main store can be made to appear faster than it really is.

Cache memory is very fast random access memory that stores some of the data that the processor uses most frequently. A typical computer might have between 8K and 512 Kbytes of cache memory with an access time of 15ns. Cache memory is located either inside the microprocessor itself (internal cache) or on the same circuit board as the microprocessor (external cache). Most PCs employ a mixture of internal and external cache memories.

Cache memory and main store

When the CPU accesses memory, it accesses both the cache memory and the main store. If the data is in the cache, is supplied from the cache memory; otherwise it is supplied from the main store.

Clearly, if part of a program is located in cache memory, it can be executed much faster than if it were in main memory. However, since most computers have cache memories that are only a fraction of the size of the main memory, you might be tempted to think that adding cache memory to a computer to speed it up is a waste if time. Is isn’t. When a computer executes a program, parts of the code are executed over and over again, and some of the data is accessed very frequently. Similarly, other parts of the code and data are accessed infrequently during the execution of the program. A cache memory is cleverly designed to hold frequently used instructions and data. Consequently, whenever the CPU accesses its memory, there is a very high probability that the data will be in the cache rather than in the slower main memory. In many systems, over 95% of all accesses to memory retrieve data from the cache. The fraction of memory accesses that access data in the cache is called the hit ratio.

If you’re not convinced that cache memory is so efficient, consider the humble notebook. You may have a notebook with the phone numbers of less than 100 of your friends and colleagues. These phone numbers represent only a very tiny fraction of all the phone numbers in the world—and yet probably over 90% of the calls you make are to numbers in your notebook. Cache memory behaves exactly like the notebook.

The principle governing cache memory is remarkably simple—keep frequently used data in high speed memory—but the fine details of a cache system are anything but simple. A cache memory system has to identify frequently used data and keep a copy of it in the cache. Whenever the processor makes a read access, the cache system has to decide whether there is a current copy of the data in the cache. If the data is in the cache, it is read from the cache; if the data isn’t in the cache it must be brought from the main store and then copied into the cache. Moreover, if the cache is full, it might be necessary to throw out some old data and replace it by the new data. During a write cycle, you have to decide whether to write the data just to the cache or to the main memory as well. If you don’t write data to the main memory at the same time you put it in the cache, you have to ensure that the main memory is correctly updated at some point in the future.

Organization of direct-mapped cache memory

Because the organization of a real cache memory is quite complex, we will use a simple example. This figure illustrates the direct-mapped cache that is found, in a modified form, in many computers.

The highly-simplified 32-word memory system shows how the main store is partitioned into three units: the set, the line and the word. The smallest unit of data accessed by the CPU is the word. Both the cache and the main memories are  divided into units called lines, and each line consists of two words. When data is transferred between the cache and the main memory, the line is the smallest quantity of data that can be moved. The main memory is divided into sets, each of which contains as many lines as the cache memory.

Suppose the CPU accesses memory location  11 10 1 (i.e., set 11, line 2, word 1). Line 2 is accessed in the cache memory. Line 2 in the cache corresponds to one of the four sets of lines in the main store—but which one? In this example line 2 in the cache has a tag with the value “3”. This tag indicates which set the line belongs to. In this case, it is 3 which matches the set specified in the CPU’s address, and the CPU can go ahead and read word 1 from line 2 in the cache.

Now suppose that the CPU addresses set 2, line 1, word 1 with the address 10 01 1. The CPU reads line 1 from the cache and gets the tag 3, indicating that this line belongs to set 3 in the main store. Because the CPU is accessing line 1 in set 2, it has to get the data from the main store, because the current line 1 in the cache contains data from a different set. The system would normally update the cache by writing the data from set 2 into line 1.

 

Read-only Memory

Most semiconductor random access memory is called read/write memory because the CPU can read its contents, or write to it and modify its contents. This memory is also called volatile memory because its contents are lost when you switch off the power. Read-only memory or ROM is a type of memory that can be read from but not modified. By the way, semiconductor read-only memories are also random access devices. Unfortunately, the term RAM is often used to mean read/write random access memory, even though most ROM is also RAM!

You might well think, “If ROM can’t be written to, how do you put data in it in the first place?” There are several types of ROM—some are programmed during their manufacturing phase; some can be programmed by taking them out of a circuit and putting them in a special programming device; and some can be programmed electronically without removing them from the computer. Before you exclaim that electrically programmable ROMs can’t be true ROMs, we would point out that modifying the contents of such device takes a relatively long time and an electronically programmable ROM is unsuited to normal data storage and retrieval. By the way, writing data into a ROM is called programming the ROM (this meaning of programming has nothing to do with programs).

ROMs are used to store software that must be in the system when it is first switched on. For example, most computers have a bootstrap ROM containing a program designed to load the operating system from the disk. A special type of read-only memory is called flash EEPROM (electrically erasable and programmable ROM). Although this type of ROM can be programmed electronically, it is often called read-mostly memory because it is rarely modified. A computer might employ flash EEPROM to hold system constants that are changed infrequently (e.g., the format of the display, the type of the keyboard, the configuration of the memory). Because flash EEPROM is physically small (but fairly expensive), some of the smallest subnotebook computers use it to hold their operating system and applications programs.

The memories we’ve described so far are all random access memories. The remaining memory systems in the computer are all serial access devices. A typical high-performance modern personal computer might include four types of serial access memory: a floppy disk drive, one or more hard disk drives, a CD ROM drive, and a tape cartridge system. All these devices can store large amounts of data. The difference between them depends on their characteristics: cost, technology, and speed (access time). Serial access storage devices are so much slower than random access devices that programs are never executed directly from a disk or a tape. The program is first copied into high speed random access memory and then executed.