Processor Comparison Essay, Research Paper
1. Investigate the instruction set and architectural features of a modern RISC processor such as the Digital Equipment Corporation Alpha or Motorola/IBM PowerPC. In what ways does it differ from the architecture of the Intel Pentium processor family?
The main difference between the architectures of Digital Equipment Corporation’s (DEC) Alpha and Intel’s Pentium processors are the instruction sets. In this paper I intend on defining both RISC and CISC processors. In doing this I will be comparing DEC’s Alpha 21164 (a microprocessor that implements the Alpha architecture) and also Intel’s Pentium processors (from the Pentium-R through the Pentium II).
Reduced Instruction Set Computing or RISC processing is a CPU architecture with an instruction set that eliminates some (but not all) complex instructions by pairing down and reducing them in complexity so that instructions can be performed in a single processor cycle. This is accomplished through high-level compilers that breakdown the more complex, less frequently used instructions into simpler instructions. Thus, allowing the RISC architecture to im-plement a smaller instruction set that utilizes more registers and eliminating the need for microcode.
“The Alpha architecture is a 64-bit load and store RISC architecture designed with particular emphasis on speed, multiple instruction issue, multiple processors, and software migration from many operating systems.” (1, pg. 1-1) Most recent CPU designs are superscalar and superpipelined. Superscalar means that the architecture provides two pipelines for executing multiple instructions in parallel. Superpipelining increases the number of pipeline stages, allowing for results from either pipeline to be simultaneously used to avoid stalls thus, improving data flow by removing data dependency. “The 21164 microprocessor is a superscalar pipelined processor manufactured using 0.5-micron CMOS (Complementary Metal Oxide Semi-conductor) technology.” (1, pg.1-3) The Alpha 21164 can issue four instructions in a single clock cycle. This combined with the low-latency and/or high-throughput features in the instruction issue unit and the on-chip components of the memory subsystem reduce the average cycles per instruction. All data manipulation is done between registers. The registers are 64 bits in length and all instructions are 32 bits in length. Memory operations are either load or store operations.
Since many early computers had extremely limited memory and processing power, complex instruction sets were developed. Complex instruction computing or CISC processing is a CPU architecture in which a large number of instructions are hardcoded into the chip. Intel’s Pentium processors still adhere to this philosophy.
The Pentium processor was Intel’s first CPU to employ superscalar architecture. With its 3.3 million transistors it is able to execute two instructions per clock cycle resulting in twice the integer performance relative of an Intel 486 CPU running at the same frequency. Pentium also employed on-chip dual-processing support as well as an onboard interrupt controller.
Next came the Pentium Pro, which introduced dynamic execution technology that pre-dicts the program flow through multiple branches. Multiple branch prediction lets the CPU pre-fetch possible next instructions rather than waiting for the outcome. This technology can actually change the order of executed instructions based on analyzed data dependencies, which in turn provides optimum execution speed. However, the Pentium Pro was only available in speeds from 150MHz to 200MHz and has only 16KB of internal cache (half as much as the MMX).
In 1997 Intel introduced the Pentium MMX processor. The MMX processor added1.2 million more transistors (4.5 million total) and also SIMD technology (Single Instruction, Multiple Data). SIMD technology included 57 new instructions, 4 new data types and eight 64-bit registers.
As in the original Pentium, the MMX Pentium provides both a fixed-point integer data path that allows up to two operations to be executed simultaneously, and a floating point data path that allows one operation to be performed at a time. In addition, the MMX Pentium provides a new MMX data path that allows up to two MMX operations to execute simultaneously, or up to one MMX operation and one integer operation (in the integer data path) to execute simultaneously. The inte-ger data path includes two ALUs and supports operations on 8-, 16-, and 32-bit integers. (4)
The MMX processor is available in speeds from 166MHz to 333MHz.
Finally the Pentium II processor combines the best features of both the Pentium Pro and Pentium MMX on one chip. Including a 64-bit dual independent bus (system bus & cache bus) enhances performance. This was first realized on the Pentium Pro, the pipelined system bus en-ables multiple simultaneous transactions, which accelerates the flow of information within the system and boosts overall performance. Another feature stemming from the Pentium Pro is Dynamic Execution Technology (changing the order of executed instructions based on data dependencies). With its 7.5 million transistors the Pentium II processors can handle up to 64GB of RAM. “The independent cache bus runs at half the CPU clock, giving a bus speed of 166MHz with a 333MHz processor.” (2)
In short, modern RISC processors such as the DEC Alpha 21164 execute many simple instructions by using more registers. RISC processors are able to execute the instructions rela-tively fast due to the use of fixed length instructions. The Alpha for example requires that all instructions are 32 bits in length. Each instruction is loaded and executed before the next. CISC processors on the other hand, deal with variable length instructions and typically become bloated or bogged down by complicating the job of the control unit. Intel seems to have gotten around this inconvenience by implementing Dynamic Execution Technology and dual independent buses.
1. Choose a commonly used microprocessor such as the Intel Pentium, the DEC Alpha, or the IBM/Motorola PowerPC. What data types are supported? How many bits are used to store each data type? How is each data type internally represented?
Through researching Digital Equipment Corporation’s (DEC) Alpha processor I found that the following data types are supported within its architecture: integer, and floating point formats for both IEEE and VAX. I intend to briefly explain what each data type is, how many bits are used to store, and how each data type is internally represented specifically within the DEC Alpha 21164.
The basic addressable unit in the Alpha architecture is the 8-bit byte. Virtual addresses are 64 bits long, however, the Alpha may support a smaller virtual address space of at least 43 bits. These virtual addresses are translated into physical memory addresses by the memory management mechanism. The following data types are described in terms of little-endian byte addressing, meaning; the bytes are numbered from right to left. Implementations may also include support for big-endian byte addressing (bytes numbered from left to right).
The Alpha 21164 architecture provides support for four integer data types. Integers are “whole numbers or a value that does not hav
Data Type Description
Byte A byte is 8 contiguous bits that start at an addressable byte boundary. A byte is an 8-bit value. A byte is supported in Alpha architecture by the EXTRACT, MASK, INSERT and ZAP instructions
Word A word is 2 contiguous bytes that start at an arbitrary byte boundary. A word is a 16-bit value. A word is supported in Alpha architecture by the EXTRACT, MASK, and INSERT instructions.
Longword A longword is 4 contiguous bytes that start at an arbitrary byte boundary. A longword is a 32-bit value. A longword is supported in the Alpha architecture by sign-extended load and store instructions and by longword arithmetic instructions.
Quadword A quadword is 8 contiguous bytes that start at an arbitrary byte boundary. A quadword is supported in Alpha architecture by load and store instructions and quadword integer operate instructions.
(7, pg. 1-2)
Numbers are also represented using floating-point notation. Floating point notation is just that, the radix, or decimal point is allowed to “float” or move left or right as needed. This allows the values to become either very precise or very large. The Alpha 21164 supports IEEE 754 & 854, and VAX floating point formats.
IEEE (Institute of Electrical and Electronics Engineers) addressed the lack of transportability of floating point data among different computers by setting standards for both 32- and 64-bit floating point coding formats.
The Alpha 21164 provides support for both S_floating formats and T_floating. S_floating (single precision), datum occupies four contiguous bytes in memory or 32 bits numbered 31 through 0. The 31st bit is the signed bit (indicates negative or positive values), bits 30 through 23 represent exponents, and bits 22 through 0 represent the fraction. The store instruction reorders register bits on the way to memory and does no checking of the low-order fraction bits. Thus register bits 61 through 59 and 28 through 0 are ignored. The load instruction reorders bits on the way in from memory. In doing so, the exponent is expanded from 8 to 11 bits, and the low-order fraction bits are set to zero. This produces an equivalent T_floating number.
An IEEE double precision, or T_floating, datum occupies eight contiguous bytes in memory. The bits are labeled from right to left, 0 through 63.Bit 63 is a signed bit, bits 62 through 52 repre-sent the exponent, and bits 51 through 0 represent a 52-bit fraction. In IEEE T_floating data types, no bit reordering or input checking is performed during load or store instructions as it is in S_floating.
The Alpha 21164 also supports VAX floating point formats. VAX is short for Virtual Address eXtension. VAX systems feature the operating system, VMS that support virtual memory. VAX floating-point numbers are stored in one set of formats in memory (datum) and in a second set of formats in registers. The floating-point load and store instructions convert between these formats purely by rearranging bits; no rounding or range checking is done by the load and store instruc-tions. Alpha processors support F, G, and some D floating point formats.
F_floating data types are much like IEEE S_floating in that they are also four contiguous bytes, the 31st bit is the signed bit, bits 30 through 23 represent exponents, and bits 22 through 0 represent the fraction. However, instead of producing an equivalent T_floating data type, it produces an equivalent VAX G_floating data type in the register
The G_floating operand occupies 64 bits in a register. According to Digital Equipment Company, the bits are as follows:
The form of a G_floating datum is sign magnitude with bit 15 the sign bit, bits *14:4* an excess 1024 binary exponent, and bits *3:0* and *63:16* a normal-ized 53-bit fraction with the redundant most significant fraction bit not repre-sented. Within the fraction, bits of increasing significance are from 48 through 63, 32 through 47, 16 through 31, and 0 through 3. The 11-bit exponent field encodes the values 0 through 2047. An exponent value of 0, together with a sign bit of 0, is taken to indicate that the G_floating datum has a value of 0 (9, 2-5).
D_floating data types are only partially supported in Alpha architecture.
For backward compatibility, exact D-floating arithmetic may be provided via software emulation. D_floating format compatibility in which binary files of D_floating numbers may be processed, but without the last three bits of fraction precision, can be obtained via conversions to G_floating, G arithmetic operations, then conversion back to D_floating.” (9, 2-6)
The reordering of bits required for a D_floating load or store is identical to that of G_floating load and store instructions, therefore those instructions are used for loading or storing D_floating data. Except for 32 additional fractional bits of low significance, the memory form of a D_floating datum is identical to that of a F_floating datum. Within the fraction, bits of increasing significance are from 48 through 63, 32 through 47, 16 through 31, and 0 through 6. “The exponent conventions and approximate range of values is the same for D_floating as F_floating.” (9, 2-6)
According to the Alpha Architecture Handbook hardware support is not provided by the Alpha 21164 for the following data types:
¨ Octaword (VAX data type)
¨ H_floating
¨ D_floating (except noted previously)
¨ Variable-Length
¨ Character String
¨ Trailing Numeric String
¨ Leading Separate Numeric String
¨ Packed Decimal String
(9,2-12)
In summation, the Alpha 21164 supports four integer data types, two IEEE floating-point data types, and three VAX floating point data types. The integer data types utilize the EXTRACT, MASK, and INSERT instructions. The floating-point data types use combinations of different LOAD and STORE instructions and in doing so, perform some bit reordering. In order to accommodate some of the floating point data types other data type instructions are used as in the VAX convention D_floating.
Bibliography
Resources
1. White papers on Alpha 21164, www.digital.com , July 1996
2. Building a new PC – Catching up on Technology, Morris Rosenthal (1999)
3. How does the Pentium II processor achieve its performance benefits?, www.compucon.com/arch7, June 1997
4. Intel MMX Pentium, www.bdti.com/procsum/mmx_pent, Berkley Design Technology, Inc. 1997
5. Introduction to the Intel Architecture MMX™ Technology, White papers Chapter 1, http://developer.intel.com/drg/mmx/Manuals/prm/prm_chp1
6. Systems Architecture, Second edition, Stephen D. Burd, pg. 162-164 Course Technology (1998)
7. Alpha 21164 Microprocessor Hardware Reference Manual, Digital Equipment
Corporation, Maynard, Massachusetts (July 1996)
8. http://pds.jpl..nasa.gov/stdref/appC.htm#HDR8
9. Alpha Architecture Handbook, Digital Equipment Corporation, Maynard Massachusetts (1996)