| RISC
(Reduced Instruction Set Computer)
A RISC (reduced instruction set
computer) is a microprocessor
that is designed to perform a
smaller number of types of computer
instruction so that it can operate
at a higher speed (perform more
million instructions per second,
or millions of instructions per
second). Since each instruction
type that a computer must perform
requires additional transistors
and circuitry, a larger list or
set of computer instructions tends
to make the microprocessor more
complicated and slower in operation.
John Cocke of IBM Research in
Yorktown, New York, originated
the RISC concept in 1974 by proving
that about 20% of the instructions
in a computer did 80% of the work.
The first computer to benefit
from this discovery was IBM's
PC/XT in 1980. Later, IBM's RISC
System/6000, made use of the idea.
The term itself (RISC) is credited
to David Patterson, a teacher
at the University of California
in Berkeley. The concept was used
in Sun Microsystems' SPARC microprocessors
and led to the founding of what
is now MIPS Technologies, part
of Silicon Graphics. DEC's Alpha
microchip also uses RISC technology.
The RISC concept has led to a
more thoughtful design of the
microprocessor. Among design considerations
are how well an instruction can
be mapped to the clock speed of
the microprocessor (ideally, an
instruction can be performed in
one clock cycle); how "simple"
an architecture is required; and
how much work can be done by the
microchip itself without resorting
to software help.
Besides performance improvement,
some advantages of RISC and related
design improvements are:
A new microprocessor can be
developed and tested more
quickly if one of its aims
is to be less complicated.
|
Operating system and application
programmers who use the microprocessor's
instructions will find it
easier to develop code with
a smaller instruction set.
|
The simplicity of RISC allows
more freedom to choose how
to use the space on a microprocessor.
|
Higher-level language compilers
produce more efficient code than
formerly because they have always
tended to use the smaller set
of instructions to be found in
a RISC computer.
RISC characteristics
Simple
instruction set.
In a RISC machine, the instruction
set contains simple, basic
instructions, from which more
complex instructions can be
composed. |
Same length instructions.
Each instruction is the same
length, so that it may be
fetched in a single operation.
|
1 machine-cycle instructions.
Most instructions complete
in one machine cycle, which
allows the processor to handle
several instructions at the
same time. This pipelining
is a key technique used to
speed up RISC machines. |
Inside a RISC Machine
Pipelining: A key RISC technique
RISC designers are concerned primarily
with creating the fastest chip
possible, and so they use a number
of techniques, including pipelining.
Pipelining is a design technique
where the computer's hardware
processes more than one instruction
at a time, and doesn't wait for
one instruction to complete before
starting the next.
Remember the four stages in our
typical CISC machine? They were
fetch, decode, execute, and write.
These same stages exist in a RISC
machine, but the stages are executed
in parallel. As soon as one stage
completes, it passes on the result
to the next stage and then begins
working on another instruction.
As you can see from the animation
above, the performance of a pipelined
system depends on the time it
takes only for any one stage to
be completed---not on the total
time for all stages as with non-pipelined
designs.
In an typical pipelined RISC
design, each instruction takes
1 clock cycle for each stage,
so the processor can accept 1
new instruction per clock. Pipelining
doesn't improve the latency of
instructions (each instruction
still requires the same amount
of time to complete), but it does
improve the overall throughput.
As with CISC computers, the ideal
is not always achieved. Sometimes
pipelined instructions take more
than one clock to complete a stage.
When that happens, the processor
has to stall and not accept new
instructions until the slow instruction
has moved on to the next stage.
Since the processor is sitting
idle when stalled, both the designers
and programmers of RISC systems
make a conscious effort to avoid
stalls. To do this, designers
employ several techniques, as
shown in the following sections.
Performance issues in
pipelined systems
A pipelined processor can stall
for a variety of reasons, including
delays in reading information
from memory, a poor instruction
set design, or dependencies between
instructions. The following pages
examine some of the ways that
chip designers and system designers
are addressing these problems.
Memory speed
Memory speed issues are commonly
solved using caches. A cache is
a section of fast memory placed
between the processor and slower
memory. When the processor wants
to read a location in main memory,
that location is also copied into
the cache. Subsequent references
to that location can come from
the cache, which will return a
result much more quickly than
the main memory.
Caches present one major problem
to system designers and programmers,
and that is the problem of coherency.
When the processor writes a value
to memory, the result goes into
the cache instead of going directly
to main memory. Therefore, special
hardware (usually implemented
as part of the processor) needs
to write the information out to
main memory before something else
tries to read that location or
before re-using that part of the
cache for some different information.
Instruction Latency
A poorly designed instruction
set can cause a pipelined processor
to stall frequently. Some of the
more common problem areas are:
Highly encoded instructions---such
as those used on CISC machines---that
require a ulating and testing
thed of cal to decode |
Variable-length instructions
which require multiple references
to memory to fetch in the
entire instruction. |
Instructions which access
main memory (instead of registers),
since main memory can be slow
|
Complex instructions which
require multiple clocks for
execution (many floating-point
operations, for example.)
|
Instructions
which need to read and write
the same register. For example
"ADD 5 to register 3"
had to read register 3, add
5 to that value, then write
5 back to the same register
(which may still be "busy"
from the earlier read operation,
causing the processor to stall
until the register becomes
available.) |
Dependence on single-point
resources such as a condition
code register. If one instruction
sets the conditions in the
condition code register and
the following instruction
tries to read those bits,
the second instruction may
have to stall until the first
instruction's write completes.
|
Dependencies
One problem that RISC programmers
face is that the processor can
be slowed down by a poor choice
of instructions. Since each instruction
takes some amount of time to store
its result, and several instructions
are being handled at the same
time, later instructions may have
to wait for the results of earlier
instructions to be stored. However,
a simple rearrangement of the
instructions in a program (called
Instruction Scheduling) can remove
these performance limitations
from RISC programs.
One common optimization involves
"common subexpression elimination."
A compiler which encounters the
commands:
B = 10 * (A / 3);
C = (A/ 3) / 4;
might calculate (A/3) first, put
that result into a temporary variable,
and then use the temporary variable
in later calculations.
Another optimization involves
"loop unrolling." Instead
of executing a sequence of instruction
inside a loop, the compiler may
replicate the instructions multiple
times. This eliminates the overhead
of calculating and testing the
loop control variable.
Compilers also perform function
inlining, where a call to a small
subroutine is replaced by the
code of the subroutine itself.
This gets rid of the overhead
of a call/return sequence.
This is only a small sample of
the optimizations which are available.
Consult a good textbook on compilers
for other ideas on how compiled
code may be optimized.
RISC Pros and Cons
The advantages of RISC
Implementing a processor with
a simplified instruction set design
provides several advantages over
implementing a comparable CISC
design:
Speed. Since a simplified
instruction set allows for
a pipelined, superscalar design
RISC processors often achieve
2 to 4 times the performance
of CISC processors using comparable
semiconductor technology and
the same clock rates. |
Simpler hardware. Because
the instruction set of a RISC
processor is so simple, it
uses up much less chip space;
extra functions, such as memory
management units or floating
point arithmetic units, can
also be placed on the same
chip. Smaller chips allow
a semconductor manufacturer
to place more parts on a single
silicon wafer, which can lower
the per-chip cost dramatically.
|
Shorter design cycle. Since
RISC processors are simpler
than corresponding CISC processors,
they can be designed more
quickly, and can take advantage
of other technological developments
sooner than corresponding
CISC designs, leading to greater
leaps in performance between
generations. |
The hazards of RISC
The transition from a CISC design
strategy to a RISC design strategy
isn't without its problems. Software
engineers should be aware of the
key issues which arise when moving
code from a CISC processor to
a RISC processor.
Code Quality
The performance of a RISC processor
depends greatly on the code that
it is executing. If the programmer
(or compiler) does a poor job
of instruction scheduling, the
processor can spend quite a bit
of time stalling: waiting for
the result of one instruction
before it can proceed with a subsequent
instruction.
Since the scheduling rules can
be complicated, most programmers
use a high level language (such
as C or C++) and leave the instruction
scheduling to the compiler.
This makes the performance of
a RISC application depend critically
on the quality of the code generated
by the compiler. Therefore, developers
(and development tool suppliers
such as Apple) have to choose
their compiler carefully based
on the quality of the generated
code.
Debugging
Unfortunately, instruction scheduling
can make debugging difficult.
If scheduling (and other optimizations)
are turned off, the machine-language
instructions show a clear connection
with their corresponding lines
of source. However, once instruction
scheduling is turned on, the machine
language instructions for one
line of source may appear in the
middle of the instructions for
another line of source code.
Such an intermingling of machine
language instructions not only
makes the code hard to read, it
can also defeat the purpose of
using a source-level compiler,
since single lines of code can
no longer be executed by themselves.
Therefore, many RISC programmers
debug their code in an un-optimized,
un-scheduled form and then turn
on the scheduler (and other optimizations)
and hope that the program continues
to work in the same way.
Code
expansion
Since CISC machines perform complex
actions with a single instruction,
where RISC machines may require
multiple instructions for the
same action, code expansion can
be a problem.
Code expansion refers to the increase
in size that you get when you
take a program that had been compiled
for a CISC machine and re-compile
it for a RISC machine. The exact
expansion depends primarily on
the quality of the compiler and
the nature of the machine's instruction
set.
Fortunately for us, the code expansion
between a 68K processor used in
the non-PowerPC Macintoshes and
the PowerPC seems to be only 30-50%
on the average, although size-optimized
PowerPC code can be the same size
(or smaller) than corresponding
68K code.
System Design
Another problem that faces RISC
machines is that they require
very fast memory systems to feed
them instructions. RISC-based
systems typically contain large
memory caches, usually on the
chip itself. This is known as
a first-level cache.

|