The
architecture of Pentium Microprocessor:
Contributed**
by Rajesh Kothandapani
Instruction
formats

GENERAL
INSTRUCTION FORMAT
All Intel Architecture instruction
encoding are subsets of the
general instruction format shown
in Figure 2-1. Instructions
consist of optional instruction
prefixes (in any order), one
or two primary opcode bytes,
an addressing-form specifier
(if required) consisting of
the ModR/M byte and sometimes
the SIB (Scale-Index-Base) byte,
a displacement (if required),
and an immediate data field
(if required).
Generalities:
-- Many (most?) of the instructions
have exactly 2 operands.If there
are 2 operands, then one of
them will be required to use
register mode, and the other
will have no restrictions on
its addressing mode.
-- There are most often ways
of specifying the same instruction
for 8-, 16-, or 32-bit oeprands.
I left out the 16-bit ones to
reduce presentation of the instruction
set. Note that on a 32-bit machine,
with newly written code, the
16-bit form will never be used.
Meanings of the operand specifications:
reg - register mode operand,
32-bit register
reg8 - register mode operand,
8-bit register
r/m - general addressing mode,
32-bit
r/m8 - general addressing mode,
8-bit
immed - 32-bit immediate is
in the instruction
immed8 - 8-bit immediate is
in the instruction
m - symbol (label) in the instruction
is the effective address
Data Movement
-------------
mov reg, r/m ; copy data
r/m, reg
reg, immed
r/m, immed
movsx reg, r/m8 ; sign extend
and copy data
movzx reg, r/m8 ; zero extend
and copy data
lea reg, m ; get effective address
(A newer instruction, so its
format is much restricted over
the other ones.)
EXAMPLES:
mov EAX, 23 ; places 32-bit
2's complement immediate 23
; into register EAX
movsx ECX, AL ; sign extends
the 8-bit quantity in register
; AL to 32 bits, and places
it in ECX
mov [esp], -1 ; places value
-1 into memory, address given
; by contents of esp
lea EBX, loop_top ; put the
address assigned (by the assembler)
; to label loop_top into register
EBX
Integer Arithmetic
------------------
add reg, r/m ; two's complement
addition
r/m, reg
reg, immed
r/m, immed
inc reg ; add 1 to operand
r/m
sub reg, r/m ; two's complement
subtraction
r/m, reg
reg, immed
r/m, immed
dec reg ; subtract 1 from operand
r/m
neg r/m ; get additive inverse
of operand
mul eax, r/m ; unsigned multiplication
; edx||eax <- eax * r/m
imul r/m ; 2's comp. multiplication
; edx||eax <- eax * r/m
reg, r/m ; reg <- reg * r/m
reg, immed ; reg <- reg *
immed
div r/m ; unsigned division
; does edx||eax / r/m
; eax <- quotient
; edx <- remainder
idiv r/m ; 2's complement division
; does edx||eax / r/m
; eax <- quotient
; edx <- remainder
cmp reg, r/m ; sets EFLAGS based
on
r/m, immed ; second operand
- first operand
r/m8, immed8
r/m, immed8 ; sign extends immed8
before subtract
EXAMPLES:
neg [eax + 4] ; takes doubleword
at address eax+4
; and finds its additive inverse,
then places
; the additive inverse back
at that address
; the instruction should probably
be
; neg dword ptr [eax + 4]
inc ecx ; adds one to contents
of register ecx, and
; result goes back to ecx
Logical
-------
not r/m ; logical not
and reg, r/m ; logical and
reg8, r/m8
r/m, reg
r/m8, reg8
r/m, immed
r/m8, immed8
or reg, r/m ; logical or
reg8, r/m8
r/m, reg
r/m8, reg8
r/m, immed
r/m8, immed8
xor reg, r/m ; logical exclusive
or
reg8, r/m8
r/m, reg
r/m8, reg8
r/m, immed
r/m8, immed8
test r/m, reg ; logical and
to set EFLAGS
r/m8, reg8
r/m, immed
r/m8, immed8
EXAMPLES:
and edx, 00330000h ; logical
and of contents of register
; edx (bitwise) with 0x00330000,
; result goes back to edx
Floating Point Arithmetic
-------------------------
Since the newer architectures
have room for floating point
hardware on chip, Intel defined
a simple-to-implement extension
to the architecture to do floating
point arithmetic. In their usual
zeal, they have included MANY
instructions to do floating
point operations.
The mechanism is simple. A set
of 8 registers are organized
and maintained (by hardware)
as a stack of floating point
values. ST refers to the stack
top. ST(1) refers to the register
within the stack that is next
to ST. ST and ST(0) are synonyms.
There are separate instructions
to test and compare the values
of floating point variables.
finit ; initialize the FPU
fld m32 ; load floating point
value
m64
ST(i)
fldz ; load floating point value
0.0
fst m32 ; store floating point
value
m64
ST(i)
fstp m32 ; store floating point
value
m64 ; and pop ST
ST(i)
fadd m32 ; floating point addition
m64
ST, ST(i)
ST(i), ST
faddp ST(i), ST ; floating point
addition
; and pop ST
ETC.
I/O
---
The only instructions which
actually allow the reading and
writing of I/O devices are priviledged.
The OS must handle these things.
But, in writing programs that
do something useful, we need
input and output. Therefore,
there are some simple macros
defined to help us do I/O.
These are used just like instructions.
put_ch r/m ; print character
in the least significant
; byte of 32-bit operand
get_ch r/m ; character will
be in AL
put_str m ; print null terminated
string given
; by label m
Control Instructions
--------------------
These are the same control
instructions that all started
with the character 'b' in SASM.
jmp m ; unconditional jump
jg m ; jump if greater than
0
jge m ; jump if greater than
or equal to 0
jl m ; jump if less than 0
jle m ; jump if less than or
equal to 0
INSTRUCTION PREFIXES
The instruction prefixes are
divided into four groups, each
with a set of allowable prefix
codes:
· Lock and repeat prefixes.
— F0H—LOCK prefix.
— F2H—REPNE/REPNZ
prefix (used only with string
instructions).
— F3H—REP prefix
(used only with string instructions).
— F3H—REPE/REPZ
prefix (used only with string
instructions).
— F3H—Streaming
SIMD Extensions prefix.
· Segment override.
— 2EH—CS segment
override prefix.
— 36H—SS segment
override prefix.
— 3EH—DS segment
override prefix.
— 26H—ES segment
override prefix.
— 64H—FS segment
override prefix.
— 65H—GS segment
override prefix.
· Operand-size override,
66H
· Address-size override,
67H
For each instruction, one prefix
may be used from each of these
groups and be placed in any
order. The effect of redundant
prefixes (more than one prefix
from a group) is undefined and
may vary from processor to processor.
· Streaming SIMD Extensions
prefix, 0FH
The nature of Streaming SIMD
Extensions allows the use of
existing instruction formats.
Instructions use the ModR/M
format and are preceded by the
0F prefix byte. In general,
operations are not duplicated
to provide two directions (i.e.
separate load and store variants).
OPCODE
The primary OPCODE is either
1 or 2 bytes. An additional
3-bit OPCODE field is sometimes
encoded in the ModR/M byte.
Smaller encoding fields can
be defined within the primary
OPCODE. These fields define
the direction of the operation,
the size of displacements, the
register encoding, condition
codes, or sign extension. The
encoding of fields in the OPCODE
varies, depending on the class
of operation.
MODR/M AND SIB BYTES
Most instructions that refer
to an operand in memory have
an addressing-form specifier
byte (called the ModR/M byte)
following the primary OPCODE.
The ModR/M byte contains three
fields of information:
· The mod field combines
with the r/m field to form 32
possible values: eight registers
and 24 addressing modes.
· The reg/opcode field
specifies either a register
number or three more bits of
opcode information. The purpose
of the reg/opcode field is specified
in the primary opcode.
· The r/m field can
specify a register as an operand
or can be combined with the
mod field to encode an addressing
mode.
Certain encoding of the ModR/M
byte require a second addressing
byte, the SIB byte, to fully
specify the addressing form.
The base-plus-index and scale-plus-index
forms of 32-bit addressing require
the SIB byte. The SIB byte includes
the following fields:
· The scale field specifies
the scale factor.
· The index field specifies
the register number of the index
register.
· The base field specifies
the register number of the base
register.
Cont......