Summary for H&P Ch. 3
Pipe stage/pipe segment: conceptual stage in a pipeline that does
some amount of work on each instruction and eventually forwards the work
to the next stage.
Machine cycle: The basic time quantum of the pipeline. The
slowest pipe stage generally determines the machine cycle time.
Basic stages of DLX
- Instruction Fetch (IF) Fetch the instruction word pointed to by
the PC.
- Instruction decode/ register fetch (ID): Decode the instruction
and fetch the needed registers in parallel. This is possible because the
registers are in a fixed location in the instruction format. This
technique is known as fixed field encoding. The sign-extended
immediate is also calculated at this stage in parallel.
- Execution/effective address (EX): This stage uses the ALU in
different ways depending on the type of instruction being executed:
- Memory reference -- the ALU calculates the address of the reference.
- Register-register ALU Op -- the ALU performs an op on the two input registers.
- Register-Immediate ALU op -- same as above but with an immediate
- Branch -- the ALU calculates the address of the brand and another unit examines the branch condition register to determine if the branch should be taken.
- Memory access/ branch completionIf the instruction is a memory
op a load/store takes place. If the instruction is a branch, if the
branch is taken the PC is replaced by the branch address calculated in the
previous stage. Otherwise, the PC is incremented.
- Write-back (WB) Write the result of an ALU instruction or a
load instruction back to the register file.
Pipelining & Hazards
Pipeline registers/pipeline latches Latches that hold
intermediate values generated by pipeline stages.
Structural hazard Conflicts between resources such as functional
units.
Data hazard An instruction depends on a previous instruction in
a way that is exposed by the overlapping of instructions in the
pipeline.
Control hazard Branches and other instructions that modify the
PC.
Bubble A pipeline stall
ForwardingTechnique to reduce data hazards in pipelines. A
result from an earlier instruction is forwarded directly to the execution
phase of a later instruction instead of waiting to write the result to the
register file.
Data Hazards
For these examples, consider two instructions i and j, where i occurs
before j in program order.
- RAW j tries to read a source before i writes it.
- WAW j tries to write an operand before it is written by i.
This only happens in pipelines that allow writing in more that one pipe
stage, or in architectures that allow reordering.
- WAR j tries to write a destination before it is read by i.
Pipeline interlock Control logic that stalls the pipeline to
avoid a data hazard.
Pipeline scheduling/instruction scheduling The compiler
schedules instructions to reduce the number of stalls.
Basic block A piece of code that has no control transfers except
at the begining into it or at the end out of it.
Instruction issue When an instruction moves from the ID stage to
the EX stage, it is said to have issued. Instructions can be
stalled before issue if data hazards exist
Control Hazards
Branch delay The length of a control hazard.
Reducing pipeline branch penalties
- Freeze Stall the pipeline until the branch target is
known.
- Predict-not-taken Assume the branch is not taken, but make sure
not to modify any state until the target is known. If the branch is
taken, nullify the speculated instructions.
- Predict-taken Lame, because we still can't calculate the target
any faster. Thus not used for DLX.
- Delayed branch Introduce branch delay slots after all branches
that are executed regardless of the outcome of the branch. Insert nops if
necessary.
Cancelling branch If the branch is correctly predicted, the
instruction(s) in the delay slot are executed. If not correctly
predicted, the instruction(s) are nullified.
Static branch prediction Using compile-time information to
predict the outcome of a branch. Generally two approaches are used:
prediction based on the type of branch (predict all backward branches will
be taken and all forward branches will not be taken) or prediction using
profiled data.
Exceptions
Characterizations of exceptions:
- Synchronous vs. asynchronous An exceptions is synchronous if it
occurs at the same place every time the program is executed with the same
context (memory allocated, etc.). An exception is asynchronous if it is
caused by a device external to the cpu and memory (i.e. a disk
interrupt).
- User requested vs. coerced User requested exceptions are
predictable (like disk service interrupts). Coerced exceptions are caused
by a hardware event not under the control of the program.
- User maskable vs. user nonmaskable If the exception can be
turned off by user code, it is user maskable.
- Within vs. between instructions If the exception occurs within
the middle of executing the instruction, it is within. These types of
exceptions are always synchronous because the instruction they are within
caused the exception. An example of a within exception is a page fault
during the MEM stage.
- Resume vs. terminate If the exception causes the program to
stop, it is a terminate exception. Otherwise, it is a resume
exception.
Restartable The pipeline is able to handle the exception, save
its state, and restart the program without affecting the execution of the
program.
Precise exceptions The pipeline can be stopped so that the
instructions just before the faulting instruction are completed and those
after it can be restarted from scratch. Integer piplelines are usually
precise in modern processors, while FP pipelines can switch between
precise (slow) and imprecise (fast).
Extending DLX to include FP operations
Latency The number of cycles before the result of one
instruction is available to another instruction.
Repeat interval the number of cycles that must elapse between
issuing two operations of the same type.