RISC-V is design for pipelineing as they all of 32 bits and have few instruction format (4)
Parrallel processing of stages
CPI is almost 1 as 1 cycle it processes 5 stages
Slowest stage still determines the clock speed
Required flip-flips in between for isolation
Fetch - Decode: Store instruction bit to be decoded
Decode - Execute: Store control information, Rd index, immediate, offsets, register value (Ra, Rb)
Execute - Memory Access: Store control information, … also result of ALU and value in case of store insn
Memory - Writeback: Store control information, … also result of load and pass result from execute
1. Structural hazards:
When multiple instructions compete for the same hardware resource in the datapath at the same time. This creates a bottleneck and disrupts the smooth flow of instructions.
like reading reading and reading from memory or register file at the same time
Time share: having 2 independent read ports and 1 write port like in Register File
Replication: having a separate adder for jump in PC
Split: memory, more on this later
2. Data hazards:
Instruction needs data before the last instruction can be done
If the arrow points forward or straight down, its ok
But pointing backward denote dependency when data is not ready
Forwarding: Allow by passing
Forward path MX: forward the value of the path between execute and memory access, route it to A and/or B of execute stage→ must have another controller to select when to skip and which to skip to
Forward path WX: forward the value of the path between memory access and write back, route it to A and/or B of execute stage → must have another controller too
However, it doesnt solve Killer hazard:
in the case of load and use right after, as data is only available after it read from memory
But after memory access, the next instruction using it already need to be executed
We can fetch the further next instructions that dont have dependence but is not a good approach
Solution to Killer hazard:
Pause current and subsequent instructions till safe
ex: 1. lb x1, x2, 4 and 2. or x3, x1, 1 →
1: F1
2: D1, F2
3: X1, D2* (hall), F3* (also hall subsequent instruction)
4: M1, D2, F3,…
3. Control hazards:
Next instruction cant be determined until the last control instruction is done processing
like if, for, while,… there are 2 possible next instructions
After an “if”, we fetched and decoded the next 2 instructions (always 2). Then the execution of the “if” different from the fetched instructions (jump or not jump) then we zap, clear out the last 2 instructions and fetch the other 2
set all the registers in the middle of the stages to 0
Solution: branch prediction
Simple prediction: taken or not taken?
Take 8 bits of the pc to predict the decision at the next branch using 1 bit
zap pipeline if predict wrong
A bit more sophisticated prediction: strong take, take, not take, strong not take
Take 8 bits of the pc to predict the decision at the next branch using 2 bits
strong take, take, not take, strong not take
zap pipeline if predict wrong
And if prediction is correct, more sure, if prediction is wrong, be less sure or move change prediction