Description:
- Separates into multiple specifications
- User-level (unprivileged) ISA spec
- Compressed ISA spec (16-bit instructions)
- Privileged ISA spec (supervisor-mode instructions)
 
- ISA suuport is given by RV + word-width + extension supported
- RV32I means 32-bit with support for I instruction set
- RV64F means 64-bit with support for Single precision float-point instruction set
 
- 5 bit wise → 32 bit registers
User-level ISA:
- A mandatory base integer ISA (core extension) I
- Standard extensions:
- M: Integer multiplication and division
- A: Atomic Instructions
- F: Single-precision floating-point
- D: Double-precision floating-point
- C: Compressed instructions (16 bits)
 
RISC-V Includes:
- 
32 of 32-bit Register, x0-x31 - x0 is always 0
- x1 holds the return address on a call
 
- 
or 32 floating-point register x0-x31 - Each can contain a single or double precision floating point value (32 or 64 bit IEEE FP)
- from F extension
 
- 
The control unit will decide which bit to read from and write to in Register File after processed by ALU - optionally store in memory
 
- 
Each of instruction type has 1 op code to identify which to use 
- 
Arithmetric/logical: - R-type: result and two source registers, shift amount
- I-type: result and source register, shift amount in 16-bit immediate with sign/zero extension
- U-type: result register, 16-bit immediate with sign/zero extension
 
- 
Memory address: - I-type for loads and S-type for stores
- Load/store between registers and memory
- word, half-word and byte operations?
 
 
- I-type for loads and S-type for stores
- 
Control flow: - S-type: conditional branches, Program Counter-related addresses
- U-type: jump-and-link
- I-type: jump-and-link register
 
R-type instruction:
- Arithmetic, logic instructions
- Shift right might need to extend sign (sign extended)
- Source register 1 and 2 are read and fed to ALU and return the result to register file
| Op | funct7 | rs2 | rs1 | funct3 | rd | op | 
|---|---|---|---|---|---|---|
| nb of bits | 7 | 5 | 5 | 3 | 5 | 7 | 
I-type instruction:
- Immediate values instructions
- Used for adding a register value to an immediate new value
- Also used to load data from memory
- ex: LB rd, rs1, imm: R\[rd]= Sign_ext(Mem\[imm+R\[rs1]])
- meaning load byte from memory chunk rs1 and offset by imm then sign extend
- R[rs1] is the address of the chunk
- added to imm by the ALU, then retrieve from memory then store back to register file
 
- Has load byte, load half-word, load word
 
- ex: 
- Load byte:
- sw src, off(dst) => M[dst + off]
- as 1 register is 4 bytes of memory
- lb x2, 5(x8)means load from address x2, taking the 5&6th/8 bytes then store to x8
 
| Op | imm | rs1 | funct3 | rd | op | |
|---|---|---|---|---|---|---|
| nb of bits | 12 | 5 | 3 | 5 | 7 | 
S-type instruction:
- Store instruction
- Used for storing data into the memory
- ex: SB rs2, rs1, imm : Mem[imm+R[rs1]] = R[rs2]
- take the value of rs2 from register and store it in memory chunk rs1, offset by imm
 
- Has store byte, load half-word, load word
| Op | imm | rs2 | rs1 | funct3 | rd | op | 
|---|---|---|---|---|---|---|
| nb of bits | 7 | 5 | 5 | 3 | 5 | 7 | 
U-type instruction:
- Load upper immediate
- I type, add immediate can only add 12 bits
- we can extend it by adding it with another 32 bits with only 20 bits in front
- Typical usage :
- LUI x5, 0xbeef1 - store 0xbeef1000 in x5
- ADDI x5, x5, 0x123 - add 5 with an immediate, now x5 is 0xbeef1234
- So we can use x5 as 32 bits in ADD
 
| Op | imm | rd | op | |||
|---|---|---|---|---|---|---|
| nb of bits | 20 | 5 | 7 | 
B-type instruction:
- Branching instructions
- ex: BEQ rs1, rs2, L1 : if R[rs1] == R[rs2] then jump to instruction labeled L1
- then PC = address of L1 instruction
 
- if not, PC = PC + 4
- next instruction, as 1 instruction is 32/8= 4 bytes
 
 
- ex: BEQ rs1, rs2, L1 : if R[rs1] == R[rs2] then jump to instruction labeled L1
		bne x2, x3, Else // conditional
		add x12, x20, x21
		beq x0, x0, Exit // unconditional
Else:   sub x19, x20, x21
Exit:   addi x19, x19, 1
- if x2 == x3, minus them, else add them then ad 1 to result
- Always have unconditional instruction to jump out the if
- In actual, the labels are immediate denotes offsets
- example: the label else would be immediate 12 as it jumps 12 bytes (3 instructions)
- and the Exit label is immediate 8 as it jumps 2 instruction
 
- We need a way for Program Counter to know whether to plus 4 or to plus the offset, therefore, a multiplexer with selector bit from ALU will decide where to +4 or +offset
Procedure calls:
- jump and link instruction
- U-type
- jal rd, offset- R[rd] = PC+4- Address of following instruction put in x1 to know which instruction to continue after the procedure
 
- PC=PC + imm << 1- PC = address of ProcedureLabel (jumps to target address)
- The function is offset bytes from current counter
- offset = 2 imm* (help jump further to the procedure as it cant be odd number)
- better if *4, but *2 to support risc-16
 
 
 
- `jalr x0, 0(x1) // branch back to caller
- Like jal, but jumps to 0 + address in x1 (where the JAL stored our return location!)
- Use x0 as rd (x0 cannot be changed)
- JALR rd, rs1, offset- R[rd] = PC+4;
- PC=(R[rs1]+imm)
 
 
| Op | imm | rd | op | |||
|---|---|---|---|---|---|---|
| nb of bits | 20 | 5 | 7 | 
Little-endian order:
- Little-endian order: LSB is stored first, followed by more significant bytes.
Stages:
- Instruction Fetch
- Use PC to fetch current instruction, increment PC
 
- Instruction Decode
- Decode insn, generate control signals, read register file,..
 
- Execution (ALU)
- Calculate destination address, calculation, logics,…
 
- Memory Access
- Use result of ALU to read data of memory
 
- Register Writeback
- Write value to Register File
 
Single-cycle:
- Each cycle process the whole instruction → meaning only 1 stage is active at once → wasteful
- CPI (cycle per instruction) = 1
- 1 cycle is a big cycle of 1 instruction
 
- Slowest instruction determines the clock speed
Multi-cycle:
- Break 1 instruction to 4 or 5 cycles (some instructions dont have writeback)
- CPI is less than 5, approximately 4.5
- 1 cycle is 1 stage of the instruction
 
- Slowest stage determines the clock speed
RISC-V Data Pipelining
RISC-V Calling Convention
Pseudo-instructions:
| Pseudo-insn | Actual | 
|---|---|
| NOP | ADDI x0, x0, 0 | 
| MV reg, reg | ADD reg, x0, reg | 
| LI reg, 0x45678 | LUI reg, 0x4ORI reg, reg, 0x5678 | 
| J | JAL x0, offset | 
| JR rs | JALR x0, rs, 0 | 
Atomic Instructions:
- LR.W rd, (rs1): load reserved, load value but also keep update to the value later
- SC.W rd, rs2, (rs1): store conditional- if the value hasnt changed in last loaded, Success, then perform store, return 0 in rd
- if value has changed, Failure, doesnt store, return 1 in rd, try load and store again later
 
- Example:
cas:
	lr.w t0, (a0)        # Load original value.
	bne t0, a1, fail     # Doesn't match, so fail.
	sc.w t0, a2, (a0)    # Try to update.
	bnez t0, cas         # Retry if store-conditional failed.
	li a0, 0             # Set return to success.
	jr ra                # Return.
fail:
	li a0, 1             # Set return to failure.
	jr ra                # Return.