Environment
After this part, you will have the following things installed:
- C compiler for RISC-V:
riscv64-unknown-linux-musl-gcc
- RISC-V ISA simulator:
spike
- RISC-V proxy OS kernel:
pk
Install from provided binary (x86_64 Linux)
wget --no-check-certificate \
https://www.cse.cuhk.edu.hk/~jlhu/csci2510/riscv-linux-amd64.tar.zst
sudo tar axvf riscv-linux-amd64.tar.zst -C /opt
sudo chown -R $(whoami) /opt/riscv
Install from provided binary (macOS with Apple Silicon)
brew install gun-tar zstd
wget --no-check-certificate \
https://www.cse.cuhk.edu.hk/~jlhu/csci2510/riscv-darwin-arm64.tar.zst
sudo gtar axvf riscv-darwin-arm64.tar.zst -C /opt
sudo chown -R $(whoami) /opt/riscv
Install from source code (both macOS and Linux with any CPU):
## install dependencies
# macOS
brew install dtc gnu-tar gnu-sed gawk flock texinfo gmp isl libmpc mpfr zstd boost
# ubuntu
sudo apt install autoconf automake autotools-dev curl python3 python3-pip \
libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex \
texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev \
ninja-build git cmake libglib2.0-dev libboost-all-dev \
device-tree-compiler
## prepare directory and permission
sudo mkdir -p /opt/riscv
sudo chown -R $(whoami) /opt/riscv
## download the source code
# option 1: download and decompress the provided tarball
gtar axvf riscv-gnu-toolchain.tar.zst
# option 2: GitHub (might take a long while depending on network conditions)
git clone --recurse-submodules https://github.com/riscv-collab/riscv-gnu-toolchain
## install musl gcc toolchain
cd riscv-gnu-toolchain
./configure --prefix=/opt/riscv
# this step will take about 30min on a 10-core M1 Pro
make musl -j$(nrpoc)
# check if installed successfully
/opt/riscv/bin/riscv64-unknown-linux-musl-gcc -v -E -x c
## spike
cd spike
mkdir -p build
cd build
../configure --prefix=/opt/riscv
make install -j$(nproc)
# check if installed successfully
file /opt/riscv/bin/spike
## pk
cd ../../pk
mkdir -p build
cd build
env PATH="$PATH:/opt/riscv/bin" RISCV=/opt/riscv ../configure --prefix=/opt/riscv --host=riscv64-unknown-linux-musl --with-arch=rv64gc_zifencei
env PATH="$PATH:/opt/riscv/bin" RISCV=/opt/riscv make install -j$(nproc)
# check if installed successfully
file /opt/riscv/riscv64-unknown-linux-musl/bin/pk
Install from package manager (macOS) (not recommended)
This way you only get gcc
with newlib
libc instead of the preferred musl
.
You might need to change riscv64-unknown-linux-musl-*
command accordingly.
brew tap riscv-software-src/riscv
brew install riscv-tools riscv-isa-sim riscv-pk
Compile, assemble and link
As introduced during the lecture, transforming a picece of human readable code into an executable program takes 3 steps, i.e. compile
, assemble
and link
.
In this part, you will walk through this process by manually finish these steps and finally compile hello.c
into an executable.
You should read hello.c
first, it is a simple program to print a hello message.
Before proceeding to the following tasks, please execute a make clean
under the source code directory to ensure we are starting with a clean tree.
Compile
Under the given source code tree, please type make compile
.
And watch out what command is been executed as well as what file is been generated.
Command you saw:
/opt/riscv/bin/riscv64-unknown-linux-musl-gcc -S -O0 -o hello.s hello.c
File generated:
hello.s
Assemble
The file you just saw is an assembly source code, please open it and get a feel of compiler generated assembly.
Next, we will assemble this file. Please type make assemble
and record what command is actually been executed as well as what new file is been created.
Command you saw:
/opt/riscv/bin/riscv64-unknown-linux-musl-gcc -c -o hello.o hello.s
File generated:
hello.o
Link
The file you just saw is an object file, which is an binary file.
To show a summary about the content, please type the following command:
make dump-<file>
.
You should replace <file>
with the created file last step.
In hello.c
we have encountered two functions, they are main
and printf
.
Please find them in the symbol table and paste the related two lines here:
Symbol table lines containing main
and printf
:
0000000000000000 g F .text 0000000000000044 main
0000000000000000 *UND* 0000000000000000 printf
You might find out that the row for main
contains F
and .text
,
they means main
is a function in the program text section.
However, the row for printf
will give you *UND*
which means undefined.
That's because printf
is provided by the C standard library.
And it's the linker's job to link the actuall code into the program which uses it.
Please type make link
and record what command is actually been executed as well as what new file is been created.
Command you saw:
/opt/riscv/bin/riscv64-unknown-linux-musl-gcc -static -no-pie -o hello hello.o
File generated:
hello
Executable file
The file you just saw is the final artifact of the overall compiling process, which are ready to be loaded into memory and executed.
It's also an binary readable by objdump
,
please try using make dump-<file>
again to see what's inside.
You might find the results overwhelmingly long,
you can use make dump-<file> | grep <keyword>
to filter out the desired info.
Please also find the related rows describing main
and printf
in the Symbol table.
Symbol table lines containing main
and printf
:
0000000000010490 g F .text 000000000000002c printf
0000000000010236 g F .text 000000000000003e main
You might find out that now both main
and printf
are both a function of the text section.
Loading and memory layout
Execute an executable is essentially loading that executable file into memory then jump to the entry point in the .text
section.
As you can see in the figure 1, the memory layout is mainly composed of text, data, stack, heap and shared libraries.
We will not discussing heap (which is used for dynamic memory management) and shared libraries (which are used for dynamic linking).
You have already seen .text
section in the executable file, it's loaded into the text region in memory.
Data region contians several data sections in the executable, such as .data
, .sdata
, .rodata
and etc.
Can you name one or more symbols that are both defined in hello.c
and loaded into data region?
csci
year
As you have probably noticed, data region mainly contains global variables.
For function-scope local variables, they are stored in the stack.
Stack grows towards lower address, i.e. when reserving additional stack space, the stack pointer sp
will be decremented.
Can you name one or more symbols that are local variable and their offset to the stack pointer in main
?
You should use the format <variable> <offset>
for each symbol, for multiple symbols, please write each one in a different row.
Symbols and their offset:
a 8
b 0
Figure extracted from here. Entry point of text section might be different.
Now we have walked though the entire process of turning C code into exectuable and running them.
Please type make load-hello
to see the results.
Output of the program:
Hello, CSCI257!
Oops! It seems the output is not exactally correct.
Can you change exactally one line of code in the generated assembly file to make it output Hello, CSCI2510!
?
After change the assembly code, you can type make load-hello
again to see the results.
If the assembly file is deleted, you can type make assemble
to generate it again.
The assembly code that needs to be changed:
add a5,a4,a5
The modified code:
mul a5,a4,a5
You also need to submit the modified assembly file.
Remember to save it before running make clean
.
Note for debugging
You can use make debug-hello
to run the executable under spike
's debug mode.
You can skip everything before entering main
by typing until pc 0 <value-of-main>
.
<value-of-main>
can be obtained from the first column of the objdump
's output, like here.
In the aboved case, we can use until pc 0 0x10236
Function call
In this part, you will learn how exactally does the RISC-V architecture support function calls like f(x, y)
by looking closely at the RISC-V calling convention.
Jump & link vs call
You might have seen call
occurred in the generated assembly code.
However, standard RISC-V instruction set does not include it.
RISC-V only defines two J&L instructions jal rd,imm
and jalr rd,rs1,imm
.
These instrucitons are able to keep the return address by storing the address of the following instruction into rd
.
As you may have guessed, call
is a pseudo instruction that is implemented using an auipc
plus a jalr
and
the register ra
will keep the return address after a call
.
We also suggest you use pseudo instructions as listed here
and here.
The gcc
generated assemblies prefer the clearer pseudo instructions as well, e.g. call
and ld/sd
etc..
Calling convention
When calling a function, there are several problems to consider:
- How can I pass variable to the callee function and get results from it?
- After the call finishes, can I continue to use my previous values stored in the registers?
- Will the callee mess with my local variables on the stack?
- Will the callee even return to the correct location?
- ...
To solve these problems, calling convention is standarized. The first part of a calling convention is a set of rules on how to use and save registers. A complete documentation can be found on RISC-V official website here.
Registers
- Argument passing:
a0-a7
are used to pass up to 8 parameters, if more are required, they can be passed by stack. - Return value passing:
a0-a1
are used to pass return vaule from callee back to caller. - Register states:
- For caller saved registers, they are assumed by the callee to be safe to modify. So, if those register are needed by the caller after finishing calling a function, the caller should save them.
- For callee saved registers, the caller can assume they are perserved and safe to use after finishing calling a function. So, if those register are modified by the callee, the callee must save them first.
- Special attention are needed on 3 registers
sp
,s0
andra
, they are used to form stack frames or ensure correct control flow transfer when returning. See the following sections on funcion calling prologue and epilogue for how to form and manage stack frames.
Stack frame and frame pointer
Eevery function needs some space on the stack to store local variables or save registers.
To clearly mark the boundary among different functions' stack content,
we have a frame pointer.
The following figure shows the stack frame of main
and f
for the given program call.c
.
Frame pointer s0
and stack pointer sp
are corresponding to the highest and lowest
address of the stack memory (frame) owned by a function,
no matter which function we are in, e.g. main
or f
.
The alignment is needed due to RISC-V standard requires sp
to be aligned to 16B
.
f
's arguments are copied to stack due to C's pass-by-value semantic.
a
, b
and c
are now f
's local variables.
Prologue
Prologue are what need to be performed on entering a function:
- Reserve enough stack space to save register states and local variables.
- Save return address and frame pointer to the stack.
- Update the frame pointer.
Epilogue
Epilogue are what need to be performed just before returning from a function:
- Store the return value using
a0-a1
. - Load the return address and restore the frame pointer from the stack.
- Return back to caller using the return address.
Can you identify which instruction(s) are part of the prologue and epilogue?
Prologue of function main
:
addi sp,sp,-32
sd ra,24(sp)
sd s0,16(sp)
addi s0,sp,32
Epilogue of function main
:
mv a0,a5
ld ra,24(sp)
ld s0,16(sp)
addi sp,sp,32
jr ra
Prologue of function f
:
addi sp,sp,-64
sd ra,56(sp)
sd s0,48(sp)
addi s0,sp,64
Epilogue of function f
:
mv a0,a5
ld ra,56(sp)
ld s0,48(sp)
addi sp,sp,64
jr ra
Can you draw out the stack frame of g
using the figure above as an example?
(Please draw the answer using asciiflow and export the results in ASCII Basic.)
old sp +----------+ s0
| saved s0 |
+24 +----------+ -8
| |
+16 +----------+ -16
| a (a0) |
+8 +----------+ -24
| b (a1) |
sp +----------+ -32
Have you noticed any inconsistency compared to main
or f
?
Can you explain why this happens?
The prologue and epilogue omit the saving and resotring of
ra
. The reason is that no other function is called byg
, so the value stored inra
will not be contaminated by other functions, and we can omit the saving and restoring.
Parameters and return value passing
Can you show the instructions in f
that are responsible for passing parameters and
storing the returned return value to stack?
li a1,2
ld a0,-40(s0)
call g
sd a0,-24(s0)
li a1,5
ld a0,-48(s0)
call g
sd a0,-32(s0)
Can you show the instructions in f
that are responsible for passing the final return value back to main?
mv a0,a5
Recursion
Recursion happens when a function calls itself. The existence of recursive function makes the need of stack to store temporary states more self-evident. In this part, you need to implement a recursive fast power algorithm using assembly.
The C code for your assembly implementation looks like this:
// exponent >= 0
long powr(long base, long exponent) {
if (exponent == 0)
return 1;
long half = powr(base, exponent / 2);
bool odd = exponent % 2;
if (odd)
return base * half * half;
else
return half * half;
}
Please finish the given template code.
recursion.s
.section .rodata
.align 3 # 2^3=8
error_msg:
.string "pow(%ld, %ld) got %ld expected %ld\n"
.text
.align 1
.globl pow
.type pow, @function
pow:
# long pow(long base, long exponent)
# prologue
# YOUR CODE HERE
bne a1,zero,pow_nonzero_exp
pow_zero_exp:
# YOUR CODE HERE
j pow_return
pow_nonzero_exp:
# YOUR CODE HERE
# calculate pow(base, exponent / 2)
# YOUR CODE HERE
beq a1,zero,pow_even
pow_odd:
# YOUR CODE HERE
j pow_return
pow_even:
# YOUR CODE HERE
pow_return:
# epilogue
# YOUR CODE HERE
.size pow, .-pow
.align 1
.globl check
.type check, @function
check:
# void check(long base, long exponent, long expected);
addi sp,sp,-64
sd ra,56(sp)
sd s0,48(sp)
addi s0,sp,64
sd a0,-40(s0) # base
sd a1,-48(s0) # exponent
sd a2,-56(s0) # expected
call pow
sd a0,-24(s0)
ld a5,-56(s0)
beq a0,a5,check_ok
check_fail:
lui a5,%hi(stderr)
ld a0,%lo(stderr)(a5)
ld a5,-56(s0) # expected
ld a4,-24(s0) # pow result
ld a3,-48(s0) # exponent
ld a2,-40(s0) # base
lui a1,%hi(error_msg)
addi a1,a1,%lo(error_msg)
call fprintf
li a0,-1
call exit # will not return
check_ok:
ld ra,56(sp)
ld s0,48(sp)
addi sp,sp,64
jr ra
.size check, .-check
.align 1
.globl main
.type main, @function
main:
addi sp,sp,-16
sd ra,8(sp)
sd s0,0(sp)
addi s0,sp,16
# check whether pow(2, 5) gives 32
li a2,32
li a1,5
li a0,2
call check
# more test cases ...
li a2,27
li a1,3
li a0,3
call check
li a2,387420489
li a1,18
li a0,3
call check
li a5,0
mv a0,a5
ld ra,8(sp)
ld s0,0(sp)
addi sp,sp,16
jr ra
.size main, .-main
You can use make load-recursion
to test if your program runs correctly.
If your program runs correctly, you should see something like this:
pow(2, 5) got 32 expected 32
pow(3, 3) got 27 expected 27
pow(3, 18) got 387420489 expected 38742048
make: *** [load-recursion] Error 255
Otherwise, you should see an error:
pow(2, 5) got 1 expected 32
make: *** [load-recursion] Error 255