System Architecture
(INFT12-212 and 72-212)
Lab Notes for Week 3: System Calls, Branches, Control
Structures, Addressing Modes
1 Introduction
This week we are looking the MIPS instructions which change the flow
of execution, and which allow us to perform decisions, loops, and
function/method calls. We will also consider how to translate high-level
Java-esque pseudo-code into assembly code. Finally, we will look at
the addressing modes available on the MIPS CPU.
Before you can proceed to do this lab, you must have fully
picked up all the material from the first two lab sessions.
2 System Calls
A CPU by itself can do things like arithmetic and logical operations,
read/write memory etc. There are no instructions, however, to perform
high-level I/O like printing or reading values from the keyboard.
To make this happen, we need some software which does all the hard
work for us: libraries and the operating system.
To access the operating system, all CPUs provide a way of performing
system calls. On the MIPS, the syscall instruction does this.
We have to load the $v0 register with a number which indicates what
type of operation is required, and put any argument into the $a0
register. Here is a list of the main system calls simulated by MARS.
Syscall | $v0 | Args | Result |
|
print_int | 1 | $a0 = integer | |
print_float | 2 | $f12 = float | |
print_double | 3 | $f12 = double | |
print_string | 4 | $a0 = string | |
read_int | 5 | | integer in $v0 |
read_float | 6 | | float in $f0 |
read_double | 7 | | double in $f0 |
read_string | 8 | $a0 = buffer, $a1 = len | |
sbrk | 9 | $a0 = amount | address in $v0 |
exit | 10 | | |
For now, we are interested mainly in the print_int, print_string,
read_string and read_int system calls.
2.1 Task 1
Download the program wk3syscalls.asm
and load it into MARS. This performs the following pseudo-code:
int age;
print_string("Please tell me how old you are: ");
age= read_int();
print_string("You told me that your age is ");
print_int(age);
Note that both print_int and print_string require
an argument in $a0 which is the thing to print. For strings, we pass
the base address of the string using the la (load address instruction).
The read_int syscalls returns the integer typed by the user
as a twos-complement word in $v0. Also note that we had to copy the
age entered by the user from $v0 into $t0, or it would be clobbered
on the next system call.
Play around with the program: run it, modify it. Make sure you understand
how to perform these system calls.
3 Branches and Jumps
The key thing that makes computers useful is decision-making:
based on the values of data, do alternative operations, or loop back
(or not) based on the data values. For high-level languages like C
or Java, we have control structures like IF .. ELSE, FOR, WHILE and
DO .. WHILE.
At the CPU, there are more primitive constructs. Based on the values
of data, we can:
- branch forward or backward to a new instruction, not just the instruction
immediately following this one; and
- jump to a specific instruction far away, and then jump back to where
we came from.
3.1 Constructing IF .. ELSE Statements
Let's see how we can synthesize IF .. ELSE, FOR, WHILE and DO using
branches. We will start with a high-level IF .. ELSE:
# Print error message if user's age is negative: high-level version
int age;
print_string("Please tell me how old you are: ");
age= read_int();
if (age >= 0) {
print_string("You told me that your age is ");
print_int(age);
} else {
print("Silly user, that number is too small");
}
Here is how we can do it with branches and labelled instructions:
# Print error message if user's age is negative: branching version
int age;
main: print_string("Please tell me how old you are: ");
age= read_int();
branch to else_clause if (age < 0);
print_string("You told me that your age is ");
print_int(age);
branch always to end_if;
else_clause:
print("Silly user, that number is too small");
end_if: # Rest of the program
Note the two branch operations. Here's how they work:
- Age is OK: the comparison at the first branch fails, so we drop into
the good code. When that's finished, we have to branch past the else_clause
code to continue the rest of the program.
- Age is bad: the comparison is true, we branch past the good code to
the else_clause code, print out the error message, and continue
on with the rest of the program.
Note that the comparison in the first branch instruction is OPPOSITE
the test in the high-level IF statement: we have to branch to skip
the good code! Note also the branch to stop us dropping into the else_clause.
3.2 Constructing WHILE Loops
With loops, we have to loop backwards as well as forwards. With a
WHILE loop, we decide to keep going or break out of the loop at the
top, and also always loop backwards at the bottom. Consider this program
to print out the numbers from 1 to 20.
# Example of a while loop: high-level version
int i;
i=1;
while (i <= 20) {
print_int(i);
i++;
}
Now here is the same loop written with branches and labels.
# Example of a while loop: branching version
int i;
i=1;
top_of_loop:
branch to end_of_loop if (i > 20);
print_int(i);
i++;
branch always to top_of_loop;
end_of_loop: # Rest of the program
What you can see is that we are making explicit the flow of control
which is only implicitly shown in the WHILE loop. Again note that
the branch condition is the opposite of the high-level condition,
as the branch is breaking us out of the loop.
3.3 Constructing DO .. WHILE Loops
DO .. WHILE loops always enter the loop at least once, and they only
branch backwards, so we only need one branch instruction. Here is
a DO .. WHILE loop to print out the numbers from 1 to 20.
# Example of a do .. while loop: high-level version
int i;
i=1;
do {
print_int(i);
i++;
} while (i <= 20);
Now here is the same loop written with branches and labels.
# Example of a do .. while loop: branching version
int i;
i=1;
top_of_loop:
print_int(i);
i++;
branch to top_of_loop if (i <= 20);
end_of_loop: # Rest of the program
3.4 Constructing FOR Loops
FOR loops are just WHILE loops repackaged, with the loop initializer
and loop modifier written next to the loop decision. So, here is the
FOR version to print out the numbers from 1 to 20.
# Example of a while loop: high-level version
int i;
for (i=1; i <= 20; i++) {
print_int(i);
}
Unsurprisingly, the loop written with branches and labels is the same
as the WHILE loop.
# Example of a while loop: branching version
int i;
i=1;
top_of_loop:
branch to end_of_loop if (i > 20);
print_int(i);
i++;
branch always to top_of_loop;
end_of_loop: # Rest of the program
4 MIPS Branch Instructions
Now that you have got the concept of branching down, let's see some
branches in real life. The list of basic MIPS branch instructions
is given below.
Instruction | Result | Comment |
|
beq Rs, Rt, label | Branch to label if Rs == Rt | |
beqz Rs, label | Branch to label if Rs == 0 | |
bge Rs, Rt, label | Branch to label if Rs >= Rt | |
bgeu Rs, Rt, label | Branch to label if Rs >= Rt | Unsigned comparison |
bgez Rs, label | Branch to label if Rs >= 0 | |
bgt Rs, Rt, label | Branch to label if Rs > Rt | |
bgtu Rs, Rt, label | Branch to label if Rs > Rt | Unsigned comparison |
blt Rs, Rt, label | Branch to label if Rs < Rt | |
bltu Rs, Rt, label | Branch to label if Rs < Rt | Unsigned comparison |
bltz Rs, label | Branch to label if Rs < 0 | |
bne Rs, Rt, label | Branch to label if Rs != Rt | |
bnez Rs, label | Branch to label if Rs != 0 | |
b label | Branch to label always | |
4.1 Task 2
Download the program wk3whileloop.asm
and load it into MARS. This is a MIPS program to print out the numbers
from 1 to 20. Read through and understand how it works. Note the two
loop labels: one at the top of the loop where the break-out decision
is made, and one immediately after the end of the loop branch back
Also note that the second argument to the branch instructions can
be a small integer like 20: the assembler converts this pseudo-instruction
into a few real instructions.
4.2 Task 3
Modify the above program to print out some other ranges, to print
backwards etc.
4.3 Task 4
Modify the program wk3syscalls.asm
to print out the error message "Silly user, that number is too
small" instead of the normal output, if the user enters a negative
number, i.e. add an IF .. ELSE construct to the program using branches.
4.4 What about multiple loops?
Question: if you need multiple loops, multiple IF .. ELSE statements
in your assembly program, you can't re-use the labels loop:
or end_of_loop: etc., so what labels should be used?
Compilers, which translate high-level languages down to assembly code,
typically give each label a number: L1, L2, L3, L4 etc., and output
the assembly code to match the labels. For us humans, it makes sense
to use descriptive labels while making sure that each label is unique.
5 MIPS Addressing Modes
So far we have been able to read, work on and write individual data
items from memory. But what about arrays and large data structures
like objects, linked lists etc.? To understand how we deal with complex
data structures, we need to introduce the MIPS addressing modes. This
week, we will look at arrays.
5.1 Immediate and Direct Addressing
So far we have seen two addressing modes: immediate and direct modes.
In immediate mode, the literal value in the instruction is used. For
example:
li $t0, 78 # Load value 78 into $t0
With direct mode, a literal value is treated as the address of some
data, and that data is loaded. For example:
.data
mynum: .word 78 85 90 # Values 78, 85 and 90 in RAM. Assume mynum is the location 0x1000
.text
la $t0, mynum # Immediate mode: load the actual mynum address into $t0
lw $t1, mynum # Direct mode: load the value at mynum into $t1, i.e. 78
Here, the assembler choose an address to store the mynum value, 0x1000.
If we use the la (load address) instruction, the address 0x1000
is loaded, which is immediate mode. If we use the lw (load
word) instruction, then the value at that address is loaded. In fact,
the assembler lets us do these instructions:
la $t0, 0x1000 # Load 0x1000 into $t0 # Immediate
lw $t0, 0x1000 # Load 78 into $t0 # Direct
What this implies is that the assembler is simply substituting the
0x1000 instead of the label when it convert the program into machine
code.
5.2 Indirect Addressing
This still doesn't help us with arrays. We want a way of specifying
an array, and an index into that array. One way is to use a pointer.
A pointer is an in-memory variable, or a registers, which points at
(specifies the address) of a value that we are interested in. Let's
re-load the mynum address into $t0.
la $t0, mynum
$t0 is now a pointer to the first data item at mynum. If we
simply did print_int($t0) at this point, then the address
would get printed out. However, if we do this:
lw $t1, ($t0)
the parentheses mean to follow the pointer and load the value at that
location. If we now did print_int($t1), then the number 78
would be printed. This type of addressing is known as indirect addressing.
Even better, if we do these instructions:
la $t0, mynum # Set $t0 up as a pointer to mynum
lw $t1, ($t0) # Get 78 into $t1
addi $t0, $t0, 4 # Move the $t0 pointer up to the next number
lw $t1, ($t0) # This time we fetch the second number in the list, 85
$t0 can now be made to point at different word values, we just have
to adjust the address that it is pointing at.
5.3 Indexed Addressing: Arrays
We will come back and consider pointers in more detail later. For
arrays, we are used to languages like Java which specify the array
and an index into the array, e.g. int x = myarray[7];
MIPS provides this type of addressing too; it is called indexed
addressing. The syntax is similar to pointers, but we specify both
the base address of the array and the offset (or index)
into the array in a single instruction. Re-using the mynum
data from the above examples:
li $t0, 0 # Set $t0 up as the offset into the array, initially 0
lw $t1, mynum($t0) # Load the value at address mynum+$t0, i.e load 78
addi $t0, $t0, 4 # Increase the offset by 4 to move to the next word
lw $t1, mynum($t0) # This loads 80 into register $t1
This time, $t0 isn't being used as a pointer, but as an offset on
the array. Effectively, the last line is doing this:
- add the mynum address with the value in $t0, creating a new address
- loading the word at that address into $t1
5.4 Task 5
Download the program wk3addressing.asm
and load it into MARS. This program demonstrates both the use of indirect
addressing, where $t0 is a pointer, and indexed addressing, where
$t0 is an offset into an array. Read through the program, understand
what it is doing, and then execute the program in single-step mode
to see what each instruction does.
6 Strings in MIPS
There are many different ways of storing strings in computers. The
Java way is to create an object which contains an array of characters
and an integer which is used to record the string's length. In other
languages like C, the string is simply stored in consecutive byte-sized
memory locations, and the string is terminated with a special character.
Typically this is the byte value 0x00, also known as the NUL
character.
With NUL-terminated strings, the string's length is not stored and
has to be recalculated each time, which is a disadvantage. The advantages
are that we save memory by using a NUL instead of storing the length,
and it is impossible for the string's actual length to be inconsistent
with the values stored in the length field.
In effect, C strings are still characters stored in an array of byte
values, but the last value in the array is the NUL character.
6.1 Task 6
Download the program wk3tolowercase.asm
and load it into MARS. This program prints a string out, then converts
all the uppercase letters to lowercase using a loop, then prints it
out again. Some things to note here are:
- the print_string system call is given the address of the first
character in the string as its argument, i.e. a pointer to the string
is passed as the argument.
- in a high-level language, the loop can be written as for (int
i=0; string[i] != NUL; i++)
- uppercase ASCII letters are in the range 65 to 90; lowercase letters
are 97 to 122. We can convert from one to the other either by adding
32 (65 + 32 = 97), or by ORing the character with 32 which does the
same job.
- the if (ch > ='A' && ch <= 'Z') test is performed by two
consecutive branches which skip the work if the character does NOT
match the range.
Read through the program, understand what it is doing, and then execute
the program in single-step mode to see what each instruction does.
7 Status Check and Some Exercises
At this point in your learning of assembly language, you have seen:
- the use of registers and main memory.
- basic assembly instructions for maths.
- the basic data sizes, and the signed/unsigned difference.
- branch instructions using several comparisons.
- the equivalence of branching and high-level constructs such as IF
.. ELSE, DO .. WHILE, WHILE and FOR.
- several addressing modes: immediate, direct, indirect and indexed
addressing.
This is about the same amount of knowledge as half of our introductory
programming subject in Java. At this point, you should take stock
of what you have learned and reinforce it. There are several ways
to do this:
- read back through the example programs so far.
- read through other assembly programming guides such as those by Britton
and Ellard.
- write as many assembly program as you can. The only way to learn a
language is to write code in it!
To help you with the latter, here are some programming exercises from
I2P which you should now be able to write in MIPS assembly language.
- Write a program to prompt the user for and read in three integer values.
Print out the largest of the three values.
- Write a program that defines a string in the .data area,
and which counts the number of digits in the string. Print out the
number of digits.
- Write a program which reads in a set of integer values from the user,
until the user enters the number 0. Calculate the running sum and
running count of the numbers entered (and keep them in two registers).
When the use enters the 0 number to stop, calculate the average of
the numbers as an integer, and print out the average. Don't calculate
or print the average if the running count is zero.
- Modify the last program to ensure that the user can only enter numbers
in the range 0 to 100 inclusive. If they enter a number outside the
range, print out an error message and loop back.
- Write a program that reads a number in from the user, and prints out
a triangle of stars with two sides having the length entered. For
example:
Please enter a number: 7
*
**
***
****
*****
******
*******
- The .space directive tells the assembler to set aside a specified
number of bytes in memory for data, e.g.
.data
emptystring: .space 100
The contents of the memory is not initialised. The read_string
system call reads characters from the user into the memory: the arguments
are the base address of the space and its length, e.g.
la $a0, emptystring
li $a1, 100
li $v0, 8 # Read string syscall
syscall
Write a program which reads a string into a space in memory, and then
counts the number of ASCII digits in the string.
- Write a program which has an array of 26 integers, all initially 0.
The program reads a string from the user, and counts the number of
each lowercase letter in the string, storing the results in the array.
Then, the array is printed out. For example:
Please enter a sentence: My name is Bond, James Bond
Number of a's: 2
Number of d's: 2
Number of e's: 2
Number of i's: 1
Number of m's: 2
Number of n's: 3
Number of o's: 2
Number of s's: 2
Number of y's: 1
8 Outlook for the Next Lab
In the next lab, we will look at calling functions, how to pass parameters,
how to return a value, and how to deal with the limited set of registers
on the CPU.
File translated from
TEX
by
TTH,
version 3.85.
On 25 Nov 2011, 11:15.