System Architecture

(INFT12-212 and 72-212)
Lab Notes for Week 3: System Calls, Branches, Control Structures, Addressing Modes

1  Introduction

This week we are looking the MIPS instructions which change the flow of execution, and which allow us to perform decisions, loops, and function/method calls. We will also consider how to translate high-level Java-esque pseudo-code into assembly code. Finally, we will look at the addressing modes available on the MIPS CPU.
Before you can proceed to do this lab, you must have fully picked up all the material from the first two lab sessions.

2  System Calls

A CPU by itself can do things like arithmetic and logical operations, read/write memory etc. There are no instructions, however, to perform high-level I/O like printing or reading values from the keyboard. To make this happen, we need some software which does all the hard work for us: libraries and the operating system.
To access the operating system, all CPUs provide a way of performing system calls. On the MIPS, the syscall instruction does this. We have to load the $v0 register with a number which indicates what type of operation is required, and put any argument into the $a0 register. Here is a list of the main system calls simulated by MARS.
Syscall $v0 Args Result
print_int 1 $a0 = integer  
print_float 2 $f12 = float  
print_double 3 $f12 = double  
print_string 4 $a0 = string  
read_int 5   integer in $v0
read_float 6   float in $f0
read_double 7   double in $f0
read_string 8 $a0 = buffer, $a1 = len  
sbrk 9 $a0 = amount address in $v0
exit 10    
For now, we are interested mainly in the print_int, print_string, read_string and read_int system calls.

2.1  Task 1

Download the program wk3syscalls.asm and load it into MARS. This performs the following pseudo-code:
   int age;
   print_string("Please tell me how old you are: ");
   age= read_int();
   print_string("You told me that your age is ");
   print_int(age);

Note that both print_int and print_string require an argument in $a0 which is the thing to print. For strings, we pass the base address of the string using the la (load address instruction). The read_int syscalls returns the integer typed by the user as a twos-complement word in $v0. Also note that we had to copy the age entered by the user from $v0 into $t0, or it would be clobbered on the next system call.
Play around with the program: run it, modify it. Make sure you understand how to perform these system calls.

3  Branches and Jumps

The key thing that makes computers useful is decision-making: based on the values of data, do alternative operations, or loop back (or not) based on the data values. For high-level languages like C or Java, we have control structures like IF .. ELSE, FOR, WHILE and DO .. WHILE.
At the CPU, there are more primitive constructs. Based on the values of data, we can:

3.1  Constructing IF .. ELSE Statements

Let's see how we can synthesize IF .. ELSE, FOR, WHILE and DO using branches. We will start with a high-level IF .. ELSE:
    # Print error message if user's age is negative: high-level version
    int age;
    print_string("Please tell me how old you are: ");
    age= read_int();
    if (age >= 0) {
       print_string("You told me that your age is ");
       print_int(age);
    } else {
       print("Silly user, that number is too small");
    }

Here is how we can do it with branches and labelled instructions:
                 # Print error message if user's age is negative: branching version
                 int age;

    main:        print_string("Please tell me how old you are: ");
                 age= read_int();
                 branch to else_clause if (age < 0);
                 print_string("You told me that your age is ");
                 print_int(age);
                 branch always to end_if;
    else_clause:
                 print("Silly user, that number is too small");
     
    end_if:      # Rest of the program

Note the two branch operations. Here's how they work:
  1. Age is OK: the comparison at the first branch fails, so we drop into the good code. When that's finished, we have to branch past the else_clause code to continue the rest of the program.
  2. Age is bad: the comparison is true, we branch past the good code to the else_clause code, print out the error message, and continue on with the rest of the program.
Note that the comparison in the first branch instruction is OPPOSITE the test in the high-level IF statement: we have to branch to skip the good code! Note also the branch to stop us dropping into the else_clause.

3.2  Constructing WHILE Loops

With loops, we have to loop backwards as well as forwards. With a WHILE loop, we decide to keep going or break out of the loop at the top, and also always loop backwards at the bottom. Consider this program to print out the numbers from 1 to 20.
    # Example of a while loop: high-level version
    int i;
    i=1;
    while (i <= 20) {
      print_int(i);
      i++;
    }

Now here is the same loop written with branches and labels.
                 # Example of a while loop: branching version
                 int i;

                 i=1;
    top_of_loop:
                 branch to end_of_loop if (i > 20);
                 print_int(i);
                 i++;
                 branch always to top_of_loop;
    end_of_loop: # Rest of the program

What you can see is that we are making explicit the flow of control which is only implicitly shown in the WHILE loop. Again note that the branch condition is the opposite of the high-level condition, as the branch is breaking us out of the loop.

3.3  Constructing DO .. WHILE Loops

DO .. WHILE loops always enter the loop at least once, and they only branch backwards, so we only need one branch instruction. Here is a DO .. WHILE loop to print out the numbers from 1 to 20.
    # Example of a do .. while loop: high-level version
    int i;
    i=1;
    do {
      print_int(i);
      i++;
    } while (i <= 20);

Now here is the same loop written with branches and labels.
                 # Example of a do .. while loop: branching version
                 int i;

                 i=1;
    top_of_loop:
                 print_int(i);
                 i++;
                 branch to top_of_loop if (i <= 20);
    end_of_loop: # Rest of the program

3.4  Constructing FOR Loops

FOR loops are just WHILE loops repackaged, with the loop initializer and loop modifier written next to the loop decision. So, here is the FOR version to print out the numbers from 1 to 20.
    # Example of a while loop: high-level version
    int i;
    for (i=1; i <= 20; i++) {
      print_int(i);
    }

Unsurprisingly, the loop written with branches and labels is the same as the WHILE loop.
                 # Example of a while loop: branching version
                 int i;

                 i=1;
    top_of_loop:
                 branch to end_of_loop if (i > 20);
                 print_int(i);
                 i++;
                 branch always to top_of_loop;
    end_of_loop: # Rest of the program

4  MIPS Branch Instructions

Now that you have got the concept of branching down, let's see some branches in real life. The list of basic MIPS branch instructions is given below.
Instruction Result Comment
beq Rs, Rt, label Branch to label if Rs == Rt  
beqz Rs, label Branch to label if Rs == 0  
bge Rs, Rt, label Branch to label if Rs >= Rt  
bgeu Rs, Rt, label Branch to label if Rs >= Rt Unsigned comparison
bgez Rs, label Branch to label if Rs >= 0  
bgt Rs, Rt, label Branch to label if Rs > Rt  
bgtu Rs, Rt, label Branch to label if Rs > Rt Unsigned comparison
blt Rs, Rt, label Branch to label if Rs < Rt  
bltu Rs, Rt, label Branch to label if Rs < Rt Unsigned comparison
bltz Rs, label Branch to label if Rs < 0  
bne Rs, Rt, label Branch to label if Rs != Rt  
bnez Rs, label Branch to label if Rs != 0  
b label Branch to label always  

4.1  Task 2

Download the program wk3whileloop.asm and load it into MARS. This is a MIPS program to print out the numbers from 1 to 20. Read through and understand how it works. Note the two loop labels: one at the top of the loop where the break-out decision is made, and one immediately after the end of the loop branch back Also note that the second argument to the branch instructions can be a small integer like 20: the assembler converts this pseudo-instruction into a few real instructions.

4.2  Task 3

Modify the above program to print out some other ranges, to print backwards etc.

4.3  Task 4

Modify the program wk3syscalls.asm to print out the error message "Silly user, that number is too small" instead of the normal output, if the user enters a negative number, i.e. add an IF .. ELSE construct to the program using branches.

4.4  What about multiple loops?

Question: if you need multiple loops, multiple IF .. ELSE statements in your assembly program, you can't re-use the labels loop: or end_of_loop: etc., so what labels should be used?
Compilers, which translate high-level languages down to assembly code, typically give each label a number: L1, L2, L3, L4 etc., and output the assembly code to match the labels. For us humans, it makes sense to use descriptive labels while making sure that each label is unique.

5  MIPS Addressing Modes

So far we have been able to read, work on and write individual data items from memory. But what about arrays and large data structures like objects, linked lists etc.? To understand how we deal with complex data structures, we need to introduce the MIPS addressing modes. This week, we will look at arrays.

5.1  Immediate and Direct Addressing

So far we have seen two addressing modes: immediate and direct modes. In immediate mode, the literal value in the instruction is used. For example:
              li $t0, 78     # Load value 78 into $t0

With direct mode, a literal value is treated as the address of some data, and that data is loaded. For example:
              .data
    mynum:    .word 78 85 90 # Values 78, 85 and 90 in RAM. Assume mynum is the location 0x1000
              .text
              la $t0, mynum  # Immediate mode: load the actual mynum address into $t0
              lw $t1, mynum  # Direct mode: load the value at mynum into $t1, i.e. 78

Here, the assembler choose an address to store the mynum value, 0x1000. If we use the la (load address) instruction, the address 0x1000 is loaded, which is immediate mode. If we use the lw (load word) instruction, then the value at that address is loaded. In fact, the assembler lets us do these instructions:
              la $t0, 0x1000  # Load 0x1000 into $t0  # Immediate
              lw $t0, 0x1000  # Load 78 into $t0      # Direct

What this implies is that the assembler is simply substituting the 0x1000 instead of the label when it convert the program into machine code.

5.2  Indirect Addressing

This still doesn't help us with arrays. We want a way of specifying an array, and an index into that array. One way is to use a pointer. A pointer is an in-memory variable, or a registers, which points at (specifies the address) of a value that we are interested in. Let's re-load the mynum address into $t0.
              la $t0, mynum

$t0 is now a pointer to the first data item at mynum. If we simply did print_int($t0) at this point, then the address would get printed out. However, if we do this:
              lw $t1, ($t0)

the parentheses mean to follow the pointer and load the value at that location. If we now did print_int($t1), then the number 78 would be printed. This type of addressing is known as indirect addressing.
Even better, if we do these instructions:
              la $t0, mynum      # Set $t0 up as a pointer to mynum
              lw $t1, ($t0)      # Get 78 into $t1
              addi $t0, $t0, 4   # Move the $t0 pointer up to the next number
              lw $t1, ($t0)      # This time we fetch the second number in the list, 85

$t0 can now be made to point at different word values, we just have to adjust the address that it is pointing at.

5.3  Indexed Addressing: Arrays

We will come back and consider pointers in more detail later. For arrays, we are used to languages like Java which specify the array and an index into the array, e.g. int x = myarray[7]; MIPS provides this type of addressing too; it is called indexed addressing. The syntax is similar to pointers, but we specify both the base address of the array and the offset (or index) into the array in a single instruction. Re-using the mynum data from the above examples:
             li $t0, 0          # Set $t0 up as the offset into the array, initially 0
             lw $t1, mynum($t0) # Load the value at address mynum+$t0, i.e load 78
             addi $t0, $t0, 4   # Increase the offset by 4 to move to the next word
             lw $t1, mynum($t0) # This loads 80 into register $t1

This time, $t0 isn't being used as a pointer, but as an offset on the array. Effectively, the last line is doing this:

5.4  Task 5

Download the program wk3addressing.asm and load it into MARS. This program demonstrates both the use of indirect addressing, where $t0 is a pointer, and indexed addressing, where $t0 is an offset into an array. Read through the program, understand what it is doing, and then execute the program in single-step mode to see what each instruction does.

6  Strings in MIPS

There are many different ways of storing strings in computers. The Java way is to create an object which contains an array of characters and an integer which is used to record the string's length. In other languages like C, the string is simply stored in consecutive byte-sized memory locations, and the string is terminated with a special character. Typically this is the byte value 0x00, also known as the NUL character.
With NUL-terminated strings, the string's length is not stored and has to be recalculated each time, which is a disadvantage. The advantages are that we save memory by using a NUL instead of storing the length, and it is impossible for the string's actual length to be inconsistent with the values stored in the length field.
In effect, C strings are still characters stored in an array of byte values, but the last value in the array is the NUL character.

6.1  Task 6

Download the program wk3tolowercase.asm and load it into MARS. This program prints a string out, then converts all the uppercase letters to lowercase using a loop, then prints it out again. Some things to note here are: Read through the program, understand what it is doing, and then execute the program in single-step mode to see what each instruction does.

7  Status Check and Some Exercises

At this point in your learning of assembly language, you have seen: This is about the same amount of knowledge as half of our introductory programming subject in Java. At this point, you should take stock of what you have learned and reinforce it. There are several ways to do this: To help you with the latter, here are some programming exercises from I2P which you should now be able to write in MIPS assembly language.
  1. Write a program to prompt the user for and read in three integer values. Print out the largest of the three values.
  2. Write a program that defines a string in the .data area, and which counts the number of digits in the string. Print out the number of digits.
  3. Write a program which reads in a set of integer values from the user, until the user enters the number 0. Calculate the running sum and running count of the numbers entered (and keep them in two registers). When the use enters the 0 number to stop, calculate the average of the numbers as an integer, and print out the average. Don't calculate or print the average if the running count is zero.
  4. Modify the last program to ensure that the user can only enter numbers in the range 0 to 100 inclusive. If they enter a number outside the range, print out an error message and loop back.
  5. Write a program that reads a number in from the user, and prints out a triangle of stars with two sides having the length entered. For example:
          Please enter a number: 7
          *
          **
          ***
          ****
          *****
          ******
          *******
    
    
  6. The .space directive tells the assembler to set aside a specified number of bytes in memory for data, e.g.
                 .data
    emptystring: .space 100
    
    
    The contents of the memory is not initialised. The read_string system call reads characters from the user into the memory: the arguments are the base address of the space and its length, e.g.
                 la $a0, emptystring
                 li $a1, 100
                 li $v0, 8      # Read string syscall
                 syscall
    
    
    Write a program which reads a string into a space in memory, and then counts the number of ASCII digits in the string.
  7. Write a program which has an array of 26 integers, all initially 0. The program reads a string from the user, and counts the number of each lowercase letter in the string, storing the results in the array. Then, the array is printed out. For example:
         Please enter a sentence: My name is Bond, James Bond
         Number of a's: 2
         Number of d's: 2
         Number of e's: 2
         Number of i's: 1
         Number of m's: 2
         Number of n's: 3
         Number of o's: 2
         Number of s's: 2
         Number of y's: 1
    
    

8  Outlook for the Next Lab

In the next lab, we will look at calling functions, how to pass parameters, how to return a value, and how to deal with the limited set of registers on the CPU.


File translated from TEX by TTH, version 3.85.
On 25 Nov 2011, 11:15.