INFT12/72-212 Systems Architecture
121 Endterm Exam Answers

Question One

The correct answers are shown in bold font.
  1. When a process writes to a page with read-only protections,
    1. the OS will mark it read-write if it is a page containing machine code.
    2. the OS will copy it and make it read-write if it is a COW page.
    3. the OS will fork() a child process to read from the page instead.
    4. the OS will terminate the process as all writes to a read-only page are illegal.
    Comment: Not all writes to a read-only page are illegal: copy-on-write pages should be read/write, but they are marked read-only until the OS make the copy and then both can be read-write. Forking has nothing to do with read-only pages, and machine code pages only have to be marked read-only.
  2. On Unix, a process in the lowest priority ready queue (127)
    1. is a CPU-bound process that continues to exceed its timeslice.
    2. is an I/O-bound process that never uses all of its timeslice.
    3. will be the first process chosen to run, no matter what processes are in the other ready queues.
    4. will never, ever be chosen to run, no matter what processes are in the other ready queues.
    Comments: In Unix, CPU-bound processes that exceed their timeslice have their priority lowered by 1 at the end of each timeslice, so a process that has the lowest priority is continually exceeding its timeslice.
  3. When a file of size 5,000 bytes is stored in a FAT filesystem,
    1. no disk space will be used up as FAT supports holey files.
    2. 10,000 bytes of disk space will be used up as there are two FAT tables.
    3. exactly 5,000 bytes of disk space will be used up.
    4. more than 5,000 bytes of disk space will be used up.
    Comment: Like all filesystems, FAT uses multiple blocks to store file contents. It is highly likely that the last block is not completely filled with file data, so that implies that more than 5,000 bytes of disk space is allocated for the file. In fact, the block size is 512 bytes, so we need to allocate 10 block, giving 5,120 bytes of allocated space.
  4. A system has two page table entries:
    Page Frame Loaded Access A M
    3 71 200 250 0 1
    4 23 95 100 1 0
    The loaded and access times are in seconds since the machine was booted. Choose the correct statement below.
    1. FIFO would choose to replace page 3 instead of page 4.
    2. NRU would choose to replace page 3 instead of page 4.
    3. LRU would choose to replace page 4 instead of page 3.
    4. Second chance would choose to replace page 4 instead of page 3.
    Comment: NRU would choose page 3 (A=0,M=1) over page 4 (A=1,M=0). FIFO would choose 4 as it was loaded the earliest. Second chance would choose page 3 (A=0) over page 4 (A=1). LRU would similarly choose page 3 (A=0) over page 4 (A=1).
  5. The MIPS ISA is considered to be a RISC architecture because:
    1. all of the registers are 32-bits wide.
    2. all the instruction formats are the same width.
    3. there are only a handful of instructions in the architecture.
    4. the architecture provides support for recursion.
    Comment: The size or uniformity of the registers is not relevant, and RISC is not about fewer instructions. All decent ISAs support recursion, CISC and RISC. One prime indicator of a RISC architecture is a fixed instruction width for all instruction formats.

Question Two

The MIPS architecture has the following three instruction formats:
R-type Instruction
op rs rt rd shamt func
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

I-type Instruction
op rs rt address/immediate
6 bits 5 bits 5 bits 16 bits

J-type Instruction
op target address
6 bits 26 bits
For each of the addressing modes listed below, consider whether the MIPS instruction set architecture uses it, could use it or could not use it. Choose exactly one of the following explanations for each addressing mode: List of addressing modes:
  1. Register indirect mode
  2. Indexed absolute mode
  3. Base plus index plus offset mode
  4. Register pre-increment mode
  5. Memory indirect mode

Answers to Question Two

If the instruction formats are already defined, we could make an addressing mode available only if there are fields in the instructions to hold the operands necessary for that addressing mode.
  1. Register indirect mode: The MIPS architecture already has this mode, and the I-type instruction format is used to provide this mode. Example: lw $t1, ($t2)
  2. Indexed absolute mode: The MIPS architecture already has this mode, and the I-type instruction format is used to provide this mode. Example: lw $t1, 300($t2)
  3. Base plus index plus offset mode: With this mode, a single operand is specified using 2 registers and an immediate value. The MIPS architecture could support this addressing mode using the I-type instruction format, but the instruction would only have a single operand. The only example I can think of would be to increment (add 1) to a location in memory specified by the 2 registers and an immediate value: inc $t1($t2)45. It would be impossible for MIPS to support this with more than 1 operand for the instruction.
  4. Register pre-increment mode: we only need to specify which register to pre-increment. The MIPS architecture could support this addressing mode using both the R- and I-type instruction format. Example: add $t0, $t1, ++$t2.
  5. Memory indirect mode: For this one to work, we need to use a literal value as a pointer, and access memory through the pointer. The only problem here is that the size of the literal fields in MIPS are either 16- or 26-bits, and that doesn't cover the whole of the 32-bit address space. If a MIPS instruction used this addressing mode, then it could only access 1/32 of the 4G address space, and so it would be useless. Therefore, the MIPS architecture could not support this mode.

Question Three

The following MIPS assembly language program receives the starting address of a C string as its single argument, and it returns a single value. Read the program and answer the following questions.
     1	function:    subu $sp, $sp, 24
     2              sw $ra, 20($sp)
     3              sw $t0, 0($sp)
     4              sw $t1, 4($sp)
     5	  
     6              beqz $a0, error
     7	  
     8              li $t1, 0
     9	loop:        lb $t0, ($a0)
    10              beqz $t0, notfound
    11              beq $t0, $t1, found
    12              move $t1, $t0
    13              add $a0, $a0, 1
    14              b loop
    15	 
    16	error:       li $v0, -1
    17              b return
    18	found:       li $v0, 1
    19              b return
    20	notfound:    li $v0, 0
    21	return:      lw $ra, 20($sp)
    22              lw $t0, 0($sp)
    23              lw $t1, 4($sp)
    24              addu $sp, $sp, 24
    25              jr $ra
  1. What possible values can the function return?
  2. What input value(s) will the function treat as an error? Give the name of the input as it is known in the C language.
  3. Describe the purpose of lines 1 to 4 and lines 21 to 25?
  4. For each loop iteration, what part of the input does registers $t0 and $t1 hold?
  5. What is the purpose of line 13?
  6. Given the input string "beer" which is stored starting at address 5,000, what value will the function return?
  7. Given the input string "flute" which is stored starting at address 6,000, what value will the function return?
  8. Describe the purpose of the function, i.e. what job does it do, what types of inputs will cause it to return its different values?

Answers to Question Three

  1. The function sets $v0 with a return value in several places before branching to the return label to return from the function. The values that are set into $v0 are -1 (line 16), 1 (line 18) and 0 (line 20). These three values are the values returned by the function.
  2. Line 6 will return an error if the $a0 argument is 0. As the input is a pointer to a C string, the 0 value is also known as a NULL pointer.
  3. Lines 1-4 build a new stack frame for the function, storing the return address and saving the values of some registers. Lines 21-25 reload the registers and the return address, and remove the stack frame from the stack.
  4. Line 9 gets a byte from the input string into $t0 on each iteration of the loop, and line 12 copies that byte into $t1 before looping back to get the next character. Therefore, $t0 and $t1 hold adjacent characters from the input string.
  5. Line 13 moves the string pointer up by 1 address, i.e. it now points at the next character in the string.
  6. Line 11 says if the 2 adjacent characters are the same, then branch to found, which returns 1. The string "beer" has 2 adjacent characters the same, so the branch on line 11 will be taken and the function will return 1.
  7. Line 10 exits the loop when the NUL character is found in the string. As there are no adjacent characters the same, we won't leave the loop because of line 11, but leave when the NUL character is found. We branch to the notfound label and return 0.
  8. Given a pointer to a NUL-terminated string, return 1 if the string contains any adjacent characters which are the same, otherwise return 0. If the input is a NULL pointer, -1 is returned.

Question Four

Both Windows 7 and Linux provide `soft links': small files which hold the name of another file. When you attempt to open a soft link, the operating system follows the name in the file and opens the real file instead.
Linux also provides `hard links': a file (its i-node attributes and actual contents) can have any number of filenames, all of equal worth. Using any of the file's names in a file operation will work on the same file.
Discuss the advantages and disadvantages of both approaches. Is either approach definitely better than the other?

Answer to Question Four

Hard links: A file has multiple names, but only 1 set of attributes and contents. The advantages are that all names are equal, so no name gets special treatment. All names share the same attributes, so there is consistency with permissions and things like time of last access or modification. One disadvantage is that hard links cannot cross filesystem boundaries: both names have to be within the same filesystem. In other words, if you mount a USB key in Linux as /media/usbkey, then all the names of the files on the USB key have to appear below /media/usbkey; you can't have one name in /home/fred, as that is on a separate filesystem.
Soft links: There is one real file, and a second file internally points at the real file. The big advantage here is that you can point to a file on another filesystem, so /home/fred/myfile can point to /media/usbkey/Thingy.java. But there are some disadvantages. If the real file is removed or renamed, the soft link stays the same and now points to "nowhere": you try to use the soft link and the action will fail. Similarly, if the mounted filesystem is unmounted, the soft link doesn't know about the change. Another problem is that there are now two files, each with their own permissions. You need to get both sets of permissions aligned to ensure access to the real file. In fact, it gets worse than that as the soft link just stores the name of the file, e.g. /media/usbkey/Thingy.java. You now need read/search access through all the directories from the top down through to the real file to access the file, i.e. /, /media and /media/usbkey. With hard links, each filename points directly at the i-node, so as long as the i-node grants you access, you can access the file.

Question Five

Define the terms internal and external fragmentation. What memory management systems suffer from each type? Explain why the memory fragmentation in partitions can or cannot be reduced. Explain why the memory fragmentation in a paged architecture can or cannot be reduced.

Answer to Question Five

Internal fragmentation: when a resource is allocated in fixed-size units, and not all of the units are completely filled up. External fragmentation: when there are gaps between resource allocations which cannot be used immediately, even if the sum of the gaps would be sufficient to satisfy a request.
Partitions suffer from external fragmentation, as there are gaps between the partitions which may be individually too small to allocate to a new process. This can be reduced, but only by copying the contents of all the allocated partitions to defragment the physical memory, leaving a single gap which could then be allocated.
A paged architecture suffers from internal fragmentation, as each page is fixed in size, and memory allocations are not exactly multiple pages in size, so the last page is not entirely used. There is no solution to internal fragmentation. We cannot allocate the leftover memory on the page to another process, as then they would share the page and this would violate the principle that one process cannot read from or write to any part of another process' address space. The only way to reduce internal fragmentation would be to design new hardware with a smaller page size, but this makes the page map much bigger and would cause slower context switches.

Question Six

On some systems, the OS does not know of the existence of threads, and so-called "user threads" have to be implemented in user mode. On other systems, the OS knows about threads, and has Thread Control Blocks rather than PCBs. What are the advantages of user threads over a thread-aware OS? What are the disadvantages?

Answer to Question Six

If the kernel knows about threads, then each thread can be scheduled independently of the other threads and processes on the system. This is an advantage over user threads. With user threads, the kernel only knows about the process (i.e. the address space), and can only schedule the process. If the process contains 4 threads, then if one of them blocks e.g. doing a read() system call, then the kernel will block the entire process, and the other 3 threads will also be blocked.
Another advantage of a thread-aware kernel over user threads is that the kernel can schedule threads on multiple CPUs if the system has multiple CPUs. With user threads, the kernel can only schedule the process, and so it is limited to a single CPU.
There are not many advantages of user threads over a thread-aware kernel. User threads have to schedule themselves, and this is usually done co-operatively, which means no thread pre-emption, i.e. a disadvantage. About the only advantage I can think of is that a thread-aware kernel has to know about the threads, and so there are more thread control blocks to keep track of than if it only knew about processes.


File translated from TEX by TTH, version 3.85.
On 13 Apr 2012, 21:19.