MOSS File System Simulator
User Guide
Contents
This document is a user guide for the MOSS
File System Simulator.
It explains how to use the simulator and describes the
programs and the various input files used by and output files produced by the simulator.
The MOSS software is designed for use with
Andrew S. Tanenbaum,
Modern Operating Systems, 2nd Edition
(Prentice Hall, 2001).
The File System Simulator and documentation were written by
Ray Ontko
(rayo@ontko.com).
This User Guide assumes that you have already installed and tested
the simulator.
If you are looking for installation information,
please read the
Installation Guide for
Unix/Linux/Solaris/HP-UX Systems or the
Installation Guide for
Win95/98/Me/NT/2000 Systems.
The file system simulator shows the inner workings of a UNIX V7
file system. The simulator reads or creates a file which
represents the disk image, and keeps track of allocated
and free blocks using a bit map. A typical exercise might be for
students to write a program (in Java) which invokes various
simulated operating system calls against a well-known disk image
provided by the instructor. Students may also be asked to
implement indirect blocks, list-based free block managment, or
write a utility (like fsck) to check and repair the file system.
The MOSS File System Simulator is a collection of Java classes
which simulate the file system calls available in a typical
Unix-like operating system. The "Kernel" class contains
methods (functions) like "creat()", "open()", "read()",
"write()", "close()", etc., which read and write blocks
in an underlying file in much the same way that a real
file system would read and write blocks on an underlying
disk device.
In addition to the "Kernel" class, there are a number of
underlying classes to support the implementation of the kernel.
The classes FileSystem, IndexNode, DirectoryEntry, SuperBlock,
Block, BitBlock, FileDescriptor, and Stat contain all data
structures and algorithms which implement the simulated
file system.
Also included are a number of sample programs which can
be used to operate on a simulated file system. The Java
programs "ls", "cat", "mkdir", "mkfs", etc., perform
file system operations to list directories, display files,
create directories, and create (initialize) file systems.
These programs illustrate the various file system calls
and allow the user to carry out various read and write
operations on the simulated file system.
As mentioned above, there is a backing file for our simulated
file system. A "dump" program is included with the distribution
so that you can examine this file, byte-by-byte. Any dump
program may be used (e.g., the "od" program in Unix); we include
this one which is simple to use and understand, and can be
used with any operating system.
There are a number of ways you can use the simulator to
get a better understanding of file systems. You can
- use the provided utility programs
(mkfs, mkdir, ls, cat, etc.) to
perform operations on the simulated file system and use
the dump program to examine the underlying file and observe
any changes,
- examine the sample utility programs to see how they use the system
call interface to perform file operations,
- enhance the sample utility programs to provide additional
functionality,
- write your own utility programs to extend the functionality of the
simulated file system, and
- modify the underlying Kernel and other implementation classes
to extend the functionality of the
In the sections which follow, you will learn what you need to
know to perform each of these activities.
The mkfs program creates a file system backing file.
It does this by creating a file whose size is specified by the
block size and number of blocks given. It writes the superblock,
the free list blocks, the inode blocks, and the data blocks
for a new file system. Note that it will overwrite any existing
file of the name specified, so be careful when you use this program.
This program is similar to the "mkfs" program found in
Unix-like operating systems.
The general format for the mkfs command is
java mkfs file-name block-size blocks
where
- file-name
- is the name of the backing file to create (e.g., filesys.dat).
Note that this is the name of a real file, not a file in simulator.
This is the file that the simulator uses to simulate the disk device
for the simulated file system.
This may be any valid file name in your operating system environment.
- block-size
- is the block size to be used for the file system (e.g., 256).
This should be a multiple of the index node (i-node) size (usually 64)
and the directory entry size (usually 16). Modern operating systems
usually use a size of 1024, or 512 bytes. We use 128 or 256 byte block
sizes in many of our examples so that you can quickly see what happens
when directories grow beyond one block. This should be a decimal
number not less than 64, but less than 32768.
- blocks
- is the number of blocks to create in the file system(e.g., 40).
This number includes any blocks that may be used for the superblock,
free list management, inodes, and data blocks. We use a relatively small
number here so that you can quickly see what happens if you run out of
disk space. This can be any decimal number greater than 3, but not greater
than 224 - 1 (the maximum number of blocks), although you may not
have sufficient space to create a very large file.
For example, the command
java mkfs filesys.dat 256 40
will create (or overwrite) a file "filesys.dat" so that it contains
40 256-byte blocks for a total of 10240 bytes.
The output from the command should look something like this:
block_size: 256
blocks: 40
super_blocks: 1
free_list_blocks: 1
inode_blocks: 8
data_blocks: 30
block_total: 40
From the output you can see that
one block is needed for the superblock, one for
free list management, eight for index nodes, and the remaining
30 are available for data blocks.
Why is there 1 block for free list management? Note that 30 blocks
require 30 bits in the free list bitmap. Since
256 bytes/block * 8 bits/byte = 2048 bits/block, clearly
one bitmap block is sufficient to track block allocation
for this file system.
Why are there 8 blocks for index nodes? Note that 30 blocks could
result in 30 inodes if many one-block files or directories are created.
Since each inode requires 64 bytes, only 4 will fit in a block.
Therefore, 8 blocks are set aside for up to 32 inodes.
The mkdir program can be used to create new
directories in our simulated file system. It does this
by creating the file specified as a directory file, and
then writing the directory entries for "." and ".." to the
newly created file. Note that all directories leading
to the new directory must already exist.
This program is similar to the "mkdir" command in Unix-like and
MS-DOS-related operating systems.
The general format for the mkdir command is
java mkdir directory-path
where
- directory-path
- is the path of the directory to be created (e.g., "/root", or
"temp", or "../home/rayo/moss/filesys"). If directory-path
does not begin with a "/", then it is appended to the
path name for working directory for the default process.
For example, the command
java mkdir /home
creates a directory called "home" as a subdirectory of the root
directory of the file system.
Similarly, the command
java mkdir /home/rayo
creates a directory called "rayo" as a subdirectory of the
"home" directory, which is presumed to already exist as a
subdirectory of the root directory of the file system.
The ls program is used to list information
about files and directories in our simulated file system.
For each file or directory name given it displays information
about the files named, or in the case of directories, for
each file in the directories named.
This program is similar to
the "ls" command in Unix-like operating systems, or the "dir"
command in DOS-related operating systems.
The general format for the ls command is
java ls path-name ...
where
- path-name ...
- is a space-separated list of one or more file or
directory path names.
For example, the command
java ls /home
lists the contents of the "/home" directory. For each file
in the directory, a line is printed showing the name of the
file or subdirectory, and other pertinent information such
as size.
The output from the command should look something like this:
/home:
1 48 .
0 48 ..
2 32 rayo
total files: 3
In this case we see that the "/home" directory contains
entries for ".", "..", and "rayo".
The tee program reads from standard input and
writes whatever is read to both standard output and
the named file. You can use this program to create
files in our simulated file system with content created
in the operating system environment.
This program is similar to the "tee" command found in
many Unix-like operating systems.
The general format for the tee command is
java tee file-path
where
- file-path
- is the name of a file to be created in the simulated
file system. If the named file already exists, it will
be overwritten.
For example,
echo "howdy, podner" | java tee /home/rayo/hello.txt
causes the single line "howdy, podner" to be written
to the file "/home/rayo/hello.txt".
The output from the command is
howdy, podner
which you should note was the same as the input sent
to the tee program by the "echo" command.
Note that the "|" (pipe) is almost always used with the
tee program. Users of Unix-like operating systems
will find the "echo", and "cat" commands useful to produce
input for the pipe to tee. Users of MS-DOS-related
operating systems will find the "echo" and "type" commands
to be useful in this regard.
If you wish to simply enter text directly to a file, then
you may use tee directly (i.e., without the pipe).
Users of Unix-like operating systems will need to use
CTRL-D to signal the end of input. Users of MS-DOS-related
operating systems will need to use CTRL-Z to signal the
end of input.
The cp program allows you to copy the contents
from one file to another in our simulated file system.
If the destination file already exists, it will be overwritten.
This program is similar to the "cp" command in Unix-like
operating systems, and the "copy" command in MS-DOS-related
operating systems.
The general format of the "cp" command is
java cp input-file-name output-file-name
where
- input-file-name
- is the path-name for the file to be copied (i.e., the
source file, and
- output-file-name
- is the path-name for the file to be created (i.e., the
target file.
For example,
java cp /home/rayo/hello.txt /home/rayo/greeting.txt
creates a new file "/home/rayo/greeting.txt" by
copying to it the contents of file "/home/rayo/hello.txt".
The cat program reads the contents of a named file
and writes it to standard output. The cat program
is generally used to display the contents of a file.
This program is similar to the "cat" command in Unix-like
operating systems, or the "type" command in MS-DOS-related
operating systems.
The general format of the cat command line is
java cat file-name
where
- file-name
- is the name of the file from which data are to be
read for writing to standard output.
For example,
java cat /home/rayo/greeting.txt
causes the file "/home/rayo/greeting.txt" to be read,
the contents of which are written to standard output.
In this case, the output from the program might look
something like this
howdy, podner
While you are working with the file system simulator,
you may wish to dump the contents of the backing file
to see if it contains what you think
it contains. The dump program shows the contents
of a file in the operating environment, one byte at a
time, in various formats (hexadecimal, decimal, ASCII).
Note that dump dumps the contents of a real file,
not a file in our simulated file system.
The general format of the dump command line is
java dump file-name
where
- file-name
- is the name of the file to be dumped. This should
generally be the name of the backing file for the file
system simulator (e.g., "filesys.dat").
The general format of the dump output is
addr hex dec asc
where
- addr
- is the decimal address of the byte,
- hex
- is the hexadecimal value of the byte,
- dec
- is the decimal value of the byte, and
- asc
- is the corresponding ASCII character if the
value is between 33 and 127 (decimal).
Each line of dump output corresponds to a single byte
in the file.
To keep the listing brief, dump only displays
non-zero bytes from the input file.
For example
java dump filesys.dat | more
causes the contents of the file "filesys.dat" to be
displayed, one line per byte. The "| more" causes
you to be prompted for each page of the output.
The first page of the output should look
something like this:
0 1 1
5 28 40 (
9 1 1
13 2 2
17 a 10
256 1f 31
512 40 64 @
515 3 3
523 30 48 0
527 ff 255
528 ff 255
529 ff 255
530 ff 255
531 ff 255
532 ff 255
533 ff 255
534 ff 255
535 ff 255
536 ff 255
537 ff 255
538 ff 255
539 ff 255
540 ff 255
541 ff 255
You should notice, for example, that the first block
(the super block) contains a few numeric values corresponding
to the block size (the 1 in the 0 byte means 256),
number of blocks, etc. The second block (starting at byte 256)
contains a few bits that are set, indicating that the first few
blocks are allocated. The third block (starting at 512)
contains a few index nodes; the FF/255 values indicate that
a direct block is unallocated. A little further down you
will see ".", and ".." for the directory entries for the
root file system, and other data blocks.
Each file system simulator program must call Kernel.initialize()
before calling any of the other Kernel methods. The
initialize() method reads a configuration file
("filesys.conf" is the default),
opens the backing file for the file system ("filesys.dat" is the default),
and performs other initializations.
This section of the user guide describes the
various options which may be set in the configuration file.
Name |
Description |
Default Value |
filesystem.root.filename |
The name of the file containing the root file system for the
simulation. |
filesys.dat |
filesystem.root.mode |
The mode to use when opening the root file system backing file.
The mode should either be "rw" for reading and writing, or "r" for
read-only access. |
rw |
process.uid |
The numeric user id (uid) to use for the default
process context.
This should be a number between 0 and 32767. |
1 |
process.gid |
The numeric group id (gid) to use for the default
process context.
This should be a number between 0 and 32767. |
1 |
process.umask |
The umask to use for the default process context. This should
be an octal number between 000 and 777. |
022 |
process.dir |
The working directory in the simulated file system to be used
for the default process context. This should be a string that
starts with "/". |
/root
|
process.max_open_files |
The maximum number of files that may be open at a
time by a process.
When a process context is created, this many slots are created for
possible open files. |
10 |
kernel.max_open_files |
The maximum number of files that may be open at one time by
all processes in the simulation. When the simulator starts, this
many slots are created for possible open files. |
20 |
In addition to the standard configuration file, "filesys.conf",
the distribution also includes a smaller sample configuration
file, "sample.conf". This is shown below to illustrate a typical
configuration file.
!
! my personal filesys configuration file
!
filesystem.root.filename = rayo.dat
filesystem.root.mode = r
process.uid = 1000
process.gid = 1000
process.umask = 002
process.dir = /home/rayo
In this particular example, the file system is contained in the
backing file "rayo.dat", which is here being opened for read-only
access. The working directory for the default process context
is "/home/rayo", with the uid, gid, and umask shown.
The default configuration file is named "filesys.conf" and is
included in the application distribution. You may modify this
file directly to set various options, or you may create your
own configuration file and specify the name of this new file
when you launch your simulator programs.
If you choose to create your own configuration file, you
will need to define a system property "filesys.conf"
which contains the name of file. For example, suppose you
wanted to run the "ls" program using "my_filesys.conf" as the
configuration file. Your java command would look something
like this:
java -Dfilesys.conf=my_filesys.conf ls /home
If there is no value set for the "filesys.conf" system property,
then the name "filesys.conf" is used as the default configuration
filename.
Writing programs that use the File System Simulator
requires the use of the Kernel class,
and may involve the use of the classes
Stat and DirectoryEntry.
If you're writing ordinary programs that use the
standard file system calls, you should not need to reference
any other classes.
These three classes are described briefly here. For more
information, follow the link for the class to the javadoc
for that class.
- Kernel
-
sets up the simulator environment and defines all the
system calls. This class defines: the method
initialize(), which is used to initialize
the file system simulator; the creat(), open(),
read(), write(), close(),
and other methods which simulate the work of a file system;
and constants like EBADF, S_IFDIR, and
O_RDONLY which are used to represent parameter or
return values for the system calls. All the methods and
fields of Kernel are static; you do not instantiate a
Kernel object. For examples, see any of the sample
programs (i.e., cat.java, cp.java,
ls.java, etc.)
- Stat
-
is a data structure that represents information about a
file or directory. This intends to faithfully represent
the Unix stat struct. You may reference fields
within a stat object directly (e.g., stat.st_ino),
or using JavaBean-style accessor/mutator methods (e.g.,
stat.getIno() or stat.setIno(). Stat
objects are updated by the methods
Kernel.stat() and Kernel.fstat().
For examples, see mkdir.java.
- DirectoryEntry
-
is a data
structure that represents a single record in a directory
file. This intends to faithfully represent a Unix
dirent struct. It contains an index node number and
a file name. You may reference the fields directly (e.g.,
dirent.d_ino), or using JavaBean-style accessor/mutator
methods (e.g., dirent.getIno() or dirent.setIno()).
However, Java programmers my find it more convenient to use
the getName() and setName()
(which use String)
instead of the field d_name (which is byte[]).
DirectoryEntry objects are updated by
the method Kernel.readdir(). For examples, see
mkdir.java and ls.java.
For more information about Unix system calls and the
stat and dirent structs, refer to a
Unix system manual. Users of Unix-like systems may
find the commands "man -S 2 creat",
"man -S 2 open", etc. to be helpful.
All programs that use the File System Simulator should
adhere to the following guidelines:
- Invoke the method Kernel.initialize()
before any other File System Simulator calls.
- Use Kernel.exit() when you wish to
terminate processing in your program.
- Check for errors after each system call (e.g.,
creat(), open(), read(),
write(), etc.).
Nearly all the system calls return -1 if an error
occurs.
- Use Kernel.perror()
to print the message associated with an error.
- Use Kernel.getErrno()
to determine which error occurred, if needed. Note that in standard
Unix programs you would reference the static process
variable "errno".
For examples, take a look at the following sample programs
in the distribution:
Collectively, these sample programs invoke all of the core methods
(system calls) of the file system simulator.
Adding new features to the File System Simulator
is an excellent way to probe your understanding of
file system operation, and to investigate new features.
Enhancements will almost certainly require changes
to the class Kernel, and may necessitate
changes to the sample programs described above.
This section describes the other classes that
implement the functionality of the simulator so
that you may understand the intended organization
of these components when making a proposed enhancement.
The following are the internal classes for
the file system simulator:
- BitBlock
- is a data structure that views a device block as a
sequence of bits. The methods setBit(),
resetBit(), and isBitSet() are used
to set, reset, or check a bit in the block.
This structure is used to implement
bitmaps, and is used by the file system simulator to
track allocated and free data blocks in the file system.
BitBlock extends Block.
- Block
- is a data structure that views a device block as a
sequence of bytes. The field bytes is an array
of byte, and is directly accessible. Included
are methods to read() and write() the
block to a java.io.RandomAccessFile, which
simulate the action of reading or writing a device block.
- FileDescriptor
-
is a structure and collection of methods that represent
an open file. It includes a number of get and
set methods for various tidbits of information
about the open file, and provides readBlock
and writeBlock() methods for reading and writing
the blocks of the file.
- FileSystem
-
is a structure and collection of methods that represent
an open (mounted) file system. It includes a few get
and set methods for various fields about the file
system, but more importantly, includes methods to open()
the file behind the file system, to read() and
write() blocks of the device, to manage blocks
(allocateBlock() and freeBlock()) and
to manage inodes (allocateIndexNode()). In general,
Kernel methods should call FileSystem
methods when they want to read or write data in the file system.
- IndexNode
-
is a structure and collection of methods
for representing an index node. This is
meant to reflect the exact structure on disk for an index
node. It includes get and set methods
for each of the fields in the index node. Also included
are read() and write() methods which
are used to copy data to and from byte arrays (not disk files).
- ProcessContext
-
is a structure and collection of methods to represent
a process. This is where the simulator stores the
uid, gid, umask, dir, and other information for the
current process. It includes get and set
methods for each of the fields in a process.
- SuperBlock
-
is a structure and collection of methods for representing
the superblock on the disk. In our implementation, the
superblock contains information about the block size,
number of blocks, offsets to the first block of the free
list, inode block, and data block areas of the device.
It includes get and set methods
for each of the fields in the superblock. Also
included are methods to read() and write()
the superblock.
Of course, you should look at the code and plan your enhancements
carefully.
- Use mkfs to create a file system with a block size
of 64 bytes and having a total of 8 blocks. How many index nodes will fit in a block?
How many directory entries will fit in a block? Use dump
to examine the file system backing file, and note the value in
byte 64. What does this value represent? Use mkdir to
create a directory (e.g., /usr), and then use dump
to examine byte 64 again. What do you notice? Repeat the process of
creating a directory (e.g., /bin, /lib, /var,
/etc, /home, /mnt, etc.) and examining with dump.
How many directories can you create before you fill up the file system?
Explain why.
- Enhance ls.java to display for each file the
uid and gid as decimal numbers,
and the 9 low-order bits of mode as a 3-digit octal number
(i.e., 000..777).
- Write a program find.java which, given a path name, checks to
see if the path exists, and if so lists that path name and all files in all
directories (and sub-directories, and sub-sub-directories, etc.) under it,
one path name per line.
For example:
java find /home
might produce the following output:
/home
/home/nathant
/home/nathant/bar.txt
/home/nathant/foo.txt
/home/rayo
/home/rayo/homer
/home/rayo/homer/odyssey.txt
/home/rayo/homer/iliad.txt
/home/rayo/virgil
/home/rayo/virgil/aeneid.txt
/home/rayo/virgil/eclogues.txt
/home/rayo/virgil/georgics.txt
under the right circumstances, of course.
Hint: Your program may include a recursive method or an array for keeping
track of each directory as you open it. What is the maximum directory
tree depth to which your program will work?
- Enhance the file system simulator to include a new method,
Kernel.chown(), which, given the name of a file, a uid,
and a gid, sets the file's uid and gid to the values given.
Note that only the owner of a file (or the super-user) may
change the gid of a file. Only the super-user may change the
uid of a file. To test your new method,
write two new programs chown.java and
chgrp.java which accept a uid or gid (respectively)
and a list of file or directory names.
- Enhance the file system simulator to include a new method,
Kernel.chmod(), which, given the name of a file and a
mode, sets the file's mode to the value given. Note that only
the owner of a file (or the super-user) may change
the mode for a file, and that only the 9 low-order bits of
mode are affected. To test your new method,
write a new program chmod.java which accepts a
mode value (000..777) and a list of file or directory names.
- Enhance the file system simulator to include a new method,
Kernel.umask(), which, given a umask, saves that
umask in the process context and returns the previous
umask value. Note that only the 9 low-order bits of the
umask are used. Further, modify Kernel.creat()
so that the mode for newly created files is the logical
and of the given mode and the compliment of
the current umask value from the process context.
- Enhance the file system simulator to include a new method,
Kernel.link(), which, given two path names, creates
the second path as a (hard) link to the first path. link()
should find the inode number for the first file, and then write a
directory entry for the second path which references the same
index node. Don't forget to increment nlink on the index node.
To test your new method, write a new program, ln.java,
which, given two path names, performs the link() operation.
Assume that creating a link to a directory is not allowed.
- Enhance the file system simulator to include a new method,
Kernel.unlink(), which, given the name of a file, removes
the directory entry for that file and decrements nlink
for the index node. If nlink is decremented to zero,
free all the blocks of the file. If the file is currently open
by any process, mark the file so that the blocks will be
freed when the file is closed by the last process.
To test your new method, write a new program rm.java
which accept the names of files to be unlinked.
Assume that unlinking a directory is not allowed.
- Enhance the file system simulator to include a new method,
Kernel.access(), which, given the name of a file and a
mode, determines whether the current process can access the file
in the desired mode. In this case, mode may be 4 (read), 2 (write),
1 (execute), or any combination thereof. If the mode is 0, only
the directories leading to the file are checked for read access.
- Enhance the file system simulator
to keep track of atime, mtime,
and ctime for file and directory operations.
Consider using java.util.Date(long),
java.util.Date.getTime(), and
java.util.Date.setTime(long) as part of your
solution.
- Enhance the file system simulator to support
indirect blocks, and double- and triple-indirect
blocks.
- Enhance the file system simulator to support
a simplified form of file sharing. Assume that a
file may not be opened for writing if it is already
open by any process; a file may be concurrently open
any number of times for reading.
- Enhance the file system simulator to allow the use
of a list instead of a bitmap for keeping track of free
data blocks. Modify mkfs.java to allow the user
to enter a fourth parameter to choose between
free-list and free-bitmap.
- Write a program called fsck.java which
checks a file system for internal consistency. For
example, it should verify that all the inodes mentioned
in directory entries have the correct number of nlink,
and that all blocks mentioned in the inodes are marked
as allocated blocks, and all blocks NOT mentioned in the
inodes are marked as free blocks. For a more complete
description of fsck, consult the text.
- Attempting to create or open for writing a file on a
read-only file system should return EROFS,
not an IOException.
- The cp program doesn't check to make sure that
the input and output files are not directory files, but should.
- The cp and tee programs are not careful
about the mode used for the output file, but should be.
© Copyright 2001, Prentice-Hall, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program (see copying.txt);
if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Please send suggestions, corrections, and comments to
Ray Ontko
(rayo@ontko.com).
Last updated: October 3, 2001