Document Processing Requirements for a Ph.D Thesis
Warren Toomey, 18th June 1998
This is an on-line version of a talk I gave to the current Ph.D students
in the School of Computer Science at ADFA. I have left the bulleted points
of the talk untouched, but I have added a section with hyperlinks to the
tools mentioned in the talk.
If you have any questions about the presentation, or the tools I used,
please email me at wkt@cs.adfa.edu.au.
- I could only tell once I had finished the whole thesis!
- Separate files, e.g per chapter, with ability to produce 1 document.
- Chapters, sections, subsection.
- Markup & layout ability, fonts, bold/italic, quotation, computer code,
math equations.
- Prefaces, appendices.
- Tables of contents, diagrams, references, etc.
- Graphics: diagrams, graphs, artwork, photos.
- Internal references: chapter, section, subsection, page. Done
automatically.
- Citation: of direct quotes, paper references. Flexible citation style.
Production of bibliography and reference sections.
- A reference database: what details can be stored, ability to add more
fields and use them.
- Document management: backups, versioning, visualisation of differences.
- Preview of output. Printing of selected pages.
- Spell checking.
- Tools you are comfortable with.
- LaTeX only processes the input files, it doesn't display them.
- LaTeX outputs documents in a special format known as DVI, or
`device independent' format.
- Several DVI viewers available. I use xdvi for X Windows.
- To print these files, I use dvips to produce PostScript
files which can then be printed on our laserprinters.
- Other DVI display and print tools are available.
- LaTeX's built-in drawing capability is terrible.
- However, there are `standard' extensions to include external
graphic files. I use the EPSF extensions. Both the xdvi
and dvips tools understand these extensions.
- I use xfig as my tool for drawing diagrams. It produces
EPSF files.
- To convert data to graphs, I use Gnuplot. It can produce
EPSF files directly, but I normally convert to xfig
format, so I can add/move labels, and then convert to EPSF.
- For screen-shots, artwork etc., xv can be used to convert
from nearly any bitmap format to EPSF format.
- The reference database tool which comes with LaTeX is BibTeX.
- Again, based on text files, which you have to hand-edit.
- Citation style within LaTeX is completely malleable. I tweaked
the `scribe' format to suit my thesis. You can define new fields
as well.
- Other standard citation formats available. Many journals
give out LaTeX templates for the format they require.
- There are some BibTeX database tools which give you a GUI front-end,
instead of manually editing the files.
- A crucial aspect of thesis production. Versioning later.
- You need backups! Use whatever tools you can use. Do it regularly!
- Home/work document migration: a pain to keep duplicates in sync.
- I have Unix at home & at work. I use rsync to synchronise
trees of (any) files joined by a network. Rsync only sends
differences where required, and also has built-in compression.
- With a 14.4K modem, I can usually rsync my work/home Ph.D
area (roughly 40 Megs) in under 5 minutes, often faster.
- You need to be able to find out when you
edited a chapter, why, and what the changes were.
- I use RCS for document versioning. When you check-in a file,
you can add a comment in describing why you checked it in. Checked-in
files are read-only.
- When you check-out a file, it gets a new version number, and
becomes writable.
- You can check-in or -out many files at the same time.
- You need to be able to print version numbers on drafts.
I modified LaTeX's page style to do this for me.
- Every document I modified as part of my Ph.D went into RCS:
thesis chapters, source code, log of activities etc.
- LaTeX and a number of other tools gave me the document processing
environment I required to write my thesis.
- It wasn't as user-friendly as current word processors, but it
had the ability to be moulded to my requirements. That was
very important!
- Even better, all the tools are freely available.
- Finally, I expect to still be able to read and use my LaTeX
documents in 10 years with little changes.
Here are some hyperlinks to information about the tools I used.
The current version of LaTeX is LaTeX2e, which differs from the LaTeX
described in Leslie Lamports book, published in 1985. A Nutshell book
by O'Reilly and Associates,
Making TeX Work
was more up to date, but is no longer being maintained by the author.
I'd welcome any other hyperlinks to good, up-to-date LaTeX books.
On-line information about LaTeX and LaTeX2e, including documentation,
can be found at the LaTeX Encyclopedia
site.
LaTeX itself, and more styles, extensions & associated tools you can poke
a stick at, can be obtained from any of the
Comprehensive TeX Archive Network
sites, also known as CTAN.
If possible, you want get a pre-compiled binary set of the LaTeX tools,
to save you the trouble of building it. If you run
FreeBSD
or
Linux
, you can obtain pre-compiled
binary packages. The same exists for Windows 95, but I don't have any
hyperlinks at hand for them.
The BiBTeX bibliographic tools come with LaTeX, and you can many
reference styles from the CTAN.
The DVI tools xdvi and dvips can be obtained through the CTAN.
xdvi has it own
home page
.
I haven't found one for dvips. Both are pretty easy to compile, and both
are available as binary packages for FreeBSD and Linux.
I used some home-grown tools written in perl to separate colour pages
from black & white pages in the PostScript output from dvips,
so I could send them to different printers.
The main tools I used were xfig to draw figures, Gnuplot to do plotting,
and xv to work with bit images. All three can produce EPSF files. I used
an old LaTeX extension, epsf.sty, to include EPSF figures. There
are newer extensions to work with EPSF files, but I haven't used them.
Check on CTAN for more details.
Xfig doesn't have a web page, but is software contributed to the X
Windows system, and is available at
ftp://ftp.x.org/
.
Gnuplot has its own
home page
.
Xv has its own
home page
.
Again, binary packages for FreeBSD and Linux. Easily built on most
Unix platforms.
Rsync is a great tool for synchronising entire trees of file
between two Unix systems connected by a (possibly slow) Internet connection.
You can find out more about rsync from its
home page
.
There are many document revision systems: RCS, SCCS, CVS, and I hope
there are some systems for Windows 95 (anybody got some hyperlinks?).
Here is the
original RCS paper
by Walter Tichy.
Here are the basic commands:
ci -u file |
Check-in a new/existing file |
|
Makes read-only, gives new version number |
co -l file |
Check-out file, makes writable |
rlog file |
Shows log of check-ins |
rcsdiff file |
On checked-out file, shows differences |
|
from last checked-in version. |
rcsdiff -rX -rY file |
At any time, shows differences |
|
between versions X and Y. |
Finally, you might be interested to know that this talk was written
in LaTeX and translated to HTML using
latex2html
. Similarly,
I have converted my Ph.D thesis to HTML and it is now
on-line
.
Warren Toomey
6/18/1998