Introduction
I asked myself:
"Crazy idea: what's the smallest number of 7400-style chips
needed to build a CPU?". This is the result, my "crazy
small" CPU. I've implemented it both in Logisim and also with
real chips on two breadboards. The chip count ended up as
three ROMs, one RAM, three registers, two counters, one multiplexer
and one timer for a total of eleven chips.
The CPU has a 4-bit
data path with an 8-bit address space (256 4-bit words). There are
two 4-bit data registers, A and B, which can be loaded across the
data bus from the RAM. The A and B values are sent to the ALU, which
can perform sixteen different operations. The ALU result can be placed
back on the data bus for writing back to RAM.
Using the
multiplexer, both the A and B registers can be loaded either from RAM or from
part of the current instruction. This allows constant literal
values to be used in the program.
The instruction
sequence is controlled by the 8-bit Program Counter (PC) which
normally increments. The PC's value chooses a location in the top and
bottom ROMs. The bottom ROM's value is the RAM location to access on
this instruction. The top ROM's value encodes the six control signals
for the CPU:
ALUop (3 bits): which ALU operation to perform
PCincr (1 bit): increment the PC (high) or load the PC (low)
Aload (1 bit, active low): load the A register from the multiplexer (low)
Bload (1 bit, active low): load the B register from the multiplexer (low)
Asel (1 bit): when loading a register, get data from the instruction (low) or from RAM (high). This control line
serves a dual purpose. It also acts as an ALUop bit, giving the ALU sixteen operations.
RAMwrite (1 bit, active low): when high, read from RAM onto the data bus;
when low, read from the ALU onto the data bus and write it to RAM
Finally, the ALU
outputs four result flags: negative (N), zero (Z), overflow (V) and
carry (C). These are buffered in the Flags register at the end of each
instruction to be used on future instructions.
The CPU can perform
jumps. When the PCincr control line is low, the address from the
bottom ROM is loaded into the PC, "jumping" to a new
instruction. To do conditional jumps, the two ROMs receive and
combine the PC's value and the NZVC value as the ROM address
to look at. This means that there are sixteen ROM "pages",
one for each combination of NZVC flags.
So here is where it
gets tricky. If you want the CPU to keep running one instruction
after the other, you need to write the instruction sequence in all 16
pages on the ROMs. Thus, regardless of what the NZVC flags are, the
correct next instruction will be found.
So, how is a
conditional jump like JCS (jump to location X if carry is set) done?
Assume that this instruction occurs at location 12 in the instruction
sequence. At location 12 on the eight ROM pages which would be
accessed when the C flag is set, a "jump" instruction is
encoded. At location 12 on the other eight ROM pages, a "nop"
instruction is encoded. Only when the C flag is set will the ROMs
access the pages with the "jump" instruction and jump.
Otherwise, the CPU will perform the "nop" instruction and
move onto the next instruction.
The ALU Operations
The ALU can perform
sixteen operations based on the A and B inputs:
A + B decimal
A - B decimal
A & B
A | B
A ^ B
A + 1
Output 0, flags set to B's value
Output 0
A + B binary
A - B binary
Output A
Output B
A * B binary, high nibble
A * B binary, low nibble
A / B binary
A % B binary
The negative (N), zero (Z), overflow (V) and carry (C) flags
are set after each ALU operation except the decimal operations.
The "Output A"
and "Output B" operations allow the ALU to pass the A and B
registers through so they can be written out to RAM.
The ALU is
implemented as a 8K x 8 bit ROM. As input it takes four A bits, four
B bits, a carry in bit and the four ALUop bits (including Asel). The output is the four
result bits and the four flag bits. The ROM simply looks up the result based
on the input.
The binary and decimal addition and subtraction operations take and use
the carry in bit. This allows N-bit or N-digit additions and subtractions.
Instruction Format
Taking the top and
bottom ROMs as a single 16-bit wide ROM, here is the instruction
format.
RAMwrite (active low)
|
Asel
|
Bload (active low)
|
Aload (active low)
|
PCincr
|
ALUop
|
RAM/ROM address
|
X
|
X
|
X
|
X
|
X
|
XXX
|
AAAAAAAA
|
This means that the
CPU is already hard-wired to perform certain instructions: we just
need to determine which instructions are actually useful. For example:
Load A with constant from ROM
|
LCA AAAA
|
10101 xxx xxxxAAAA
|
Load A from memory
|
LMA AAAAAAAA
|
11101 xxx AAAAAAAA
|
Load B from memory
|
LMB AAAAAAAA
|
11011 xxx AAAAAAAA
|
Store A to memory
|
SMA AAAAAAAA
|
00111 101 AAAAAAAA
|
Store B to memory
|
SMB AAAAAAAA
|
00111 110 AAAAAAAA
|
Put A+B to memory
|
ADDM AAAAAAAA
|
00111 000 AAAAAAAA
|
Put A-B to memory
|
SUBM AAAAAAAA
|
00111 001 AAAAAAAA
|
Jump to ROM address
|
JMP AAAAAAAA
|
10110 xxx AAAAAAAA
|
No operation
|
NOP
|
1x111 xxx xxxxxxxx
|
Example Program
Here is an example
program which calculates the first Fibonacci numbers which fit into
four bits. The columns shown are: instruction number, instruction, operand,
hex encoded instruction, list of active control lines, comment.
00: LCA 1 a801 Aload # Store 1 in locations 0 and 1
01: SMA 0 3d00 RAMwrite ALUpassa
02: SMA 1 3d01 RAMwrite ALUpassa
03: LMA 0 e800 Asel Aload # Load RAM location 0 into A
04: LMB 1 d801 Asel Bload # Load RAM location 1 into B
05: ADDM 2 3802 RAMwrite ALUadd # Add them and store in location 2
06: JCS end b00c # Result too big, exit the loop
07: LMA 1 e801 Asel Aload # Copy RAM location 1 down to 0
08: SMA 0 3d00 RAMwrite ALUpassa
09: LMA 2 e802 Asel Aload # Copy RAM location 2 down to 1
0a: SMA 1 3d01 RAMwrite ALUpassa
0b: JMP loop b003 # Go to instruction 3 to do it again
0c: JMP end b00c # End: do an infinite loop
If you run
this in the Logisim version, you can see the last three values of the
Fibonacci sequence in the first three RAM locations.
A ROM for an ALU?
Is implementing the
ALU with a ROM fair game, or is it breaking the rules? You could
argue either way. I could have hunted down a 74LS181 ALU to make the
design really authentic. I'd need two more control lines for the five
select/mode bits. Anyway, it's still one chip for the ALU, so the
choice doesn't have
too much of an impact on the design. A ROM also allows the designer
to choose what ALU operations can be performed.
Design Changes
The chip count can be reduced. Right now I'm using two 74LS161 chips for
the PC, but there is an obsolete 74LS469 chip that could replace both of
them. That would bring the chip count down to ten. Next up, replace the
two control ROMs with an AT27C1024 64Kx16 ROM chip (the only x16 ROM I could
find): that would bring the chip count down to nine. Finally, the A and B
registers could be replaced with one CD4508B dual 4-bit register as it
has separate load lines: that would bring the chip count down to eight chips.
I'll probably never do this as I would have to rewire the whole thing!
Design Limitations
It's a Harvard architecture, so instructions and data are separate.
There is only direct memory addressing: no pointers, no indexed addressing.
The available instructions are determined by the control lines, so there
isn't much that you can change. Despite this, we have been able to implement
functions, a stack and a sort of recursion.
There is only one carry bit input and output by the ALU; this really limits any
shift left/right operations to one bit shifts.