Introduction

I asked myself: "Crazy idea: what's the smallest number of 7400-style chips needed to build a CPU?". This is the result, my "crazy small" CPU. I've implemented it both in Logisim and also with real chips on two breadboards. The chip count ended up as three ROMs, one RAM, three registers, two counters, one multiplexer and one timer for a total of eleven chips.

The CPU has a 4-bit data path with an 8-bit address space (256 4-bit words). There are two 4-bit data registers, A and B, which can be loaded across the data bus from the RAM. The A and B values are sent to the ALU, which can perform sixteen different operations. The ALU result can be placed back on the data bus for writing back to RAM.

Using the multiplexer, both the A and B registers can be loaded either from RAM or from part of the current instruction. This allows constant literal values to be used in the program.

The instruction sequence is controlled by the 8-bit Program Counter (PC) which normally increments. The PC's value chooses a location in the top and bottom ROMs. The bottom ROM's value is the RAM location to access on this instruction. The top ROM's value encodes the six control signals for the CPU:

Finally, the ALU outputs four result flags: negative (N), zero (Z), overflow (V) and carry (C). These are buffered in the Flags register at the end of each instruction to be used on future instructions.

The CPU can perform jumps. When the PCincr control line is low, the address from the bottom ROM is loaded into the PC, "jumping" to a new instruction. To do conditional jumps, the two ROMs receive and combine the PC's value and the NZVC value as the ROM address to look at. This means that there are sixteen ROM "pages", one for each combination of NZVC flags.

So here is where it gets tricky. If you want the CPU to keep running one instruction after the other, you need to write the instruction sequence in all 16 pages on the ROMs. Thus, regardless of what the NZVC flags are, the correct next instruction will be found.

So, how is a conditional jump like JCS (jump to location X if carry is set) done? Assume that this instruction occurs at location 12 in the instruction sequence. At location 12 on the eight ROM pages which would be accessed when the C flag is set, a "jump" instruction is encoded. At location 12 on the other eight ROM pages, a "nop" instruction is encoded. Only when the C flag is set will the ROMs access the pages with the "jump" instruction and jump. Otherwise, the CPU will perform the "nop" instruction and move onto the next instruction.

The ALU Operations

The ALU can perform sixteen operations based on the A and B inputs:

  1. A + B decimal

  2. A - B decimal

  3. A & B

  4. A | B

  5. A ^ B

  6. A + 1

  7. Output 0, flags set to B's value

  8. Output 0

  9. A + B binary

  10. A - B binary

  11. Output A

  12. Output B

  13. A * B binary, high nibble

  14. A * B binary, low nibble

  15. A / B binary

  16. A % B binary

The negative (N), zero (Z), overflow (V) and carry (C) flags are set after each ALU operation except the decimal operations. The "Output A" and "Output B" operations allow the ALU to pass the A and B registers through so they can be written out to RAM.

The ALU is implemented as a 8K x 8 bit ROM. As input it takes four A bits, four B bits, a carry in bit and the four ALUop bits (including Asel). The output is the four result bits and the four flag bits. The ROM simply looks up the result based on the input.

The binary and decimal addition and subtraction operations take and use the carry in bit. This allows N-bit or N-digit additions and subtractions.

Instruction Format

Taking the top and bottom ROMs as a single 16-bit wide ROM, here is the instruction format.

RAMwrite (active low)

Asel

Bload (active low)

Aload (active low)

PCincr

ALUop

RAM/ROM address

X

X

X

X

X

XXX

AAAAAAAA

This means that the CPU is already hard-wired to perform certain instructions: we just need to determine which instructions are actually useful. For example:

Load A with constant from ROM

LCA AAAA

10101 xxx xxxxAAAA

Load A from memory

LMA AAAAAAAA

11101 xxx AAAAAAAA

Load B from memory

LMB AAAAAAAA

11011 xxx AAAAAAAA

Store A to memory

SMA AAAAAAAA

00111 101 AAAAAAAA

Store B to memory

SMB AAAAAAAA

00111 110 AAAAAAAA

Put A+B to memory

ADDM AAAAAAAA

00111 000 AAAAAAAA

Put A-B to memory

SUBM AAAAAAAA

00111 001 AAAAAAAA

Jump to ROM address

JMP AAAAAAAA

10110 xxx AAAAAAAA

No operation

NOP

1x111 xxx xxxxxxxx

Example Program

Here is an example program which calculates the first Fibonacci numbers which fit into four bits. The columns shown are: instruction number, instruction, operand, hex encoded instruction, list of active control lines, comment.

00: LCA 1	a801	Aload 		   # Store 1 in locations 0 and 1
01: SMA 0	3d00	RAMwrite ALUpassa 
02: SMA 1	3d01	RAMwrite ALUpassa 
03: LMA 0	e800	Asel Aload 	   # Load RAM location 0 into A
04: LMB 1	d801	Asel Bload 	   # Load RAM location 1 into B
05: ADDM 2	3802	RAMwrite ALUadd    # Add them and store in location 2
06: JCS end	b00c	                   # Result too big, exit the loop
07: LMA 1	e801	Asel Aload 	   # Copy RAM location 1 down to 0
08: SMA 0	3d00	RAMwrite ALUpassa 
09: LMA 2	e802	Asel Aload 	   # Copy RAM location 2 down to 1
0a: SMA 1	3d01	RAMwrite ALUpassa 
0b: JMP loop	b003	                   # Go to instruction 3 to do it again
0c: JMP end	b00c			   # End: do an infinite loop

If you run this in the Logisim version, you can see the last three values of the Fibonacci sequence in the first three RAM locations.

A ROM for an ALU?

Is implementing the ALU with a ROM fair game, or is it breaking the rules? You could argue either way. I could have hunted down a 74LS181 ALU to make the design really authentic. I'd need two more control lines for the five select/mode bits. Anyway, it's still one chip for the ALU, so the choice doesn't have too much of an impact on the design. A ROM also allows the designer to choose what ALU operations can be performed.

Design Changes

The chip count can be reduced. Right now I'm using two 74LS161 chips for the PC, but there is an obsolete 74LS469 chip that could replace both of them. That would bring the chip count down to ten. Next up, replace the two control ROMs with an AT27C1024 64Kx16 ROM chip (the only x16 ROM I could find): that would bring the chip count down to nine. Finally, the A and B registers could be replaced with one CD4508B dual 4-bit register as it has separate load lines: that would bring the chip count down to eight chips.

I'll probably never do this as I would have to rewire the whole thing!

Design Limitations

It's a Harvard architecture, so instructions and data are separate. There is only direct memory addressing: no pointers, no indexed addressing. The available instructions are determined by the control lines, so there isn't much that you can change. Despite this, we have been able to implement functions, a stack and a sort of recursion.

There is only one carry bit input and output by the ALU; this really limits any shift left/right operations to one bit shifts.