A 16-bit CPU, 128 Kilobytes of RAM, a 4K
monitor, all in one Excel file? No Visual Basic Scripts, no plugins, just a regular
spreadsheet. Is it even possible? It's the best kind of possible, theoretically possible.
This video brought to you in part by Brilliant.
Spreadsheets are just fancy calculators. Data in
data out. And if we go by the literal definition of "something that computes" then Excel
is already a very competent computer. But a computer like your PC or other device is a bit
more complex. That has a Central Processing Unit, Gigabytes of Random Access Memory, and some
kind of pixel based display. But essentially, it's still just a calculator, with the CPU
being made of different components that do a more advanced version of data in data out.
And at a low level, it isn’t hard to emulate that in Excel. For example, this is a 3-bit to 8-bit
decoder chip, it takes in a 3-bit binary signal, representing 0 to 7, and gives the one
corresponding output signal. The magic, or should I say the logic, of everything happens
here, in the formula bar. If you think Excel is boring it's because you've never known about
the power of Excel formulas. The formula for each output pin cell uses a combination
of binary logic and integer conversion to correctly compute the output. And I can also
reference other cells in this formula as well, say on a 8-bit to 3-bit encoder, where the input
here is referencing the output of the first chip, that's like virtually wiring these pins together.
Then with three more functions for the output pins here, I have two chips that have the opposite
function, operating in plain old Excel.
But before we take a deep dive into spreadsheets,
a word from this video's returning sponsor Brilliant. If you, like me, love learning, but
don't have the time or sometimes motivation to dedicate as you'd like, then Brilliant may be
just the thing for you. Brilliant has thousands of interactive lessons for beginners and experts
alike. You can learn about Computer Science, Math, and Data Analysis, with only 15 minutes
of interactive problem solving a day, through bite-sized lessons that make learning fun
again. That’s the thing about Brilliant, I enjoy going through the lessons, and I’m learning at
the same time. Try Brilliant for free for thirty days by clicking on the link in the description
below or by visiting www.brilliant.org/Inkbox. And if you're one of the first 200 people
to sign up you'll even get 20% off an annual subscription. That's www.brilliant.org/Inkbox
Now the medium of Excel is a bit different from the typical resistors and transistors of a
physical computer, and so there are a few quirks I have to deal with. Some things have simple
fixes, like in building a clock signal. This runs off a simple formula that sets itself
to 0 if it's 1 and 1 if it's 0. Now, Excel is smart enough to not just run this calculation
in a loop forever, but it's also smart enough to let you do that if you want to anyways. By going
to options I can turn on iterative calculation, and set it to update only one cycle at a time.
Then by pressing F9 or updating any cell, the sheet recalculates and the clock signal updates.
There are going to be lots of alternatives to what I’m doing here by using scripts and stuff,
but if I stoop to that level then I’m not actually using the full potential of Excel,
I’d just be writing a Visual Basic Program.
Anyways, to test this clock signal
out further I built a JK flip flop, one of the fundamental components in low level
digital logic. But it didn't work. I know it should work though because Wikipedia said it
would. But eventually I figured out what the problem was. Excel updates from left to right,
top to bottom, so the physical alignment of the values in the chip are critical to it working
properly. So, for the most part formulas should only reference cells above, not below. Hopefully
it doesn't get any more complicated than that.
So, this is the new design for the JK flip
flop after I flipped it 90 degrees, and it now works as intended. I’ve wired up four of
them together with a couple of binary AND gates, and now I've got a working 4-bit counter that
counts based off the universal clock signal.
Now I could design the whole CPU at this level,
but there's one thing hodling me back from that, my sanity. It'll be much easier to design the
CPU on a higher level based off of a custom Instruction Set Architecture. An ISA describes
how a CPU should work, it's the rule book for the data in data out process. It defines a list of CPU
instructions, and other features of the processor including a set of registers, units of memory
used to store and manipulate data within the CPU, mainly located within what's called the register
file. The main registers in a 64-bit CPU are, you'll never guess, 64-bits long. My registers
will be 16-bits long, making this a 16-bit CPU.
While a typical CPU based on the x86-64
architecture has 981 instruction mnemonics like MOV, ADD, and OR, there are actually
3,684 total variants. Meaning that ADD for example has 6 unique opcodes used for working
with different parts of registers or memory. Writing high level Assembly code, the programmer
doesn't have to worry about these differences, but both the assembling process and
the CPU clearly distinguish the two.
But since the most common instructions are used
an order of magnitude more often than instructions like PAVGB, I decided to not include several
thousand of them in my ISA, instead I boiled everything down to a list of 25 opcodes, and 23
instruction mnemonics. Including loading, storing, transferring from one register to another,
arithmetic operations, bitwise operations, rolling, comparing, jumping, setting flags, and
NOP. I'm even including things like multiplication and division, which aren't strictly necessary, but
it'll make programming things much easier later.
Speaking of programming, a program is a list of
instructions stored in memory. Each instruction is carried out by fetching it from memory,
decoding it and producing necessary signals, executing the operation, and storing the
output either back in the register file or in the computer's memory. I can keep track
of the location of the next instruction using a special register called the Program Counter,
and that's more or less the basic CPU design.
Following that cycle, the first thing to build
is the Fetch Unit, which reads the memory at the address pointed to by the PC register.
The Instructions in my ISA are not a fixed length. Some are 16 bits long, and others are
32 bit long, but the Fetch Unit here always retrieves a full 32-bit value, both the PC and
the PC+1 value. But since both the PC Unit and the Memory Unit haven't been built yet, I'll
have to skip most of that logic here for now.
After getting the instruction, it's passed on to
the Control Unit. Here the instruction is broken into the specified opcode, the first register
operand, the 2nd register operand, and the 16-bit immediate attached value. So even though
a full 32-bit value was retrieved from memory, the second 16 bits won’t affect anything for
most instructions. Each of this unit’s output pins has a unique formula that, based on the
opcode, sets these signals according to this chart here. These signals get carried off to
different parts of the CPU like the Arithmetic Logic Unit, the beating heart of the CPU.
The ALU preforms some kind of operation with the two operands. The ALU operation and operand 1
come directly from the Control Unit, but operand 2 actually comes from an above multiplexer, since
it can be one of 6 different values, either 0, the value of the second register, the memory value
of the address in the second register, the 4-bit immediate value, the 16-bit immediate value,
or the memory value of that 16-bit immediate address. Again, the Memory Unit isn't built yet,
so I’ll fill things in by hand temporarily.
The ALU operation is itself a 4-bit value, so
there are 16 different functions it can run, including some non-arithmetic
actions, that result in a 32-bit output, 16 high bits and 16 low bits.
These are the most important formulas in the whole spreadsheet as it determines what the output of
the processor will be. Most of the operations are straight forward and don’t affect the high 16-bits
at all, but the MULT instruction does result in a full 32-bit result, so the low 16 bits will be
stored in the first specified register and the high 16 bits in the second. Division will cause
the result to be stored in the first register, with the modulus result in the second register.
Rolling the bits left or right was a bit tricky to figure out. The second operand here is the
4-bit immediate value, meaning that the bits can roll left or right up to 15 times. Thankfully
Excel has a BITLSHIFT and BITRSHIFT function, and with a little more algebra I figured it out.
Before the next unit, I need to pass the two results here through another multiplexer, one
that selects between the high result and the low result, this also gives me an excuse to break
down each result into binary, just for flare, I think it's cool to look at. This output is wired
to the input for Register 1 in the Register File, where the 16 General Purpose Registers are kept.
Register 2 input is always the high 16-bit output from the ALU and the REG1, REG2, and the two
Write signals come straight from the Control Unit, if either write signal is set to true, then the
specified register is changed based on the input.
Inside of the Register File I've kept
the four system flags, the carry flag, zero flag, sign flag, and the overflow flag.
The carry flag is set when the ALU low 16-bit result is greater than 2^16. Here's the trick
through, remember that all important ALU formula, what if I told you, it wasn't the final output
formula? Sneakily I've put that long formula one cell above and hid the text by setting the
color to be the same value as the background, then I use modulus division to get a result
that fits within 16 bits for the final result, if the result of this above cell is greater than
2^16, it'll set the carry flag to true. The other flags are simpler, ZF is set if the result of that
low 16-bit value is equal to 0, SF is equivalent to the top bit in the low-16 result, and OF
is set through the conditions of overflow.
The last unit in the CPU is the Program Counter.
When the clock signal is high, the PC checks if it needs to reset to 0, take a two-byte or four-byte
step (which is equal to one or two memory units), or if the PC Set Immediate Flag is set, then,
based on whether the jump conditions are met, it will be set the PC to the 16-bit
Immediate value as specified in the instruction. These are typically used after the
CMP instruction, and help with creating loops and other branches in programs, this will make
more sense when I get to writing some code.
And that's the whole CPU, not too bad and
it comes in a reasonable package. But that's about to change when I start building
the RAM unit. With a 16-bit address bus, there are 65,536 addressable 16-bit memory
units, that’s 128KB of RAM in total. I've fit them into a 256x256 table, and installed
a Memory Management Unit on top. The key signals here are Memory Write from the Control
Unit, the Address from the first multiplexer, and the value that comes from the ALU. This
address is converted into an X and Y coordinate, based off the Excel Space, if the Memory Write
signal is set to high, then the cell at this coordinate is updated to hold this new value.
Of course, one cell can't dictate that another cell be written to. Each cell has to have its own
defined formula, but again I'm not crazy, I didn't write 65,000 different equations, I wrote one
equation that would work for every single cell, highlighted everything then after writing the
function in the formula bar hit Ctrl-Enter and it was automatically applied to every cell.
To read from the RAM table based off of a single 16-bit address, I've used two Excel functions,
Address, to specify the row and column of the desired cell in digits, and Indirect, which
grabs the value of the specified address.
Going back to the Fetch Unit, I can finally finish
it to grab two units of memory from the address in the PC register. And now I've also added two
buttons on top next to the clock. One of them is Reset, which resets the PC and RAM all back
to 0, and the other is Manual. In manual Mode, the instruction is specified by the user in
the override slot in the Fetch Unit. I found this came in handy when designing and testing
things for the CPU, so I've included it as a feature now. The first multiplexer also reads
from two different spots in memory, from the address in the second specified register, and the
address in the 16-bit immediate value. Though, I did make one mistake here, I realized that I
didn't have an instruction that used this 3rd option, so I had to add a 26th instruction.
It’s another LOAD instruction, but this could be useful for indirect addressing in programs.
Now the only thing missing is the 4K screen. And by that I of course mean it uses 4KB of memory.
Looking at the memory map here, it will use the last 4KB of the 128KB of RAM, with the rest being
free for you to do whatever you want with it, it’s your RAM. The screen is 128x128 cells with
each cell representing one pixel, but I've resized these cells to be square rather than rectangle.
The screen will have a 16-color display, that's 4-bits per pixel, or one word in memory
controlling four pixels. That added another layer of complexity to the Excel formula, as I'm reading
data from a 256x16 table onto a 128x128 table, and again I'm only writing the one formula to be
applied to the whole table here, because I find it much more fun to spend three hours writing and
testing one complex function that does everything, than spend one hour mindlessly applying simpler
functions to each cell. After I have that worked out, to add color all I have to do is something
I've been doing this whole time already, apply conditional formatting. I've added 16 different
rules for the 16 different colors, and with the text color being the same as the background,
the cells look and act like regular pixels.
So, this is it now, it’s not a system on a chip,
it’s a system on a spreadsheet. And it all comes down to this. Here's a real simple program
that I wrote that should change the first four pixels of the screen to red, blue, green,
and yellow. First, set register 0 to $F000, alright. Then set register 1 to $48C7, okay. Now
this should be it right here, store register 1 to the address in register 0. Three, two, one…
This is so cool. This is a working CPU in a regular Excel spreadsheet. But I’m not done
yet because right now I’m only running this in manual mode. But I want to run much larger
programs from memory now. And for that I'll need a compiler. Remember when I said my sanity stood
in the way? I’ve designed a new Assembly Language based on my ISA and I'm calling it EXCEL-ASM16, it
features not only 23 different instructions, but also the ability to define variables in the data
segment, for numbers to be defined using decimal, hexadecimal, or for certain instructions,
with @ to signify the memory location, it has support for labels, comments, and
an additional ORG instruction which sets the address for the next instruction, and the
ability to include binary files directly into the program. Now this is all pretty basic for an
Assembly language, and certainly there are more bells and whistles I could have added, but it's
good enough for a made up Excel CPU language.
I wrote the entire compiler in EXCEL-ASM16 and
it runs inside this Excel CPU. No, it doesn’t actually, I’m not insane. I wrote it in Python
so that it'd be cross platform compatible. But, how do I get compiled programs into memory?
Because if I just try to write directly to the cell, that overwrites the cell's formula.
So instead let me tweak a few things. I'm going to come up top and add a few more buttons, a
separate button for reset memory, and a read ROM button. I can't actually keep the ROM in this
file though, because Excel doesn't like formulas with iterative calculations to be saved by an
outside program. So, my many attempts to keep the ROM below the RAM only resulted in many,
many broken spreadsheets. Kids, remember never to run experimental tests in your main file.
When you run the compiler, first specify your program file, then you can specify that you
want the compiled result saved in an Excel spreadsheet called ROM, which will look like
this after it compiles. Then go back to the Excel CPU file and flip the Read ROM switch.
I've already adjusted the memory function to read from the ROM file when this is turned
on, then flip it off before you start your program. Reset the PC to 0, and let her rip.
This is amazing, and I'm able to make all kinds of super cool programs with this, and the best part
is that this is still just a regular spreadsheet, but the worst part is that it’s also super slow.
I'm updating the clock cycle by hand by pressing the F9 key, and it takes such a long time to run
each program. The CPU working its hardest isn't running any faster than 2 or 3 Hz, the footage of
all these programs I’ve been showing you has been very much sped up. But still, I think this could
be a useful tool to show the inner workings of a processor one clock cycle at a time, and maybe
it'll run faster on your machine. Everything, the CPU, the compiler, the ROM, my sample programs,
and all the documentation I wrote is available to download, so hit that subscribe button if you
want to see more projects like this. And thanks once again to Brilliant for sponsoring this video,
check them out down below, and until next time…