I built my own 16-Bit CPU in Excel

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

A 16-bit CPU, 128 Kilobytes of RAM, a 4K monitor, all in one Excel file? No Visual Basic Scripts, no plugins, just a regular spreadsheet. Is it even possible? It's the best kind of possible, theoretically possible. This video brought to you in part by Brilliant. Spreadsheets are just fancy calculators. Data in data out. And if we go by the literal definition of "something that computes" then Excel is already a very competent computer. But a computer like your PC or other device is a bit more complex. That has a Central Processing Unit, Gigabytes of Random Access Memory, and some kind of pixel based display. But essentially, it's still just a calculator, with the CPU being made of different components that do a more advanced version of data in data out. And at a low level, it isn’t hard to emulate that in Excel. For example, this is a 3-bit to 8-bit decoder chip, it takes in a 3-bit binary signal, representing 0 to 7, and gives the one corresponding output signal. The magic, or should I say the logic, of everything happens here, in the formula bar. If you think Excel is boring it's because you've never known about the power of Excel formulas. The formula for each output pin cell uses a combination of binary logic and integer conversion to correctly compute the output. And I can also reference other cells in this formula as well, say on a 8-bit to 3-bit encoder, where the input here is referencing the output of the first chip, that's like virtually wiring these pins together. Then with three more functions for the output pins here, I have two chips that have the opposite function, operating in plain old Excel. But before we take a deep dive into spreadsheets, a word from this video's returning sponsor Brilliant. If you, like me, love learning, but don't have the time or sometimes motivation to dedicate as you'd like, then Brilliant may be just the thing for you. Brilliant has thousands of interactive lessons for beginners and experts alike. You can learn about Computer Science, Math, and Data Analysis, with only 15 minutes of interactive problem solving a day, through bite-sized lessons that make learning fun again. That’s the thing about Brilliant, I enjoy going through the lessons, and I’m learning at the same time. Try Brilliant for free for thirty days by clicking on the link in the description below or by visiting www.brilliant.org/Inkbox. And if you're one of the first 200 people to sign up you'll even get 20% off an annual subscription. That's www.brilliant.org/Inkbox Now the medium of Excel is a bit different from the typical resistors and transistors of a physical computer, and so there are a few quirks I have to deal with. Some things have simple fixes, like in building a clock signal. This runs off a simple formula that sets itself to 0 if it's 1 and 1 if it's 0. Now, Excel is smart enough to not just run this calculation in a loop forever, but it's also smart enough to let you do that if you want to anyways. By going to options I can turn on iterative calculation, and set it to update only one cycle at a time. Then by pressing F9 or updating any cell, the sheet recalculates and the clock signal updates. There are going to be lots of alternatives to what I’m doing here by using scripts and stuff, but if I stoop to that level then I’m not actually using the full potential of Excel, I’d just be writing a Visual Basic Program. Anyways, to test this clock signal out further I built a JK flip flop, one of the fundamental components in low level digital logic. But it didn't work. I know it should work though because Wikipedia said it would. But eventually I figured out what the problem was. Excel updates from left to right, top to bottom, so the physical alignment of the values in the chip are critical to it working properly. So, for the most part formulas should only reference cells above, not below. Hopefully it doesn't get any more complicated than that. So, this is the new design for the JK flip flop after I flipped it 90 degrees, and it now works as intended. I’ve wired up four of them together with a couple of binary AND gates, and now I've got a working 4-bit counter that counts based off the universal clock signal. Now I could design the whole CPU at this level, but there's one thing hodling me back from that, my sanity. It'll be much easier to design the CPU on a higher level based off of a custom Instruction Set Architecture. An ISA describes how a CPU should work, it's the rule book for the data in data out process. It defines a list of CPU instructions, and other features of the processor including a set of registers, units of memory used to store and manipulate data within the CPU, mainly located within what's called the register file. The main registers in a 64-bit CPU are, you'll never guess, 64-bits long. My registers will be 16-bits long, making this a 16-bit CPU. While a typical CPU based on the x86-64 architecture has 981 instruction mnemonics like MOV, ADD, and OR, there are actually 3,684 total variants. Meaning that ADD for example has 6 unique opcodes used for working with different parts of registers or memory. Writing high level Assembly code, the programmer doesn't have to worry about these differences, but both the assembling process and the CPU clearly distinguish the two. But since the most common instructions are used an order of magnitude more often than instructions like PAVGB, I decided to not include several thousand of them in my ISA, instead I boiled everything down to a list of 25 opcodes, and 23 instruction mnemonics. Including loading, storing, transferring from one register to another, arithmetic operations, bitwise operations, rolling, comparing, jumping, setting flags, and NOP. I'm even including things like multiplication and division, which aren't strictly necessary, but it'll make programming things much easier later. Speaking of programming, a program is a list of instructions stored in memory. Each instruction is carried out by fetching it from memory, decoding it and producing necessary signals, executing the operation, and storing the output either back in the register file or in the computer's memory. I can keep track of the location of the next instruction using a special register called the Program Counter, and that's more or less the basic CPU design. Following that cycle, the first thing to build is the Fetch Unit, which reads the memory at the address pointed to by the PC register. The Instructions in my ISA are not a fixed length. Some are 16 bits long, and others are 32 bit long, but the Fetch Unit here always retrieves a full 32-bit value, both the PC and the PC+1 value. But since both the PC Unit and the Memory Unit haven't been built yet, I'll have to skip most of that logic here for now. After getting the instruction, it's passed on to the Control Unit. Here the instruction is broken into the specified opcode, the first register operand, the 2nd register operand, and the 16-bit immediate attached value. So even though a full 32-bit value was retrieved from memory, the second 16 bits won’t affect anything for most instructions. Each of this unit’s output pins has a unique formula that, based on the opcode, sets these signals according to this chart here. These signals get carried off to different parts of the CPU like the Arithmetic Logic Unit, the beating heart of the CPU. The ALU preforms some kind of operation with the two operands. The ALU operation and operand 1 come directly from the Control Unit, but operand 2 actually comes from an above multiplexer, since it can be one of 6 different values, either 0, the value of the second register, the memory value of the address in the second register, the 4-bit immediate value, the 16-bit immediate value, or the memory value of that 16-bit immediate address. Again, the Memory Unit isn't built yet, so I’ll fill things in by hand temporarily. The ALU operation is itself a 4-bit value, so there are 16 different functions it can run, including some non-arithmetic actions, that result in a 32-bit output, 16 high bits and 16 low bits. These are the most important formulas in the whole spreadsheet as it determines what the output of the processor will be. Most of the operations are straight forward and don’t affect the high 16-bits at all, but the MULT instruction does result in a full 32-bit result, so the low 16 bits will be stored in the first specified register and the high 16 bits in the second. Division will cause the result to be stored in the first register, with the modulus result in the second register. Rolling the bits left or right was a bit tricky to figure out. The second operand here is the 4-bit immediate value, meaning that the bits can roll left or right up to 15 times. Thankfully Excel has a BITLSHIFT and BITRSHIFT function, and with a little more algebra I figured it out. Before the next unit, I need to pass the two results here through another multiplexer, one that selects between the high result and the low result, this also gives me an excuse to break down each result into binary, just for flare, I think it's cool to look at. This output is wired to the input for Register 1 in the Register File, where the 16 General Purpose Registers are kept. Register 2 input is always the high 16-bit output from the ALU and the REG1, REG2, and the two Write signals come straight from the Control Unit, if either write signal is set to true, then the specified register is changed based on the input. Inside of the Register File I've kept the four system flags, the carry flag, zero flag, sign flag, and the overflow flag. The carry flag is set when the ALU low 16-bit result is greater than 2^16. Here's the trick through, remember that all important ALU formula, what if I told you, it wasn't the final output formula? Sneakily I've put that long formula one cell above and hid the text by setting the color to be the same value as the background, then I use modulus division to get a result that fits within 16 bits for the final result, if the result of this above cell is greater than 2^16, it'll set the carry flag to true. The other flags are simpler, ZF is set if the result of that low 16-bit value is equal to 0, SF is equivalent to the top bit in the low-16 result, and OF is set through the conditions of overflow. The last unit in the CPU is the Program Counter. When the clock signal is high, the PC checks if it needs to reset to 0, take a two-byte or four-byte step (which is equal to one or two memory units), or if the PC Set Immediate Flag is set, then, based on whether the jump conditions are met, it will be set the PC to the 16-bit Immediate value as specified in the instruction. These are typically used after the CMP instruction, and help with creating loops and other branches in programs, this will make more sense when I get to writing some code. And that's the whole CPU, not too bad and it comes in a reasonable package. But that's about to change when I start building the RAM unit. With a 16-bit address bus, there are 65,536 addressable 16-bit memory units, that’s 128KB of RAM in total. I've fit them into a 256x256 table, and installed a Memory Management Unit on top. The key signals here are Memory Write from the Control Unit, the Address from the first multiplexer, and the value that comes from the ALU. This address is converted into an X and Y coordinate, based off the Excel Space, if the Memory Write signal is set to high, then the cell at this coordinate is updated to hold this new value. Of course, one cell can't dictate that another cell be written to. Each cell has to have its own defined formula, but again I'm not crazy, I didn't write 65,000 different equations, I wrote one equation that would work for every single cell, highlighted everything then after writing the function in the formula bar hit Ctrl-Enter and it was automatically applied to every cell. To read from the RAM table based off of a single 16-bit address, I've used two Excel functions, Address, to specify the row and column of the desired cell in digits, and Indirect, which grabs the value of the specified address. Going back to the Fetch Unit, I can finally finish it to grab two units of memory from the address in the PC register. And now I've also added two buttons on top next to the clock. One of them is Reset, which resets the PC and RAM all back to 0, and the other is Manual. In manual Mode, the instruction is specified by the user in the override slot in the Fetch Unit. I found this came in handy when designing and testing things for the CPU, so I've included it as a feature now. The first multiplexer also reads from two different spots in memory, from the address in the second specified register, and the address in the 16-bit immediate value. Though, I did make one mistake here, I realized that I didn't have an instruction that used this 3rd option, so I had to add a 26th instruction. It’s another LOAD instruction, but this could be useful for indirect addressing in programs. Now the only thing missing is the 4K screen. And by that I of course mean it uses 4KB of memory. Looking at the memory map here, it will use the last 4KB of the 128KB of RAM, with the rest being free for you to do whatever you want with it, it’s your RAM. The screen is 128x128 cells with each cell representing one pixel, but I've resized these cells to be square rather than rectangle. The screen will have a 16-color display, that's 4-bits per pixel, or one word in memory controlling four pixels. That added another layer of complexity to the Excel formula, as I'm reading data from a 256x16 table onto a 128x128 table, and again I'm only writing the one formula to be applied to the whole table here, because I find it much more fun to spend three hours writing and testing one complex function that does everything, than spend one hour mindlessly applying simpler functions to each cell. After I have that worked out, to add color all I have to do is something I've been doing this whole time already, apply conditional formatting. I've added 16 different rules for the 16 different colors, and with the text color being the same as the background, the cells look and act like regular pixels. So, this is it now, it’s not a system on a chip, it’s a system on a spreadsheet. And it all comes down to this. Here's a real simple program that I wrote that should change the first four pixels of the screen to red, blue, green, and yellow. First, set register 0 to $F000, alright. Then set register 1 to $48C7, okay. Now this should be it right here, store register 1 to the address in register 0. Three, two, one… This is so cool. This is a working CPU in a regular Excel spreadsheet. But I’m not done yet because right now I’m only running this in manual mode. But I want to run much larger programs from memory now. And for that I'll need a compiler. Remember when I said my sanity stood in the way? I’ve designed a new Assembly Language based on my ISA and I'm calling it EXCEL-ASM16, it features not only 23 different instructions, but also the ability to define variables in the data segment, for numbers to be defined using decimal, hexadecimal, or for certain instructions, with @ to signify the memory location, it has support for labels, comments, and an additional ORG instruction which sets the address for the next instruction, and the ability to include binary files directly into the program. Now this is all pretty basic for an Assembly language, and certainly there are more bells and whistles I could have added, but it's good enough for a made up Excel CPU language. I wrote the entire compiler in EXCEL-ASM16 and it runs inside this Excel CPU. No, it doesn’t actually, I’m not insane. I wrote it in Python so that it'd be cross platform compatible. But, how do I get compiled programs into memory? Because if I just try to write directly to the cell, that overwrites the cell's formula. So instead let me tweak a few things. I'm going to come up top and add a few more buttons, a separate button for reset memory, and a read ROM button. I can't actually keep the ROM in this file though, because Excel doesn't like formulas with iterative calculations to be saved by an outside program. So, my many attempts to keep the ROM below the RAM only resulted in many, many broken spreadsheets. Kids, remember never to run experimental tests in your main file. When you run the compiler, first specify your program file, then you can specify that you want the compiled result saved in an Excel spreadsheet called ROM, which will look like this after it compiles. Then go back to the Excel CPU file and flip the Read ROM switch. I've already adjusted the memory function to read from the ROM file when this is turned on, then flip it off before you start your program. Reset the PC to 0, and let her rip. This is amazing, and I'm able to make all kinds of super cool programs with this, and the best part is that this is still just a regular spreadsheet, but the worst part is that it’s also super slow. I'm updating the clock cycle by hand by pressing the F9 key, and it takes such a long time to run each program. The CPU working its hardest isn't running any faster than 2 or 3 Hz, the footage of all these programs I’ve been showing you has been very much sped up. But still, I think this could be a useful tool to show the inner workings of a processor one clock cycle at a time, and maybe it'll run faster on your machine. Everything, the CPU, the compiler, the ROM, my sample programs, and all the documentation I wrote is available to download, so hit that subscribe button if you want to see more projects like this. And thanks once again to Brilliant for sponsoring this video, check them out down below, and until next time…

Info

Channel: Inkbox

Views: 1,093,980

Rating: undefined out of 5

Keywords:

Id: 5rg7xvTJ8SU

Channel Id: undefined

Length: 16min 27sec (987 seconds)

Published: Sat Jan 27 2024