When we use a modern computer, we're
standing on the shoulders of giants so tall that the details are obscured by clouds.
It's just too far too complicated for any one person to truly understand at any level of
detail. But that wasn't always the case. Today in Dave's Garage we're getting medieval
with an old computer to investigate some of the most basic technologies that we
all overlook in our daily computing: the things you should know how they
work, but you probably don't.
We're going back in time to revisit some of the
details that we all take for granted today.
You might know how a keyboard works, but do you
know how the system scans 100 or more keys with just a few io lines? Do you know why a computer
designer might favor static RAM over dynamic RAM? Do you know how the most basic green phosphor
monitor actually puts text characters on the screen? Do you know how to add two floating
point numbers in assembly language?
Well, in about 15 minutes you're going to know all those
things and a lot more, so stick around!
[Intro]
Hey, I'm Dave, welcome to my shop! I'm
Dave Plummer, retired software engineer going back to the MS-DOS and Windows 95
days, and today we're taking a deep dive into an old machine in order to go back to
basics and make sure you know how a computer actually works. I don't mean in general - I
mean the details, like when the machine boots, what's the first instruction it executes and how
does it even know where to find it? How does a character definition in memory become a picture of
a letter on a screen? The basics.
I've been doing this long enough that I can remember the days
when you could, for all intents and purposes, know all there was to know about a particular
system. It's an awesome feeling, really, and it's a luxury that's pretty much disappeared
in modern computing, and I think young developers in particular are worse off because of it. Dave
Cutler once said of Windows that it was so big, so much code, that "No one mind could comprehend it
all". And that was 30 years ago in 1992, when it was literally one tenth the size it is today. And
he was the chief architect!
I'd say that MS-DOS on the IBM PC was about the last time that one mind
could comprehend the whole thing. But it was never true for me personally - though I worked on MS-DOS
itself, and wrote a lot of x86 assembly for it, I stayed pretty much entirely on the I/O side
of things with disk compression, copying, and caching. So I knew a lot about a little
slice of the system. Then I moved to Windows NT, where again I knew a lot about a few pieces,
but had only a general knowledge of components that I never touched, like audio or printing.
The last time I think I knew a whole system inside and out would be the 8-bit days, working
on Commodore hardware like the PET and C64. Now I do not claim to know everything
there is to know about a C64, for example. While there's nothing in the Programmer's
Reference manual that would likely surprise me, I'm sure there are lots of weird edge case
behaviors for the SID and VIC chips that I've never even heard of. Nor do I mean
that I know more than anyone else - I'm sure if Jim Butterfield and I were on Jeopardy
together and the category was the Commodore PET, I'd only win by virtue of the fact that Jim died
15 years ago now.
For me, it's not about knowing every obscure piece of technical trivia, but
rather, it all really boils down to knowing how every aspect of the machine works so that
you can explain each part without hand-waving. For example, when you press the down arrow key on
a PET, a lot happens. It's not enough to know that there's a function called CHRIN that you can
call to retrieve the character code. Ideally, you should be aware that there's a keyboard
scan using multiple chips on the motherboard, ROM routines to handle the screen editor, zero
page locations to keep track of the current cursor position after updating screen memory, and KERNAL
functions you should know that control all of it. And that's for one key on the keyboard. So you
may not know every detail, but when you have a firm grasp of the big picture, the details
that you do know can be seen in context.
It's clear, then, that I can't tell you
everything there is to know in 20 minutes. So, here's what I'm going to do. I'm going to
paint you a mental picture of the PET just like AOL used to download images from the web. A blurry
overview at first, then progressively refining the details as the picture comes into view.
I
absolutely guarantee that when we're done, you're going to understand a computer at a completely new
level. If I'm successful, then by the end of this episode you'll have a much crisper picture in your
head of not only what a simple computer is made up of but also how it works.
But which computer? I
was tempted to do the Commodore 64 due in no small part to its massive popularity. But between sound
and video, there's just enough going on there that I don't think I could do it justice in one
episode. So, let's go one step even further back, to the Commodore PET, which is very much like a
C64 without the fancy sound and video hardware. Otherwise, the machines are remarkably similar
and even largely compatible with one another. Almost everything I say about the PET will also be
true for the C64 as well. Sure, screen memory and other things move around in the memory map and so
on, but most of the PET architecture survives on directly in the C64, so if there's sufficient
interest in this episode, I can simply do the C64 episode as a specialization of the PET.
If
your brain works anything like mine, it helps to have a mental overview of the whole machine. And
I'm going to give you two visuals with which to anchor your knowledge: the system memory map
and the motherboard itself. Let's start with the motherboard. I took this photo of my own PET,
and added labels to what I think are the important parts. At the top we have the CPU and I/O chips,
in the middle are the ROMs, and at the bottom, the system RAM. There's no sound chip and no video
chip! But clearly there's video circuitry of some kind, because it has a crisp 40x25 character
display, so how did they do that? How does a simple computer even display characters on a
screen? It's not magic, so how do they do it?
It's both simple and genius. As programmers,
the part we care about the most is the character generator ROM, the heart of the system. They call
it a character generator, but it's simply a ROM with circuitry to point the source of the electron
gun at the right place in that ROM. There is no special fancy hardware to create characters, no
sprites, no special effects. Each character on the screen is just an 8x8 grid that is effectively
copied directly from the ROM to the screen, with the beam on when the bit it set, and the
beam off when its clear. But how does it "copy" the character to the screen?
It simply looks
at the right place in a particular ROM for the bytes that define what the character looks like.
In other words, the job of the video hardware is really to point the electron gun's source at the
right bit in the character definition in the ROM. Then whatever pattern is in the ROM will be lit on
the screen phosphor.
As the electron beam scans, the current X/Y position determines which byte of
screen memory is being displayed. Working within the margins of the 40x25 screen, with each
character cell being an 8x8 grid, our actual resolution is thus 320x200.
The hardware fetches
that byte of screen memory, and let's say it's the letter E, which is screen code 5. That means the
letter E is the fifth entry in the character rom. Every letter is an 8x8 monochrome grid, meaning
each letter takes up 8 bytes of 8 bits in the ROM. Thus, to find our offset in the character
ROM table, we simply take the character code and multiply by 8. And then here's the genius part
- they simply take the lower three bits of the current scanline and that's how many bytes deep
you are into the current character definition.
Let's look at a concrete example: The letter
E in the top left corner of the screen. We start at scanline 0. When the electron gun
reaches the left edge of the character, we need to fetch the byte from ROM that corresponds to the
first, or top, byte of the character definition. Since the code for E is 5, we multiply by 8 bytes
per character to get to the 40th byte of the ROM. Since we're on scanline 0, we don't have to
index in any further, and hence we fetch the byte directly. The bits are then clocked out to
the gun from highest to lowest, which is left to right in the character definition.
It's ironic
that despite all it's simple elegance, there's one glaring design shortcoming in the PET that I've
never understood. The character ROM is fixed. Had they allowed you to specify a location in RAM as
the base address of where the video circuit went to look in memory for those character definitions,
you could define your own characters. Each of the 256 characters is its own little 8x8 bitmap and
can be reused on the screen as often as you like. This is how the C64 works, and along with sprites
and smooth scrolling, it's one of the graphics technologies that allowed the 64 to become a
gaming phenomenon in its day. But it's curiously absent on the PET.
Imagine you want to define
a custom character, like the copyright symbol. You would simply copy the existing character
data from the ROM to somewhere in RAM. Then you would modify the definition for an unused
character, like the back arrow perhaps, to become the copyright symbol. Then you modify the base
address of the character table to point to your customized copy. You can customize as many or as
few as you like, and it would also be trivial to load a custom character set from disk. Changing
fonts would be one LOAD command and two pokes.
I'm not a hardware guy so I don't know how
much it saved them in circuitry, speed, or complexity to have the character ROM effectively
dedicated to the video lookup circuit on the PET. The ROM is only wired up to the video circuit
and isn't even exposed to the CPU or present in the memory map. In other words, you can't
get there from here. If you have a theory as to why they couldn't let the video circuitry
pull character definitions from RAM instead, please let know in the comments, as I'm curious!
My best guess at this point is that the character definitions take up 2K, so perhaps on a 4K machine
it wasn't practical enough to be a big priority. But you could write some killer games on a 16K
PET had it allowed for custom character bitmaps.
One curiosity of the PET is that it uses static
RAM, not dynamic RAM, or DRAM as we usually call it. Because they were dealing with an incredibly
small amount of memory at only 4K in the original config, and because 4K of RAM of any kind was
expensive to begin with at the time, they opted to go with static ram. What's the difference?
Static RAM holds its contents for as long as power is supplied to the chip. Store a number in SRAM
and come back an our later and its still there. Dynamic RAM, however, requires additional
circuitry to continually refresh the contents of the RAM to keep it fresh. Without that refresh,
it loses its contents. So static ram seems preferable, and it is in a number of ways, but it
does cost more.
On the C64 they would add custom logic to the VIC chip to do the DRAM refresh, but
there is no comparable chip in the PET, so static RAM avoids the need to add a refresh circuit.
It's time to jump into the memory map so that we can place everything where it belongs. Being able
to see the whole thing as an birds-eye-view is invaluable in understanding the entire system.
All
RAM appears in the memory map starting at address 0. On a 4K PET your RAM thus appears from 0000
to 0FFF. The PET is built around the 6502 CPU, and that chup dictates that the first page
of memory, known as zero page, is special. It's essentially a set of 128 16-bit pointers.
The second page of memory is also special, as it's the location of the stack. It can't grow
outside that page, so it's limited to 256 bytes. RAM can be added up to a limit of 32K max,
meaning the top of memory would then be 7FFF.
Regardless of how much System RAM you have,
the PET has an additional 1000 bytes of static screen RAM mapped into memory at location 8000.
That memory is in addition the base System RAM, so your 4K PET really has 5K! Bonus!
Above the
Screen RAM are the ROM sockets. Each one is 4K, except for the ROM at $E000 is which is
only 2K, but more about that one later.
At 9000 and A000 are two 4K ROM sockets
that are delivered empty on the PET. They're considered expansion ROMs. The biggest
problem is that most of the available expansion ROMs all competed for the same addresses because
6502 code is not relocatable - the addresses for loads and jumps, for example, are hardcoded
into it.
B000 is another 4K empty ROM socket. I mention it separately only because later PETs
that had BASIC 4.0, which was bigger, contained a ROM in this location. My own machine has indeed
been upgraded to BASIC 4 which is why we see a ROM installed in that location in the picture.
At C000
and D000 we find two 4K ROMs for Microsoft BASIC. I think it's fairly impressive that
BASIC fits in 8K, to be honest!
As mentioned, the ROM at E000 is only 2K. That
leaves the top half of what would otherwise be a 4K ROM as a 2K block of unused addresses where
the system I/O chips are mapped into memory. In fact, only a single page, from
E800 to E8FF, is actually used. In this area we find the registers for two
Peripheral Interface Adapters, or PIA chips. One is dedicated to the keyboard and cassette
ports while the other is intended for talking to the IEEE bus that early Commodore disk drives
would use.
Next comes the VIA chip, or Versatile Interface Adapter. It primarily controls the user
port, timers, and the second cassette interface. There are a number of PORT registers exposed, and
programming it reminds me a lot of programming something like an Arduino Nano, in that you set
the data direction appropriately and then read and write bits manually. This chip has 16 gpio lines,
and half of them are available for your use. Those 8 are exposed at the user port. The most
common use of those lines is to implement RS-232C, which is a variant of serial that runs at
5V instead of 12V. You could use the port to communicate with a microcontroller, PC, or modem.
Given a mental picture of where things live in the memory map, let's turn our attention to the system
mainboard, where you'll discover that there's a method to my madness, as it were: the physical
layout of the mainboard follows the memory map layout in the same order we just covered it.
For
example, at the bottom of the board is System RAM. Next above that is Screen RAM. Off to the side of
that, because it's not even visible to the system bus, is the character generator ROM. Above that
are found the system ROM chips. The CPU and I/O are up near the top of the board in the same way
that the I/O chips are near the top of memory.
Now that we know all the major components in the
system and where they live in the memory map and on the board itself, it's time to ask how they
work! When you turn the machine on, how does it start up? For that, we turn to the 6502 datasheet,
which tells us how the boot sequence happens. First, the CPU loads the address out of memory at
FFFC-FFFD. That's true for every 6502 ever made; it's how every one starts up. Which means that
there must be ROM up at the top of memory for it to read, and as we saw in the memory map, the
KERNAL lives up at the top of memory. If we look at the source code for the PET KERNAL, we can see
the three important vectors at the top of memory: two interrupt vectors in addition to our reset
vector. The reset vector is set to point to the label start, so let's take a look at that code.
The start code is fairly straightforward - it initializes the stack pointer, disables
interrupts, turns off decimal math mode, and proceeds to initialize I/O and get the
system running before it hands control off to the screen editor and basic interpreter.
But what is
the KERNAL? Just as a BIOS is a set of routines that allow an operating system to communicate with
and control the underlying hardware of a PC, the KERNAL routines provide all of the system-specific
functionality that BASIC needs to get its job done. If you've ever called "JSR $FFD2" on
the C64 to output a character to the screen, that's an example of a KERNAL call. On
the PET the goal was strictly to provide a set of routines that would provide Microsoft
with a way to access the Commodore hardware, screen editor, and so on. It wouldn't truly
become like a public system API until the Vic20.
Clearly, the bulk of the functionality that the
PET provides is thanks to the code included in the KERNAL and BASIC ROMs. I should be clear that
I'm lumping the screen editor in with the KERNAL. In fact, the delineation is more practically
between Commodore code and Microsoft code.
On important feature that I assume found
itself on the Commodore list of things to do was the keyboard scanning routine. The ROM code
uses one of the PIA chips to talk to the keyboard as a matrix. Several times per second the ROM
cycles between rows 1 and 10 in the keyboard. That binary number is sent to a decoder chip
with 10 lines, and those lines are used, one at a time, to power rows on the keyboard.
If the signal from row 8 comes back on column 3, then the system knows exactly what key is pressed.
Let's say the key in question is the E key. The code for E, which is 5, is stored at the position
in screen memory at the current cursor position. Then the cursor is updated by moving it right one
position, which will update the position values down in the zero page.
The screen editor was
provided by Commodore and was quite innovative for its day. Rather than accumulating a single line
of input and then executing it when you pressed Enter, the PET went about it in a very unique
way.
You could type anywhere on the screen. When the READY prompt emerged, the
cursor would be placed directly below it, but you were free to use the cursor keys to
move the cursor anywhere and type anything. Only when you pressed enter did the system snap
to attention and attempt to execute something. Whatever line the cursor happened to be on was
on would be passed in its entirely to the BASIC interpreter. So you could enter a line like 10
GOTO 20 and if you realized you had made an error, instead of retyping or editing the line you could
simply cursor back up and change the 20 to a 30. As long as you pressed Enter on that line, the new
version would be passed to the BASIC interpreter. It would see the line number and replace line 10
with the "new" version you had just submitted.
This explains why, when you press ENTER on a
screen line that contains the READY prompt, it dutifully reports
?OUT OF DATA ERROR
The
reason? You've pressed ENTER on the text READY., which thus gets passed to the basic interpreter.
It sees it as the command READ Y., but when it tries to execute it, no matching data statements
have yet been executed, so there's no data to be read. Want to verify that? Try it after providing
a DATA 10 line and removing the period at the end. It will READ Y from your data statement and
you can print Y to confirm it!
Let's have a quick look at the BASIC ROM to see what's included
within it. Of course, it has the tokenizing BASIC interpreter, but it has many support routines that
you can make use of, including a very optimized floating point math library.
Let's have a quick
look at how to call the Math libraries on the PET. Imagine we want to add two 16 bit quantities and
print the result. First we need to load an integer quantity into the floating point accumulator.
There are two such accumulators, which are really just memory addresses used as storage by the math
routines. To load our value into the first one, we put the least significant byte into the Y register
and the upper byte into the A register. When we call INTFP our 16-bit Floating Point Accumulator
is loaded from those two 8-bit registers. Next we would use the FAC12 to move the value
from FAC1 to FAC2, then load the second value just as we did the first, using INTFP. Next we
simply call ADD, and accumulator 1 is now the sum of accumulator 1 and accumulator 2. We can then
call FPOUT to print the result to the console.
There's more to it, like Commodore's I/O system
of devices and channels, but quite honestly, it's largely the same as what you do through Commodore
BASIC. The only catch is that some commands take more arguments than you have registers, and as
a result a BASIC command like OPEN gets broken down into a couple of discrete steps: you load
the device number and the file number in one step and call LISTN, then call SECND to set the
secondary address.
We've covered a great deal, from the video system and I/O chips to the two
types of RAM as well as the two major sections of ROM: KERNAL and BASIC. And yet even so, we never
even touched Cassette I/O or the IEEE interface. But those things are quite specific to the PET
itself. We covered everything you need to make a working computer with a CRT and a keyboard
interface, and that's the main goal I set out to accomplish today!
If I've made any mistakes, which
is almost inevitable in an episode like this where it'll be seen by people that know more than I do
about the subject, please do let me know and I'll update the description with any errata. If you
don't see any, that means I achieved perfection, or forgot to add it.
If you enjoyed
this little intense tour of the PET, I'd be honored if you'd consider subscribing
to my channel so that you get more like it. I'm really only in this for the subs and likes,
so please leave me one of each before you go!
Share it, like, comment, and if there's
interest in this kind of topic, I'll keep right on going with more detail on the PET and
perhaps a similar dive into the Commodore 64.
In the meantime and in between time, I hope to
see you next time, right here in Dave's Garage!