A long time ago I was playing Tempest on an
arcade game emulator. I wasn't very good at it, so it was exciting
when I finally made it far into the set of red levels. However, after passing a few more levels and
sacrificing a bunch of imaginary quarters, something interesting happened. The spaceship flew off of the level and the
vector lines making up the game starting flailing all over the place. And shortly after, the game completely reset
and lost all of my progress. I could never get that far into the game another
time, but it never crashed like that ever again. So I just figured it was a once-in-a-lifetime
bug that could never be reproduced. Old games are like that right? Well it turns out I was triggering some code
in the game to fight against piracy. The emulator wasn't accurate enough and so
the software was sabotaging my play session by purposely glitching it out and causing
it to reset. In this video, we'll look into not one, not
two, not three, but four different piracy checks that Atari put into the Tempest ROM. And we'll also get to see what happens when
each of them gets set off. Let's get into it! There are two main ways that Tempest checks
for potential piracy. It checks for software modifications, and
it checks for genuine hardware presence. Let's start with the software side of things. The basis of all of the software checks is
the checksum. A checksum is one way to verify data integrity--that
is, tell if it may have been modified. A very simple type of checksum is to add up
all of the data values you have. Suppose you had a padlock combination you
needed to keep track of. You can add up the numbers and get some result. Now you can tell if one of the numbers is
wrong. If you add them up and you don't get this
checksum value, you know the data has been tampered with. It doesn't help you figure out what the real
data should have been, but at least you know it's wrong. Usually the checksum is limited to a certain
range of values, especially when there can be a lot of data. In this case we can say that our checksum
is limited from 0 to 99 (we just wrap around at 100). In practice, a checksum is often limited to
a certain number of bits. Another way of implementing a checksum is
to use a complement. The complement is one extra value appended
to the data to ensure that the checksum is always the same value, usually zero. In our case, since our checksum was 14, we
would need to add 86 to it to reach 100 and wrap back to zero. This is our complement; it's not an extra
number for the padlock combination, but at least we don't need to keep track of the checksum
any more--it's built into the data. Now, checksums do have some drawbacks. If I add 1 to one of the numbers, but then
subtract 1 from a different number, the checksum is still the same. Just because the checksum checks out doesn't
mean the data has not been messed with. In fact, if someone knows in advance that
the data is going to be checksummed, they could change the data into whatever they want
as long as the checksum is the identical to the original. If the data uses a checksum complement, then
this becomes even easier, since they could adjust the complement to be whatever it needs
to be. Now, a lot of arcade games use checksums to
verify all of their code. For example, Pac-Man runs a checksum on each
of its four ROM chips. If any one of them fails the check, the game
will not boot and will display an error instead. However, Tempest doesn't do this, and instead
only checksums a few small excerpts of its code. And they all have to do with this string of
letters right here. All that matters to Atari is that they get
their copyright message displayed correctly, apparently. Within all of the data in the game's ROM,
there is a large table of strings that includes each of the phrases that appear throughout
the game. At offset $D575 you can find the string for
the copyright message. The first byte denotes the X position the
string is drawn at on the screen, and then each byte afterwards are the individual letters. So here, the $50 is the copyright symbol,
$00 is a space, $2E is the letter M, $1A is the letter C, and so on. The last character in the string is denoted
by having its highest bit set. The $A6 here is the letter I, which is just
$26 but with the highest bit set. Here is the code that calculates the checksum
for this copyright string. The checksum complement is the $85 here, which
is stored in the code rather than in the string itself. That gets added together with each of the
values for the letters in the string, as well as that X position byte. In fact, the X position of the string following
the copyright string is unintentionally included as well. The copyright string data is 16 bytes (1 byte
for position and 15 letters), but the maximum index value used in the code is 16. Since tables use a zero-based index, this
actually grabs the 17th byte from the start of the string--one more than its length. That's not really an issue though, it just
means the X position of the word "credits" can also not be modified without tripping
the piracy detection. One last thing to point out here is that the
add instruction here is an ADC instruction (since 65xx processors don't have a standard
ADD instruction). This means the carry flag is also added to
the running total each time this instruction is run. The carry flag is cleared at the start of
this function, but it will get set whenever the running total wraps around from 255 to
0. If you start with the complement of $85, and
add each of these 17 bytes together, you end up with $4FC. Well, if you take the lower byte and add the
all the carried bits, you get $FC plus 4, which is $100. Now looking at only the lower eight bits you
get zero, which tells us that the checksum is valid and that the string has not been
modified. The game then writes this result to a byte
in memory at $B5 to refer back to later. This should always be zero, but if it's not,
then the game knows that it has been modified and can take any action necessary. What does it end up doing? Well let's look at another bit of code. This gets run regularly during normal gameplay. The first instruction checks if the value
held in $B5 is zero. If it is, everything gets skipped. Otherwise, it performs another check. If player 1 is currently on level 10 or earlier,
then the next few instructions also get skipped. Otherwise, for level 11 or later, a value
of $7A is written to memory address $53. This memory address contains a helper timer
that sort of functions as the game's heartbeat--it throttles the game so that it runs at a consistent
framerate. It's not really important to know how exactly
that works, but it is important to know that it only ever counts up to 9, which means this
byte should only ever be in the range of zero to eight. This code sets it to $7A, or 122 in decimal. This is way outside the normal range, and
it causes the game to force reset instantly. If you try to start on level 11 or later,
it resets immediately. And if you complete level 10, the moment you
transition from it to the next level, it resets. So why would the programmers make the game
reset on level 11, instead of just failing to even start the game in the first place,
like with Pac-Man? Well, it's sort of a cruel reason. Suppose you wanted to hack Tempest, and make
your own version of it. (This exists by the way, there was a somewhat
popular romhack called Tempest Tubes that just changed the shapes of the levels--the
copyright string stays in tact as to not trip the piracy detection.) Once you are done modifying the game, you
would have to program a new ROM chip and insert it into the motherboard. If it doesn't boot up right away, then clearly
something is wrong, and you can try to troubleshoot the problem. But if it starts up just fine, then maybe
you would presume that it works! You may even play from the beginning of the
game, and even convince yourself that the game plays fine for the first 10 levels, so
you pack everything up and show it off to your friends. But then, someone gets to level 11 and suddenly
the game crashes. What was the cause of the crash? Was it faulty hardware? A software bug? It was working just fine for you while you
were testing it, so you may not even think that the game is even performing a piracy
check in the first place. And if you did figure that was the case, having
to play up to level 11 each time just to make sure it doesn't crash would be sort of annoying. Then you have to be confident that that was
the only piracy check and that there weren't others just waiting to be tripped. Which, of course, there were! The game converts these text strings into
lists of vector generator instructions so that the vector generator can actually draw
them to the screen. I have a video about the vector generator
and its instruction set if you want to learn more about that. But here is an example of what the code snippet
looks like. First the beam is centered on the screen,
and the size of the text is set. This draw instruction moves the beam to the
X and Y position where the text string will be drawn, and this instruction sets the proper
text color. All these call instructions draw each letter
in the string and move the beam over to prepare for the next letter. It turns out that the bytecode of these instructions--seen
here on the left--is also checksummed. This means if you didn't touch the copyright
string itself, but instead modified how it converts that string into drawing instructions
(maybe just disabled it altogether), this checksum would fail instead. Now, due to how vector RAM works, this bytecode
may not appear in the same location in memory each time. It would depend on what other elements are
currently being drawn to the screen. Therefore, there's this small section of code
that runs whenever a text string is being converted to vector generator instructions. If the ID of the string is exactly $2C (which
is that of the copyright message), then the current vector RAM address is copied to a
temporary location. This just saves the location of where the
copyright string instructions are. Then, later, this checksum calculation is
performed. This is similar to the last check, where all
the bytes are added together using the ADC instruction, and the checksum complement is
$F2. This time the proper number of bytes are checked--40. This is important to get right in this case
since it's not guaranteed that the bytes following this code will be the same every time--again
it depends on what elements are going to be drawn to the screen. Now, there's actually a problem with how this
integrity check works. The copyright string is drawn in four different
places. Once during the attract mode demo, once while
the high score screen is displayed, once during the title screen, and once during the level
select screen. Since the string is drawn at a different location
and in different colors, the vector code will be different in all four cases. So the code I just showed you is only responsible
for checking the version of the copyright string during the attract mode demo. A completely different function is run to
checksum the other three versions of the string. Here's what that looks like. Now, this function actually uses a different
checksumming method entirely--it uses subtraction rather than addition to combine all of the
bytes. The checksum complement for the high score
screen version of the string is $0E, which is what you see here. This results in a final tally of zero for
this version, but it results in $E5 for the level select screen. If the result is not zero, then it gets exclusively
OR'd with $E5 next--which will give exactly zero for the level select screen. If it's still not zero, it may be $29 at this
point, which would be the case if it were checking the title screen variant. So finally the result is exclusively OR'd
with $29, which will give zero for the title screen, and anything but zero if the chunk
of bytes doesn't match any of these three exactly. This is sort of a lazy way to do it, since
any three of the versions of the string can match any of the three checksum complements
for the test to pass, instead of checking each one individually. The first checksum function writes its final
result to address $011B, while the second function writes its result to $0455. Now let's look at what happens if these integrity
checks fail. Here is the function responsible for this,
which runs during gameplay, even though the copyright string is not visible while in the
middle of a game. The two flags are OR'd together--they should
both be zero. If either one of them aren't zero, then the
game takes action. This time, instead of waiting for the player
to reach a certain level, it waits for a certain amount of points. It checks if the highest byte of the player
1's score is higher than 17, which equates to 180 thousand points. If they have at least that many points, then
a certain memory address gets incremented by one. Which memory address exactly is dependent
on the lowest byte of the player's score. Since the spikes in the playfield give you
just a single point when you hit them with a single bullet, this memory address can easily
be anything from $00 to $99 in binary coded decimal. Some of these locations hold game variables
which are responsible for enemy positions and the camera's location. And some of theme hold important values that
keep the game running altogether. Haphazardly incrementing various values in
this range can cause a bunch of weird stuff to happen. The game can totally glitch out, and even
softlock. Though most of the time, it does end up crashing
and resetting itself. Now for the last software check. Both of the previous checks looked at the
data that made up the copyright string. But what if you just prevented the game from
even drawing the copyright string altogether? Well, they thought of that too. Here is the small section of code that is
responsible for drawing the copyright string (specifically on the attract mode demo and
high score screens only). Well, the bytecode of this function is also
checksummed, so if this code is tampered with, it will set off a flag as well. Here is the code responsible for that. This time, the exclusive OR operation is used
to combine all of the bytes, and the checksum complement is $A7. Just like the first check we looked at, it
seems that it checks one extra byte than intended, but again, there's nothing inherently wrong
with that. The flag for this check is stored at memory
address $016C. What happens when this check fails? Well, let's see. If the checksum result is not zero, and the
current player is on level 21 or later, than this one single instruction gets executed:
SED. This instruction turns on decimal arithmetic
mode. Essentially, this allows the processor to
add and subtract numbers in base 10 instead of binary. Normally if you add 1 to 9, you get 10. But if you convert this into hexadecimal,
you get $0A. If the decimal arithmetic flag is enabled,
adding 1 to 9 gets you 16. This is so when you convert this into hexadecimal,
you get $10, which looks like what 1 plus 9 is in decimal. This is useful when you want to store numbers
in decimal instead of binary--generally when numbers are going to be displayed on the screen,
like the player's score. Tempest actually uses the decimal arithmetic
flag properly in several places in its code. Every time it does though, it makes sure to
turn it off when it's done doing math in decimal, since it does make math really messed up if
you aren't expecting it. However, in this anti-piracy check, the game
turns on decimal arithmetic and never turns it off. This really screws with pretty much all math
operations the game needs to function, which causes it to softlock pretty much immediately. The last anti-piracy check in the game has
to do with detecting genuine hardware. ROM chips are very common, and any person
can just go buy some. And it's trivial to copy data from one ROM
chip to another. Using this method, you could duplicate a ROM
chip, and essentially duplicate a game. However, if the game requires other chips
in order to run, you would need to get a second copy of all of those chips too. If those are as easy to get your hands on
as ROM chips, then there's no problem for the pirate. However, what if one of those chips was copyrighted
by the same company whose code you're trying to copy, and they only sell those chips to
licensed manufacturers of arcade cabinets? Then you might be out of luck. This is sort of what Atari did with Tempest. The chip in question is the Atari C012294
chip, also known as the pot keyboard integrated circuit, or POKEY chip, which is responsible
for audio output, keyboard and other controller input, timers, IRQs, and a random number generator. They were found in a lot of Atari's home computers,
game consoles, and arcade games. So they weren't impossible to get your hands
on. Tempest uses two POKEY chips for sound, button
and spinner inputs, as well as the RNG functionality. The developers check for the presence of a
pair of genuine POKEY chips by auditing their random number generation. As a pseudo random number generator, the stream
of random numbers the chips produce is completely deterministic. Hence, if you know exactly how the chips produce
their random numbers, and you read one random number, you should be able to infer what number
or numbers are possible next. This code does just that--it fetches two random
numbers from each POKEY chip and confirms that they are following the exact pattern
any POKEY chip's RNG output should. But before we look at the code, let's look
at how the POKEY random number generator works. This chip uses a hardware implementation of
a 17-bit linear feedback shift register. LFSRs have shown up a few times on this channel
already, but let me quickly explain how this works. Inside the chip are 17 bits that represent
the state of the random number generator. Every time a new number is to be generated,
all of the bits shift over one position. The bit at the end gets discarded, and an
empty spot at the start appears. This bit is filled in by taking the exclusive
or of two of the bits in the shift register itself. In this implementation, it is the 12th and
17th bit from the start that get XOR'd together. The result of this bit operation is shifted
in to the start. The POKEY chip then exposes eight of these
bits to be used as the output of the random number generator. This provides a number from 0 to 255 that
can be used by the software. How often is a random number generated? The POKEY chip generates a new number (and
consequently shifts the LFSR) on every machine cycle of the main processor on the motherboard. Tempest is clocked in at 1.512 MHz, so we're
looking at about 1.5 million numbers per second. This is very fast in comparison to any software
LFSR like we've seen before. To put it in perspective, a single load instruction
like this takes four machine cycles to execute. This means in the time it takes to load in
one byte from memory into a CPU register, the POKEY chip has already created four random
numbers. It's always grinding away in the background,
producing random numbers for free. (It uses these random numbers for itself in
some cases, particularly for audio noise generation.) A big problem with random number generators
is output correlation. A random number generator may be good at uniformly
providing a range of numbers, but it may be awful at providing multiple numbers in quick
succession. Just take this LFSR for example. If we generate a random number and we get,
say, $7C, what can the next number be? Well, we know all of the bits get shifted
to the right once, so we'll end up with $3E. But that's if a zero gets shifted in to the
top. If a one gets shifted in, then we might get
$BE. But that's it. If we get the random number $7C first, we
know the next number must be $3E or $BE. When there are 256 possible numbers, but we
can only really get 2 of them, that's not very random at all. Thankfully, since this is a hardware-based
RNG, it can generate numbers so fast that it's not hard to make sure we don't generate
random numbers too quickly. All it takes is 8 machine cycles to see a
completely new set of bits, which is only about 2 or 3 CPU instructions worth of time. However, we can also use this correlation
to our advantage. Since we know exactly how the POKEY random
number generator works, we can verify its existence via software by specifically checking
for that correlation. If it doesn't exist--that is, the numbers
are too random--then we know that something other than Atari's POKEY chip is on board
supplying random numbers. And that would mean there's a good chance
this arcade cabinet has been tampered with. That's exactly what this chunk of code does. The first thing it does is load two random
numbers back to back. This load instruction takes four machine cycles
to execute, so these two random numbers are four LFSR shifts apart from each other. This means, due to the correlation, that we
can expect the upper four bits of the first number to be identical to the lower four bits
of the second number. Just to make this more generalized, if we
load into A the number with bits "ABCDEFGH", then the number we load into Y should be "JKLMA'B'C'D'". We expect the bits A and A' and so on to be
identical, but it's good to tell them apart here just to make it clear what this function
is doing. The second number gets stored into a temporary
memory location, and the first number is shifted right four times to get those special bits
on their own. This EOR instruction takes the exclusive OR
of the byte in $29 and the byte in the A register and keeps the result in A.
Exclusively ORing a bit with zero doesn't change it, and exclusively ORing a bit with
itself always results in zero. Again, if these bits are identical like we
expect, this would make the bottom four bits all zero, but let's just leave these four
bits written as A XOR A' and so on for now. This result is written to the temporary memory
location. Then, remember how I said there are actually
two POKEY chips? Well, both of them are checked in this function. Here, we generate two numbers back to back
just like before, but from the second chip. They are also separated by four LFSR shifts. Oh yeah, these SEI and CLI instructions are
for disabling and reenabling interrupts. Since this code is actually timing-critical
(the load instructions needs to take exactly 4 machine cycles), interrupts could mess this
up, which is why they are disabled while reading the POKEY chip RNG values. Anyway, suppose the number we get in A is
"WXYZNPQR", and the number in Y is "STUVW'X'Y'Z'". Now we exclusively OR the value at $29 with
A again, strip out the lower 4 bits by ANDing with the value $F0, and then XORing with $29
again. This is a clever set of instructions, and
it essentially transfers the lower 4 bits of the temporary value into A without touching
the upper 4 bits. With that we were able to combine both sets
of special bits into the same byte; we have WXYZ and ABCD together now. This now gets stored to memory, and then we
transfer the byte in Y to A so we can shift it to the left 4 times. This lines up the W'X'Y'Z' to the upper 4
bits, and then one more exclusive OR operation merges everything together. This was a lot of delicate bit manipulation,
but the end result is a single byte where our RNG number bits WXYZABCD are XORed with
W'X'Y'Z'A'B'C'D'. If we are indeed grabbing random numbers from
a genuine POKEY chip, then this result will always be zero, since any bits exclusively
OR'd with themselves always gives zero, and we expect each of A and A' and so one to be
identical. This result is put into a different memory
location at $011F to be read by the game later. Here is the last small bit of code that uses
that information to toss a wrench in the works if the test failed. It simply loads the result of the bitwise
operations from before, and if it is zero, it just returns right away. This one waits until player 1 has at least
150 thousand or more points before triggering. Similarly to one of the other checks, when
it goes off, it will increment a particular byte in memory dependent on the last two digits
of player 1's score. This one modifies memory in the range of $200
to $299 instead of $00 to $99, which means different things can happen. This region of memory is responsible for how
many enemies spawn in the level, where they currently are, as well as where the player's
ship is. A lot of weird graphical glitches can happen
when this area of memory is messed with. This was the exact bug that happened to me
a long time ago. The emulator did not emulate the POKEY chip's
random number generation accurately, and so this code was able to pick up on that and
kill any games anyone played up to this point. It has been fixed since then, so at least
now you can get more than 150 thousand points without worrying about your game crashing. Now, since all of these checks are implemented
via software, it's not very difficult to just find the code and delete it or otherwise disable
it. It's also very easy to do nowadays since we
have the tools to easily analyze all the code and figure out what is where and what does
what. But back in the day it would have taken a
lot of effort and a lot of failed ROM burns to get something usable. Funnily enough, in the present day, it's more
difficult to get an unmodified version of a Tempest arcade cabinet up and running due
to needing the genuine hardware, if you can find some that still work. As usual, thank you so much to everyone who
watches and supports RGME on Patreon. Patreon is the main way I get paid, and is
the only reason I am able to make these videos full time. If you like my work, consider joining--every
little bit helps out! Until next time.