The Arcade Game that Crashes Itself for Anti-Piracy Reasons

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
A long time ago I was playing Tempest on an arcade game emulator. I wasn't very good at it, so it was exciting when I finally made it far into the set of red levels. However, after passing a few more levels and sacrificing a bunch of imaginary quarters, something interesting happened. The spaceship flew off of the level and the vector lines making up the game starting flailing all over the place. And shortly after, the game completely reset and lost all of my progress. I could never get that far into the game another time, but it never crashed like that ever again. So I just figured it was a once-in-a-lifetime bug that could never be reproduced. Old games are like that right? Well it turns out I was triggering some code in the game to fight against piracy. The emulator wasn't accurate enough and so the software was sabotaging my play session by purposely glitching it out and causing it to reset. In this video, we'll look into not one, not two, not three, but four different piracy checks that Atari put into the Tempest ROM. And we'll also get to see what happens when each of them gets set off. Let's get into it! There are two main ways that Tempest checks for potential piracy. It checks for software modifications, and it checks for genuine hardware presence. Let's start with the software side of things. The basis of all of the software checks is the checksum. A checksum is one way to verify data integrity--that is, tell if it may have been modified. A very simple type of checksum is to add up all of the data values you have. Suppose you had a padlock combination you needed to keep track of. You can add up the numbers and get some result. Now you can tell if one of the numbers is wrong. If you add them up and you don't get this checksum value, you know the data has been tampered with. It doesn't help you figure out what the real data should have been, but at least you know it's wrong. Usually the checksum is limited to a certain range of values, especially when there can be a lot of data. In this case we can say that our checksum is limited from 0 to 99 (we just wrap around at 100). In practice, a checksum is often limited to a certain number of bits. Another way of implementing a checksum is to use a complement. The complement is one extra value appended to the data to ensure that the checksum is always the same value, usually zero. In our case, since our checksum was 14, we would need to add 86 to it to reach 100 and wrap back to zero. This is our complement; it's not an extra number for the padlock combination, but at least we don't need to keep track of the checksum any more--it's built into the data. Now, checksums do have some drawbacks. If I add 1 to one of the numbers, but then subtract 1 from a different number, the checksum is still the same. Just because the checksum checks out doesn't mean the data has not been messed with. In fact, if someone knows in advance that the data is going to be checksummed, they could change the data into whatever they want as long as the checksum is the identical to the original. If the data uses a checksum complement, then this becomes even easier, since they could adjust the complement to be whatever it needs to be. Now, a lot of arcade games use checksums to verify all of their code. For example, Pac-Man runs a checksum on each of its four ROM chips. If any one of them fails the check, the game will not boot and will display an error instead. However, Tempest doesn't do this, and instead only checksums a few small excerpts of its code. And they all have to do with this string of letters right here. All that matters to Atari is that they get their copyright message displayed correctly, apparently. Within all of the data in the game's ROM, there is a large table of strings that includes each of the phrases that appear throughout the game. At offset $D575 you can find the string for the copyright message. The first byte denotes the X position the string is drawn at on the screen, and then each byte afterwards are the individual letters. So here, the $50 is the copyright symbol, $00 is a space, $2E is the letter M, $1A is the letter C, and so on. The last character in the string is denoted by having its highest bit set. The $A6 here is the letter I, which is just $26 but with the highest bit set. Here is the code that calculates the checksum for this copyright string. The checksum complement is the $85 here, which is stored in the code rather than in the string itself. That gets added together with each of the values for the letters in the string, as well as that X position byte. In fact, the X position of the string following the copyright string is unintentionally included as well. The copyright string data is 16 bytes (1 byte for position and 15 letters), but the maximum index value used in the code is 16. Since tables use a zero-based index, this actually grabs the 17th byte from the start of the string--one more than its length. That's not really an issue though, it just means the X position of the word "credits" can also not be modified without tripping the piracy detection. One last thing to point out here is that the add instruction here is an ADC instruction (since 65xx processors don't have a standard ADD instruction). This means the carry flag is also added to the running total each time this instruction is run. The carry flag is cleared at the start of this function, but it will get set whenever the running total wraps around from 255 to 0. If you start with the complement of $85, and add each of these 17 bytes together, you end up with $4FC. Well, if you take the lower byte and add the all the carried bits, you get $FC plus 4, which is $100. Now looking at only the lower eight bits you get zero, which tells us that the checksum is valid and that the string has not been modified. The game then writes this result to a byte in memory at $B5 to refer back to later. This should always be zero, but if it's not, then the game knows that it has been modified and can take any action necessary. What does it end up doing? Well let's look at another bit of code. This gets run regularly during normal gameplay. The first instruction checks if the value held in $B5 is zero. If it is, everything gets skipped. Otherwise, it performs another check. If player 1 is currently on level 10 or earlier, then the next few instructions also get skipped. Otherwise, for level 11 or later, a value of $7A is written to memory address $53. This memory address contains a helper timer that sort of functions as the game's heartbeat--it throttles the game so that it runs at a consistent framerate. It's not really important to know how exactly that works, but it is important to know that it only ever counts up to 9, which means this byte should only ever be in the range of zero to eight. This code sets it to $7A, or 122 in decimal. This is way outside the normal range, and it causes the game to force reset instantly. If you try to start on level 11 or later, it resets immediately. And if you complete level 10, the moment you transition from it to the next level, it resets. So why would the programmers make the game reset on level 11, instead of just failing to even start the game in the first place, like with Pac-Man? Well, it's sort of a cruel reason. Suppose you wanted to hack Tempest, and make your own version of it. (This exists by the way, there was a somewhat popular romhack called Tempest Tubes that just changed the shapes of the levels--the copyright string stays in tact as to not trip the piracy detection.) Once you are done modifying the game, you would have to program a new ROM chip and insert it into the motherboard. If it doesn't boot up right away, then clearly something is wrong, and you can try to troubleshoot the problem. But if it starts up just fine, then maybe you would presume that it works! You may even play from the beginning of the game, and even convince yourself that the game plays fine for the first 10 levels, so you pack everything up and show it off to your friends. But then, someone gets to level 11 and suddenly the game crashes. What was the cause of the crash? Was it faulty hardware? A software bug? It was working just fine for you while you were testing it, so you may not even think that the game is even performing a piracy check in the first place. And if you did figure that was the case, having to play up to level 11 each time just to make sure it doesn't crash would be sort of annoying. Then you have to be confident that that was the only piracy check and that there weren't others just waiting to be tripped. Which, of course, there were! The game converts these text strings into lists of vector generator instructions so that the vector generator can actually draw them to the screen. I have a video about the vector generator and its instruction set if you want to learn more about that. But here is an example of what the code snippet looks like. First the beam is centered on the screen, and the size of the text is set. This draw instruction moves the beam to the X and Y position where the text string will be drawn, and this instruction sets the proper text color. All these call instructions draw each letter in the string and move the beam over to prepare for the next letter. It turns out that the bytecode of these instructions--seen here on the left--is also checksummed. This means if you didn't touch the copyright string itself, but instead modified how it converts that string into drawing instructions (maybe just disabled it altogether), this checksum would fail instead. Now, due to how vector RAM works, this bytecode may not appear in the same location in memory each time. It would depend on what other elements are currently being drawn to the screen. Therefore, there's this small section of code that runs whenever a text string is being converted to vector generator instructions. If the ID of the string is exactly $2C (which is that of the copyright message), then the current vector RAM address is copied to a temporary location. This just saves the location of where the copyright string instructions are. Then, later, this checksum calculation is performed. This is similar to the last check, where all the bytes are added together using the ADC instruction, and the checksum complement is $F2. This time the proper number of bytes are checked--40. This is important to get right in this case since it's not guaranteed that the bytes following this code will be the same every time--again it depends on what elements are going to be drawn to the screen. Now, there's actually a problem with how this integrity check works. The copyright string is drawn in four different places. Once during the attract mode demo, once while the high score screen is displayed, once during the title screen, and once during the level select screen. Since the string is drawn at a different location and in different colors, the vector code will be different in all four cases. So the code I just showed you is only responsible for checking the version of the copyright string during the attract mode demo. A completely different function is run to checksum the other three versions of the string. Here's what that looks like. Now, this function actually uses a different checksumming method entirely--it uses subtraction rather than addition to combine all of the bytes. The checksum complement for the high score screen version of the string is $0E, which is what you see here. This results in a final tally of zero for this version, but it results in $E5 for the level select screen. If the result is not zero, then it gets exclusively OR'd with $E5 next--which will give exactly zero for the level select screen. If it's still not zero, it may be $29 at this point, which would be the case if it were checking the title screen variant. So finally the result is exclusively OR'd with $29, which will give zero for the title screen, and anything but zero if the chunk of bytes doesn't match any of these three exactly. This is sort of a lazy way to do it, since any three of the versions of the string can match any of the three checksum complements for the test to pass, instead of checking each one individually. The first checksum function writes its final result to address $011B, while the second function writes its result to $0455. Now let's look at what happens if these integrity checks fail. Here is the function responsible for this, which runs during gameplay, even though the copyright string is not visible while in the middle of a game. The two flags are OR'd together--they should both be zero. If either one of them aren't zero, then the game takes action. This time, instead of waiting for the player to reach a certain level, it waits for a certain amount of points. It checks if the highest byte of the player 1's score is higher than 17, which equates to 180 thousand points. If they have at least that many points, then a certain memory address gets incremented by one. Which memory address exactly is dependent on the lowest byte of the player's score. Since the spikes in the playfield give you just a single point when you hit them with a single bullet, this memory address can easily be anything from $00 to $99 in binary coded decimal. Some of these locations hold game variables which are responsible for enemy positions and the camera's location. And some of theme hold important values that keep the game running altogether. Haphazardly incrementing various values in this range can cause a bunch of weird stuff to happen. The game can totally glitch out, and even softlock. Though most of the time, it does end up crashing and resetting itself. Now for the last software check. Both of the previous checks looked at the data that made up the copyright string. But what if you just prevented the game from even drawing the copyright string altogether? Well, they thought of that too. Here is the small section of code that is responsible for drawing the copyright string (specifically on the attract mode demo and high score screens only). Well, the bytecode of this function is also checksummed, so if this code is tampered with, it will set off a flag as well. Here is the code responsible for that. This time, the exclusive OR operation is used to combine all of the bytes, and the checksum complement is $A7. Just like the first check we looked at, it seems that it checks one extra byte than intended, but again, there's nothing inherently wrong with that. The flag for this check is stored at memory address $016C. What happens when this check fails? Well, let's see. If the checksum result is not zero, and the current player is on level 21 or later, than this one single instruction gets executed: SED. This instruction turns on decimal arithmetic mode. Essentially, this allows the processor to add and subtract numbers in base 10 instead of binary. Normally if you add 1 to 9, you get 10. But if you convert this into hexadecimal, you get $0A. If the decimal arithmetic flag is enabled, adding 1 to 9 gets you 16. This is so when you convert this into hexadecimal, you get $10, which looks like what 1 plus 9 is in decimal. This is useful when you want to store numbers in decimal instead of binary--generally when numbers are going to be displayed on the screen, like the player's score. Tempest actually uses the decimal arithmetic flag properly in several places in its code. Every time it does though, it makes sure to turn it off when it's done doing math in decimal, since it does make math really messed up if you aren't expecting it. However, in this anti-piracy check, the game turns on decimal arithmetic and never turns it off. This really screws with pretty much all math operations the game needs to function, which causes it to softlock pretty much immediately. The last anti-piracy check in the game has to do with detecting genuine hardware. ROM chips are very common, and any person can just go buy some. And it's trivial to copy data from one ROM chip to another. Using this method, you could duplicate a ROM chip, and essentially duplicate a game. However, if the game requires other chips in order to run, you would need to get a second copy of all of those chips too. If those are as easy to get your hands on as ROM chips, then there's no problem for the pirate. However, what if one of those chips was copyrighted by the same company whose code you're trying to copy, and they only sell those chips to licensed manufacturers of arcade cabinets? Then you might be out of luck. This is sort of what Atari did with Tempest. The chip in question is the Atari C012294 chip, also known as the pot keyboard integrated circuit, or POKEY chip, which is responsible for audio output, keyboard and other controller input, timers, IRQs, and a random number generator. They were found in a lot of Atari's home computers, game consoles, and arcade games. So they weren't impossible to get your hands on. Tempest uses two POKEY chips for sound, button and spinner inputs, as well as the RNG functionality. The developers check for the presence of a pair of genuine POKEY chips by auditing their random number generation. As a pseudo random number generator, the stream of random numbers the chips produce is completely deterministic. Hence, if you know exactly how the chips produce their random numbers, and you read one random number, you should be able to infer what number or numbers are possible next. This code does just that--it fetches two random numbers from each POKEY chip and confirms that they are following the exact pattern any POKEY chip's RNG output should. But before we look at the code, let's look at how the POKEY random number generator works. This chip uses a hardware implementation of a 17-bit linear feedback shift register. LFSRs have shown up a few times on this channel already, but let me quickly explain how this works. Inside the chip are 17 bits that represent the state of the random number generator. Every time a new number is to be generated, all of the bits shift over one position. The bit at the end gets discarded, and an empty spot at the start appears. This bit is filled in by taking the exclusive or of two of the bits in the shift register itself. In this implementation, it is the 12th and 17th bit from the start that get XOR'd together. The result of this bit operation is shifted in to the start. The POKEY chip then exposes eight of these bits to be used as the output of the random number generator. This provides a number from 0 to 255 that can be used by the software. How often is a random number generated? The POKEY chip generates a new number (and consequently shifts the LFSR) on every machine cycle of the main processor on the motherboard. Tempest is clocked in at 1.512 MHz, so we're looking at about 1.5 million numbers per second. This is very fast in comparison to any software LFSR like we've seen before. To put it in perspective, a single load instruction like this takes four machine cycles to execute. This means in the time it takes to load in one byte from memory into a CPU register, the POKEY chip has already created four random numbers. It's always grinding away in the background, producing random numbers for free. (It uses these random numbers for itself in some cases, particularly for audio noise generation.) A big problem with random number generators is output correlation. A random number generator may be good at uniformly providing a range of numbers, but it may be awful at providing multiple numbers in quick succession. Just take this LFSR for example. If we generate a random number and we get, say, $7C, what can the next number be? Well, we know all of the bits get shifted to the right once, so we'll end up with $3E. But that's if a zero gets shifted in to the top. If a one gets shifted in, then we might get $BE. But that's it. If we get the random number $7C first, we know the next number must be $3E or $BE. When there are 256 possible numbers, but we can only really get 2 of them, that's not very random at all. Thankfully, since this is a hardware-based RNG, it can generate numbers so fast that it's not hard to make sure we don't generate random numbers too quickly. All it takes is 8 machine cycles to see a completely new set of bits, which is only about 2 or 3 CPU instructions worth of time. However, we can also use this correlation to our advantage. Since we know exactly how the POKEY random number generator works, we can verify its existence via software by specifically checking for that correlation. If it doesn't exist--that is, the numbers are too random--then we know that something other than Atari's POKEY chip is on board supplying random numbers. And that would mean there's a good chance this arcade cabinet has been tampered with. That's exactly what this chunk of code does. The first thing it does is load two random numbers back to back. This load instruction takes four machine cycles to execute, so these two random numbers are four LFSR shifts apart from each other. This means, due to the correlation, that we can expect the upper four bits of the first number to be identical to the lower four bits of the second number. Just to make this more generalized, if we load into A the number with bits "ABCDEFGH", then the number we load into Y should be "JKLMA'B'C'D'". We expect the bits A and A' and so on to be identical, but it's good to tell them apart here just to make it clear what this function is doing. The second number gets stored into a temporary memory location, and the first number is shifted right four times to get those special bits on their own. This EOR instruction takes the exclusive OR of the byte in $29 and the byte in the A register and keeps the result in A. Exclusively ORing a bit with zero doesn't change it, and exclusively ORing a bit with itself always results in zero. Again, if these bits are identical like we expect, this would make the bottom four bits all zero, but let's just leave these four bits written as A XOR A' and so on for now. This result is written to the temporary memory location. Then, remember how I said there are actually two POKEY chips? Well, both of them are checked in this function. Here, we generate two numbers back to back just like before, but from the second chip. They are also separated by four LFSR shifts. Oh yeah, these SEI and CLI instructions are for disabling and reenabling interrupts. Since this code is actually timing-critical (the load instructions needs to take exactly 4 machine cycles), interrupts could mess this up, which is why they are disabled while reading the POKEY chip RNG values. Anyway, suppose the number we get in A is "WXYZNPQR", and the number in Y is "STUVW'X'Y'Z'". Now we exclusively OR the value at $29 with A again, strip out the lower 4 bits by ANDing with the value $F0, and then XORing with $29 again. This is a clever set of instructions, and it essentially transfers the lower 4 bits of the temporary value into A without touching the upper 4 bits. With that we were able to combine both sets of special bits into the same byte; we have WXYZ and ABCD together now. This now gets stored to memory, and then we transfer the byte in Y to A so we can shift it to the left 4 times. This lines up the W'X'Y'Z' to the upper 4 bits, and then one more exclusive OR operation merges everything together. This was a lot of delicate bit manipulation, but the end result is a single byte where our RNG number bits WXYZABCD are XORed with W'X'Y'Z'A'B'C'D'. If we are indeed grabbing random numbers from a genuine POKEY chip, then this result will always be zero, since any bits exclusively OR'd with themselves always gives zero, and we expect each of A and A' and so one to be identical. This result is put into a different memory location at $011F to be read by the game later. Here is the last small bit of code that uses that information to toss a wrench in the works if the test failed. It simply loads the result of the bitwise operations from before, and if it is zero, it just returns right away. This one waits until player 1 has at least 150 thousand or more points before triggering. Similarly to one of the other checks, when it goes off, it will increment a particular byte in memory dependent on the last two digits of player 1's score. This one modifies memory in the range of $200 to $299 instead of $00 to $99, which means different things can happen. This region of memory is responsible for how many enemies spawn in the level, where they currently are, as well as where the player's ship is. A lot of weird graphical glitches can happen when this area of memory is messed with. This was the exact bug that happened to me a long time ago. The emulator did not emulate the POKEY chip's random number generation accurately, and so this code was able to pick up on that and kill any games anyone played up to this point. It has been fixed since then, so at least now you can get more than 150 thousand points without worrying about your game crashing. Now, since all of these checks are implemented via software, it's not very difficult to just find the code and delete it or otherwise disable it. It's also very easy to do nowadays since we have the tools to easily analyze all the code and figure out what is where and what does what. But back in the day it would have taken a lot of effort and a lot of failed ROM burns to get something usable. Funnily enough, in the present day, it's more difficult to get an unmodified version of a Tempest arcade cabinet up and running due to needing the genuine hardware, if you can find some that still work. As usual, thank you so much to everyone who watches and supports RGME on Patreon. Patreon is the main way I get paid, and is the only reason I am able to make these videos full time. If you like my work, consider joining--every little bit helps out! Until next time.
Info
Channel: Retro Game Mechanics Explained
Views: 332,174
Rating: undefined out of 5
Keywords: video, game, programming, code, glitch, trick, explain, description, hack, tas, arcade, tempest, atari, crash, piracy, checksum
Id: ewoDLDDgHkI
Channel Id: undefined
Length: 29min 57sec (1797 seconds)
Published: Fri Jul 28 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.