Hey, I'm Dave, welcome to my Shop. I'm Dave Plummer, a retired operating systems
engineer from Microsoft going back to the MS-DOS and Windows 95 days, and I'm here to
answer the big important historical questions, like why are blue screens blue? After all, someone had to decide, but who,
and why? I tracked down the actual developer responsible
and the reason might surprise you. I know I wouldn't have guessed, and you won't
either. But did you also know that bluescreens are
sometimes green? And would you like to make yours red to impress
your friends? I'll show you how! Then I'll take you on a tour of bluescreens
going all the way back to Windows 1.0 and show you what you they used to look like. All that, plus what causes them in the first
place and how to fix them on Windows 10, coming right up on Dave's Garage. [Intro] Before we worry about why bluescreens are
blue, and when they're not, we need to talk about why they exist at all. Then we'll look at some infamous bluescreens,
like the one where Windows 98 blew up live onstage for Bill Gates, and I'll relate a
few bluescreen stories from the trenches. And so first, we should at least talk about
what blue screens are, why they happen, whether they are avoidable in the first place, and
how to fix them if not. As you likely know, a bluescreen is the end
of the line. It's a screen displayed to you when the system
instantly locks up and shuts down. If you're not up to speed on the issues surrounding
them, it might seem rather drastic of the operating system to completely shut down and
lock things up the moment the slightest thing goes wrong inside the kernel but that's precisely
what it does! But what's the alternative and would the implications
be? And by the way, I never once heard the term
"blue screen of death" at Microsoft, only on the Internet afterwards. We called it primarily just a blue screen
or, more frequently, a bug check. That's the name of the API down in the kernel
that allows you to raise a bluescreen in the unfortunate event one is needed, and so that
was the internal name we generally used. Like all the great words, it can be both a
verb and a noun. When it comes to software demos, a bluescreen
is synonymous with serious failure. The last thing you want is for your demo to
end in a bluescreen at a big conference or up on stage. Let's have a look at a famous bluescreen. Back 1998, a nervous looking Chris Capossela
was ready to plug a scanner into a Windows 98 desktop and demonstrate how plug and plug
would automatically detect it. They cut away quickly, but you'll notice Bill's
first reaction is actually to smile. He seems more chagrinned than angry, at least
on stage. His quip about not being quite ready to ship
was prescient and kind of funny. It happens. I've heard a ton of rumors that the poor guy
in the demo was summarily fired and so on, but it's not true. Chris went on to do pretty well at Microsoft,
and is currently the CMO, or Chief Marketing Officer, of the entire worldwide consumer
business. So clearly, even bluescreening the system
during a windows live demo with Bill on stage turns out not to be a career limiting move! I wasn't there, but I'm told another bluescreen
happened when Bill was demoing Forza, the racing video game. Reportedly, it was still very early in the
product development cycle and the not everyone was excited about the prospect of using the
early code in an important demo, but the Program Manager, or PM, was insistent. The PM tested the demo nine ways from Sunday,
and it worked flawlessly each and every time. A few hours before the demo they tested it
a couple of times more to confirm it was still ready to go, which it was, and soon enough
the time came for the actual demo. Bill grabbed one controller and the PM grabbed
the other but rather than a driving Battle Royale, the machine bluescreened. But how? Why, after testing it successfully so many
times, did it crash on stage? A little investigation turned up a simple
explanation: the early code had memory leaks, so it'd work the first time through, but if
the machine had been up for too long, or they had tested the demo too many times, the memory
leaks would have accumulated and eventually a bluescreen would result. That's fine in a very early beta, perhaps,
but what causes a blue screen when you're using a shipping version of the operating
system? Well, as I said, literally it happens when
code in the kernel decides it's appropriate or unavoidable for whatever reason and it
calls the KeBugCheck api. The Ke is for Kernel. Calling that API is a little like pulling
the fire alarm in a threatre. It's a serious thing, the operating system's
way of tearing the tent down around itself. But what kinds of things go wrong that the
operating system is so excited about that its willing to throw away your work in progress
and lock up irretrievably? Well, rest assured it is only done truly as
a last resort when there are no other safe options. In the simplest case, let's say the kernel
commits an access violation, by which I mean it reads from or writes to memory that it
doesn't own. As soon the attempt to write to memory outside
the kernel's control is made, a part of the processor known as the memory management unit,
or MMU, detects it and raises an exception. Windows catches that exception and ultimately
bugchecks the system, putting up the dreaded bluescreen. Note importantly that this all happens BEFORE
the memory is actually corrupted. In this case, any such illegal writes are
not only detected, but they are also prevented from happening at all. Philosophically, that's important, because
it means that the system might come to a premature halt, but it's never in a corrupted state. It's alive or it's dead, but it's never a
zombie. It's unfortunate that you can lose work in
progress, but it's better than corrupting already saved work in the form of data loss. Data loss is always the much greater sin. You can contrast that with the 16-bit days. Back in Windows 3.1, for example, and even
continuing on through Win95, the fatal exception screen was also blue, but it allowed you to
continue at your option. The system would tell you that it was borked,
but if you wanted to keep on a drivin' old Bessie on five cylinders, that was up to you. You at least had a shot at trying to save
your work before the machine burst into flames or did whatever it was going to do. With NT, it's not so simple. It's supposed to be robust and secure and
trustworthy, meaning "borked" was not really in our vocabulary. And it really has to be that way, if you stop
and think about it. What if the system is running inside of an
ATM, and the memory that's being illegally cleared is also the flag that keeps track
of how many hundred-dollar bills have been dispensed? Or perhaps it's some important function in
a healthcare monitor? I don't even know if the Windows license allows
for use in situations where life and limb are at risk, but if you want to know what
the reality is, consider this: when the Fukushima power plant melted down not that long ago,
48000 of their PCs were running Windows. But not just any Windows. They were still running XP even though it
was already well past End of Life Support. The reality is, or at least it used to be,
that by the time Windows ships, the kernel itself is quite rock solid. I've seen a lot of bluescreens out in the
world and even debugged my fair share out there, and it's never wound up being caused
by an actual kernel bug in a shipping version of Windows. Like compiler bugs they do exist, of course,
but in my real-world experience it's always turned out to be something else, and most
often it's a bad device driver. And that's a weakness in most all traditional
operating systems - the device drivers run at the same privilege level as the kernel. A device driver acts as though it were part
of the kernel itself, and its bugs are the same as kernel bugs. A simple access violation in a graphics driver
will immediately bluescreen a system. And it has to, because it has access to the
same memory that the kernel does, which is to say, every byte of the entire system. A poorly written or poorly tested driver can
truly be the Achilles heel of an otherwise solid system. One fix Microsoft has come up with is to move
progressively more and more drivers out of the kernel space and into plain old user space. It's more work for the driver authors, but
a big win for stability. Printers, for example. It was a painful transition at first but a
worthwhile one. Now if your printer driver crashes, it does
so in user space and worst case brings down the process that was doing the printing. Sound drivers have similarly been moved. Changes like that have gone a long way towards
making software bluescreens far less common than they used to be. Ironically, one step backwards in stability
came when the NT video subsystem was moved INTO the kernel for performance reasons. In the very early days even the video drivers
were in user mode. If your thread wanted to make a call to your
graphics card, a kernel thread would copy all of the data over to the kernel side and
then do the system call for you. It meant the video driver could even crash
without bringing the system down, but at a significant performance penalty, so they moved
those drivers INTO the kernel around the XP timeframe. One big improvement came with Windows 2000
and the introduction of what is known as the Driver Verifier, a version of the kernel that
stresses and abuses drivers in strange ways in an attempt to get them to fail. These two changes - the Driver Verifier and
the user mode driver model, likely went further than anything else in reducing the number
of Windows bluescreens. Let's take a quick look at bluescreens over
time. We'll start with the Windows 1.0 screen, which
as you can see, does little more than simply show you what appears to be a memory buffer. Perhaps if you're lucky you'd be able to see
whatever corruption was written to it, but it's clearly not much use other than to confirm
that the system has indeed crashed. That screen was used on through Windows 2
as well. It was finally replaced with something a little
more informative by Windows 3.1. And it is in Windows 3.1 that we find the
first evidence of a proper crash screen - more correctly known as the control-alt-delete
screen. With white text on a blue background and a
text message explaining the crash, it turns out that the Windows 95 bluescreen was the
creation of Raymond Chen. It further turns out that Raymond was responsible
not only for the Windows 95 bluescreen but also the Windows 3.1 bluescreen. I should note that he also correctly identifies
them as the control alt delete screens, but I'll let you mentally lump it in with the
other bluescreens for now. But what's more surprising is that Raymond
does not take credit for the text that you find on the control alt delete screen. Because as he explained, the actual author
of the control-alt-delete text for Windows 3.1 was none other than Steve Ballmer himself. As it turns out, Steve was visiting the systems
division one day and saw an early prototype that was displaying the fatal information
screen, but he didn't like the wording he saw. Someone suggested that if Steve thought he
could do better, he should do it himself, and so of course, he did. He went home and wrote it out and then emailed
Raymond the text that should appear on the screen, and Raymond dutifully put it into
the product almost verbatim, where it was enshrined for the ages. All I had to do, then, was find the author
of what is now the Windows 10 bluescreen code and confirm that he or she had been inspired
by the Windows 3.1 control-alt-delete screen, and the circle would be complete. A tidy story wrapped in ribbon and tied in
knot with a bow and all of that. And that is why I must now tell you the tale
of how I dug through the source code comments and version control logs in order to track
down the information on who actually wrote the code for the Windows NT bluescreen. The developer's alias was jvert, short for
John Vert. Searching for the blue screen dev whose name
means green, I ultimately located him at his new home, at the Redmond Town Center for Peaceful
Living. The staff directed me to his little cabin,
where I found a grizzled old man rocking in a chair, stroking his long white beard, mumbling
to himself. I approached him as if he were a skittish
deer, lest he be spooked, and when I got close enough, I said only three words: but why blue? He tipped his head back and cackled at me
in a way only someone truly mad could do, and as he raised a bony finger to point at
the sky, that's when I realized that I probably had the wrong John Vert, as there were several
in the phone book. So, I went home and sent a message on Facebook
to someone who looked a lot more like I remember him. The correct John got back to me promptly and
quickly dispelled two assumptions I had made about the bluescreen. For years I had assumed that bluescreens were
blue to make them easier to spot in the lab. By the mid-90s we had a few large rooms that
contained perhaps a few dozen PCs on shelves running stress tests each day on the latest
operating system builds. You could walk into a lab and simply by glancing
around easily spot where the troublemaker was. Somehow, I long assumed that this was the
reason as if I had heard it from someone authoritative. But now I didn't have to guess, I could hear
it first person from the fellow who actually wrote the code to turn the screen blue. It turns out that while perhaps convenient
later, John said labs were not a factor in his original decision, since the bugcheck
screen was done long before there were any large labs running NT stress. A second assumption I made, one that also
turned out not to be true at all, was that because both Windows NT and Windows 3.1 had
fatal error screens that were white on blue, that NT had simply followed the Windows 3.1
precedent. John said this was also not true - in fact,
most of the early NT guys likely had never used Win 3.1. I'll also point out that Windows was not the
original UI for NT, the OS/2 Presentation Manager would have been. So, for Windows NT at least, the bluescreen
is actually older than Windows NT as we know it. If NT and Windows 95 did not have a shared
heritage, then that means that John Vert is the father of the modern Windows 10 bluescreen. Now we had but a singular father, and finally,
he could tell me why it was blue. The first contributing factor is that the
lowest common denominator for the video hardware that NT could run on only offered a rudimentary
color text modes. The number of color choices were not infinite,
and among them, white on a dark color is a pretty natural choice. But even so, why blue? Put simply, because John's dev machine was
a MIPS RISC box, and the firmware on that machine was white on blue. Using the same color scheme led to a consistent
experience. And in fact, his favorite editor at the time
was SlickEdit, and the default text colors for SlickEdit were also white on blue. You could boot, code, and crash all in the
same color scheme: white on blue. And so that is why bluescreens are blue. Because the MIPS firmware was. If you want to go even further back, I worked
on the MS-DOS setup program which, from about MS-DOS 5.0 on, used a light grey on blue color
scheme. But I think pointing to any of these as the
progenitor of the bluescreen would be a pretty big reach. If you want to be super pedantic, bluescreens
are not even pure blue anymore, as in Windows 10 they've adjusted the color tone so as to
technically be caerulean, which is a shade of blue ranging between azure and a darker
sky blue. And now the next time your friend's gaming
rig crashes, you can drop a caerulean screen reference. Because now you know why bluescreens are caerulean. Hang in there for a minute and I'll tell you
how to impress your friends and frighten your enemies by making your bluescreens red. If you enjoyed this particular story but you're
not yet subscribed to my channel, I'd be honored if you took a moment right now to do so. That'll also let me know I'm going the right
direction with this episode, I'll make more like it, and if you turn on the bell icon,
you'll even be notified of them when I do. It's a win-win. As always, remember I'm not selling anything,
and I don't have any Patreons, I'm just in this for the subs and likes, so if you did
enjoy the episode, please be sure to leave me one of each before going. YouTube apparently really does care if you
like the video or not: they call it engagement. Speaking of which, I'm trying to grow the
channel so if there's somewhere you can share this episode, like an appropriate place to
list it on reddit for a forum, or a BBS or that sort of thing, please do so! It's kind of a niche interest but odds are
if you were interested then you know someone else who's enough like you to be interested
too, so send them a link. And don't forget to head on over to Dave's
Garage at the end of the month for a Livestream on Sunday the 28th of February at 10 AM Pacific,
1PM Eastern. All questions will be answered, all inquiries
addressed, and you can help me plan for future episodes! The more the merrier, so bring a friend. Our first ever livestream had a 1000 folks
show up and it was a lot of fun, so please do stop by on the 28th. Now, on the topic of how to make your bluescreens
red, I was about to tell you how to hack the 386 Enhanced section of the system.ini file,
which is the manual way. But in talking to Raymond, as is par for the
course, he pointed out an easier way that I didn't know about. And that's to download an app called NotMyFault
by Mark Russinovich, one of the guys from SysInternals. As long as you run it as administrator, it
will install a driver that will intentionally bluescreen the system with the bugcheck color
of your choice. You can pick red or even a lovely shade of
fuscia if you like... pretty much any color. I'll link to the app's homepage in the video
description. I can't promise it'll work on the latest Windows
10 without some registry tweaks to get you back to a traditional bluescreen display without
the smiley and all that, but it works well on anything that actually shows the STOP message. Thanks for joining me here in the shop. In the meantime, and in between time, I hope
to see you next time, here in
Dave's Garage.
I knew a guy who'd fucked up his install so bad the BSODs were yellow with grey text.
Hey Dave, don't mean to hijack your thread, but this video just came out yesterday https://www.youtube.com/watch?v=cwyH59nACzQ and I know you worked on product activation around the XP days.
Do you have any insight into why the product keys for 95 and other products around that era were so simple? finding out 25 years later that the CD keys were just "numbers that add up to 7" is kind of mind blowing.
I figure it was just enough "security" to convince investors that MS cares about stopping piracy.
Wish there was a transcript for these things :( More a reader than a watcher
Did Microsoft ever regret not adopting the far more logical message "FLAGRANT SYSTEM ERROR"?
My personal theory about the white-on-blue text is that was what the Borland IDEs used in the DOS days, and it was just "inherited"
Amazing how a personal preference can effect so many. In my industry my choices might be seen by 20 people at most.
Also thank you for the inside scoop. I stumbled on your channel last week, really enjoying it so far.
I always assumed it was to reduce the rage resulting from a system crash due to blue being known as a color with a psychological calming effect.
I noticed that windows 10 blue screens were so much prettier than older windows.
Which just makes me wonder... why don't they make the whole operating system out of that screen?
I wish he nailed the transatlantic accent. Btw, what's his current accent?