The Day Microsoft Campus Crashed - Bedlam at Microsoft!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

That was a wonderful story. Poor sysadmins on that day.

👍︎︎ 2 👤︎︎ u/ZombieLannister 📅︎︎ Jun 18 2021 🗫︎ replies

Nice sharing.

👍︎︎ 1 👤︎︎ u/CheKidsBuddy 📅︎︎ Jun 19 2021 🗫︎ replies
Captions
hey i'm dave welcome to my shop i'm dave plummer a retired operating systems engineer from microsoft going back to the ms-dos and windows 95 days and i'm here to tell you the story of what's come to be known as the microsoft bedlam dl3 incident when it comes to email and microsoft october 14th 1997 is the anniversary of a day that will live in technical infamy for time immemorial not to oversell it but if you were there at the time you remember it well why wasn't i do and that's how i'm able to finally tell you the story of the world's first email storm [Music] now before we get into the epic tale of the digital storm that brought the microsoft corporate network to its knees i've got one quick announcement and that's that the upcoming dave's garage live stream is on february 28th i've set this one for 10 am pst or 1 pm est which makes it 6 pm gmt and therefore much friendlier to the european folks out there we had over a thousand people on the last live stream and i had a great time answering your questions about microsoft history products and events it also gave me a great opportunity to pick your brains for future episodes the all-powerful glenn will be back to moderate so don't miss it at the end of the month if february 28 2021 has already passed check the channel info for the next live stream and now on to that email storm believe it or not it all started with a single email today when you have a group of people that you need to update promptly with a single piece of important news you'll likely go about it with some kind of message distribution list commonly known as a group chat you simply send a text message and it informs or spams as the case may be some potentially large number of recipients one sender many receivers in the days of your which generally can be taken to mean anything before about 1993 we did not yet have cellular text messaging you could send a text message to a pager but they were originally intended for paging individual doctors and so weren't great for paging many people at once as a result we had to have different ways of solving the problem of disseminating information widely to lots of people at once before there were text message distribution lists there were phone trees each person would have a list of perhaps three to ten other people that they were responsible for phoning and the tree would branch out from there the root node might be the local high school trying to announce a snow day for example it would seed the tree by phoning five important parents each of whom had a list of five others to call and each of those had a list of five more to call and so on the first set of calls of course reaches only that set of five people but then 25 more in the second generation and 125 on the third generation your call tree can be pretty shallow and small and still be reaching about a thousand people fairly easily but of course what if someone isn't home or can't be reached not only do they not get the message but anyone who is downstream from them also misses the message even if they're home because of the missing first call by the third generation 625 people could be missed due to a single outage so as a method it's hardly foolproof so what if instead we just drove the fire truck down main street with a loudspeaker on top announcing the school is closed for the day word would get out promptly in a somewhat widespread fashion depending on how far you lived from a main artery that the fire truck planned to announce on of course the reality is that if you did this all you've really accomplished is to reinvent the town crier for those that don't know perhaps because you don't predate widespread literacy or movable type the town crier really was a thing it was an occupation presumably part time where a town official with a truly loud and booming voice would literally walk down the street like the human fire truck announcing whatever the important news was in a tradition that began in england and quickly spread to america and the rest of the world at large the crier would often carry a large flying bell such that their news would alert more people and have that official feel in the early days of computers it didn't really bring much in the way of modern solutions to these problems perhaps a database of some kind could be employed to make your phone tree faster or more robust but the actual communication was still being done by one human phoning another human and reading a voice message out loud by the 1980s the notion of the computer bulletin board system or bbs had taken hold potentially great fun and interesting way to meet other nerds at least had a high probability of being in your local area code it didn't really solve the problem of disseminating news to a large group because with a typical bbs only one or at most a handful of people could be logged in at any one time even if they had a powerful computer that could manage many users simultaneously the fact that each of them had to connect through a dedicated hardware modem and a copper telephone line limited the number of people that could reach the system at once one person could easily leave a message for all the others but unless and until each of the recipients happened to get around to logging into their local bbs one at a time they weren't going to get the message even though the simple text message might be the preferred communication method today the real inflection point in the power of mass messaging came not with a telephone text message or even with its progenitor the pager that's because all of the methods that we've discussed so far make it easier and more efficient to receive a message but did little to improve the delivery of the message being able to send and deliver to multiple recipients at once is what really changed things and that came about with email and once email met its cousin the distribution list alias magical things could start to happen with email it was as easy as adding your friends to the 2 line or to the cc line if you're being fancy and everyone you listed would get the message or at least the message would be sent to them email wasn't necessarily reliable and read receipts were neither universal nor standard but at least you would typically get a bounce message back if it couldn't be delivered so you had some sense of your success or failure distribution list is like a power tool for email it allows you to send a message to three or ten or a thousand or ten thousand recipients all at once a distribution list acts as an alias for a group of people and that's why sometimes you hear them referred to as email aliases within a company like microsoft we had many aliases some would be monitored by just a few people like the hr alias for human resources while others like the nt dev alias were a way of connecting everyone on the nt dev team which could be a thousand people one would typically only ever send email to such a large address if you had a piece of information that was truly of interest to everyone on that alias for example if you fixed a significant bug or made a breaking change or needed to announce anything in general to everyone on the entire team you could do so by sending to a single alias ntdev you just did so carefully because you knew you were bugging a lot of people and they would not at all appreciate if you took the sounding of spammy or unnecessary posts when the receiving email system got a message for ntdev it knew that that didn't mean there was a person or developer by that name it'd be recognized instead as a special case and the message would be duplicated as many times as there were recipients if a thousand people were on the list when you sent one email you were really causing the server to send a thousand emails on your behalf most but not all would still likely be internal now with that preamble out of the way we can finally turn our attention to the historic date microsoft the day was the 14th of october 1997 and as i said if you were there at the time you likely remember it you also might remember playing some hacky sack or maybe some nerf football that day and maybe the next because beyond a certain point it was almost impossible to get any real work done the corporate network was completely log jammed and the email system was even worse and yet it all started with that single email at that time microsoft had something close to a hundred thousand mailboxes though many of those were simply aliases at some point the itg group which handles the internal networking setup on the microsoft campus was developing tools to split the large number of mailboxes in half and then half again yielding four partitions each partition was then placed on a numbered distribution list called bedlam one through bedlam four and thus every employee was on one of the lists and these lists were intended just to organize the users not to actually deliver any email no one would intentionally send email to these lists but if you happen to check to see which list your email address was on it would be listed among those that you belong to and that's what happened on october 14th an employee noticed that their email address was on one of these new distribution lists and quickly sent an email off to the list itself two bedlam dl3 from the user subject why am i on this mailing list what is it and that email went to all 25 000 users on that email partition a certain number of those users also had delivery receipts or out of office notices turned on so the actual number of messages created was even greater that was the fuel the 25 000 employees on the list were like waiting oxygen and all it needed was a spark and soon enough an unwitting accomplice would be a long to oblige ignition by sending the first pointless response me too and then another and then another the chimes of email delivery bells rang up and down the halls until the course of me too's and why am i on this alias and pick me off this list immediately drowned them out a few brave souls took to the airwaves and other aliases to try to get the news to stop sending email out but the servers were already overloaded a few pedantic fools sent emails to point out that sending emails asking people to stop sending emails only made the problem worse before they fell where they stood crippled by their own recursive logic like the titanic easing itself into the icy north sea or paradoxically an old man settling into a warm bath the email servers were only going one way from there down at some point a very frausel looking man came down our hallway like the town crier of old loudly begging us not to send any more email messages to corpnet he might as well have been driving the fire truck too because by that point nobody could send anything anyway email was new and there were very few rules and precedence so let's just say that one in every couple of dozen people or about a thousand in total got involved in the fun originally by replying in some form or another to that broadcast that's a thousand messages to 25 000 recipients or 25 million emails but then the read receipts and delivery sheets started kicking back and forth and the way exchange was architected at the time meant that each read receipt and each delivery receipt caused another email to be sent if only 10 of the users had them turned on that's 100 users each receiving 25 000 confirmations or another several million messages let's say we wind up with 30 million email messages total even on the hardware of the day and with the slower networks available back then exchange still could have handled the load according to larry osterman then of microsoft this process was significantly complicated by a bug in the message transfer agent which is similar in function to unix's send mail if a message had 10 000 recipients the mta would crash from processing recipient number 8192 worse it would then come back up and restart over at the beginning of the list creating that many more messages the first fix was of course to repair this bug in the mta then even though it would take some time if nothing else went wrong the servers would dutifully chew through the remaining email messages and get them delivered throughout the course of the day complicating matters was the fact that a lot of the corporate network wasn't on tcp yet but still used netbui a broadcast protocol without routing every pc on that network segment had to ingest and analyze every packet determine if it was intended for them and that's a hugely inefficient way to communicate it didn't take long before there was so much mail coming into the mail servers that none could go out you effectively had a total email logjam where the server cues were full most of the email messages in there were just useless noise and an entire company whose culture was based around email suddenly couldn't communicate with itself around 4 pm the last known email message was received by an exchange server by the name of lusitania it contained only 6 words take me off this damn list and then after that nothing more corpnet had slipped beneath the waves around the company people didn't know what to do they rubbed their eyes and stumbled forth into the daylight and the great gears of microsoft ground to a halt the lands were so clogged with packets that you couldn't even access source control servers to check code in or out unless you had work queued up on your dev machine ready to work on offline which almost nobody did no one was getting any work done because you couldn't it was like a power outage but without the surreal sense of silence that rules over a computer company like microsoft when the power goes out everything was still humming along but nothing worked some folks just chose to go home others unwittingly added the problem by sending emails to larger and larger distribution lists asking what's going on those that did would be promptly reprimanded by the few who managed to connect telling them not to send any more email itg's networking team convened crisis meetings and ultimately decided they would need to pull the plug and start over bring each mail server down flush his queue and slowly restart the system at that point the odds are that most of the messages clogging up the system were useless nonsense as i said and the business cost of losing the few valid emails in transit was likely lower than no one being able to get productive work done so they pulled the big flush handle and with one massive roaring sound the corporate groaned rolled over and in a process that would take about two entire days slowly started to write itself a number of steps were likely also taken at that point ranging from simply asking people by word of mouth to stop a variation of the old phone tree to limiting who could send to public aliases over the course of the next day or two the pipes cleared and the sun came back out if i remember correctly i think a rainbow came out and settled over the campus which i'm told was bill's symbolic way of reminding us that as long as we promise never to reply all unnecessarily to a large alias again he wouldn't flush and reset the corporate network again and that's how bedlam was brought under control that day in 1997. about 10 years later just to see how deep the memories went a friend ran a brief experiment in the lunchroom one day at microsoft in about 2004 during the middle of his meal he stood straight up in a clear booming voice and called out why am i on this list voices around the cafeteria then started coming out me too yeah me too before converting slowly to an awkward form of laughter the memories still run deep there if you enjoyed this particular story but you're not yet subscribed to my channel i'd be honored if you took a moment to do so right now that'll also let me know that i'm going the right direction with this episode and i'll make more like it and if you turn on the bell icon you'd even be notified of them when i do be a win-win as always remember i'm not selling anything and i don't have any patreons i'm just in this for the subs and likes so if you did enjoy the episode please be sure to leave me one of each before going youtube apparently really does care if you like the video or not they call it engagement more is better and don't forget to head on over to dave's garage at the end of the month for a live stream on sunday the 28th of february at 10 a.m pacific 1 pm eastern all questions will be answered all inquiries addressed and you can help me plan for future episodes the more the merrier so bring a friend our first ever live stream had over a thousand folks show up to it and it was a lot of fun so please do stop on by on the 28th thanks to all the regular viewers like mr bud out there for joining me here in the shop in the meantime in between time i hope to see you next time right here in day's garage this little chair will be waiting for one of you and a rocking chair for another who likes to rock and a big armchair for two more to curl up in all next time on dave's garage
Info
Channel: Dave's Garage
Views: 66,877
Rating: undefined out of 5
Keywords: email storm, microsoft email, microsoft email alias, satya nadella, microsoft, windows, email, emails, email marketing, email address, email account, email etiquette, microsoft email contact, Microsoft teams, Windows 10, Spam, microsoft campus
Id: pBmuY6qFMPQ
Channel Id: undefined
Length: 14min 42sec (882 seconds)
Published: Fri Feb 19 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.