I Tried to Break a Million Dollar Computer - IBM Z16 Facility Tour!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

THIS, THIS is the LTT content I want, Bringing Enterprise into a context I have a chance of digesting. Thank you LTT. Moar of this, please!

👍︎︎ 47 👤︎︎ u/Summer-LL2022 📅︎︎ Apr 06 2022 🗫︎ replies

im surprised someone from IBM invited linus to one of their test labs to play around with this stuff. usually they dont market this stuff to consumers.

👍︎︎ 82 👤︎︎ u/randomkidlol 📅︎︎ Apr 05 2022 🗫︎ replies
Captions
- Just yank it? Ah, come on you bastard. Oh, wow. Now it's really mad. Whole thing's still running. Meet Z16, the latest member of IBM's mainframe family. That's right. Not only do mainframes still exist, but powering high frequency transactional industries like global banking and travel. But as of today, you can get a brand spanking new mainframe, configured with up to 256 cores built on the latest seven-nanometer node from Samsung and with up to 40 terabytes of memory. Very few outsiders have ever been invited to IBM's 100,000 square foot Z test lab. So we are gonna pack as much as we can into the very short time that we get to spend on site today. Just like I pack our videos with sponsors. - Ridge Wallet, time to ditch that bulky wallet. Ridge Wallet can hold up to 12 cards, comes with a cash strap or money clip, and is available in tons of different colors. Use the code "Linus" to save 10% and get free worldwide shipping at the link below. (upbeat music) - The core pillar of Z is right in the name. It's short for zero, like zero downtime. And their target is seven nines of reliability or about three seconds of downtime a year on average. And to achieve that, takes an architecture that's built from the ground up for resiliency. Now, out of the hundreds of Z systems that they're beating the heck out of here in the lab, there's just one Next Gen Z16 that they got all gussied up for us with the fancy covers and everything, only for us to immediately ignore them and start digging around in its guts. Starting at the bottom, I initially assumed that these heavy black chunguses here were cooling lines. But as it turns out, Z16 does not support being plumbed into a building wide cooling system. Instead, they opted for a self-contained system where every rack that has a compute drawer gets this pump and radiator unit down here at the bottom. And if you look closely, you can actually see the two redundant cooling pumps, that's in case one fails. There's actually also an empty spot where it looks like one of the pumps is missing. That's because in the validation phase, they weren't sure if these pumps were gonna meet their reliability standards. So the plan was to just tuck three of them in there, just in case. These are actually power lines then, routed through the three-foot crawl space that's all underneath us, covering the entire test facility. And each of these two is capable of carrying 60 amps of 208 volt three-phase power on this particular config. Power that needs to be distributed. That's where these come in. Each of these power distribution units or PDUs down the side here, are meant to be fed from a separate breaker in the event of a power loss. And basically you can think of these as the enterprise grade equivalent of a power strip. So using the onboard ethernet port, a technician can monitor power usage, turn off particular plugs for maintenance, or even update the firmware. Does your power strip have upgradeable firmware? No? Gross. Next up, things get really spicy. Each Z16 system can be configured with up to four of these compute drawers. And each compute drawer contains up to four of their new Telum chips. And these things are really cool. Let's go take a closer look. What a monster. Telum is a chip-like design and you can actually see the two separate dyes here. It's got support for DDR4 and a total of 16 cores and 64 lanes of PCI Express per socket. Which is neat, but that tells only a fraction of the story. While a consumer, x86 CPU, might contain dedicated hardware for let's say decoding popular video codex, Telum brings some very different specializations to the table. Each one of these cores, so there's eight per dye, has a co-processor to accelerate storage, crypto, and IBM's own compression algorithm. Then each dye gets its own GZ compression accelerator. And this is one of their biggest announcements, an AI accelerator. The reason for bringing that on dye, is that customers like financial institutions, have these complex machine learning models that they've built for, for example, fraud detection. But data scientists care about accuracy, not necessarily the downstream performance impact. So what they found out was that they had these great models, but they didn't have the performance to actually apply them to every transaction. Or well rather they could apply them to every transaction if they had all the time in the world, but they don't. So if your bank has an SLA or a performance guarantee of nine milliseconds on a transaction, and there's not enough time to apply their fraud detection model, they either have to just let it go, fraudulent or not, or decline it, risking the customer just pulling out their next card in their wallet, effectively giving that business to their competitor. By putting the AI processor on dye, they're giving it direct access to the transaction data that is sitting in the CPU cash, rather than forcing it to go out to memory for it. And compared to last Gen Z, IBM is figuring on a 20 to 30X improvement in performance with, and this is critical, better consistency and accuracy. The cash on this chip is really special too. But to properly talk about that, we need to zoom out a little bit. Now we're looking at a full compute drawer here. Each of which contains four Telum chips. Actually, this one's only got three in it so far. I'm assuming I have this ESD strap on, so I can install the last one, right? - [PJ] All you. - Is it this one? - [PJ] Yep. - Wait, so you guys handed me a working chip before? Oh, well, that was your first mistake. Okay. Where's the nifty install tool? Ah, yes. Okay, cool. So I've seen this demo once, which should be more than enough for me to perform it for you. The Telum chip goes in a little something like that. Then this Doohickey (indistinct) lines up with the dots on the thing, then you grab the chip, right? You make sure it's not gonna come off over something soft-ish. Okay. Holds onto a little something like that. We really hope the CPU doesn't come out. If it comes out in the socket, this boy's done. It won't, right? (PJ laughs) I just have a bit of a reputation and I don't need it right now. I don't need the SAS from my camera operator here. Then I poke this in here, I think. I poke in there and then I squeeze to lock it. Well, unlock it. And boom, it's in. We give a little wiggle. Heck yeah. Now we're halfway. The next thing we gotta do is install our cooling. And this is super cool. One of the first questions I asked them when I walked up to this compute drawer was, "What the heck kind of thermal compound is that?" My initial gut feeling was that it looked like some kind of liquid metal. Like, I mean, it's not like it's not being used in commercial products. The PlayStation 5 is using liquid metal. But actually it's solid metal. So rather than being like an Indian Galea mix of some sort, this is just an Indian thermal pad. And what's great is you can actually see how, there you go, they've got the size just exactly right to cover where the dyes are creating hotspots under the integrated heat spreader. So these are gonna be the interface between the integrated liquid cooling system in the chassis, there we go, and our processors. And I guess, I can't even tell... Is the latch on? Kind of looks like it. Where's my torque wrench or screwdriver. Oh, is this it? - [PJ] That's it. - Look at that. It's all configured for me and everything. Yeah, this feels like a lot. I mean, I guess it's a lot of pins, but still. Oh, this really feels like a lot. Is that... Okay. Whoa. Putting that kind of pressure on that kind of spicy, expensive stuff is stressful. I'm sure you guys are used to it though. This is adorable. Little 3D printed hose holders. Do you know how many times I've been working on a system and I wished I had one of these to hold a block out of the way while I'm doing stuff? I love it. I don't like this screwdriver. We're gonna have to make a better torque screwdriver for you guys. We're working on our own screwdriver, lttstore.com. Yeah, he knew I was gonna say it. Bloody hell. That looks freaking awesome. Like that is hardware porn right there if I've ever seen it. This four CPU config gives us a total of 64 cores in this drawer and a whopping 256 PCI Express Gen four lanes. But you might have noticed, the motherboard that they're installed on is really unusual looking. Where are the memories slots? Where are the VRMs? Let's answer the second one first. This is called a POL card or a point-of-load card. They haven't showed me how to take it out, but I'm sure I can figure it out. Ah, yes, there we go. These are super cool. These take the bulk 12 volt power that comes from the power supplies over on the end here and step it down to the approximately one volt that we need for the CPUs and the DRAM modules. And the craziest thing about these is that they've got 14 power phases each and they can be dynamically assigned depending on where they are needed, with two of them actually being completely redundant. So these three do all of the work, and these two, including one of the ones that I took out, are doing absolutely nothing unless something fails. I love seeing that old school IBM ass logo on like cutting edge (beep). Total combined output not to exceed 660 amps. Okay, I'll be sure not to. Now, they get some additional help from... Ah, yes. Here they are. Come on out little buddy. There we go. This is called a voltage regulator stick or a VRS. It's smaller with fewer phases, but it contributes to step down for IO cards and things like that. And then... Oh, one other really cool thing I wanna show you guys is the oscillator card in the front of the... Oh, I think this thing is locked. There we go. I'm moving it. In the front of the compute drawer here. We talked in more detail about what cards like this do in our timecard video. But essentially, they take an external signal and ensure that all the machines within a data center are running at exactly the same time to avoid wasting clock cycles and operate more efficiently. Now let's move on to memory. And this is where things get really funky 'cause I ain't never seen a memory module looks anything like this thing. There's at least three crazy things going on here. Number one, is that even though this is DDR4, some of the power delivery is actually on the module itself, like we've seen with DDR5. Second up is the IC configuration. We've got 10 of these DRAM chips per side for a total of 40, which doesn't really correspond with any kind of ECC error correction that I've ever seen in my life. And it turns out that that's because it's not done on the module. Each memory controller in the system, actually addresses eight of these memory modules and the ECC is handled more like RAID, with the parody data striped across the eight different modules. It's all managed at the memory controller level. Finally, we've got this bad boy right here. What's this thing called again, with the copper chip set? - [PJ] Explorer. - It's the Explorer. This is a proprietary buffer ship that basically adds to latency hit, but allows the memory controller to address vast amounts of memory with each of these systems able to handle 10 terabytes of DDR4 memory. Absolutely flipping insane. Papra, pa, pa, pa, pa. Now let's talk about why I needed all four of these chips to explain Telum's cash configuration? Where a normal CPU would have a tiny, lightning fast level one cash sitting right next to the processing cores, followed by a larger slower level two and then a largerer slower level three and so on and so forth. Telum has an already aggressive by-consumer standards, 256 kilobyte, private level one cash per core. So they've got eight of those per dye, 16 per socket. That is then backed up by a whopping 32 megabytes of level two cash per core. To give you some idea of what this means in practical terms. Look at this dye shot. This thing has more cash than compute by area. And it does a away with level three cash altogether out the window. Now IBM's engineers probably could have said, "Well, the level two cash is so big. We probably won't need level three anyway." Fair enough. But they didn't. Instead, when a line needs to be evicted from one of the core's private level two cashes, it will actually look for empty space in another core's level two, and then mark that as virtual level three cash. Now for that core to then fetch it from another core's cash, does add latency compared to if it had been in its own level two right next to the core, but that would be true of a shared level three cash anyway. And this approach affords them an incredible degree of flexibility, particularly for cloud customers who might have very different hardware needs from one to the next. And it gets even crazier. Evicted lines can also be pushed off of the chip to another CPU altogether in the Z16 system, allowing a single core to conceivably have access to up to two gigabytes of virtual level four cash from the other CPUs in the drawer, and up to eight gigabytes if we go beyond. Is this the beyond? I suppose I believe I died and went to hardware heaven. To my right is a fully kitted out Z16. And now we can both continue and actually expand on the tour that we started before with that single rack unit. This one has all four possible compute drawers populated. You can see there's one here and there's actually three in the second rack here. And in order to connect them, IBM uses these super cool SMP-9 active cables. These things are super badass. They've actually got heat sinks on them. And inside the sheath, you'll find 10 plus one fiber for redundancy, with each fiber carrying 25 gigabit per second for a total of 250 gigabit per second. Oh, and of course, all of the links are redundant. Two of them between each one of the compute drawers. Now over these links, the system allows the CPUs in the separate drawers to share that level two cash I talked about before as a virtual level four. Which can apparently in some cases, actually be faster to draw off of this CPU's cash over SMP-9, versus pulling off of your own DRAM in your own compute drawer. It's insane. I'm just getting permission to pull a coupling link card out, but I realized we didn't even talk about the reservoir. How sick is this? Just wanna fill up your mainframe. No big deal. Got your like giant... Looks like stainless steel res going on here. Absolutely gorgeous. I'm 104! - [PJ] '04! - [Chris] Okay! - Just data center things! How's it going? - [Chris] Good. - Awesome. Look at these chunky quick connects, man. That is sick. Do you want me to pull the collar back. It'll be fine, it'll be fine. You'd have to actually link them together to let the liquid flow. Hey, the lights on, look at this. And one, and... I don't know. Sure, like that. Hey, there we go suckers. This is a coupling link card. It's using multi-mode fiber, 12 lanes per four redundant for short range, high bandwidth linking of multiple Z systems within the data center. And it uses PCI Express. The idea here is that, depending on the customer's resiliency strategy, which is often government mandated, they could have a whole web of Z systems configured for high availability, which could allow an entire drawer to go down with immediate failover and no data loss. That's why these plugged directly into the compute drawers. For less bandwidth intensive IO devices, we rely on these. You can fit so much PCI and this bad boy. Code name "Hemlock" makes them sound a little sexier than they are, but they're still pretty cool. Using these direct attached copper cables. Actually... Ah, cool. I've got one of these right here. They have a 16X PCI link to the compute drawer. And then you can stack these puppies up with a ton of IO cards. NVMe storage, crypto accelerators, network cards. I mean, you can actually run Linux on these things. So pretty much whatever ad in board the customer desires, can go in one of these. And the PCI Express link between the hemlock right here and the compute drawer, goes a little something like that. Pretty cool. Now, you guys might have noticed that only the center two racks contain compute while the outer ones are all IO. In fact, this one over here on the right is just an entire tower of IO. That's normal. And most of the cards that you're looking at in here are either FICON or CICSPlex cards. And these work together to handle everything from connecting to storage racks within the data center to offsite synchronization and backup. They can go unboosted up to 10 kilometers and 100 kilometers with boosters. But in order to test that, we're gonna have to take a little field trip It's not for hairstyle. It's got a mainframe up my (indistinct) here. Welcome to the patch room. In here, you will find 50,000 ethernet connections alongside 200,000 IO connections. The blue ones are OM3 multi-mode fiber for carrying PCI Express links across the data center. Those are the ones we saw before. And often these would end up at an IBM DS8K, which is their enterprise class storage. While these ones are almost all FICON. And this is super cool. If normal fiber channel drops a packet, the origin is gonna retransmit it to the destination. "No big deal," says I. "No," says IBM. "That is a big deal." So FICON actually contains enough storage on each adapter in the link that only the last hop of the journey needs to be retransmitted. This apparently has both performance and security implications. Sounds expensive. Then over... Oh crap, I lost it. There it is. Check this out. Inside this box is 50 kilometers of real actual fiber that they can combine with another 50 kilometer roll, you can actually see it right there, to test out their offsite capabilities under real world conditions. And while we're over here, this is super cool. Each of these racks is set up kind of like a site A, site B. And what these do is they can take hundreds of channels of 13-10 fiber, convert them to different wavelengths, transmit them hundreds of kilometers away or whatever, then split them all back out on the other side, like with a prism, into your separate channels. Freaking bananas, right? Then there's the brains of the operation. The support element, or as I call it, The My Wife Unit. Although because it's Z, there's two redundant ones. And I don't know how much she'd like that. The support element acts as a management interface for the system, making sure that everything is running smoothly by monitoring this intra rack ethernet network that is checking temperatures and functionality and reporting on failure events. Bringing us to the big question. What does it actually do? Well, the biggest draw for IBM's Z customers is reliability and lightning fast, accurate processing of transactional data. That's what it does way better than NextCity Six. And Z16 aims to improve over last Gen by introducing tools that will help it do all of that, and faster, thanks to a machine learning boost, as well as adding new tools that will help come customers identify and streamline security and compliance obstacles in the organization. Sounds fancy. So I'm sure you're wondering how much one of these costs. Well, the truth is that the number and the title there, assuming they haven't changed it by now, is more of a guesstimate. A bear cage actually starts in the neighborhood of $250,000. But every single system is custom built to the spec of the purchaser. And the IBM folks here laughed and said, "Well, by the time you've gone all the hardware and software to actually run it, I'd say we're pretty comfortable with that million dollar number as an approximate." So you won't be having any of these in your basement lab anytime soon. But what you might have is FreshBooks. Thanks FreshBooks for sponsoring this video. Now I'm gonna go ahead and guess that you're not an accountant, which is why you're gonna love this software. FreshBooks is built for freelancers and small business owners who don't have time to waste on invoicing, accounting, and payment processing. In fact, FreshBooks users a can save up to 11 hours a week by streamlining and automating PES admin tasks like time tracking, following up on invoices, and expense tracking, with features like their new digital bills and receipt scanner. Over 24 million people have used FreshBooks and love it for its intuitive dashboard and reports. It's easier to see at glance, exactly where your business stands, and it's even easier to turn everything over to your accountant come tax season. 94% of FreshBooks users say it's super easy to get up and running. And with award-winning support, you're never alone. So try FreshBooks for free for 30 days. No credit card required by going to freshbooks.com/linus. Huh? My voice is shot. So if you enjoyed this video, maybe go check out our tour of SFU's super computer. It was super cool. Feel like this is like a scene out of "Gremlins." Don't get it wet. (laughs)
Info
Channel: Linus Tech Tips
Views: 2,004,730
Rating: undefined out of 5
Keywords: tech, server, mainframe, gaming, ibm, imb z, z16, facility tour, tour, networking, custom, hardware, pc, linux
Id: ZDtaanCENbc
Channel Id: undefined
Length: 24min 0sec (1440 seconds)
Published: Tue Apr 05 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.