They Told Me NOT to Do This... - Building a Node of the $1,000,000 PC

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
- Ow! (Jake laughing) Damn it! What is that? A total 128 cores? - 64, but good try. - And a terabyte of memory. - That's 256 Gigs. - Oh. Supermicro told us that we're not allowed to build this server ourselves. They have to build it for us. Naturally, we said no. So we are gonna be taking one of the thin 1U storage servers from the petabyte flash project and see just how fast we can drag race it with a handful of Kioxia CD6 drives, Optane acceleration, 64 epic course, the fastest we can get, and is this 200 Gigabit per second networking? - It's 200 gigabit. - And 256 gigs of memory. - The pin's just for boot. - Thing's gonna be crazy fast, almost as fast as I can take to our sponsor. - GlassWire. Are you having poor quality video meetings, use GlassWire and instantly see what apps are wasting your bandwidth during your meeting and block them. Get 25% off today using code 'Linus' at the link below. (upbeat music) - In a way, a server is a lot more like a laptop than it is like a commodity desktop native off the shelf components, because they tend to be way more tailored to a specific use case, and they're not really as flexible. Say you'd wanna build a storage server. There's a dozen different ways to skin that cat. (cat meowing) Pardon the expression. For example, here at Linus Media Group, our primary concern is getting as much capacity as possible at the lowest possible price. - Yeah. - So we're willing to give up some compute in favor of stuffing more drives into a single chassis. That's why our storage servers tend to be this thick or this thick. The reality is that 4K or even 8K video editing is pretty demanding, especially if you've got 10 or a dozen editors working off the server at once compared to enterprise or scientific applications. It's not you even close. So, that is why any good server deployment starts with the chassis. This right here is the Supermicro super server, 1124US. And it is all about density, not storage density, because if we went with a 2U, remember that dual layer one? - Oh yeah. Old Whonnock. - They absolutely could pack in more drives, but they choose not to because you're going to run into performance bottlenecks if you don't have enough compute. And that's the density that we're increasing here by going with these 1Us. Each layer of this, 12 drives, yes, but two CPUs. So no bottlenecks, right? - That's the idea. - Supermicro only sells these as a complete system these days, meaning that it must leave their warehouse with a minimum of two CPUs, four sticks of RAM, and at least one storage drive. And the intention there is for them to be able to ensure quality and compatibility. And then as a side benefit, obviously, they make some money off the parts, but because of petabyte flash project, we were able to get our hands on some bare bones ones. So let's take a closer look. Wow, built in on board. You've got dual SFP ports. Are those 10 gig or? - [Jake] All four of those are 10 gigs. - All four of these ports, RJ45 and SFP are 10 gig, dual USB three, see we've got an IPMI management port serial, VGA, I love that VGA, as well as two PCIe 16X slots back here, and what have we got for power? - [Jake] There's more, there's three. - There's three. Oh, there's a third one. Oh, look at that. - Oh, wait, there's actually four. There's one more like hidden inside. We'll see that later. - Oh, cool. Okay. Let's have a look at our power supply. Obviously, dual up to 64 core CPUs. Wow. 1200 watt power supply. - [Jake] Geez. - Strictly speaking, this, they didn't actually send us a bare bones, they sent us a completed one and we took it apart. - Oh, do I ever have the story for you. On a call for this project, the Supermicro guy was like, "You know, taking out a PCIe card, that's easy, "but you know, get to a CPU there's, "there's pins and thermal paste." I'm like, "Bro, "I have probably taken out slash installed "at least a thousand CPUs." - Yeah. Is it wrong for me to just love looking at thermal solutions for super thin systems like this? - Really, you know what I'm looking at? The RAM slots, it's like 60% of the width of the server is just RAM slots. - It's a forest of memory slots. Why aren't we putting more memory in then? - Erm... - Fetch me more memory. - No, no, no, no, no. The thing with EPYC is we wanna have all of the channels filled out, so that's eight per CPU. - Yes. - But once you add more, it can be harder to hit the same speeds in the same latency, so- - And speed and latency of your memory is super, super important if you're running software RAID, which is exactly what we're gonna be doing with ZFS. We're using ZFS, right? - Well, for now. Just to test it. But the actual deployment's gonna be using WekaFS, which is a different thing that costs hundreds of thousands of dollars, but seems to be software RAID too, so yeah, I guess. - Whoa! - But, oh, what the hell? - That's cool. - You dropped something. It comes out as one big fat mammoth of a module. - [Jake] Here. - I love it. - I don't know where that is from - Raaaaaaaaa! Now, this is a fun fact. Small fans, not great at moving a ton of air because they got little bit tiny planes. But what they are really good at is generating a ton of static pressure, which is really important in a deployment like this. See, look at the front of this chassis. It's gonna be all full of drives in there, right? And in order to fill it with drives, you've gotta have a back plane for them to connect to. Well, that back plane has hard PCB, and you can see that there's only tiny little gaps in it. Wherever they were able to get a little hole to draw air through the front of this chassis. They need to generate enormous static pressure in order for there to be enough air flow, to force over the CPU's memory, power supply, and PCI Express cards. Did I say power supply? Power supply is up, they're redundant in the event that one fails, and it's also super useful for connecting your server to two independent power sources, in case your power source fails. Side note, Jake, I think this might be the thickest PCB I've ever seen. - Holy (beep). - I mean you want rigidity obviously, especially somewhere where there's gonna be mechanical strain on the device. - That's like two sticks of RAM. - Yeah. - Thickness. - Here, let's get a shot of this, just for context. Here's a stick of memory. I think it's more like three, Jake. - What? That PCB is almost 1/8 of an inch thick. - That's crazy. (Jake laughing) Oh, this is interesting. You can see that in order to avoid recycling any of the hot air back to the other side, they've got these little like rubber curtain things anywhere where cables have to pass between the front of the chassis and the back. And that's not the only cable management trick it's got up sleeve, power runs up this side, but the front enclosures also need PCI Express connection for the NVMe drives. And all of those are flat connectors. Check this out, that run right in between these memory slots, to these sick freaking PCIe connectors that go into the motherboard, and do they have any cards for them? No, they just all come directly off the board. - Yeah, there's the little ones over here too. - Of course, you can add even more NVMe storage if you wanted to. There's the three, excuse me, four PCIe slots here at the.. Wow, this is- - See this little guy, he is right here. - Oh, there it is. - That's where we're gonna put our Optane. Oh wait, actually no, it doesn't fit. - Ah. - Oh God. - No, it's fine. Oh, cool. Okay. So this is a dual riser, on this side. You've got a simple PCIe 16X to 16X slot, and knowing AMD EPYC, it's probably running at full speed. - It's running at full speed. - Actually, all of this is just gonna be full speed PCIe gen four. And then over on this other side, we've got, I believe this is a PCIe 32 X slot. Oh, that is crazy. Jake, it is a 32X slot, and you can see they've actually got the pins that correspond to each of the 16X slots, silk screened onto the PCB. - Look, if you think that one's crazy, look up here. - That's amazing, I love it. - There's one 16, another 16, and then a, what is that? 8X. - That's an 8X. Right over here. - So I think some of the NVMe is run off of this. - Oh, you know what? The 8X is running these SFP ports at the back. - Oow! - No bottlenecks. - And then one slot, and then those are two more 8X NVMe connections different ones. - Running to the front, yep. There's so much PCIe in microservers. - And again, very purpose built, right? - Yeah. - You can put a GPU in here though. Oh look, GPU power. If you wanted like an NA100 or something. Again, if you had a very purpose built, specific use case. Oh my God. - I mean, we did see a storage deployment recently where GPU acceleration was used for RAID parity data. - I have a bit of an update on that one. - Wendell, informed us that there could be some issues with that particular solution. - We tested it. There is an issue. - Ooh. (Jake laughing) - Basically, we stopped the array, edited one of the drives, I think we edited 32 bytes of it to be something different, started it up, and it didn't fix it. - It just has no error handling whatsoever. - It is. It is depending on the drive to tell it that there's an error. - Jake, I just realized something. I was trying to figure out why the front of the slot was over here. and I was like, "Right, that's where the power pins are." - [Jake] Yeah. - So it's got normal size power pin, and then these itty-bitty higher density data pins. - Now the goal today is to see how the system would perform if you were just to set it up yourself with something like ZFS but even with a more optimized and actually specifically built for NVMe file system, like WekaFS, you still need a lot of CPU compute to handle things like networking, the actual connection to the NVMe drives themselves and any sort of networking overhead. Fortunately, AMD stepped up to the plate and provided 12 of their 7543 EPYC processors. So those are 32 core each, for a total of 64 cores in each of our six servers. These are configurable to a max TDP of 240 watts and a max boost clock of 3.7 gigahertz. They're not quite as fast as the 75F3s we had in the G-RAID server, but there's still plenty potent for what we're trying to do here. So, let's get them installed. I gotta prove Supermicro right here. I know how to do this, David. I swear. I swear I can put a CPU in. - [David] I don't know, Jake. - Watch me screw this, no. - [David] Oh. (Jake laughing nervously) - You saw nothing. You know, my hands quite aren't what they used to be. (upbeat music) All right, David, I'm doing the most dangerous part here, thermal paste. Oh, don't wanna mess this up. Oh, oh, I already messed it up. - [David] Is there treasure under that? - Look at these bad boys. It's crazy to think that this could handle a 280 watt CPU. Like there's just underneath here. There's gonna be a massive vapor chamber that just spans the entire thing. - Now it's time for the tedious process of installing all of these sticks of memory. The DIMMs are made by Samsung. They are 16 gigs each, and they run at 3,200 mega transfer per second, but the most important thing about them is that they're qualified by Supermicro for this particular server. And I get it, you know? Who would wanna run unqualified memory in their mission critical server? It's like drinking from a non LTT store qualified water bottle. Crazy. I think the craziest thing about this memory setup is that it's not even that crazy. 256 gigs of ECC error correcting memory would be mind blowing for a desktop, but for a server, this is pedestrian. This is a storage server. We don't actually need to put enormous data sets in memory for these CPUs or GPU's to crunch away at. These are to make sure that each of our CPUs gets two full fans worth of dedicated airflow blowing through them. In our final deployment, as part of our petabyte flash storage project, oh, we're gonna have six of these acting as NVMe over fabric posts. And it's kind of similar to iSCSI in that your storage is in one box over here, and then there's connected VN networking to your compute box. But NVMe over fabrics was designed specifically with NVMe devices in mind. So, it's way more performant, but to push that kind of speed, you need to make decent use of the drives, right? And that requires a lot of networking. A hundred gig? Ha, can I get a ha? - I don't have my mic on. hold on. Ha. - 400 Gig is what we're targeting with dual Nvidia ConnectX-6 series cards, and I couldn't help noticing that one of these has a half height bracket on it. Oh, you want it on this side? - Yeah. - It's for cooling, for more cooling. Get them separated. - It's just gonna hang there. - The teamwork in it. I'll go at it from one side, you go at it from the other. - Yeah, I think they called that a (beep). - [David] Ew, no. (Jake laughing) - I thought you wanted to put this one on this one. - It doesn't fit. - Oh. - Yeah. It's fine. We can just stick it- - Too much, too much cable. - Ooh, careful. - Yeah, we really need to not break any of these if we're gonna hit our petabyte flash storage, - Well, and also, - Target. - We don't want them to be able to say, "I told you so." - They did tell us not to bother them. (Jake laughing) This may be the most overkill boot drive of all time. (Jake laughing) - Like doesn't.. - It's not redundant though. So it's like actually not that great (laughs). - In fact, many server motherboards, most even have an internal USB port that is exactly for that. That is what it's for. - Really? - Yeah. It's for just running an OS off of USB, but just using a cheap thumb drive and plugin it in. A lot of them also have an internal, like little powered inserted thing, and that's what we'll be using for this one. Where is that? - In the real deployment. - Yeah. - Let's take it out. - Oh, well where does it go? Oh, is it the super dome one here? - Yeah. Something about that doesn't look right. - Yeah. I did not put this in right. Yeah. - Oh, oh God. Every time. You didn't screw this in? - Oh, I forgot. Oh, I can still access it. Where'd the screws go. Just gonna- - Is there a through hole? - Yeah, so like a- - Oh, God. - Just a- - What about this side? - There's a bit of flux. No, that one I screwed. Oh, I didn't screw that one in either. - Last but not least, storage. The actual deployment of this cluster is gonna be making use of 12 CD6, 15 terabyte drives per server. But because those drives already have like specific demo data, - Right. - Assigned to specific slots, we had to be very careful about taking them out. Did you see my little diagram? - Oh no, I didn't. - Oh. Oh yeah, baby. - Oh my gosh. - [Jake] It's perfect. - [Linus] Okay. I mean... - I labeled the drives. I didn't wanna screw it up. - That's fair. (Jake laughing) That's fair. - Because that would be catastrophic. So instead we're gonna be using these seven terabyte CD6s that we already had laying around, and we're gonna be installing true to run ZFS on them. Just to see like if you were to buy this server and these drives. - Yeah. - How much could you get without spending $400,000 on a file system. - Yeah, that. I mean, we're expecting really impressive results even without the fancy file system, because these are PCIe gen four drives that are capable of an excess of... What is it? - There's like six gigs. - Like over six gigs a second? Gigabyte. - These are exactly six gigabytes. - Six gigabytes a second of throughput. - So put together that's around 75 gigabytes a second. The interesting thing is the 15 terabyte ones are a little bit slower. I think they're five and a half gigabytes a second, so it works out to be closer to like 65 gigabytes a second, which is a lot closer to the 50 gigabytes a second that our network can do. - Shout out Supermicro, by the way, for these toolless sleds, this was so fast compared to when I built that simply double server that had- - Did you have to screw them all in? - All of them. - Oh. - Even with one screw, boy, does that ever add a lot of time. - But you kind of have to do two. So that's like, oh, 96 screws. He's not even doing it right and he is complaining. (Jake laughing) Oh, are we done? - That was it. - That's it? Wow, it's so cute. - Fricking crazy. - I mean, cute is like not giving it enough credit. - 400 gigabit per second. Okay. - Shall we plug her in captain? It's gonna complain that I only plug in one of these. - And then we'll tell it to shut up. - This one doesn't have a shut up port, but it does have a shut up function called unplugging the power supply. (server humming) - Oh, well we could just plug it in. Here we're gonna plug the power supply into our server. Oh, that was a little rough. Oh, I got the wrong power cable. One moment please. When I was living at Yvonne's house. - Like with her parents? - Yeah. I had a GPU make that noise when I forgot to plug in the PCIe power connector. - Oh yeah, like an 8,800 or something. - And it made her dog throw up. (Jake laughing) This thing is surprisingly quiet for a 1U. I mean, I know it's idling, but still. - I guess if it's not doing anything. - Yeah. Well, it's nice to not have to hear it just whining. - Some of them, their power supplies... (server humming) - No, nevermind. I spoke too soon. - That's still not bad. 232 core processors. All right, we see 13 drives, looks good. We got our boot drive and our 12, seven-ish terabyte drives. It's time to make our pool. Should we do realistic or should we do full send? I actually don't think a stripe is gonna be that much faster, honestly. We might be best to just do like two RAIDZ ones. - Two RAIDZ one pools would allow us to have two drive failures before we actually experienced any data loss, and the way Jake's going to configure it is with two six drive vdevs that we will then combine into a single pool. - [Jake] All right. What do you wanna call this? IAMSPEEDD, 69 tebibytes. You know what, it's even dot 84. Like that's double 420, you know? (Jake laughing) We gotta make a couple tweaks here. They've actually updated it so atime is off by default. - What is atime? - It's like, it records the access time of the data. You only need that for like very specific use cases. - Or diagnostics, I would think. - Yeah, but you don't want it. It's not good for performance unless you actually need it. We're gonna go from 128 to one meg, 'cause that's kind of closer to our use case of like the video, which is big files. - Yeah. - If you were to host like a database or something, where you have lots of random reads that are small, like especially like a text based database, you would probably want a smaller record size. But for us one mega is good. - 128 is the default for a reason. - Yeah. - That's excellent for a mixed use case. - Yeah. Okay, and we wanna do one more thing. We're gonna set at the ARC, that is the RAM cache of ZFS to just be metadata only if you use it for files as well, when you have such vast backend storage, you can actually lose performance. - Yeah. - So setting it to metadata only, gives us a little bit of acceleration from it, but not the same kind that ARC would for hard drives. So we're running an iodepth of 32, which is somewhat unrealistic, but two threads per MBME. So 24 threads total at a 128K block size. - You make the server go fast your way, I'll make it go fast my way. Here we go, ready? Ah. - Did you just unplug the network? - Oh. Maybe. - Maybe? - But I did it really fast. - Yeah, I think you did just unplug the network. Cool. Well, it's fine now. 15, 17, 20. - Now we've done, you know? 1820 gigabytes a second on a ZFS pool before, but what you have to consider now is that we've done that on servers that were generally double the thickness. So, in a clustered deployment where density is key, you're able to get effectively double the performance of your drives by having true one use, by adding all of that compute. That's the point of this, and that's what made these ideal for our petabyte flash project. - Whoo, we're ramping up baby, almost 30 gigs per second. - Oh wow. - Look at the CPU usage, those cores are, - They're going. - Ahead. I bet you if we switch our test to a 128 block size, leaving the array at one meg, this is gonna go even faster. Yeah, there's 22, 20. See if it ramps up even higher. - Five threads at 100%. - Well, there's more than that if you look at it like realistically. So this is a right test? Sequential right? We're looking at around 20 gigabytes second as well. - What that tells us is that we are still CPU limited because in theory, these drives don't write as fast as they read. These are a more read optimized data center drive. That's actually really impressive considering that we're dealing with parity data here though. - And ZFS. Bump the threads up a bit. - Man, I am excited to see what this thing can do when there's another five of them in a cluster. - Yep. Okay, so this is a random read 4K block size. We're doing four threads per drive and a 64 Q depth. - This is not only gonna be a petabyte flash. It's gonna be the highest performance setup that I, - Look at how, - Might ever see. - Dog crap that is. - Oh, that's a shame. 150,000 IOPS individually. These drives they'll do more than that. - Even a million. - Yeah. - They'll do a million each. That's that one meg record size kind of hurting us. Geez. Look at our cores they're just banged. - Wow. - Yep. - 47 threads at a 100% right now. Poor thing. - And what about our random write. It's poor drives are just abusing them. Oh, that's embarrassing. - 20,000 IOPS? This is literally slower than a hard drive, but - No, it's not. - A hard drive sequentially. If we were doing 4K random writes to a hard drive, it would be, - You'll get like 10 IOPS. - Way slower than this. So that's something you gotta keep in mind about these numbers. It's not as simple as just megabytes a second. The kind of data that you're hitting your storage device with is what makes an enormous difference. Really that brings us back to what was kind of the whole point of this video, doesn't it? - Yeah. - That servers have to be designed for the application that they're intended for. And these are absolutely perfect for what we will be doing with them, but not perfect for what we do with our regular servers here. Like we wouldn't replace new, new Whonnock with one of these. - That's the thing with software man, random reads and writes are just not it. But you know what is it? (Jake laughing) - Our sponsor. - Manscaped. The new Manscaped ultra premium collection is an all in one skin and hair care kit for the everyday man and covers you from head to toe. There's the two in one shampoo and conditioner, their body wash with cologne scent, hydrating body spray, deodorant, and a free gift, moisturizing lip balm. Ooh (lip licking) your man maintenance just got easier. And best of all, all Manscaped products in the ultra premium collection are cruelty-free, paraben-free and vegan. Visit manscaped.com/TECH or click on the link below for 20% off and free shipping. - If you guys enjoyed this video, go check out part one where we got into more depth about the complete configuration, including taking a close look at the 8-GPU server that is gonna act as the head controller for the six of these that we're gonna have stacked up- - Oh, I wanna do a video like this on that server. - You should. - Like booted into windows just for (beep). - Nvidia specifically told us not to mine on it, but - I might just do it. - Like could they hate us anymore at this point? - Yeah.
Info
Channel: Linus Tech Tips
Views: 2,304,495
Rating: undefined out of 5
Keywords: Server, NVMe, Flash, Petabyte, Million Dollar Unboxing, Supermicro, NVIDIA, AMD Epyc, Epyc, Installation, Server Installation, Fastest, Jake, 1124US
Id: iA_B2oaqqvg
Channel Id: undefined
Length: 23min 15sec (1395 seconds)
Published: Sat Apr 02 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.