128-Core AMD Epyc Rome Server Tear-Down, ft. Level1Techs

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey Ron we're back with another cool enterprise-e type video except this is also interesting for the enthusiast space because we at GN we've talked a lot about Africa over the past couple of years as AMT has rolled it out it is a lot of the reason that risin exists in the capacity it does in desktop so I have Wendell from level one text here at U at this point may have seen him in some of our other videos we've got a server build going up that's why you came out as a server expert and now we're going to talk about the the expensive way to do a server not the GN way to do a server before that this video is brought to you by Linode cloud computing we've trusted Linode as our web host since 2012 and recommend it for excellent technical and customer support reliable uptime and clean interface aside from cloud hosting Linode comm recently added GPU hosting plans for machine learning and neural net use built with our TX 6000 GPUs and 10 gigabit per second network speeds they're also starting to deploy epic CPUs in their servers sign up for Linode comm cloud computing with code g @ nexus 24 a $20 credit or click the link in the description below to visit linda.com slash gamers and access this this is exciting but well I mean because you guys do teardown so you guys do like yes you know the best of the best and this kind of stuff and this is the fastest computer the mere mortals can lay their hands on it's the world record holder it's a 64 cores per CPU 2 CPUs her twenty eight cores 256 threads up to 4 terabytes of memory this holds currently I think 11 world records and you've got it in a gigabyte chassis we'll go through the cooling system - and right now it's powered on it's actually really quiet right now but when you first power it up it is it's deafening it's got a standby power so the baseband management controller is on and we can remote it to it over the web and that's what we set up with your server the oven EMI all right the a speed 2500 it's the same implementation ok so this is the gigabyte r 282 z93 and so when you first turn it on it sort of gets angry I mean that's fine ok that's reasonable that's not bad [Music] andrew is this picking up on the mic I want to make sure they can hear this I'm not sure that they can oh they can hear it they can do it should we do mr. dream with this in the background so it'll calm down eventually yeah but it has some cool load balancing at the fans if you remove one or if one dies I should I stop talking until it stop [Music] [Music] that's pretty blowing put the fifth picture [Music] so we're taking an anemometer right next to it wow it is already at 1,500 linear feet per minute flow so it knows when you pull a fan it knows and it will compensate for that it's like hey literally everybody in the building oh my god so 4,000 linear feet per minute if that means nothing to you for viewers as an example a lot of the let's see the 200 millimeter fans opposite end of the spectrum 200 millimeter fans they'll be in like the 400 range linear feet per minute flow so 10 times with yes and it is it is linear so linear feet per minute after all so yeah that is a that is insane yeah this I really like these chassis from gigabyte so I picked up some of the some of their first gen epic Tassie's so there's an epic Naples and I work on a lot of Dell and HP system it should quiet the stack Parata yeah let's see [Music] there we go so you're saying you worked on a lot of Dell and HP systems oh yeah so this is a gigabyte chassis with a gigabyte motherboard they've designed it it's a PCI Express 4 risers a lot of the modern epic Rome design servers is a mix of PCI Express three Jen 3 Engine 4 depending on what the devices it's a popular misconception that they were able to just drop epic Rome into the old Naples servers that's that basically didn't happen ok not Dell or HP or anybody the boards required a little bit of rework it is true that a lot of the world record-holder systems were on pure PCI Express 3 that used all the PCI Express power budget should just give a little bit more push ok for the CPU to push it over the edge but this chassis is designed for kind of a mixed workload I don't I don't think you would normally use 128 cores in this kind of a chassis because the storage density is not super high and it's also got some room for GPU compute so you could deploy up to 3 GPUs in here like Tesla V 100's or 28 et is although the PCI Express devices are you know like not every video card Under the Sun is qualified for use in a GPU chassis like this so when you got the fans to go super crazy did you remove two or one just one just one do you know what the RPM is of these uh we could find out with I have an RPM I have a tachometer but I want to say it's like 3,000 or 3500 but it could be more I mean these are small yeah okay let's show the 1207 amps seven amps so a lot of your consumer fans will be between like point one and maybe point eight five at like the really high end yeah so a lot of headers on motherboards kind of max out at 1 amp there's high up headers to like KC's has but yes so 7 amps is how you open that cases 3 yes and it might just be - it's not seven seven is quite high yeah that's insane so let's let's explain if you I saw a brilliant analogy someone made on the least expected place on reddit actually so I someone was talking about I think there was a critical comment from a DIY enthusiast about why our servers like this why do they have just a heatsink this is worse than my home system and the analogy I saw on Reddit and I apologize I don't remember your name was there's probably something very inappropriate being read it it was you have to think of it like if you're driving along the highway and you just stick your head out the window and open your mouth like in terms of throughput from the speed of the vehicle that's really all you need yeah and so for the video card or for the CPU coolers they don't need cooling they don't yeah cuz it's the window yeah right it's headers out the window going 60 miles an hour you're forcing all their and possible I can fit yeah so it's yeah so much it's a straightforward solution also these go in racks where you can potentially have tons of them yeah 42 you rack a high density ultimate compute per power but that's also one of the world records that Andy holds which is the efficiency of the compute per watt so if you have a system like this in terms of density you know in terms of like the Amazon AWS if you are just using non reserved instances like just the default processing that you get if you were to like sliced the server up as configured for Amazon Web Services it would generate about five to six thousand dollars a month in you know Amazon equivalent web services for the number of VMs and run a lot yeah and you know that what the process there's only seven grand so I mean we're talking about an ROI of like three and a half months four months yeah so very good and then the the board is pretty unique to you let's roll in and get a closer shot of this so for the board this is this a in any sense of it is this a standard form factor or is this custom for its its standard in the sense that pretty much all servers follow the general formula that we have here in terms of power supply layout and this kind of thing there's also these standards like this is OCP three this is the third-generation Open Compute Project interface so your 100 gig Ethernet cards that are low profile that are you know not going to consume an expansion slot they'll slide in here underneath this I don't think this is yeah I don't have this unscrewed so this is not gonna take me a second well I do want to remove power or no it's okay and we should talk about the power spice to it so here's OCP Gentoo okay and these are all modular and the thing that I like about these chatzi's is that because they are super modular this is configured for three-and-a-half inch drives in the front this could be you know you could dump all of your PCI Express lanes into nvme in the front you could have twenty-four nvme slots not a problem you could do crazy amounts of compute whatever works but generally this kind of layout is is pretty typical for a server another alternative layout is you've got those low profile cards like you might notice get the Intel solid state this is strictly speaking a consumer drive but have you noticed that certain PCI Express cards are only this high yeah and so that's designed to be stuck in here vertically so an alternative motherboard layout would be a ton of expansion slots but they're all vertical half high okay slots as opposed to what we've done here which is enough for three you know tehsil v100 size two graphics cards yes yeah and those I should know to just like these Tesla p100 stock do not have a fan on them they just breathe the air and through the front of the case and for the motherboard so VRMs let's talk about that I think this should be pretty obvious there one on our channel but there's no heatsink on the VRMs no heat sink on the MOSFETs as you'll know from our builds builds loyd videos if you've seen them rant about different VRMs of different qualities if it's either a very good vrm or there's a ton of air flow you don't necessarily need a heatsink and that's certainly true here with first of all these are right up against the fans and there's a lot of air flow you've got some other beer I was back here but heatsink is not necessary for that and this evidence yeah they've taken the strategy of just distributing the V rmz uniformly throughout the month water yeah yeah yeah yeah air solves a lot of problems and it turns out it's a very effective strategy because the the benchmark numbers from this system are so high it's difficult to benchmark in any truly meaningful way because you end up having to run just you know four or eight virtual machines that are all pegged and then the software has trouble keeping up well you have to start creating scenarios yeah for benchmarks yeah yeah and let's let's talk about the front end here so it drives we can show maybe a shot of the front later or roppongi it in here a bit so you've got removable drive bays similar in concept to the disk shelf we had but you've got that cool pop-out action I'm gonna set this back here this is really nice for hotspot so this this is the three and a half inch configuration I think if I were building a server for storage or for compute density I would probably opt for spending the money on a bunch of nvme devices but if you were going to build a storage server or like a mix to Duty workstation you've got a ton of to three and a half inch drives here and so these are you know relatively small eight terabyte hard drives 14 terabytes that's what you have in yours I think 20 terabytes are on the horizon yeah so you could get a ton of local storage in this it's mechanical storage but you've got so much PCI Express connectivity that you can mix your PCI Express connectivity with envy me plus storage and all of the compute is local here so it's a good mix of it's a good everything server and what's the board behind it what's the name for this this is well like is it a patch board or is it it's it's really this board is just designed for SAS and SATA mmm the interfaces here are SATA so normally you have a backplane and so generically this would be called a backplane you can have smart back planes and dumb back planes okay this is a dumb backplane but like the disk shelf that you have is a smart backplane and if you so what's your delineation there between the two the backplane will handle routing and failure much more intelligently if it is a smart backplane okay so this backplane really just provides one-to-one connectivity with the motherboard and it's up to the motherboard to recover from fault situations whereas a smart backplane will aggregate all of the traffic from all the disks into as few links as possible okay and then deliver that to the motherboard in aggregate and are these where stuff going here so this is this is that a SAS SATA SATA okay say that go into what's on the other end where does it terminate a similar connector just on the the back plow I see okay gotcha these are our fans yep and then the cool thing about the fans too was is it just pull up yep okay so cool thing on the fans is just sockets so it's all pre routed you drop it in and I guess you think about it and that in the sense of not as a consumer but as a business as an enterprise business if you have a component go down you really just want to be able to quickly remove the bad component and replace it yep and fans and power spas are the most common failures yeah it makes sense especially a fan so for how much of this stuff is is hot-swap I mean these you can drop in and out hot-swap all that's running it's really just the fans in the power supply okay so not the drives well the drives the drives depend on the controller okay so in this setup the drives would actually be hot-swap but that's not necessarily true depending on what you're rolling with okay and because these are direct CPU interface that was a little sketchy in epic Naples but there's much better in epic Rome ideally some of that is handled by the backplane especially a smart backplane will do a better job with that than a dumb backplane but in this particular setup basically they say it's okay at this point is I'm not familiar with the enterprise world at all but so for for it this type of application how much do the case is really very I mean if you if you're looking at chassis is from gigabyte versus someone else are there meaningful differences in their design or their yes standardized I mean the layout is standardized but like the original design will be unique to each om I think Dell and HP will spend a lot of time customizing and optimizing their design one of the big differences is in like the software bundle so like the ipmi that's on your system is very similar to this one on a Dell or an HP system that typically costs $300 retail so like if you're just buying a server II on the Dell website and it's like I want the I want the whole enchilada for the ipmi options it's a $300 upgrade usually you can have with them and get a better deal than that but on this system and like the other system it's out of the box you get it it's full functionality full screen share you know upload virtual media the whole nine yards from gigabyte stand alone or do you buy it with usually you buy it from a reseller okay so like the reseller will give you the warranty so like an eight by five warranty or a four hour response time or whatever sort of service plan you're looking for right usually if you're gonna buy like usually if you're gonna buy like 10 or 15 servers at a time then maybe but usually your reseller would handle all that I see for the as there is there anything else before I ask specific power supplies let's talk about those so these are very small and 2200 watts and you've got two of them lot on ya it's designed to be redundant so worst-case scenario they think this system could draw up to 2200 watts so that's going to be 15,000 rpm drives with the fans running at the full 7 amps with the CPUs pegged with the full 4 terabytes of memory with the crazy OCP cards pegged and 3v 100s are they is it 7 amps 12 volts for those yep and you've got how many fans 4 4 is that really true they can run out like 330 Watts if you're 700 yeah yeah 7 amps 12 volts minus 4 so 336 yeah hell of a lot of power yeah fans but 4,000 one ear feet per minute and yes right from yeah for sitting in front of the nozzle of I guess two of them merging yeah yeah that's impressive flow a lot of modern data centers are not like in the olden days the data center would be like 68 degrees and that level of refrigeration was a real challenge but modern data centers are made to run warmer but as a result of that you have higher rpm fan right yeah but you're not fighting with AC as much I guess yeah which is valuable so these they're redundant for each other I guess yep like most enterprise products in hot swap like the drives and the and these are just 40 ml fans on the the back of these ones yep and I guess we don't know the speed of these but also fast when it's running very loud when it's running do I just this is just socket okay I'll just slide right in here anything else it'll click when you keep going I don't let you do it okay I don't want to be responsible for anything what did you do it's all in a push it but uh no it's it's really like the most standard thing in the internal is like this OCP too so like you can get the 40 gig Ethernet OCP two cards on eBay for like 40 bucks but you know 125 gig and hard gig is the new standard and this is the new OC p3 is the new peripheral standard so like Dell HP you know super micro gigabyte they're all going to have OCP 3 and 2 slots and these are really popular because no matter what your other configuration is if you're doing the half-height half-length cards or if it's configured like ours is for the v 100's you're always going to have those OCP 2 and 3 slots and you can they don't don't really count against you using your slots so if you've got a really high density you know gigabyte has another server that's 4 - socket nodes into you so you have 2 4 6 8 CPUs in a 2-u chassis and the only real expansion you have are those OCP types cut cards hmm we should point out this too if you didn't already power cables we're talking about this the other day so those Tesla's they take EPS yep 12-volt cables they're not PCIe yet normally you would get those power cables but I got these because I was trying to rock the 28th et eyes cuz I'm a cheapskate yeah yeah real cheap well $7000 Tesla be 100 or 1620 ATP's yeah in contrast the rest of the system but yeah yeah anything else we should point out here cooler I guess I'm a little curious about so is this is this a TR for like the same mounting holes okay and this this heatsink is totally stay so it's a copper plate and this is exactly the same cooler that you would find in a 1u server okay but the reason it's to you is so that you've got more room for your PCI Express cards so we're just one you cut off and I right at the top of the power supplies okay so you'd have shorter fans in that instance I guess do those typically do they double up then do you end up with two times as many 40mil fans or something usually but not always okay and what about water cooling I know there's I know I've seen a Sutekh does enterprise products how common is water cooling is a typically closed loop right now it's not super common but hey Sutekh has a whole rack water cooling solution with a bunch of quick release hoses okay so they've got a standard that they're trying to create create yeah because it's not nobody really does it but the idea is that for those super-dense machines you can do water cooling and truth be told I think that's really for like Intel because they're 56 core CPU is like 500 watts and so if you want the density of you know four nodes in a 2-u cluster yeah water cooling is your only option these CPUs are relatively power sippy I mean you don't really need water cooling even if you're packing eight of these how how does the water cleaner only work so if you've got let's say a rack with whatever a normal amount of these might be say 10 or something do they stack them all and then run water cooling like externally out the back or something and routed some radiator somewhere else or yeah usually the rack is a little wider than a normal rack and there's a distribution port on the side I say and then in the body all right now I'll hook into the distribution yeah and it's just rubber hoses so like it's like a quick-release thing so like if you've ever met before of those and they sort of pee a little bit it's like that and so you like you pop it on and off but okay erratically cuz it's like back here on the server it does that far enough away from the server that you're not gonna really have a fluid problem and I think it's a non conductive fluid anyway probably yeah so and then does that you set at the bottom is that where that's where a tank in the refrigeration all that is there's I saw two configurations one was at Computex and it routed the hoses like it was designed a datacenter that has a raised floor and so it would route all the heat through the floor and like deal with it some other way yeah and the other one I actually had the refrigeration unit in the bottom do you're saying refrigeration unit is do they use radiator and fan at all or do they just use some I think they use a chiller and I can pump and like there's a heat pump somewhere outside yeah that makes sense now it like out of the building or in a different closet or something yeah cool so what are clearly not super common but also necessary for the really high density high thermal density scenarios not so much on the AMD side but definitely on the Intel side with like the 56 core CPU or insanely high density some of the networking gear like 100 gig networking gear not all of that is on the really advanced lithography so it produces a lot of heat but there's not very many data centers that are that it's it's what's the premium for floor space so like at the New York Stock Exchange yeah it's really super premium so you're gonna probably see that a lot per square foot but most of the other data centers square footage in the data center is not really super at a premium so those kind of technologies are not really necessary as I understand it a lot of the time those are built in cheaper areas anyway if they can yeah and if they have the network infrastructure so I know last thing I can immediately think of but other than an open-ended question for you at the end but I know that Roman did a video overclocking I pick at some point does any of that work on Rome it doesn't work the same way but it is possible to overclock mmm so have you done it a bit I have not pushed the clocks past so the upper the upper clock on pretty much all at big CPUs currently is three point three five gigahertz and it is possible to get that three point three five gigahertz clock on all of your CPUs but with this motherboard it's really designed it seems like it's designed with a little bit of padding and so you really don't want to push this the CPUs more than like 280 watts okay for you start like things start getting sketchy okay so so is is this a an instance where Elan's who obviously brings your temperature floor down for the CPU and it's immediately adjacent to it but is that a scenario where exotic cooling doesn't really help you still have other barriers to overcome yeah you would still have the power delivery problem okay and then that enters into like a PCB design and yes so basically you need like gigabyte or someone to make a special yeah solution just for that yeah there's a there's a funny video circulating on the internet of some HP engineers like Intel did a demo of their 56 core CPU that was slightly faster than Intel's demo or than AMD's demo of the system and then the HP engineers just showed they published a video it was unofficial and it was not supposed to go on the internet but they were they were looking at it and laughing and so they just configured the CPUs for like 25 more watts each and it leapfrog being Intel CPU by like 1% well if you wanna play that game Intel it's like weekend bath in yeah yeah alright and it was still less wattage yeah so cool alright open-ended question for you then is there anything else here that you think is really interesting that you want to point out for any other projects you're working on with this that you wanna bring attention to benchmarking this is really hard so if you want to see the benchmark you can look for the level one video on this I love bleeding edge hardware and it really is like this is sort of the pinnacle of human civilization in terms of computer engineering all right I heard 28 court and it's commodity like this is a commodity part we taught sand to do billions of operations per second across 128 cores what's not to love about that it's so crazy yeah I am but PCI Express for routing I mean it's so fast and some of the other stuff like look at the ridiculousness that they had to deal with for PCI Express for like all of the spacing and aroma as we were talking about that about the grounds yeah it's like you had to double up the ground pins here on the edges of the card connector so PCI Express 4 is a huge pain and you guys are gonna have to deal with that on like gaint like I want to run my GPU vertically now you have another reason to recommend not doing that yes right if we needed more we have another one yeah so I think there's two to two interposers that claim they can do it right now I haven't validated that but it's so sketchy yeah I question it I like that you even if you're running a GPU to slot GPU you still got another slot available yeah there's like sometimes six seven eight slots that's more slots and you get in the desktop yeah that's a super tower desktop yeah very cool so if you want to see more of this stuff first of all we'll have additional videos on this channel with wendyl including the one where we built our own what version of this light version of this also with AMD technology yes it's got an r5 3600 what's really good is is good I mean it's technically server technology in a lot of ways people are gonna laugh at that but then it's like you know if you buy a you know an over $1000 nass off of Amazon that's a four core atom typically oh yeah yeah and atoms have had a lot of issues lately intel had some issues with the atom processor specifically so yeah that's a great point to the DIY approach you end up with a solution that's not complete out of box but is cheaper for the power you get so anyway we'll have that video on the channel if not already and level one text you can get a lot more of this definitely go subscribe I'll link it below and I guess you already have a video up benchmarking this or will maybe by probably by the time it's all I hope okay it's been hard and if not go subscribe and you'll catch it as soon as it goes up yeah so you can get more there and yeah check back for the rest thanks for watching you can go to store documents access dotnet sports directly like by buying one of the mod maps that we can now we can now save this mod map forever because it has touch to 128 core system will auction it off or something and you can go to patreon.com/scishow and access for behind-the-scenes we'll see you all next time thank you thanks for having me [Music]
Info
Channel: Gamers Nexus
Views: 399,161
Rating: undefined out of 5
Keywords: gamersnexus, gamers nexus, computer hardware, epyc rome, epyc rome server, epyc server, amd epyc server, amd epyc server tear down, epyc rome benchmark, 128 core cpu benchmark, amd epyc rome review, server tear down, amd epyc benchmark
Id: la0_2Kmrr1E
Channel Id: undefined
Length: 28min 59sec (1739 seconds)
Published: Tue Nov 12 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.