Talking About Mellanox 100g

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
300 bits per second this is number 1000 off the line by the way this is a modem from the 80s this was made in korea but it's from general electric and you would actually plop the phone down here on the top this is a hundred billion bits per second a hundred billion versus three hundred mile we've come a ways haven't we this is the melodox connect x5 this is our tie-in transport there's four epic nodes in this thing the tie and chassis this came from can rock up to 512 epic threads and two terabytes of memory and it's time we talk network interfaces for clusters [Music] see these blades have two pci express slots which is awesome i can throw in a 100 gigabit card like this one this is actually potentially up to a dual gigabit card but this type of connector is qsfp and this is a qsfp to fiber interface so this will take that quad interface and turn it into you know a single transmit receive fiber pair and operate at 100 gigabits um you don't even necessarily have to use a fiber transceiver like this or loopback cables that'll work over short distances but it's not ethernet so you thread it all these chassis actually have another ocp2 slot underneath the standard pci express slots where you can install a module like this one this one is a very nice intel x550 at2 it's two 10 gigabit interfaces 10 gigabit for a server like this is basically okay so you're running your server workload it's self-contained we've got you know we've got four four terabyte u.2 drives these are available in up to 16 terabyte capacities it's four nvme interfaces on the front we're doing pretty good with our pci express lanes even though the platform has 128 pci express lanes i've also got an optane m.2 cache right here and my operating system m.2 is back here so we got the dual m.2 slots plus u.2 plus there's also four sata connections so if you'd rather unforsated drives up here you could or you could probably use a proprietary breakout cable and run some stated discount module stuff if you really really wanted to but i don't know that's a supported configuration from tain but you do have the sata interface as well as the nvme so you could rock sata up front if you really had to this platform is amazing and flexible and awesome and it's really it's really pretty cool this is a self-contained server so four servers and a 2u chassis that's really awesome 10 gigabit interface that's really awesome you know if you're running virtual machines and all your storage is here and your vms hit the network at 10 gigabit that's pretty good you're running vmware or something like that it's self-contained that's fine when you run into problems it's when you run a cluster you deploy a cluster so if we're talking about vmware and vsan and this is universally true of any vsan cluster not just you know running in a single chassis you really want the interface between nodes to be as fast as possible and 25 gigabit is really not expensive anymore i know the big oems really want you to think that it's expensive but melanox connect x4 and connect x5 cards like this one are under 500 typically you can pick these up now for you know two three four hundred dollars which is not an unreasonable price okay okay what about the switch yeah you can get the cars cheap but what about the switch well dell has the s 5212 f-o-n see there's a secret in the switch market the bottom is basically falling out of it the hyperscalers are buying switches in such capacity and they've gotten to be such sophisticated buyers that everybody that is buying when this was considered high speed is so sophisticated they're dictating how the switch is to be built and so there's not really a lot of differentiating features anymore unless you're like cisco or juniper or one of those but when you're facebook and amazon and google and you're specifying i a 100 gigabit switch there's this open network initiative uh and when you buy switches like the dell s5212 f-o-n it's what's called an oni switch o-n-i-e and you run a custom operating system on it depending on what solution you're deploying it as and so the differentiator actually becomes software now the software that bell will sell you is actually based on debian and you have to log into dell's license portal to get it but it's also kind of open so it's kind of hard to stop it from just leaking on the internet so but you're not supposed to run that because it requires a license there's also a subscription and versus not a subscription but the reality is is that like between open compute and oni and all of the other stuff and what happened with aristo remember a couple of years ago we bought that arista you know 10 40 gigabit switch for a bunch of nothing and it's like hey this is based on x86 the x86 hardware in there or the uefi hardware in the case of the dell is just control plane stuff there's still special silicon in there that actually does the switching it's just that the control panel for flipping all the knobs and tunables gives you a familiar interface on linux the cpu in those the x86 cpu and those switches is not doing the heavy lifting it's the custom silicon that's doing the heavy lifting but you as the operator get to decide which ports go where what vlans what interface speeds so on and so forth and in order to flip those knobs and tunables you need some kind of a convenient interface to do that well nobody wants to write an operating system erico linux becomes the de facto choice for the operating system of the control plane of those switches so in the case of the dell s5212-f-o-n debian and there's a how-to guide for getting it up and running assuming that you get the software from your dell license portal on the level one forum that is 12 sfp 28 interfaces just like our kinect x4 card that i've installed in this tie-in blade as well as three 100 gig interfaces which are similar to the 100 gig interface on our melanox card here and this is a switch that can switch at wire speed so we got 300 gig ports and 12 28 gig ports now actually under the hood what dell has done to make this switch relatively affordable street price is about 1200 it'll fluctuate a little bit because of this video because a bunch of you will run out and buy this because hey it's a good learning platform full disclosure uh it's actually a bunch of 100 gig interfaces that are pre-broken out into uh sfp 28 interfaces you see this interface in this form is four channels it's four 25 gig channels there's a total of 100 gigabit here the next step up in networking is 400 gigabit can you guess what they did with 400 gig yeah four channels that are 100 gigabit instead of four channels we're not here to talk about that that stuff is still a little bit unobtainium because the hyperscalers are buying all of that for crazy inflated prices but what we can buy is stuff like this and nobody ever got fired for buying mellanox and uh that's probably why the whole nvidia thing they're really really good cars connect x3 connect x4 connect x5 they're still very viable um even though i think this this card is from i think 2019. oh it's like oh it's three years old it's totally obsolete 100 gigabits it's 100 gigabits with low overhead but it's not just that it's 100 gigabits it's also that it can actually offload a lot of processing and computation of things like rdma so this is why it's very important to pair that with another switch like the s5212 on now because those are basically built or you know built a hyperscaler spec they don't do everything they don't solve every use case you see when dell sells you a switch it has to sell you a switch that basically works even if you're an idiot and the most dangerous kind of idiot are the idiots that don't know they're an idiot they're at the top of mount stupid and they really make everyone's life a living hell a short version of that is that when you build products around that they have to work no matter what the s5212 f-o-n from dell if you were to plug in an sfp plus cable into the sfp 28 port it will not work in fact out of the box it doesn't do anything because there's no operating system it's like oh there's a broken ah no just put the operating system but even when you put the operating system on it it doesn't work remember how i mentioned that it's actually 300 gig ports and then 12 28 gig ports well those 12 28 gig ports are in four port groups internally in terms of their wiring so if you want to run an sfp-28 port at 10-1 gigabit you can but it goes in four-port groups now some of these seem to be actually able to configure individual ports and some of these s5212s seem to be able to only configure them in groups of four i've got one that won't let me configure it for whatever reason and maybe my operating system version because i don't have the very latest uh it won't let me configure uh what just one port to run a 10-1 i have to configure a group of four and that makes sense because internally it's this 100 gig interface that they've just broken out into four sfp 28 ports you can do that with 100 gig ports too so like the the 300 gig ports that are physically you know this qsfp connection you can get a cable that breaks that out into four sfp 28 cables and that's totally fine and that works and there's there's no problem with that and you can go in the in the switch interface and tell it hey these are actually four interfaces not just one you can configure that to mix 10 gigabit and even one gigabit stuff with your 100 gig interface but you really don't need to in something like this you really what you want to do is set up your connectx interface to run at 25 gig or 2x 25 gig connect that to the dell switch and have everything run at 25 gig on this side and then maybe your storage server like your all flash san or whatever is connected to two of those 100 gig interfaces and you're good to go now here's the other really awesome thing about the s5212 f that showing 12 ports plus three you're supposed to have two of them that's why it's half the width of one u you're supposed to get two of these side by side and link them together with one or two hundred gig cables so that you've got 100 gigs of switching capacity between the two sides and then when you've got something like this with your dual sfp 28 connection one side goes to one switch the other side goes to the other switch the reason for that is if one switch fails completely your cluster remains fully connected so imagine four of these in our 2u blade uh that are all connected with you know two sfp 28 connections so 25 gigs on each side and then imagine another storage server that's connected at 2x 100 gig because it's all flash because we need to rock 10 gigabytes per second i mean 10 gigabytes per second is barely faster than a single gen 4 nvme so you know that's how this should work right then you have 100 gigabytes from your all-flash sand going to one side and another 100 gig interface going to the other side and that's how everything works and connection is really good let's step back for a second and talk thought exercise because as developers and people that are working on higher end local machines let's say that you've got a really awesome you know threadripper workstation you got really fast storage and all that you've set up your solution locally you've got your database running locally everything is really good you got that really nice gen 4 ssd everything on your local workstation is going to be insanely fast because you got all the memory bandwidth all the cores and all the disk speed when you deploy that solution to the cloud it's going to be an order of magnitude slower because the cloud computers just aren't that nice think about all of the stuff that we've talked about here 25 gigabit is only 2.5 gigabytes per second interface speed from storage to your switch and so when we're talking about something like vsan if the storage is local yeah i'm going to be able to run local nvme storage at local speeds that's great but if some of the storage for the vm that's running locally on this cpu and this memory needs some stuff from over the wire it's going to be constrained to whatever speed that it can get it from the other machine 10 gig 10 gig is not fast enough for vsan anymore if you're deploying a solution this year or even last year and you deployed your vsan solution on 10 gig i'm sorry but you've made a mistake because 25 gig would not have been that much more expensive and it's orders of magnitude faster not just because of the bandwidth but also the offload capability and that's the other feature that these melanox nics have remote dma and you know handling packet flows and handling you know local storage the drivers for linux windows and vmware for these network cards the software part of it is extremely sophisticated these cards will absolutely use their x16 interface and they really do take a lot of load off of the cpu when you deploy it in a solution like this so we've got a video coming up with openshift we're simulating a cluster on a single amd epic server it's based on the gigabyte mz 72-hb0 this is a monstrous motherboard that supports 280 watt epic cpus and my goodness running red hat open shift on that thing is crazy it can basically pretend to be three nodes of data center we've maxed it out with nvme storage and a 100 gig melanox interface i just the mind boggles at how fast openshift is and how efficiently it uses resources the plan is to do some content around those kinds of clusters with systems like this and i went to do that with our tie-in cluster and found out yeah 10 gig is my bottleneck i'm going to have to move up to 25 or even 100 gig so i've got some 100 gig pcie cards as well as their dual 25 gig ocp2 cards which is about the limit of what i'm going to do on ocp2 i'm really excited about this i love this tie-in chassis i love the gigabyte motherboard that we're rocking in our system that's the fractal torrent case which is one of the only desktop cases that give you enough airflow to satisfy server motherboard things are bananas and also big thanks to amd for letting me have all these cpus because oh my gosh have you seen all the insanity we've been just having a blast on the level one forms a little this is level one i'm signing out and you can find me in the level one forums where mel knox hundred gigs [Music] you
Info
Channel: Level1Linux
Views: 69,730
Rating: undefined out of 5
Keywords: technology, science, design, ux, computers, linux, software, programming, level1, l1, level one, l1Linux, Level1Linux
Id: lAk89Id-5RU
Channel Id: undefined
Length: 14min 25sec (865 seconds)
Published: Thu Dec 30 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.