4 Pis on a mini ITX board! The Turing Pi 2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

mini cluster ok, but supercomputer is very overrated, for such it should do 100 petaFLOPS at least.
This one should be about 50 GFLOPS, so 2 million overated

👍︎︎ 8 👤︎︎ u/vilette 📅︎︎ Dec 01 2021 🗫︎ replies
Captions
i was in downtown st louis during super computing 21 a conference about well supercomputers why well i just so happen to have this board which runs four high performance servers at least in terms of gigaflops per watt this is the turing pi 2 it's a 200 mini itx motherboard that holds up to four raspberry pi's or jets and nanos and it has a lot of other tricks up its sleeve anyways at the arch i ran into someone unexpected and he gave me a challenge you are never going to guess who i met here in st louis at the arch today this is patrick from serve the home he actually told me that he built a four node server in a box a few years back oh yeah and he's challenging me now to do the same thing today in 2021 i'm gonna do it with four of these little guys and he's gonna build one too but it might be a little bit different what are the ground rules here okay so basic ground rules of this thing we should have at least four arm nodes and it has to be in one box and i think also we should go and get together in like let's call it two weeks or so and maybe we'll go compare notes and see what we both built all right that sounds like a plan let's see what happens i literally have no idea what patrick's gonna build but i'm guessing his four nodes will use a lot more power than my little pie cluster that could but it'll be interesting to see how it compares well i have this board and four eight gig compute module fours let's build a pi cluster the thing i like most about this board is that it's mini itx which means it will fit in tons of standard pc cases as an example i have this bc1 mini build platform and i can just use the turing pi with this out of the box i don't have to build any adapters or do any 3d printing so the first thing you'll notice if you compare the new board to the old one is that there's a lot more going on and that's because this new board exposes each of the pi's pci express lanes to an individual function on the board and i'll go over those later but also there's a lot of i o around the edges of the board that's new for a pi for for example this is the first board that has a full 24 pin atx power header so you can power this board from any standard pc power supply it also has headers for the front panel for power switch and leds it has a front panel usb 3.0 header it has usb 3 on the back it has dual gigabit ethernet controlled through this rtl 8370 ethernet switch which means i don't have to have an external switch to power all these different pi's over the network so it's really cool for all that and up here it has uart it has a jtag header this stm32 chip allows the board to run some custom firmware that also wasn't present on the original deterring pi and it has a better cold start capabilities and it staggers the boot of the pies all these different things can be controlled in the firmware it also gives you an interface through to things like the rtc clock and the ethernet switch so that is a huge quality of life improvement now as i mentioned you can use any standard pc power supply and i went way overkill with an sf600 from corsair this is a 600 watt 80 plus platinum power supply the main reason i'm choosing it for this build is because at some point i'm planning on using the same power supply in a build that will have a graphics card in it and the graphics card is going to need that power but this board can be powered in many other ways in fact i also have a pico psu and this is probably the more practical way of powering the board this is a little power supply that has all the cables you need and all you need is a barrel plug power adapter and i have a few different ones and this will power everything on the touring pi just the same as this power supply but this one has a little bit more heft and beef to it so i'm going to mount it up here with some of the included thumb screws okay so we have our power supply another thing i like about this build platform is it has some cutouts in here so i can route power cables through it and i'm just going to stick these guys through here the good thing is they don't come out easy but the bad thing is you usually cut your fingers on these things so we have power for the board i'm also going to be installing this hard drive it's a two terabyte crucial ssd and it has a i have a little sata cable that i'll plug through the board but it's going to need power too so i'm going to route this sata power cable as well and i'll go ahead and mount up this sata drive and i'll plug in the data all right and now i'm ready for the board itself so i'm going to mount it up on these four posts now this the bc1 actually has a post for snap mounting too but for this board it's going to have a lot of things plugged into it so i don't really want to have it have any potential falling off so i'm going to screw it down on here all right so i got that in and i'm going to connect up power i'm going to need to get a little bit more slack out of here all right so we got power to the board and i'm going to plug this into the sata port sata not sata the next thing is i'm going to install the raspberry pi's now you might be wondering if you look at these slots these are not compute module 4 slots they're actually sodium slots and if you have something like an nvidia jetson you can actually plug this straight in and you could build a hybrid cluster with nvidia jetsons and compute module fours if you wanted or do all of one or all of the other it's up to you but in my case i'm going to use some compute module fours and they need a special adapter board and this adapter board is kind of like its own compute module four board on its own so we're gonna install the compute module fours on here now i have four eight gigabyte compute module four lights and they get kind of hot when you're doing clustered computing or running kubernetes or something like that so i also have four heat sinks for them so i'm going to put the heat sinks on them before i put them on these daughter cards and then i'm going to put all the cards into the slots and then we'll have our cluster and we can boot it up and see what happens so the pi's actually run okay without heat sinks but to be safe it's nicer to have the heatsink just to get that heat off of the pi's soc quicker and if you put a fan in whatever case you're going to put this in it'll get that heat right out of there very fast and keep your cluster running at full performance so these heat sinks come with thermal pads for the power chip as well as the soc which is in the middle of the board i think people who build full-size pcs have it easy they have screws that are actually visible to the naked eye and i actually tested these heatsinks with this setup we'll see if they actually fit or not so while i finish up this last one i should also mention that the pi's system on a chip does get pretty hot in operation but i actually am going to take a thermal camera i'll put the overlay on the on the video right now so you can see what it looks like it took some thermal images of the board itself and it it seems like of all the processors on the board probably that ethernet switch will get the hottest and i know on the old turing pi board i had to put a heat sink on it to keep it a little cooler so might have to do the same thing on here we'll see i'll be testing it all right so we have our four pies we have our four cards and let's get them all loaded onto the board you can actually mount them up if you have all the right adapters and things but i'm just going to plug them in and rely on the friction fit of these 100 pin connectors and that's all there is to it stick it in the board like so all four of these slots have their own pci express lane exposed on the board in some different way i actually have a couple cards here this one is a google coral tpu which as i said in my previous video the drivers for the raspberry pi still aren't quite there but it might work with some other boards and these cards should be compatible with rads's cm3 and with pine 64 soquarts so i'm going to try those out at some point but for now i'm just going to see if it shows up this one i will put into this slot over here on the side which i believe is connected to node two and i also have this mini sata card that's also mini pcie and i'm going to put that in over here sometime i would like to test out lte networking and have this be a completely portable wireless pi cluster we'll see if i can make that happen or not but that would be i think a fun project let me know if that's something that you'd like to see me do it's very expensive so that's why i'm not committing to it yet and then you might be wondering where are all the other pci express lanes exposed well one of them goes to these two sata ports i believe that's on node 3 and then node 4 has a vli vl805 usb 3 chip that goes out to these usb 3 ports on the back and the front panel usb 3 header so you might be tempted to think that uh the sata ports can be seen by all the pies somehow magically but that's just not how it works the idea being that you could have one pi be kind of like the storage controller with with sata another pi could be the wireless network connectivity and you can build your cluster that way so they share the resources and that's actually a good pattern for the pi because the pi only has pcie gen 2 so you can't get a ton of bandwidth this is a healthy compromise because you do have a little bit of freedom with these mini pcie slots and you can get adapters like i have for the key a e adapter for the coral that goes into mini pcie the last thing i want to do is make sure that this stays cool and i was going to use this board but i actually found out from turing pi i've been talking to them about this board this is a 12 volt pd bomb fan and it plugs directly into the fan header on the board however the firmware currently doesn't have the pwm fan control set up in it so for now this one is actually powered off so what i'm going to do instead is use this noxua pwm fan that's 5 volt and you can plug it into usb and i have naksha's pdm fan controller so i can control the speed of it so it's not just going full blast all day because i've lived like that before and i don't like having my right hand get frozen while i'm doing these projects for airflow this location is not exactly perfect but it is good it'll get air past all these you'd want to have airflow going this way to get all the heat out of those heat sinks all right so it's going to go into usb cable managed so i already loaded up these four cards using the raspberry pi imager and with the pi imager you can actually set a host name so i set this as turing node 1 turning node 2 turning node 3 and turn node 4 and i put my ssh keys on them so that when i boot them up i should be able to log right into them without having to do any discovery or any passwords or anything like that so i'm going to put each one of these into the micro sd card slots on these boards okay and i'm going to go ahead and plug in power and then when i switch this on the board should come on and once it comes on it should start its boot process and i believe it boots each pie in succession i'm also going to connect it to hdmi so i can see what happens on the monitor the hdmi port is connected to node 1 so you can only see node 1's video output you can't see any of the other nodes through this connection and ethernet this is set up by default in a bridged configuration so don't plug both of these jacks into your network at the same time or mayhem results no sparks that's good here comes power to each of the pies and if i look at the back of the boards i can see the power leds on each of those pies as well all right so it looks like they're all booting up and if i turn on my monitor i should be seeing the video output from the first pi and it looks like it's already booted completely now i have to admit i cheated and booted it once before to make sure that they actually boot so it already did its disk expansion and all that stuff so what i can do now on my computer is i'm going to log in to the node 1 and see what i can see on it so ssh pi at turing node1 dot local i believe it is and if i say lspci now i can see that there's a sata controller and that is this sata controller the one that i just plugged in here so you can you can have all four populated with different pci express devices but the last two have the built-in pci express devices so if i log out of here and log in to node 4 and run ls pci i can see that there's a vl805 usb 3 host controller and that's plugged into these ports and the front panel header like i mentioned earlier so one thing that i really like on this version of the board that wasn't on the original is all the blinking lights they're very helpful for the status there's these red ones which means that the slot has power to it the microcontroller controls whether there's power to the slot or not on the other side there are link and activity lights for each slot's ethernet activity and of course on the back there's lincoln activity on each of these network jacks and then there's a board power light that's the first green led and then system power means that the the micro controller is powered up and can control all the boards so all those are on there and on the back of each of these cards there's a power and activity led for each pi itself blink and lights are one thing but i want to see how this cluster performs and to do that i need a way to interact with all four raspberry pi's without tearing my eyes out i'll use ansible since well i wrote a book on it i set up this basic inventory file to tell ansible how to find the nodes and i put them into a few groups like control plane nodes and cluster which will be helpful when i set up kubernetes in a future video so with this i should just be able to run ansible all dash m ping and get a response from all the nodes let's see if that works nice now the other thing i wanted to do was run hpl or high performance linpack this is the benchmark the top 500 supercomputer list uses and since they just updated the list at sc21 i thought it'd be nice to know how my pi cluster stacks up so i built this playbook that builds mpi atlas and hpl so i can benchmark the cluster and of course i put it up on github it's open source like everything else i do so if you have your own cluster you can run the exact same benchmark i ran the playbook but i had trouble getting the benchmark to run on the whole cluster it would just hang instead of running the benchmark as with almost every problem i encounter these days the problem was dns hey i have a shirt for that redshirtjeff.com anyways after i figured out i had to add all the node ip addresses to each of the node's host files so they could all see each other i ran the hpl benchmark and the cluster put out a respectable 45 gigaflops this cluster would have qualified for the november 1999 top 500 list not too shabby considering those clusters are running 64 or 128 cpu cores and this one only has 16. but what's more interesting to me is the potential for more energy efficient computing the turing pipe 2 is billed as a potential edge server and a lot of places where they'd be deployed might be running off solar or have a limited power budget it's great to put out hundreds of gigaflops but if doing that causes a brownout it's not really a good solution and here's where this little pie cluster does pretty decent if we rank it in the current november 2021 green 500 list it would actually rank somewhere around 150 getting 1.83 gigaflops per watt granted the other machines on the list were pumping through over 2000 teraflops but still by my calculations i don't need around 150 000 more compute modules to make it to the top 500 list and that's just the thing the pi isn't going to be a compute monster its cpu cores are pretty power efficient but even compared to an m1 mac they're just not that fast but they are small and relatively cheap and that's why places like the los alamos national laboratory built a 750 node raspberry pi cluster in 2017. they realized they could still learn and test on a smaller more efficient cluster and then run their final workloads on the big beefy production servers the pies can be densely packed without eating megawatts of power and that beats out a lot of other types of servers if quantity is more important than raw performance but let's be real for most people the laptop or tablet you're using to watch this is already faster than my pi cluster my m1 macbook air uses less power than the cluster and still puts out about 40 gigaflops using only the cpu but it can't run linux yet so there's that now that we know how this little pie cluster performs i think it's time to see what patrick's been up to all right so it's been a couple weeks have you finished your build yes i did all right well i want to see it and how about we both reveal our builds at the same time and then you know we'll see what they look like all right i think that's an awesome idea are you ready yeah three two one ta-da i notice you're not holding yours can you uh can you turn it is it something you can hold in your hand yeah it's a little it's a little bigger than uh yours i guess is and even has like a fancy fancy glass window oh my yeah so this is just kind of a pretty pretty basic cluster all in one box single nice uh fractal design chassis and i see yours is all like together you don't have like a case or anything on it but it's all put together well for this first build i built it this is a bc1 mini benchtop itx frame and i wanted to build it here because i wanted to be able to do everything to it and kind of experiment on it right now there's a sata board and a coral tpu and stuff but i'm going to do some other things before i put it into my rack so i i i kind of want to guess maybe a little bit bigger but so this is just kind of a pretty standard uh standard cluster in box these days uh so here what we have is an amd uh threadripper pro so 64 core processor there are eight dim modules of course for that processor with 64 gigs each so we have a half a terabyte of memory uh and then over here um these are actually the arm because we said we have at least four arm nodes there are seven slots so i figured well why don't we just do seven slots and so uh these are actually the melanox bluefield two dpus and they're the vpi cards so i can run either infiniband or i can run ethernet there's two 100 gig ports on each and both the host threadripper system as well as the bluefield 2 dpu each little arm core complex can actually access both of those so we actually have about 1.4 terabits per second of networking built into this and then in terms of each card you get a total of eight 2 gigahertz arm cortex a72 course you get 16 gigabytes of memory per uh card and then you also get 64 gigabytes of storage so they're all running ubuntu right now just to kind of make it easy as well as running ubuntu on the main threadripper system as well and then of course because this is a really kind of cool asus board we actually have a arm another arm processor that is on here as well which is actually the a-speed bmc so you have another thing that goes and provides all the out-of-band management features that you'd want as well so you mentioned that you're getting terabits of bandwidth through their network connections yeah this is about 1.4 terabytes per second because you have seven cards each with uh you can spin this so so yeah see you can see that each card their pcie gen 4 by 16 cards so they actually have 200 gig ports each because you can run 200 gig networking off of a pcie gen 4 by 16 slot well i'm proud to say that this has two gigabit ethernet ports so we can get a whole gigabit of network connectivity out of this but i mean i think that brings up an interesting point here this oh jeff there's also two 10 gig ports here and then an out of band one gig management port as well so i i think that brings up a good point here though like the turing pi 2 is not meant to be some sort of i o monster and the raspberry pi only has pcie gen 2. that one is not built for like necessarily edge computing where you're power limited or you want to have wireless connectivity and you might be limited to under a gigabit of total bandwidth right so these are actually of course supposed to be used in servers but i just kind of wanted to show that you could actually go build a cluster in a single box using this and you know we have 56 arm cores you have 112 gigs of ram just on the arm side and i think a little over 440 gigs of storage just on the arm side of this and then we also have you know the threadripper pro so we have another 64 cores another half terabyte of memory and a couple terabytes of ssd storage as well i'm also interested to know uh do you have a ballpark estimate of how much that whole system with everything would cost i'm gonna guess a little bit more uh more than that system um but on the other hand i can't get one of those touring pie boards so priceless it's basically you basically have a priceless system this is one that you can just order so patrick's cluster might be a little more beefy than mine it has hundreds of cores terabits of i o and pci express gen4 of course it has the price tag and power requirements to match but that's not the point both of our builds approach the same problem building a cluster of arm-based computers the end result is radically different but the cool thing is maybe you're just starting out with clustering or maybe you have a limited budget or limited power the turing pi 2 is great for that not everyone needs 100 cpu cores and 1.4 terabytes of bandwidth those who do usually have a budget to match and they have a much more realistic chance of hitting near the top 500 supercomputers but even without that muscle the turing pi 2 offers a lot and coming in around 200 bucks for the board itself and 10 bucks for the compute module 4 adapter cards it's not too expensive it should be released early in 2022 and the board i'm testing is a late prototype but not the exact final version i'll be covering more aspects of the touring pi 2 like rack mounting it in this mini itx enclosure from my electronics so make sure you subscribe until next time i'm jeff gearling i need to think pc power p power supply that's it power supply you got it this is like a vengeance quality sth yeti right here i don't think they're selling this yet but uh each of these heat sinks comes with its own little screwdriver so we'll see how many screwdrivers i end up with at the end of this i believe you're supposed to try to get as many of your finger oils on these things as you can now it's stuck oh that's weird
Info
Channel: Jeff Geerling
Views: 359,895
Rating: 4.9184737 out of 5
Keywords: raspberry pi, turing pi, turing pi 2, v2, cluster, k3s, kubernetes, edge, iot, supercomputer, supercomputing, serve the home, homelab, sc21, st. louis, arch, saint louis, stl, build, mini itx, atx, power supply, corsair, sf600, nodes, controller, setup, self-hosted, hosting, router, platform, compute, computing, computer, pci, pci express, pcie, hpl, xhpl, linpack, top500, green500, efficiency, efficient, power, budget, gigaflops, gflops, tflops, teraflops, list
Id: IUPYpZBfsMU
Channel Id: undefined
Length: 22min 35sec (1355 seconds)
Published: Wed Dec 01 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.