The Petabyte Pi Project

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Hello /u/geerlingguy! Thank you for posting in r/DataHoarder.

Please remember to read our Rules and Wiki.

Please note that your post will be removed if you just post a box/speed/server post. Please give background information on your server pictures.

This subreddit will NOT help you find or exchange that Movie/TV show/Nuclear Launch Manual, visit r/DHExchange instead.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

πŸ‘οΈŽ︎ 1 πŸ‘€οΈŽ︎ u/AutoModerator πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

This entire project is stupid but it's stupid in a 'Mythbusters strapping JATO rockets to a car' kind of way. It's about the adventure and I love it. :)

πŸ‘οΈŽ︎ 100 πŸ‘€οΈŽ︎ u/AshleyUncia πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

Blog post with a couple more pictures is here, but I haven't had time to do a proper write-up (the YouTube video linked from this post has taken a considerable amount of time to the exclusion of all else ;).

But basically:

  • Storinator XL60 chassis
  • Yanked the Xeon CPU / 256 GB ECC RAM / Supermicro motherboard
  • Installed Raspberry Pi CM4 with 8 GB non-ECC RAM
  • Patched the Pi's Linux kernel so the mpt3sas driver works on the Pi
  • Connected 60x Seagate Exos X20 drives via LSI 9405W-16i HBAs
  • Formatted many different ways, the only reliable way I could get a full 1.2 PB addressable is using mdadm with linear mode (no RAID)

So... inherently unstable, but extremely fun project. I'll be pulling the Pi and re-installing the Xeon setup soon, and this will go into production as a ZFS-based archive vault for all my footage (I'm now hitting 100-200 GB/week, and will probably expand a bit as I do more video projects in 4K).

Now I'll need to figure out how to incorporate a petabyte in my back up plan!

Huge thanks to 45Drivesβ€”I contacted them last year and through a long process, we worked out the best way to do this, and they provided this hardware so I could do this video and then also continue to use it as my production archive storage.

πŸ‘οΈŽ︎ 231 πŸ‘€οΈŽ︎ u/geerlingguy πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

Just when you think you're smart, you watch a video that begins with "I worked with Broadcom engineers to patch the Linux kernel"

πŸ‘οΈŽ︎ 144 πŸ‘€οΈŽ︎ u/bigdon199 πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

One thing i would have done different. Plug everything up how they shipped it and make sure all the drives and system work as expected before going to the Pi.
You never know how the shipping has affected the system.. :)

πŸ‘οΈŽ︎ 17 πŸ‘€οΈŽ︎ u/Liwanu πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

Now do it with SSDs. It'll only cost $2m.

πŸ‘οΈŽ︎ 19 πŸ‘€οΈŽ︎ u/ShowerVagina πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

I'm kinda curious about the power usage here. The power supply is beefy, but what's the power draw at idle or all drives seeking at full tilt?

πŸ‘οΈŽ︎ 15 πŸ‘€οΈŽ︎ u/slatsandflaps πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

I really hope the Compute Module 5 will have x16 PCIe.

πŸ‘οΈŽ︎ 6 πŸ‘€οΈŽ︎ u/TheFuzzball πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies

Hol up

πŸ‘οΈŽ︎ 7 πŸ‘€οΈŽ︎ u/vijaysingh94 πŸ“…οΈŽ︎ May 18 2022 πŸ—«︎ replies
Captions
this is 1.2 petabytes of hard drives and this is a raspberry pi and using this giant server i'm going to plug all 60 hard drives into this one raspberry pi and build the pytabyte or a pietabyte or petapi let me know in the comments what you want to call this thing for now i'm calling it the petabyte pie project a lot of people said this will never work but there's only one way to find out and to get to today i had to solve a lot of problems the chip in the raspberry pi was never meant for this kind of thing it only has a tiny bit of bandwidth and the thing it uses to communicate with the hard drives its pci express bus might not even be able to work with all the drives last year i worked with broadcom engineers to patch the linux kernel to get 16 drives working but getting 60 well that adds another layer of complexity and what about power and space for the drives well luckily i have the storinator for that and you might notice i'm in the workshop i couldn't even fit this thing in my office where i normally record so i had to boot out redshirt jeff and film here but since i felt sorry for kicking him out i made sure 45 drives threw him a bone with the front panel design i think he approves once we get all the drives working if we get all the drives working we're also going to have to deal with the pi's networking bottlenecks i guess i should make it obvious right away don't try this at home not that many people have 60 hard drives and a storinator sitting around but there's a reason 45 drives doesn't sell a stornator pie it might work but it's gonna bottleneck before we get into hardware i need to put into perspective how massive a petabyte is i could download all of wikipedia on here 50 000 times my nas the one i loaded up with eight terabyte hard drives that thing is 40 times smaller than this petabyte of storage of course it's going to be a while before normal home users or heck even s-tier data hoarders deal with petabytes just having somewhere to plug in all the drives is hard and that's why i called up 45 drives they build a range of storage servers and their biggest one is this one this tornado xl60 and it is massive i mean look at this thing full disclosure somehow i convinced 45 drives to send me the server and all these hard drives they told me they want to see what wild and crazy things i can do with it but i had to promise the insurance company i wouldn't let redshirt jeff near it i was thinking about unboxing it on camera but fedex already did luckily nothing was damaged in spite of this box completely missing one of its corners if i pop the top cover off you can see slots for 60 hard drives and if you just pop the drives in you don't even need a tray to hold them and looking closely inside it looks like they even have some 3d printed parts in here after i pop off the back cover you can see a 26 core xeon cpu a super micro server motherboard with seven pcie slots a dual 10 gigabit ethernet card 250 gigs of ram and we're gonna rip it all out sorry to whoever cable managed this thing it looks great right now but it's gonna get a little messy first i took out the boot ssds after making sure one had an identifying mark from the factory then when i pulled the network card i found a nice surprise one thing i do like is it looks like all the screws in their chassis are the same thread so i could mix and match these screws which is nice standards it's nice to have them phillips heads it's nice to have those too if apple built this thing there'd probably be five different kinds of screws and half of them would be pentalobes i unplugged the atx and cpu power connections plus a little purple connector that i have no idea what it's for then i went to unplug all the hba drive connections and realized they were in a weird order looks like they have c a b d for the order of these cards which is not as intuitive as you'd hope it's probably important for a storenator since 45 drives has a fancy dashboard that can show where the drives are physically located in the system but the ordering won't matter on the pi look at that even the motherboard screws are the same how nice that is nice and it's especially nice because there are so many screws here server motherboards seem to have more screws in them than standard ones captain obvious there but it is interesting a lot of pc builds i've seen only have like six screws holding the motherboard and there can be an awful lot of board flex protocase the division that actually stamps out these cases embeds studs in every position so there are nine screws total [Applause] oh somewhere someone's screaming at me to use my static protection where's your anti-static wristband you don't even ground the anti-static mat you're using so instead of all that enterprise-grade hardware we're going with this this raspberry pi compute model 4 has a four core arm cpu one pci express lane one gigabit ethernet and a paltry eight gigabytes of ram i'm going to put it on this i o board then i'm going to use this pci express switch board to plug in four of these this is an enterprise grade raid card and what's funny is this one is actually a little newer than the ones that came with the storinator some people told me i should go the easy route and instead of using four of these hbas i could use a thing called a sas expander and plug all the drives through that but the stornator's custom backplane boards down here that plug all the drives in it might not work that way plus i wanted to see if a pie could handle enterprise grade storage and if you want the best performance you're going to use four of those fancy raid cards not a bunch of expanders of course if you're only interested in performance go check out linus's video on a petabyte of flash i 3d printed an atx adapter plate for the i o board and drilled out the atx mounting holes to a quarter inch so they'd slip over the stornader's motherboard standoffs that thing lets me mount the pi so all the i o goes out where a normal motherboard's i o is i also 3d printed an i o shield but because of a little threaded screw that was stuck in the case the i o board stands a little proud and the i o shield won't fit so i just tossed it more airflow right but when i went to put in the pci express switch i realized the thing is actually pretty big with the wide spacing between the slots this will be interesting i'm literally going to have it on top of the pie but that's okay i grabbed a little piece of cardboard to insulate the boards from each other at least temporarily but i also had to get a molex power connection to each slot on the switchboard and that means the power supply cable management had to go all right we're going to snip these the power supply had two molex power connectors but that was it so it didn't have enough power connections so i hope that i have an adapter looks like i do have a couple so i'm going to go sata sata to molex for two of these since that's what the this board needs i finished wearing up the power then plugged in the usb 3 cable adapter that goes from the pi's by one slot to the switchboard's input as long as it doesn't blow up we'll be good indeed after i got that sorted i started plugging all the back planes into the hbas so the fun thing is for the pi this doesn't matter too much what matters is getting them plugged in so i am going to try to get these organized but if they aren't that's not the end of the world in the end i didn't really get them organized i just made sure they all clicked into one of the free ports no click but it it's in well maybe except for that one the next thing i need is a power switch to be able to turn on everything and that's because the i o board doesn't have an atx power header so to turn on the power supply i could either jump the right pins on its atx header or use a fancy switch like this one unfortunately it looks like the leads on the switch are soldered so i couldn't get it to slide into the power button hole i'll just have to have it dangling oh well the last thing left is to install the actual pi compute module compute module 4 goes here i think that's everything i should make it clear before i started this build i did do some testing already at my desk with four hbas and one hard drive plugged into each one and if you think that looks messy just think what it would look like if i plugged in 60 drives that way but i did that for drive test and i made sure the drives would at least power up be recognized and work together in a raid 0 array with that setup i got about 416 megabytes per second so i think this will work but like i said the only way to know is to test it at scale alright here we go this is the fun part well it's fun but then the problem is to move this i'll need to take them all back out again now the hard drives 45 drive sent are seagate exos x20 drives and you might think about jamming any old drive in a system like this but you shouldn't even if you're using a pi the performance and warranty are typically better on enterprise drives but that's not usually as important for someone building massive storage arrays the two most important features are reliability and how many hard drive baits are supported these drives have a 550 terabyte per year and 2.5 million hour mtbf rating assuming they don't fail right away they only have a 5 year warranty though so don't focus too much on the mtbf in fact seagate says they focus more on afr or annual failure rate instead i love reading backblaze's hard drive reliability reports and sure seagate isn't always on top but most of the models are pretty reliable but even if we had the worst drive ever that's the whole point of a system like this there are 60 drives in here we should set them up with redundancy because with hard drives it's not a question of if one fails but when that's why i'm going to need to work out a new plan for backing up a whole petabyte because even if all this expensive hardware works great i still need a 3-2-1 backup if i want my data to be safe check out my video on that from last year the second important thing is how many drive bays are supported not every hard drive can run inside a storenator if you look at the specs on a desktop drive they don't even say how many drive bays are supported because they're just not built to handle vibrations you get in a server like this even ironwolf pro drives are only rated for up to 24 bays per server so they wouldn't be a good choice the exos drives you can put as many as you want there's no limit except for maybe the size of your rack and as an added bonus these enterprise drives are like party balloons they're filled with helium instead of air they're still pretty heavy though but the helium reduces friction and heat with so many drive platters spinning around inside so when you build your first data hoarder setup make sure you choose the right hard drives when you get to petabyte scale it actually matters a lot but enough about hard drives they're all installed we tidied up the back of the system a little bit at least and it's time to see if this thing boots i'm going to boot raspberry pi os on here and i actually already applied the patch you see on the screen right now to the open source mpt3 sas linux driver that makes these hbas work on the pi i have instructions for how to do that on github in a link below and you might have noticed i had to swap out the fancy power button from 45 drives for this little power switch since the compute module 4 i o board doesn't have an atx power input i have to use a separate switch to turn on the power supply but that's wired up the drives are in and the pi is ready i hope let's see what happens that's pretty loud but i guess you kind of need this much airflow when the hard drives by themselves can eat up to 600 watts of power i guess the redundant power supply works and it's a good thing both of these tiny power supplies can do let's see 1200 watts i was also worried about the startup surge when you boot up all these hard drives at the same time i know the fancy raid cards i'm using are supposed to stagger the hard drive spin up but i wasn't sure if it would work or not with this setup but it looks like it did because i can hear the different drives spinning up at different times alright we'll give it a few minutes to boot and hop over to the terminal to see what happened first i made sure i could see all the hbas using lspci then i also checked to make sure it was using the mpt3 sas driver which is the patch driver i had to compile into pios's kernel i also checked the system log and saw it had initialized a bunch of disks but when i checked out how many drives the system could see with lsblk it was only showing 45 drives i mean that'd be great for a sponsor segue but i really wanted to see all 60. so i shut down the server and noticed all the hbas still had blinking green lights so they were all at least getting power that one wasn't all the way in oh look at this the card's not all the way in yeah it came out as i plugged all the devices in let's try this again [Music] this time i watched as all the discs were getting initialized and it was fun listening closely to hear the different sets of hard drives spinning up you can hear all the drives slowly spinning up oh look at all these this time when i ran lsblk it showed me all 60 hard drives i also checked the model for all the drives and it's showing the right model number so the next step was to put the drives in an array and see how they perform because i'm not a masochist i pulled on my mac plug the pi into my network and started managing it over ssh i grabbed the ip tried logging in then realized i never put my macbook's ssh key on the pi so i headed back to the office and did the rest of my testing from there first i tried raid 0 which spreads data out over all 60 drives this is a terrible idea for redundancy because if any one drive fails all your data goes poof but it's also the simplest and fastest raid level and doesn't require any special data calculations to run on the pi's cpu which just doesn't have the horsepower for fancy things like parity on a 60 drive array but i kept getting failures during the last part of the formatting process it kept dying with input output error while writing out and closing file system checking the system log i saw a bunch of buffer i o errors and i also noticed some fault state messages with an error code like ox2623 or ox-5854 and they also rebooted a few times while debugging and sometimes not all the disks would show up whenever that happened the last message would be something like start watchdog error so sometimes during boot the driver would just die for no particular reason but about half the time at least all the drives would show up i also noticed md admin would report the array as broken after formatting failed and at that point one or two hard drives would just be gone until i rebooted if i couldn't format the raid 0 array i couldn't mount it either so that was a dead end and at this point i was wondering if maybe the vibration when all the drives started riding could be an issue i mean it shouldn't be but exploring that idea i actually have a seismograph my raspberry shake running in my basement and sure enough when i turn on the storinator i can see the fan noise up in this band then whenever a group of drives spins up it registers pretty clearly and i remember around a decade ago there was this masterpiece of a video [Applause] don't shout at your j-bugs they don't like it but no i mean these exos drives are built to handle the vibration in a chassis like the store knitter and i'm not actually yelling at them so i don't think that's it vibration is definitely something to think about but i don't think that's my main issue so at this point i switched tracks and installed zfs i've actually tested it on the pi before and it usually runs faster than standard rate on the pi but then i realized you have to have kernel sources to install zfs and i didn't because of the custom kernel i built and getting that set up would take a bit of time so i moved on to plan c which was btrfs btrfs is already enabled on pios so i just installed the extra package to manage volumes and used make fs to create a btrfs raid 0 file system and it worked i got a 1.07 pebbite or 1.2 petabyte raid 0 storage volume i mounted it and ran my benchmark script but it was a little disappointing earlier with one drive on each hba i could get around 416 megabytes per second now with all 60 drives in btr fs raid 0 i'm only getting 213 megabytes per second almost half the performance and random writes were less than 20 megabytes per second but the end goal is to see how well this thing performs on the network so i installed samba connected from my mac and copied over 70 gigabytes of video files at first it was getting over 100 megabytes per second but after a few seconds it dropped to 30 then nothing then 30 again i used atop to monitor the pi and there were a few spikes in cpu usage but after copying about 3.5 gigabytes the copy just stopped completely and failed btrfs said some device is missing that's oddly similar to the problem i had with mdraid0 earlier and the system log showed a similar error too that fault state with the 5854 code at this point i shot an email off to broadcom about it but in the meantime i also remembered btrfs has a single mode which puts the drives together but not in raid instead of splitting up writes across all the drives it would store complete files on different drives so i set up a single btrfs volume did the network copy again and then the same thing happened error messages in the log some devices missing in btrfs and looking at the array it looks like one of the drives device 23 just poofed out of existence so i went for a hail mary i went back to the simpler linux raid setup using md admin this time using linear which is similar to btrfs single layout with the drive set up sequentially instead of altogether in a stripe and it was interesting seeing each drive get formatted one at a time like here when it was formatting sdaj and at the part where 0 failed it wrote to each drive in sequence instead of all at once and that seems to have made it work i mounted the drive and of course since this array is so big and the pi is so slow it took 6 minutes just to mount but it worked so i ran my benchmark on it the benchmark was better getting 250 megabytes per second but still nowhere near the performance i got with just four hard drives in this case that's because the way a linear array works it's only really benchmarking one drive at a time so i did another network copy expecting it to fail but no it actually worked this time there were a few times when the copy slowed down to 30 megabytes per second but overall it averaged 72. but that means this storage setup can't even max out a one gigabit network but hey at least the copy did finish 74.31 gigabytes in 17 minutes and eight seconds extrapolating that out to 1.2 petabytes it would take 192 days and that's kind of the limit at least with all 60 drives at this point i talked to the broadcom engineers now i should mention that none of this setup is a supported configuration we're using a pi in ways it was never meant to be used and we're putting enterprise storage cards in a situation they were never made for but the engineers did have a couple ideas first the firmware on the cards was version 5 but the latest version is 22. so a firmware upgrade could help a little but when i tried it failed it looks like that has to be done on a machine with an x86 processor and the pi's little arm cpu couldn't cope second they looked up the errors i was getting and that ox-2623 error apparently means the driver detected a data corruption during a storage operation and seeing as i have a bit of a hacked together solution for both of those things i wouldn't doubt it so my best guess is that when the pci express bus is sending tons of data to and from all 60 hard drives through all four cards at the same time something is flaky it might be the power delivery to one of the cards or one of the connections on the pci express switch or maybe even that usb 3 adapter cable that goes from the pi to the switch i don't know so yeah i mean we got 1.2 petabytes on the pi it worked at least with no redundancy and no speed up but would i recommend this setup no not at all i took the pi to the bleeding edge and it started bleeding out there's a reason 45 drives uses a xeon cpu and a server motherboard for their servers they can handle 60 drives easily and even do it with 10 gigabit networking or even more than that the pi it couldn't even saturate a one gigabit connection but heck if you have an old storage server sitting around and a ton of hard drives you could run it on a pi but don't expect the experience to be pain free make sure you subscribe i'm going to follow my own advice and swap back to the 45 drives hardware i'm also going to review this thing compared to some other enterprise options like this 100 drive server patrick reviewed on serve the home and see if i can mount this massive 300 pound server in my new rack oh and if you want to see me build a new rack head over to gearlink engineering where my dad and i just posted a new video about it until then i'm jeff gearling
Info
Channel: Jeff Geerling
Views: 1,524,740
Rating: undefined out of 5
Keywords: raspberry pi, compute module, cm4, hba, raid, hardware, nas, storinator, xl60, 45drives, storage, hard drive, hdd, ssd, speed, performance, slow, network, attached, seagate, x20, exos, backblaze, pod, rackmount, rack, server, houston, motherboard, xeon, removal, replace, upgrade, downgrade, sbc, rpi, pi os, sas, sata, nvme, pci, pci express, pcie, switch, 10g, fiber, benchmark, 60, massive, heavy, power, spin up, vibration, shake, earthquake, yell, juggling, red shirt jeff
Id: BBnomwpF_uY
Channel Id: undefined
Length: 22min 26sec (1346 seconds)
Published: Wed May 18 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.