Accelerating FreeNAS to 10G with Intel Optane 900P

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

It doesn't make any sense to use optane for the workload he's describing. It's a read cache of mostly static content for a file server, so something like a 960 evo would be perfectly fine for this, hell he could spend the same amount and buy a couple of 960 evos and bundle them together for even greater performance than what he's getting from the optane drive.

👍︎︎ 15 👤︎︎ u/Thotaz 📅︎︎ Nov 11 2017 🗫︎ replies
Captions
hi this is John Bock president of Puget systems in this video we'll be taking a look at using SSDs specifically Intel opted-in SSDs to accelerate network-attached storage in this case with the platform freenas now we do a lot of testing on the workstations that we build here and every time we test one of these workstations we have to copy over dozens of gigabytes of benchmark tools and utilities and software and that's on every single pcs that goes to our production line so that's a lot of data to move around and we use network attached storage for this and that's in a server in our server room that we put in place years ago but it's been there a few years and it's time we give that system a shot in the arm we want we want to speed it up we want to be more efficient with with those tools so Rison that we wired up all of our install stations where we image our operating systems and and do all of our testing we ran a 10 gig Network lines for them to these these installation benches and then we give a 10 gig line to every PC that's that out there on those benches but of course all of the networking speed is only good if our storage server can keep up if they can actually serve out the data that quickly so today I will be looking at accelerating that network attached storage that free nas server with SSDs so we use the free nest platform for our storage here and in the process of accelerating it with SSDs I learned a lot now the first thing we need to do is benchmark our server before we even make any changes so we know what our base line is so if we look at the storage array that we have now we have for Western Digital re 4 terabyte drives those are just platter drives or enterprise-class or good drives but there they are platter drives so as I copy the files from the server to my local machine I'm going to be showing you four things on the bottom left is just the network speeds that I'm monitoring on my machine see how much how much throughput I'm putting through the network in the bottom right I'm actually showing you stats from our server I'll be focusing actually on the ZFS stats to monitor the the cache population that'll be more important later the top right I'm actually showing a real-time view of the read and writes from the data volume so we can actually see how much data is actually being pulled from the platter drives themselves and then the top left is the actual file copies being being done so we've started it now this is pulling large ISO files Linux install discs from the server and as that copy starts you can see that there is quite a bit of data being read from the platter drives themselves if we look at the the ethernet speed i'm pulling in you know a gigabit i'll be pulling in two gigabits even three gigabit sometimes and that's about we would expect from a server like this we have four platter drives in there each one can do about 150 megabytes a second you do the math and we're pulling the speeds that I would expect peopling off all four platter drives so as this test runs I'll go ahead and speed things up so we don't have to watch paint dry here but the end result is after I copy 57 gigs of data off of our server I average about two gigabit in speed so about one-fifth of what our our network connection can do and no matter how many times I run this test over and over and over again I'll be getting the exact same speeds of two gigabit but we just installed a 10 gig network so that's not fast enough so next up let's put in a SATA SSD as cache and see how much faster we can make things so to do that we go into our volume manager we select the storage volume that we want to augment here I'm selecting my one terabyte SSD adding it as an l2 our cache and we'll wait for that to be added to our array so the way that this works now that we have an SSD cache to our array is every time we read data it also loads that data onto the SSD so that if in the future we request that same data again it's already cast on the SSD it can load from that and it doesn't have to go to the slower platter drives to reload it again so that's perfect for us because what we're doing is we're copying the same data over and over and over again for every pcs that goes through our production line so caching to an SSD should be a really great help for soup it should be a good boost in performance so as I start this test again you'll see that on the first run I'm getting pretty much identical results I'm pulling two three gigabit average just as before and that should be what we expect nothing's been loaded on the SSD yet nothing's in the cache so it's pulling from the platter drives just like before but this time as we go along keep an eye on the bottom right hand panel you see two boxes the ark size and the Ark hit ratio and what that is the the Ark is our cache and there's two different types there's the blue dot that's the Ark that's always there in free Nets that's your system memory that's a cache that's very very fast but we also don't have very much of it I have 24 gigs of memory in this server the L to Ark the red dot that's what we just added that's our one terabyte SATA SSD much much larger and what we want to see is we want to see the Ark and the L to Ark fill up with all this data that we're putting into the cache now down below the Ark hit ratio that's keeping track of how often when I request data is that actually coming from the cache or if it's not coming from the cache it has to load from the slower platter drives so we want to see a higher arc hit ratio we want to see a hundred percent of our data coming through either the ark or the L to Ark ideally that would be the perfect scenario as I fast forward through this first pass and look at the second pass you can see there a cache is starting to populate that red line is growing as our L to Ark is being populated but it's still not very exciting not a lot of changes so let's keep fast-forwarding and go through another pass and another pass and as we go go through the cache is holding more and more of the data and you start to see our rate is climbing so more of our data is actually being pulled from the cache and down in the bottom left you'll see that our speed is also improving we've already almost doubled our speed now we're approaching four gigabit and then after that no matter how many more passes I fast forward through here we're still holding at about that same speed well it makes sense again how fast is this SATA SSD that we're cashing - well it's a SATA SSD it's rated for about little over 500 megabytes a second and read you multiply that by eight and you get four gigabit that's exactly the speed that we're seeing so we're you successfully cashing to that said SSD but the speed limitation on that SATA SSD just isn't going to cut it we want 10 gigabit not for gigabit so we have to keep working so let's turn to a drive that I think would be perfect for this job Intel just launched their new obtain drive the 900p I have a 480 gigabyte Drive then I'll add that to our to our volume as an L to our cache and I think that it's gonna work really well for this we could get the same throughput with a lot of different NAND based nvme drives out there on the market but there's three things that draw me to this opt Ain Drive specifically first of all it has much faster write speeds than a lot of the drives that you'll find out there that are based on manned so with a faster write speed I can populate the cache a lot faster second it has a lot more consistent performance over time and consistent performance at low cue depths that's something that I think would be pretty ideal for a cache drive and then third it has excellent endurance as an op tain drive I believe it's ten Drive writes per day that it that it's rated for don't quote me on that but in any case it's higher endurance than you'll find with the NAND drives which is something I really would like in a cache based solution this isn't quite at a level where I need to go to enterprise class drives but endurance is something I care about so that's that's a good fit for me now as I fast-forward through the file copy is over and over again you'll see that we are populating our arc cache or l2 our cash and our arc hit ratio is getting better and if you look at our Ethernet speeds in the bottom left they are improving over time but I'm having to really fast forward through this and copy over and over again I had to copy the files about ten times before the cache was fully populated and even then I was expecting a little more speed than this we're only pulling about six gigabit on average so it's time to investigate a little bit now the problem is that in FreeNAS or even just a ZFS filesystem in general the default settings for the l2 arc weren't made for something as fast as this until obtain drive by default it only allows eight to sixteen megabytes per second of data to go into the cache and was a number that low it made sense that we had to load the data over and over again and it took a very long time to populate that cache so there's three settings that are important to change for a setup like this the first is the l2 Arkwright max now that refers to how much data per second can be written into the Z cache the default is 8 megabytes a second in my case I upped it as high as 400 megabytes per second so much much higher now the key is you need to know your drive you need to know your cache drive and you don't want to write so much data to it that it'll slow down your read speeds because the read speeds is what's really important here that's serving data out to the clients so you need to make sure you choose a number that you know your cache drive can maintain I did a bunch of testing with my Intel obtain drive and I found that 400 megabytes per second it could sustain very easily for other drives out there that might be way too high next up is the l2 Arkwright boost so this refers to when your cache is cold when there's nothing no data in your cache drive yet how much more do you want to allow to be written to your cache drive per second in my case I upped it again from 8 megabytes per second up to 400 megabytes per second this Intel drive can sustain over 1,000 megabytes per second right to it so that wasn't any problem but again you need to know your cash drive and know what it's capable of to know what number you can put in there last up is the l2 arc no prefetch you need to set that to 0 if you're in a workload like us we do a lot of sequential writes a lot of large file copies and that is exactly what we want to have in the cache for other people having a cache drive might be less about the speed of the transfers and more about freeing up operations per second for other other things that your storage servers being used for so really your mileage will vary you need to know your own workload to know what setting is right for you so after you put these three settings in place you do need to reboot your server for them to take effect and with that in place let's go ahead and start up the file copies again and see what we see I'm going to fast forward to the second pass and you see that even after just one pass now the cache is filling up much more quickly by the even the second pass I'm already faster than I was after 10 passes with my previous settings and now if I fast for a little more by our third pass I'm already pulling 8 even close to 9 gigabits and if you look at the the top right showing my reads and writes from the front of the platter drives it's down to really these small amounts or even zero now so that's great that means that I'm pulling all of my data from the cache at this point I've maxed out pretty much everything I can do here and I'm happy I'm still not using up all of my 10-gig network but really in order to get all the way up to pretty close to that 10 gig there's a lot more tuning that has to be done TCP window sizes and buffers and jumbo frames things that have to be done both on the server and the client and I can't really do that because this is going to be data that's copied to customer machines every single one that goes to the production line and we're not gonna tune to our specific network every single customer machine so I'm really happy with this I'm really happy with getting a close to 9 gig gigabit on our 10 gigabit network so this is great I really enjoyed this process I really enjoyed learning what it takes to get all this speed of a FreeNAS server I'm actually going to be doing a follow-up video on all the different possible bottlenecks that can go into a 10 gig network because I sure learned a lot in this process I by no means a seasoned expert in network tuning so if you see things in this videos that you would have done differently or things that you would have recommended for me leave it a comment I'd be happy to check it out and learn from your experience
Info
Channel: Puget Systems
Views: 36,112
Rating: undefined out of 5
Keywords: Intel, Optane, 900P, NVMe, FreeNAS, L2ARC, 10G, network, CIFS
Id: oDbGj4YJXDw
Channel Id: undefined
Length: 12min 55sec (775 seconds)
Published: Fri Oct 27 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.