Comprehensive Guide to pfSense 2.3 Part 9: Traffic Shaper

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

howdy folks and welcome to part 9 of my comprehensive guide to pfSense 2.3 in this video I'm going to be going over the traffic shaper so to give you a bit of an overview of what you'll see in this video first of all I want to go over what traffic shaping is and I'm going to go over some broad traffic shaping theory because I think it's good to know and understand what's going on under the hood I'm going to go over limiters because they are the simplest form of traffic shaping you can apply I'm also going to go over the alt Q schedulers I'm going to go in-depth into service curves because they are required by one of the elders of course I'm going to go over configuring limiters in the web UI as well as using the traffic shaping wizard and manually adding custom shaping rules in the web UI as well so we need to define what traffic shaping really is and ultimately traffic shaping is the act of controlling the bandwidth or latency of one or more traffic flows and this is usually done for quality of service control but you can also perform traffic shaping to simulate network conditions as well and so the guiding principle we're going to be following is that we want to delay less time-sensitive traffic to improve the overall performance of time-sensitive traffic we can control latency quite easily by adding a simple first in first out queue or buffer so basically packets are temporarily stored in a queue and then they're released to their destination at some later time now in a normal network we generally want to minimize this queuing delay and in fact a theoretical ideal network actually doesn't have queues at all but in practice that's not possible we would require infinite network bandwidth for that to be possible and we can use queues in a couple different ways first of all we can delay packets for when the network is less congested so we hold on to packets when the network is busy and then we can release those packets at a later time and that of course allows for more important traffic to pass by and we can also use it to simulate longer network paths which can be quite useful if you're testing an application or a service whether you're a developer or you're simply testing something if you need to simulate what will happen if you're connecting you know train across the you know across the world is something on the server on the other side of the world you can simulate that propagation delay by basically adding a queue with a delay now bandwidth controls a little bit more complicated and we need to start out by classifying bandwidth somehow now normally we would consider a data rate when we talk about bandwidth something like you know bits per unit time or maybe packets per unit time and I differentiate these because of course packets are not always the same size so these are not directly comparable units always but you can't really just use a data rate to describe bandwidth it's actually far more complicated than that and as an example take two different rates one packet per second versus 60 packets per minute they sound the same but they're actually not they have the same average rate but they have very different characteristics and so we ultimately need more than one data point in order to describe bandwidth when we're controlling bandwidth we want to consider a few different parameters the first one is of course the average data rate this is the long term data rate this is kind of I'll call the sticker bandwidth this is what we refer to usually as bandwidth but we also have the burst rate and this is the maximum rate that is achievable at any particular instant in time which is usually higher than the average data rate of course when we have these bursts we have to look at how much data can be in a burst and we call that the burst length and of course how often these bursts can occur is the burst frequency and I want to make it very clear that bursts do not violate the average data rate a lot of people think that these bursts are over and above the average data rate but that's not really true and to sort of explain that I'll just show a quick example so imagine that we have two different traffic flows traffic flow a where we have one packet per second and then flow B where we have packets group like this you'll notice that both of these flows have an average packet rate of 1 per second but flow B has a burst rate of 4 packets per second so the average rate is conserved even though the burst rate exceeds the average rate it's only the way that the packets are arranged in time that changes so I want you to think for a second about how we would actually design something that can implement bandwidth control taking these parameters into consideration and so I want to go into a little bit of an aside on sort of a very common method used for doing bandwidth control known as the token bucket regulator so in a token bucket regulator we have a bucket and we have a FIFO queue and we plumb it together we add in a token factory and bingo we have a token bucket regulator so let me explain how this works is a very common constructive very commonly used in networking and talking about your regulators also find other uses which I'll maybe go into in just a moment so first of all tokens are added to the bucket at the average data rate so if for example we wanted our average data rate to be one token per second then we would add tokens to this bucket at a rate of one token per second assuming that the bucket was not full the buckets full we don't add anything and the bucket size corresponds to the burst length so in this trivial trivial example we'll say that we can only hold two tokens in the bucket and so in this case the bucket is currently full the basic principle of operation is that packets come in from the network and they're placed in the queue the packet that's at the front of the queue requires a token in order to be forwarded out to the network so in this case there is a token available the token is taken from the bucket the token is destroyed and the packet is forwarded out to the network and so as a result we can burst at any given moment in time if there's two items in the queue because the bucket size is two tokens we can burst two tokens out of the out of the regulator however if we have packets constantly coming into the queue we can only release them at the average rate of one token per second because that's what we are adding to the bucket and so this allows you to the average rate is always conserved because the token factory always produces tokens at the average rate but you can burst because of the tokens in the bucket and it's a very simple regulator and it's used very commonly in networking one other place where it's commonly used instant bet is in the embedded systems you know this isn't going to be applicable to a lot of people that watch this but if you ever work in embedded systems if you have a microcontroller or something that has an interrupt service routine you can of course have what's known as an interrupt storm where basically some fault condition creates a massive number of interrupts which can swamp your processor and ultimately cause your device to crash and so you can actually use a token bucket regulator to regulate the amount of interrupts that come into your system to prevent and interrupts overload so these are very simple constructs to write programmatically but they are quite quite useful and they work very very well and pfSense does the lowdown in freebsd these are used but you know we're not really going to be dealing with these directly but just so you get an idea of what's going on this concept is definitely used so with that aside out of the way let's move back to the main focus of this video and let's go over limiters first of all because limiters are ultimately the simplest form of traffic shaping the principle of operation is that you define one or more of what are known as pipes and each pipe has a maximum bandwidth that's associated with it and the idea is that a pipe cannot use more than its associated bandwidth you write some firewall rules which classify traffic and you it assigns that traffic to a particular pipe and so therefore you can restrict the bandwidth of those traffic flows to the bandwidth that you've set for the pipe so I want to make note of something right off the bat and that is that limiters are not work conserving so what that means is that a pipe can never exceed its bandwidth even if there is available bandwidth on the parent connection that the pipe runs through so even if you could be sending data faster than what you specified as the bandwidth of that pipe it won't happen and so this leads to a waste of your precious networking resources generally speaking you're going to be doing traffic shaping we're limiting on a when interface right it's going to be your internet connection that is going to be the narrowest and most expensive link that you have and so generally speaking you don't want to under utilize that right and so limiters don't work terribly well in this case because if you decide to limit some particular type of traffic you may you may end up underutilizing of your network so for general general quality of service and those types of things limiters are not what you're looking for n PF sense limiters are implemented using dummynet so if you want more information about limiters I recommend you look for that in relation to FreeBSD one thing that's very important to note is that pipes don't discern between packet direction so what that means is that you will need two pipes one for inbound traffic and one for outbound traffic to get a proper full duplex connection now there's nothing wrong with assigning both traffic directions with your firewall rules to the same pipe but you're going to end up with a half duplex connection in that case which might not be what you're looking for another thing that's worth noting is the dummynet can implement what's known as a dynamic limiters so basically it creates pipes on-the-fly for individual hosts so individual IP addresses so you can imagine a situation where you can set it up so that every IP its own bandwidth limit without you having to write a separate rule for each IP address separately and add a separate pipe for every IP address separately so it has that functionality and I'll show you that in the web interface so with limiters out of the way we now move on to the old queue schedulers and this is where things get far more interesting now Pistons has five built-in options for doing traffic shaping and they are coddled queue fair queue pry q CB q and HFS c now of course cobble queue is kind of the odd one out here but I don't really have a better place to put it so I'm going to group it in with this because this is kind of how the web interface group stuff together and all of these are recommended for general quality of service tasks and I'm going to be covering these starting from the simplest to the most complex so that's why I've chosen this ordering because we're going to kind of be building our knowledge as we go and hopefully this will make sense I might put timestamps in the description if you want to jump to a particular scheduler but you may need information from a previous one for things to make sense so jump ahead at your own risk I want to make a note right at the beginning that more complex is not always better a poorly configured complex scheduler will likely perform worse than a well configured simple scheduler despite any of its downsides from being so simple so if you choose to use a complex scheduler you should be prepared to tweak it a lot before it works properly and also you know on the same grounds if you choose to use a simple scheduler just make sure you understand what its weaknesses are and so that you know you you understand if it really is the right choice for you now one thing that we will see a lot of is the notion of queues and so I want to make a few notes on queues before we get started all network interfaces have a queue for outgoing packets right so this is sometimes referred to as the transmission buffer and normally packets are treated in a first-come first-serve basis right the first packets that arrive in your pfSense box doesn't matter where they come from are the first ones that get added to this queue and then they shuffle through the queue in a FIFO manner and then they're released out onto the wire and all traffic shaping really does on a like the most simple fundamental level is it just simply uses an algorithm to select the order of packets to add to that queue instead of doing it on a first-come first-served basis that's really what we're trying to do we're just trying to reorder that queue basically that's that is what traffic shaping does there are a few other things we can do to that queue as you'll see in just a moment but generally speaking that's really all we're doing so the first thing that I want to talk about is coddle which stands for controlled delay and this algorithm is not really a traffic shaper like the others but again sort of do the way it's grouped in the web interface I thought I'll talk about it now its goal is to reduce buffer bloat and I'm going to explain what that is in just a moment it's the absolute simplest thing that you could possibly add to your network because there are no firewall rules that you have to add to set this up at all and as a result it doesn't differentiate between types of traffic in fact it only has one option that you can set and that is the available bandwidth on your link so all you have to do to get it to work is tell it how much bandwidth your connection has so if you've got a 50 megabit per second internet connection you just tell it I have a 50 mega per second internet connection and that's all you need to do how simple is that it can be enabled in any interface and it's something that even if even if you don't watch anything else in this video you should probably think about adding this just because it's so simple now of course it's not perfect it doesn't work well on slow links so I would say roughly anything below half a megabit per second you should probably not turn kadhalan for that and it also doesn't work well for big links as well so if you have a large amount of bandwidth maybe you're a company and organization something like at a data center probably you don't want to use it for that you should use something more complicated so of course what is buffer bloat what is this thing that we're trying to reduce well buffer bloat is basically the act of having consistently full network packet buffer zone interface so just like I said how all interfaces have a queue for outgoing traffic buffer bloat is when that that queue that buffer is consistently full or near its maximum capacity and having these large buffers these large queues it increases network delay for pretty much all traffic because all the traffic has to sit in these queues for a longer amount of time and ultimately it doesn't really increase overall throughput it just causes congestion now TCP the transmission control protocol it has some you know some simple congestion avoidance built into the algorithm but large buffers they can defeat the algorithm and it ends up making things just much worse and so what coddle does is it's designed to drop packets in those queues to prevent buffer bloat from happening and it works by measuring the travel time of packets in a queue and it discards packets to keep it at a reasonable size so basically it measures how long it takes for a packet between the time it enters the queue and the time it exits the queue and if that time is too long then it starts to discard packets based on some sort of mathematical model and as a result by keeping the queue smaller it will improve the overall latency of the network and like I mentioned earlier if the network link is slow large packets so large ones with large payload so ones that are close to you know 1,500 bytes or jumbo frames even they actually can take so long to shuffle through a queue that they can actually cause false positives and they can false trigger packet drops and so this is why I don't recommend that you use coddle on slow network links because you may end up actually hindering your performance so the first kind of real traffic shaper is fair queuing and third queuing is the name implies is designed to give all connections fair access to a queues bandwidth and because this works on a connection by connection basis this shaper doesn't work well for applications which create lots of connections for example I'll use BitTorrent right you know it's fair if every you know if every person or every application has one or two connections but if one application just opened the hundred connections then of course it's going to be able to swamp the link because you know it now has a hundred times more bandwidth than anybody else so that is one of the big downsides of this type of scheduler I also know that it's not perfectly fair but that's by design and I'll mention that fair queue is work conserving which means that it will always transmit data if there is data to be transmitted so unlike a limiter you won't have an idle network connection when there is data to be transmitted over it so fair queue is implemented by creating one or more queues for a particular interface and then there are some firewall rules which group traffic into those queues and every connection in each queue is then hashed and it's placed in a hash bucket and so when I say every connection is hashed you can think of it like the the IP address and the port for every sort of it sort of uniquely identifies a connection and so that's what gets hashed and so it sort of you splits up all of the packets into some fixed number of buckets and then those buckets are serviced round-robin so what I really mean by that is that packets are forwarded by drawing one packet from each hash bucket in turn and so this roughly translates to one packet per connection per cycle and so therefore every every get every connection should theoretically get the same amount of service if they have the same amount of traffic now it's not perfect because there are force are a finite number of buckets and so it's very possible that there will be a hash collision and that you you know you may have two connections which end up sharing the same bucket and it's also possible that one of those connections might have a lot of traffic and the other one might have a lot less traffic and so therefore it may get swamped out by the other connection so it's not perfect but it works generally okay in practice so the next scheduler is known as priority queuing and as the name implies it attempts to separate traffic based on priority and so the architecture is as follows we have two levels of nested FIFO queues and there can be no more than two levels so it's basically a flat architecture the first level is always known as the root queue and this queue is set for the total bandwidth that's available on the link and when I say set I mean the system administrator aka you tell the system what the total bandwidth available on the link is and that root queue is sized accordingly below the root queue you can then have two or more child queues and each of these queues has a priority from 0 to 7 with 7 being the highest priority and the principle of operation of the scheduler is that packets are drawn from the queue that has the highest priority in the event that that queue is empty then the next highest priority queue is emptied and so on and so forth in the event that you make two queues which have the same priority then they get emptied in a round robin fashion assuming that traffic is available and so since traffic is always forwarded if it exists then pry queue is work conserving right it will always forward something if any of the queues have traffic in them so let's give a little bit of a graphical example here so let's say that I have three different use which I'll call class a B and C and they all have different priorities and of course we have our root queue which is the parent all of these and so all of these three child queues they flow into the root queue and so what our algorithm is going to really do is it's going to be picking which packets from which of these child queues make its way into the root queue in what order so let's add some arbitrary traffic here so each of these boxes represents a packet that's sitting in each of these queues so because Class A has a higher priority than the other two packets will be pulled from that queue first now once that queue is empty we're going to start pulling from Class B of course because that has the next highest priority and at this point you know the root queue is full we've saturated our link bandwidth and so we can't really forward anything any further but let's say sometime later the packets I'm going to get forwarded out the link and some more packets arrive in the Class A queue so we have to take those higher priority packets and I think you can kind of already see what the problem the biggest problem with this scheduler is and it's known as starvation it's possible that packets within a lower priority queue can sit potentially indefinitely in the presence of a stream of higher priority traffic and so you can it's not a fair scheduler is really what I'm trying to get at and you can have these problems where you get massive massive latency on low priority traffic the next scheduler addresses the starvation issue and it's known as class-based queuing and so this is similar to pry queue in that we have nested FIFO queues however there can be more than two levels so you can actually nest nest nest child cubes basically if you want to build a more advanced scheduler the first level of course is always the route queue same as for pry queue and of course you can then have two or more child queues and those queues can also have child queues as well and similar to pride cute each queue has a priority from 0 to 7 7 being the highest priority however the big difference is that every queue also has a bandwidth and it's set up such that the sum of all of the child queues bandwidths equal that of the parent queue so if you only have two levels all of the child queues bandwidths must add up to the route queue and so the principle of operation here is that packets are drawn from the child queue with the highest priority until its bandwidth limit is reached so this is what allows us to get rid of the starvation problem now child queues can also borrow bandwidth from their parent if available and you'll see that in their next example and since children can borrow bandwidth cbq is work conserving so let's go over a simple two-level tree example I think most people will probably start out by just doing a simple two-level trees so of course our first level we have our root queue and let's assume that this is our way on interface and you know we have a 50 megabit per second internet connection so that's the amount of bandwidth we have to go around so I'll define three different queues for three different services and I'll give them their appropriate bandwidths and priorities so SSH it's a real time real time protocol I mean it's interactive so you want that to be you know as responsive as possible so we'll give that a high priority HTTP you know it's sort of user interactive so we want that to be relatively high priority as well and BitTorrent we don't really care about it it's just a thing that happens in the background so we're going to give that the lowest priority and the bandwidth you know SSH it's relatively low bandwidth so we'll just put in 100 kilobits a second HTTP could be multimedia going over and we don't really know so we'll just give it a whole bunch of bandwidth and BitTorrent you know give it some small bandwidth so it you know it has something now every cue is guaranteed its bandwidth but they can borrow from their their siblings if available so assuming HTTP isn't using all of its you know all that's available and all of its available and with a BitTorrent could use that or SSH could use that whatever is left over in order to keep the scheduler work conserving so just because we've we've said five megabits per second for BitTorrent doesn't mean that's what the limit is that just means that in the event that you know all of the other services are using all of their allotted bandwidth BitTorrent is only guaranteed that much and the priority is basically used to allow you to control the latency so the higher the priority the lower the latency will be during times of congestion so you want your more real-time services to have higher priorities and I think that makes logical sense so at this point there's only one scheduler left and it's arguably the most popular and most complex and so before I go into discussing it I want to talk about service curves because they are fundamentally important and I think I need to spend a little little bit more time on this than anything else because I really really want you to understand service curves before we get into the scheduler because if you don't understand service curves none of this is going to make any sense a service curve basically defines when incoming traffic receives service so by service we mean when the traffic is chosen to be forwarded out of the router and so I'm going to be explaining service curves using graphical examples because I think that is the easiest way to understand this rather than looking at equations now I'm going to assume that you've taken high school math and that you know you know linear equations and how to calculate slope and things so this is a classic depiction of what's known as a linear service curve so on the x-axis we have time which is usually in milliseconds or seconds and then on the y-axis we have service which is usually represented in bits now physical links and limiters have linear service curves like this where they have a single straight line and so the way that you kind of read this is that as time passes the number of bits that are forwarded over a link for example it trends up linearly right so if you if you have a you know 50 megabit per second link and you send traffic constantly at 50 megabits per second then the number of bits that are transmitted in total will go up in a linear fashion and as a quite nice result of this the slope of that line is actually the bandwidth of the link right because of course we have bits on the y-axis and seconds on the x-axis right and slope is rise over run so it's bits per second so the steepness of that line is the bandwidth of that service curve and I'll mention that the route queues in any of the previous schedulers have talked about have this type of linear service curve because of course they are directly tied to a real link and therefore they have a set fixed bandwidth so let's do a little bit of an example and let's assume that we have traffic that arrives at a rate of 10 packets per second so it's a rival curve this is not a service curve it's a rival curve would look something like this so you know if it's ten packets per second we'll say one packet comes every 100 milliseconds so if we were to graph the number of bits over time it would look something like this every time a packet arrives we have this vertical edge of where all of those Pat a number of bits arrives the number of actual bits doesn't matter it doesn't it's not important but that's what our arrival curve would look like now if we add a linear service curve that has the same average slope so what I mean by that is the the slope of our service curve is the same as the packet arrival rate it would look something like this now because packets leave the queue according to the service curve we can actually measure the queuing delay that would be present in this case by effectively sort of sliding over where the rival curve intersects the service curve so we're kind of looking at the y projection and so in this case the delay that we would observe is 100 milliseconds per packet so from the time the packet arrives which is the vertical jumps in the sort of the step function of the arrival of the arrival curve if we slide that over as you follow the arrow to where it intersects the service curve that is the amount of time that that packet would sit in a queue before it gets forwarded out so if we sort of graphically depict that if we have our packet arrivals and departures and every color represents a different packet we can see that the packets arrive and depart at the same rate so there's no need to drop packets or anything like that however the packets get delayed by the service curve by 100 milliseconds which just in this case happens to correspond to the arrival rate but what if this is a problem what if the latency that we incur like the 100 milliseconds in that example what if that's too big what if we need a lower delay for something that's real-time such as VoIP for example well if we use a linear curve we use a straight line the only option we have is to increase the slope now if we do that that will reduce the delay as I'll show you in just a moment but it also has a side effect of increasing bandwidth so here you can see that we have the same packet arrival curve as before but we now have a steeper linear service curve and you'll notice that by doing this the delay is reduced but the bandwidth allocation has also increased and so we're actually allocating more bandwidth than we need we're allocating 15 packets per second that's the slope of the service curve but the packets aren't arriving that fast and so we've allocated more bandwidth than we need which therefore doesn't really allow us to provision our network very effectively so if we graph that like we did previously you'll notice that the queuing delay approaches zero because we've under provisioned our network and of course it doesn't actually reach zero but just for this example let's just say that it does so the fundamental takeaway from this is that with a linear service curve latency and bandwidth are interdependent on one another ie you cannot reduce delay without also increasing bandwidth now of course you may say well what if I have a service that needs low latency but also doesn't need high bandwidth like let's take ssh for example you know ideally we would want to somehow be able to decouple delay and bandwidth and we can do this what we need to do that is known as a nonlinear service curve so the key is to use a piecewise dual slope linear function so we draw it as follows we have one line segment which has a slope known as m1 and then we connect another line segment with a different slope of slope M 2 and the point where they meet the X projection of that point is a time value known as D and so we can characterize this entire function by three quantities M 1 M 2 and D so with those three numbers the two slopes and the time at which they intersect we can basically just write down or enter in this service curve we don't need any more equations than that so if M 1 is greater than M 2 ie the first line is steeper than the second line the curve is known as a concave service curve and vice versa if M 1 is less than M 2 the curve is known as a convex service curve concave service curves reduce delay and convex curve increase delay but of course both must exist in any traffic shaping scheme because of course not everything can be high priority priority is relative so if you make some things concave you must make others convex or otherwise at anything will just end up having the same priority and nothing will change now let's redo the previous example with a nonlinear service curve and see what happens so we have our same packet arrival curve as before and we have a not concave nonlinear service curve now you'll notice that the slope m2 is the desired average bandwidth which is 10 packets per second like we did in the very first example but the slope M 1 is higher in order to reduce the delay and so it's 15 packets per second and so as a result the delay in this case is actually 50 milliseconds per packet and so if we graph this as we did before due to the fact that we have a concave curve our average bandwidth is the same as the first example right there's no drop packets there's no wasted bandwidth allocation but our delay is 50% less than that first example and as a result we have successfully decoupled delay and bandwidth now as with life there is no free lunch if you use concave curves you must balance them with convex curves now ideally the sum of all serviced curves should actually be a straight line to match that of the service curve of the root queue but in practice this is not required and doesn't really make a lot of sense because there will be time delays between when service curves start they're not going to be tracking perfectly in line with each other so let's take a look at an example so let's say I have this black service curve and this blue service curve one of them is concave one of them is convex and they both start at the same time so what I mean by that is there happens to be two different traffic flows one for each and both of those traffic flows happen to start at exactly the same moment in time and so these service curves are tracking with each other now let's say that the route Q has a service curve represented by this red line now let's say that at time T which is some arbitrary time later we have this other service curve there's this other traffic flow to the different service curve and it's in green now the problem here is that in order to guarantee or to satisfy that service curve we would actually need the route queues service curve to follow the dotted line in order to satisfy that m1 bandwidth which of course we can't do we can't have more than a hundred percent bandwidth and so we actually we can't guarantee all of these service curves a hundred percent it doesn't exactly make a lot of sense to try and make all of your curves add up to a straight line because ultimately traffic is going to be starting and stopping all the time and so therefore they're never going to really be lined up and so the algorithm is constantly going to have to be adjusting the service curves to make them fit and so with all of this talk of service curves out of the way we can talk about the last scheduler and that is the hierarchical fair service curve scheduler or HFS see and it builds on the architecture of class-based queuing but it extends it by adding a nonlinear service curve and it uses a dual slope piecewise function as we described earlier so I want to go over this one by way of example so we'll do a two level example let's say that we have of course the same route Q as before 50 megabits per second and then we're going to have three child Q's but instead of priorities and bandwidth they're going to have service curves and service is ultimately given according to these service curves but of course Q's can borrow from other siblings if available just like they could in class based queuing and so of course H FSC is normally work conserving now note that SSH is convex and both BitTorrent and HTTP are concave in fact BitTorrent actually has an m1 slope of zero which means that it's guaranteed that all packets will have a delay of at least 50 milliseconds in that case because no packets will be forwarded for the first 50 milliseconds now you may notice that the m2 values in this example actually do add up to the root Q bandwidth and of course the m1 values don't but the m2 values do and I recommend following this this principle because you know you know if you have a bunch of sustained traffic flows across all of your different queues you know you want your traffic schaper to be saying you want you know you want to have allocated bandwidth you know according to a percentage of a hundred percent you know you don't just want random values you know if you're m2 values if you're sustained bandwidth exceed 100 percent of the route queue bandwidth then it doesn't really make a lot of sense and it will still work because hf SC will it will compensate and it will make decisions but it probably won't make the right ones so as i've just said the m2 values should probably add up to the route q bandwidth but the m1 values don't necessarily need to do the same because of course with time shifting they can ever really be guaranteed anyway now HFS c generally works on a best effort basis so it tries to satisfy everyone's service curves while sharing bandwidth effectively however it's not perfect and so we can actually make it better by providing more information to the algorithm and we can do that by defining additional service curves in fact with HFS c every cue can actually have three different service curves the first one is known as the link share service curve the second one is the real-time service curve and the last one is the upper limit service curve now the service curve that I've been talking about without giving it a name is the link share service curve because that one is required the other two are completely optional but they do have some benefits if you define them so the link share service curve is kind of like the normal target for the algorithm it's what I've been talking about previously it will attempt to do its best to satisfy that service curve and it will also borrow it can it can exceed that service curve by borrowing bandwidth from other queues and adding that bandwidth onto the link share service curve the second service curve is known as the real-time service curve and this is effectively a minimum amount of service that you can reserve for a particular queue ie it is guaranteed and it cannot be borrowed by any other queue now if you do this just note that adding this can make the scheduler no longer work conserving because you're now reserving bandwidth that cannot be borrowed by any other queue regardless of whether it's in use or not generally speaking real-time service curves are good for real-time services so if you have VoIP for example and you know you have a minimum bandwidth that it can operate on you may want to define a real-time service curve for that particular Q such that voice will always function and you won't have any dropouts if you have some sort of you know telephony that has you know different levels of quality where you know as the bandwidth goes down the call quality goes down well you might want the minimum level of call quality to be your real-time service curve and the sort of desired level of call quality to be your LinkShare service curve so in a worst case scenario you drop down to the real-time service curve and you know you have a reduction in call quality but you never have drops because that bandwidth is reserved the upper limit service curve is kind of self-explanatory and it is a maximum amount of service that your Q can borrow from any other siblings and so as a result it kind of acts like a limiter it's the maximum amount of bandwidth or service that can be given to any particular Q and you could use that if you wanted to you know restrict a particular service from just you know constantly flooding your network for example and again if you add this curve it's but it's possible that the scheduler will no longer be work conserving because of course--i that that Q may not be able to use bandwidth that's available to it so with all of the schedulers out of the way I want to just give a few quick tips on designing and implementing traffic shaping some of these are probably common knowledge but I might as well go over them anyway first and foremost know your network if you don't know what's going on on your network then you have absolutely no hope of being able to shape it and so as a result Wireshark is your best friend I recommend that you observe traffic in production and take captures of that traffic for debugging later once you have you know once you have those captures once you observe your traffic looks like make some assumptions and guesses on how you can improve performance of whatever it is you're trying to achieve and then build a shaper with one of them you know one or more of the shapers i've previously discussed once you've built it you're ultimately going to need to implement it so you know you're going to need to test it and you can test it in one or two different ways you can either test it in production which may sound bad but in certain circumstances that may be viable just turn it on and see what happens worst case scenario it crashes and burns and you turn it off again or you can test in isolation and so this is where those captures come in you can basically replay those Network captures that you took on production and you can observe what happens to them and you can do that in one of two ways you can either replay the the traffic and then you can actually try to use the network try to use interactive services and you know observe what happens or you can replay the captures and then on another machine on the other side of your pfSense box you can then do another capture and then you can actually sort of compare the two and sort of look at you know mean latencies and things like that and and see if things actually got better now just note that if you try to test in a virtual machine just just understand that on most hypervisors the virtual network interfaces don't actually operate at their real speeds and therefore that may skew your results by quite a bit network interfaces they can run really as fast as the CPU Ram will let them so a gigabit interface in a VM is not actually a gigabit it can do forward to give it it can do 6 gigabit so that may change things and also the latencies if you're using especially using like an internal network some sort of virtual switch of course the latencies are unnaturally low and so that can also affect how your your shapers behave so with all of that kind of theory out of the way I think now it's time to move on to the web UI and we'll go over all of the options and how you can actually implement all the things that we've gone over in the actual interface itself so I'm here at the web interface for my virtual machine which I use to screencast these videos from so I want to go over configuring limiters first of all because that was the first thing I went over in the video and you can get to the limiters by going firewall traffic shaper and then clicking on the limiters tab and as you can see I have no limiters currently configured so we want to add a new limiter so the options you see here are basically for the pipe itself so of course we want to enable it and we want to give it a name so I'm going to call it something very generic for this video but of course you would name these yeah with reasonable names and of course we define a bandwidth for this pipe so let's just say 100 kilobits per second so you know maybe double double dial-up for for this example and of course if you had schedules defined you could add one here and then you can also add multiple schedules so if you had so for example if you wanted this pipe to change bandwidth depending on the schedule you could have different bandwidths associated with different schedules and as long as they don't overlap each other you can have the basically the bandwidth of the pipe change over time these two fields here these are the masks for dynamic pipe creation so I mentioned that dummynet has the ability to dynamically create pipes based on IP addresses so you don't have to create pipes manually for each host so if you leave the mask at none then this remains a simple pipe with no dynamic features however if you change the mask to either source or destination address then for every packet that passes through the pipe with a different source or destination a destination address then a new dynamic pipe will be created for all traffic either to or from that address and of course W that needs some way to the end English between hosts and it does this by you providing the subnet mask effectively so that way if there's a change outside of that mask then it knows that it's a you know that's the host portion of the address and it should be given a different type of course as usual I always recommend putting in a description for your your reference in the future because you're going to forget what this thing does especially with stupid names like this but of course it's a demo I'm not gonna bother to give it a description and then there are some advanced options down here at the bottom the first field is delay in milliseconds and like I said you can use limiters to simulate poor connections so just just like I'm simulating 100 kilobits per second here you know I could be restricting some traffic to that bandwidth or maybe I'm trying to test something and so if you leave this blank it's assumed to be zero so no added delay but I could put in you know 200 milliseconds worth of delay to simulate you know some very long network paths you know across a country or something or in fact actually that's more like a transatlantic cable pretty much with that kind of latency similarly we can simulate packet loss as well and so this is basically just a probability so it ranges is a floating value between zero and one zero meaning no packet loss one meaning 100% packet loss and so you can put in a decimal value here if you want some random packet loss to simulate again a poor connection the last two fields are sizes for the queue and for the hash buckets I recommend probably not touching these unless you you have problems there their units are slots and you'll see this throughout the web UI and slots basically means packets so the units are our packets basically one slot can hold one packet regardless of packet size so that's what that means in case you're wondering so anyway I don't want any dynamic features so I'm going to just turn this off and I'm just going to save actually yeah we'll leave the delay on why not we'll just click Save and I'll apply those changes why not so here I have one limiter which I've called limit one and so if I actually go into Diagnostics limiter info I'll open that in a new tab you can see that we now have one limiter which is in our 100 kilobits per second 200 millisecond delay and it's all good now there's one thing that I didn't mention earlier and that is that you can nest cues within limiters so right now I have a single pipe with 100 kilobits per second bandwidth and I could assign traffic directly to this pipe but I could also create one or more queues that sit within this pipe and then share the bandwidth of this pipe and you do that by going to the bottom and clicking the add new queue button and so now you'll notice that I have pretty much the exact same options except I can't specify a bandwidth anymore because the bandwidth is already defined by the parent of this Q which is the limiter I just created so the idea is that each queue has some weight which is basically a percentage value of the full bandwidth of the the parent pipe so you could have several queues each with a different weight and therefore they'll use up to that percentage of the pipes bandwidth in the event that they're all being utilized so it is work conserving in the sense that if there's only one queue with traffic even if it has a low weight maybe lower than the others it will still use the full limiters bandwidth in the event that there's no other competing traffic with a higher weight so let's just enable the queue here and of course we don't want any dynamic features and I'll give it an arbitrary weight of let's say 50 let's say I don't need any anything else and I'll hit save and now you'll notice that there is a nested queue below our limiter and now if we actually look at the limiter info you'll notice that now there is another queue here and it has a weight of 50 and so I can continue to add more queues and you know I'll give it a different weight and so now you can see we have multiple queues and you can see they have different weights so all of these queues are limited to a combined maximum of 100 kilobits per second now if you remember I mentioned earlier in order to get a proper full duplex connection you need to have two different pipes one for incoming and one for outgoing so I can create a second limiter and give it some bandwidth and so now I'll just create another queue below this and we'll give it arbitrary weight the weight doesn't really matter if you only have one Q because of course it's going to have a hundred percent of the bandwidth but I'll just put it in there for no apparent reason so this point we've created some limiters but they don't actually do anything because they don't have any traffic flowing through them so in order to change that we need to create some or modify some existing firewall rules in order to ensure that traffic flows through these limiters so if we go to firewall rules we have two options we can either add the limiters as a advanced option to an existing rule or we can create some floating rules which match the traffic so let's say that I wanted all of my traffic on my land to the course that passes through the router to be limited by those limiters that I just created so I'm just going to pick the allow land to any rule and let's edit this and if we go down to the Advanced Options you'll notice as I mentioned I believe in my last video the in and out type and so now there are some options for this so the way that this works is described in the help text here but basically the first option is the the input pipe so this is the data that's coming in to the specified interface so because we're editing the land then of course this is things that are coming into the land interface and this would be things that are leaving the land interface and of course you only specify the outgoing if you specify the incoming and there is kind of a weird caveat with floating rules that because floating rules you can specify their direction to be outbound you can specify them to have the opposite direction these fields actually become reversed so if you have a floating rule whose Direction is outbound then the one on the right is actually the incoming and the one on the left is actually the outgoing and of course the labeling doesn't change you just have to know that implicitly so it does say it here but it is kind of an odd thing floating rules the interface doesn't change based on what you set the direction to so you just have to kind of do that reversal in your head so I'll specify the incoming pipe as one of the cues from the first limiter and the outgoing will be a cue from the second limiter so therefore we'll have a proper full-duplex connection and if I save this it will ask me to apply this changes and I'll do that and so now all of the traffic for that would that rule passes goes through those two cues so now we should have effectively a 100 kilobits per second full duplex connection so we can test this by going to basically anything because because of course I've added this to the you know allow to any land rule all traffic that matches this rule all traffic that's passed by this rule will be affected so now there's no traffic flowing through it right now because of course we can't see any any flows that are being limited but if I open something maybe speed test everything is going to be incredibly slow because of course we're simulating 100 kilobits per second and some things are actually going to be relatively fast because some of this is cached but we'll see if this even loads I may have to speed this up and so if we go back you'll notice that now we're dropping packets in both of those queues as expected and of course this is very very slow so I I don't even think I'm going to do this I think I think you get the idea it is working so I'm just going to remove those limiters from this rule just because it makes things a little difficult okay so if you wanted to create your own rule that you know doesn't already exist I would recommend creating a floating rule and so basically all you would need to do is specify the match act action which of course is not going to change the flow of the traffic but it will flag it so that it gets picked up by the limiter and you would make you know all the rest of this you would select it based on whatever traffic you were trying to limit and then you would of course set up the pipes as before and like I mentioned if you're using an out Direction floating rule just remember that these are backwards and that way you this way you can course you could create a rule which doesn't actually affect the flow of traffic it doesn't affect any existing rules that you may have under your interfaces but it will shape them so I think that's it for limiters let's move on now to the alt Q schedulers which is of course what people really want to see and I think the way I'm going to go about doing this is I'm actually going to show you the the wizard first of all show you what that creates and go from there so as you'll notice there are two wizards one of them is called multiple land wind the other one's called dedicated links now pretty much everybody regardless of how many winds or how many lands you have you're going to select the multiple land when wizard it has a terrible name it's just been called that forever I don't really know why it's never been changed so that's what we're going to be going through so the first thing that we have to enter of course is the number of land based connections we have and the number of land interfaces that we have so in this case we have one one win and in fact we do have to lands because of course we have this optional interface as well so these numbers are actually already correct and so now we get to pick what schedulers we want for each of the interfaces so in this case for our land let's say we want HF FC and for our optional interface we also want HF SC but of course we can pick between class-based queuing or priority queuing if we want to use one of those simpler schedulers and for the when we have a few extra options specifying what the upload and download speeds are and so in this case these numbers are already correct so if you have a VoIP system you can choose to prioritize that traffic over others and if you do this of course you'll have to specify all of your details as well as speeds for your different interfaces for this pipe traffic I don't have this so I'm just not going to fill it in so I don't even know what I would populate here you have the ability to set up what's known as a penalty box so basically this is just this just creates a rule to limit the bandwidth for a specific IP address this is useful if you have some some hosts that you want to restrict its its traffic to buy a hard percentage it's I think most useful if you just want to turn this on just to see what it generates and then you can sort of use that as a template for generating your own rules later on so I will just leave this I've already pre configured this and next is peer-to-peer networking so of course generally generally speaking we would want to lower the priority of peer-to-peer traffic so of course you can enable this if you want this peak to peak catch-all basically is for all traffic that is uncatted Erised you can choose if you want to consider a peer-to-peer traffic I personally don't like this but if your or depending on the network it may actually be worthwhile doing so and of course you can specify which protocols you want rules to be created for it because ultimately what this wizard does is it creates a floating firewall rule for each protocol that you select so of course really you only need to select one that you know exists on your network so let's just say that I only really have BitTorrent on my network none of these other ones so it will create a rule which we'll see after the wizards complete which pertains to BitTorrent and of course the next one is gaming and so of course we can prioritize gaming of course we want to reduce the latency so that's what we're doing here you can specify again which services you want and of course you can pick whatever you want and it will create rules for each of these and then there's a list of games which it has sort of predefined rules for and of course you can of course add rules for your own games if they're not in this list which I suspect will probably be the case for a lot of people but it's very easy to add your own rules so don't worry if what you want is not already here this is kind of an old an old wizard the next page is for sort of just general applications and so you can basically specify priority levels for certain sort of applications that this wizard knows about and so you can specify whether you pay someone higher or lower priority for each of these different services and so I'm not really going to bother to change much here I'll just say that you know HTTP should maybe be higher priority and maybe I don't know crash plan backup should be lower priority you know samba high or whatever I've just created just just a couple here just to show you what this generates when the wizard is complete and once you're done you just click the finish button and it will actually create all those rules for us and so it gives us a little bit of like sort of a log message which shows what it's done and it says that it's complete and so now if we go to our traffic shaper you'll notice that a whole bunch of stuff has been created here in the the interface view you can also look at it by Q so basically both by interface and by Q or for the ulta schedulers it's basically the same information but displayed in a different way the by Q doesn't have the ability to sort of edit anything you have to go to the BI interface view to actually get panels where you can edit things so I'll come back to this in just a moment I also want to show you what its what has been created in the firewall they also notice if you go to floating rules where there were no rules before you'll now notice that there is a large list of rules that were created by that wizard so if we go to the traffic shaper so you'll notice at the top level we have when land and opt 1 and these are the route queues for each interface so if I click on when you can see the course we can enable and disable the scheduler on that interface we can change its type between any of the types I mentioned before and so this is the bandwidth this is the route queue bandwidth that is available on this interface of course because I have a 10 megabit per second upload that is the bandwidth that's available on the win and queue limit and tbrs is talking about the regulator size I would recommend probably not changing these because you almost always will not need to so because this is HFS see of course we can have nested nested queues below the route queue and so we have another queue called queue internet which is below this and then we have a bunch of queues which are also nested below that so we have two levels in this nesting arrangement of course we can enable and disable this queue we can give it a name and then we've got a bunch of options and I want to make note of something right away and that is what this what this interface effectively is doing is it's generating a bunch of and line arguments for the low-level FreeBSD traffic shaping utilities that's really what this does and so this interface is designed to make it easy to translate you know what's in these boxes into a command line you know a set of command line arguments that can be passed to this utility and so it's not the most user intuitive interface and there are also a few things that I really don't like about it a few things that I believe are actually wrong and the first thing is this priority help text here and you'll notice it says 4hf SC the range is 0 to 7 this I believe is wrong because HFS see if you remember doesn't have priorities at all hf SC uses service curves to determine priority on a you know a case-by-case basis now what I believe this should say is cbq the range is 0 to 7 because in cbq that is correct so I believe this help text is actually wrong so don't let that confuse you now one thing that I will note is that you can actually put a number in this box even if even if you have an HF SC scheduler you can put a number in this box and it will not return an error but that priority will be ignored by the actual scheduling utility so this box has no effect if you using HF SC and so that's why this is very confusing to me it doesn't make any sense so if anyone actually knows what's going on please let me know but to the best of my knowledge that's a that's a bug actually so if we were using cbq you would enter your priority for this queue in their queue limit this is basically the size of the queue in packets or slots for this particular queue if you leave it empty it uses a system default we have a few options for this queue and first of all there's of course coddle so we can enable coddle on this queue as I mentioned earlier and we have read capabilities random early detection and what red really tries to do is it tries to prevent the queue from reaching its size limit before it actually reaches its size limit and so Caudill sort of works by a time basis and red works by a size basis and so you can enable red for incoming packets and for incoming and outgoing packets if you want ecn explicit congestion notification this sort of works hand in hand with red and in the event that congestion occurs it can set a flag in the packet header to sort of notify other routers downstream that will be receiving that packet that congestion is present on the path and of course that this only works it only does anything if the router downstream has support for reading that flag and actually doing something about it but in my opinion that doesn't hurt to set the flag and so it is actually set by default so you can choose what you want to do with those flags at your leisure of course set a description for it I recommend and then we have this bandwidth field here and this is the second thing that I think is really dumb about this interface because of course here we have our our service curves which we can specify we have of course the upper limit the real time and the LinkShare service curve which are exactly as I as I mentioned earlier we can enter in our m1 values our m2 values and our D value which is in milliseconds and so of course we don't need anything other than the link share service curve but we can add in these other ones if we wish now there's also this bandwidth field and so you know there wasn't we never really needed a bandwidth when we were dealing with service curves earlier so what's the deal with this and according to the source code what actually happens is this box this bandwidth field is actually the same as the link share m2 box these are actually the same thing and so if you put a value here you don't need to put a value here and similarly if you put a value in this bandwidth field it overrides what's in this box so if I put a different value let's say 15 in this box then even though my link share says 10 megabits on m2 it will actually be 15 megabits on m2 and that's not documented anywhere which is really dumb so I would recommend it basically don't put anything in this box if you can and that way it'll just be less confusing and it'll use this value and if you do put something in here just make sure that it's the same so therefore you know there's no confusion now the other thing that you'll notice is that there are no m1 or D values for these service curves and that's because you don't necessarily need them if you want just a straight linear service curve all you need to do is enter in the m2 value and it's assumed that that's the slope for the entire curve only if you want to make it nonlinear do you add in the m1 and D values so by default the wizard doesn't actually create nonlinear service curves it only creates linear service curves and so you would have to specify that D value if you wanted to change these around so now let's look at another nested queue below this Q internet queue we'll call or look at Q default and this one of course this one has a priority value which will be ignored because we're using an hfst scheduler I believe these priorities are put in place because the wizard doesn't really care whether using HFS c or c vq it just populates them regardless so that's why that's there but of course that doesn't have an effect you'll also notice that now that we're nested another level we have this other option here called default queue and so basically the default queue receives all traffic that is not explicitly put in another queue by a firewall rule so this will have basically this is the catch-all for all traffic that is not explicitly shaped and of course it has all sort of the same as before of course these values are all generated by the wizard so I haven't entered these in and I kind of disagree with some of these and of course it was created one for peer to peer one for games one for but it calls other applications high priority and other applications low priority and then there's this other queue called Q AK and I think I've mentioned this in a previous networking video but whenever you are transmitting data using like TCP for example in one direction there is a constant stream of acknowledgement packets which are headed in the other direction to inform the sender that the receiver has actually gotten the packet and it was not corrupted and so you have to ensure that those acknowledgment packets get back to the the sender in a timely manner because the sender uses that stream of acknowledgment packets to figure out how fast it needs to send packets to the receiver right if it doesn't get a lot of acknowledgments back then it knows that not a lot of the packets it's sending are actually making its way to the receiver and so it'll throttle back and of course we don't want that to happen we wouldn't we wouldn't want the sender to throttle traffic to us because we can't send acknowledgments fast enough so generally we put in a queue specifically for acknowledgments and we give it actually a relatively high priority because we want to ensure that we tell other you know other servers that we are receiving data and to keep it coming at the maximum rate that we can accept so that's why there is an acknowledgment queue in fact all of these on all of these interfaces now for LAN of course similar thing HFS see it has put in our 100 megabits bandwidth in kind of an odd manner but that's what it does so one thing that you'll notice which is different between the land interfaces and the LAN interface is that the wayne interface has this cue default nested between below cue Internet and the local interfaces don't have a QD fault they instead have a Q link which is actually up one level and this Q link is actually the default queue for these land interfaces they have a Q limit probably to prevent buffer bloat would be my best guess I'm no expert so I'm not entirely sure why the wizard does that and this is effectively the catch-all for both of these local interfaces so now let's take a look at how the firewall rules that were created by the wizard so of course they're all floating rules and of course as you saw before they're just a bunch of match rules and the description of these rules is kind of a machine parsable format which of course is human readable so you can tell what these rules do so let's just pick one here let's look at the the minecraft TCP outbound rule so all this rule is really doing is absolutely matching traffic that comes in on the win that's tcp and if destined for this particular port and I haven't played minecraft in a while but I'm going to assume that that is the minecraft port of course it has a description and the most important part of the role of course is the the queue setup at the bottom so the queue which is on the right hand side this is of course where the witch queue the traffic will go through and on the left-hand side you have the acknowledgement queue which is optional and this is the queue which all of the act packets that are generated from the incoming traffic will be passed out through so if you don't specify this if you leave this none then the act packets will go through the default queue but of course since we have Q AK on the win we can use that to of course shape those acknowledgment packets accordingly and of course it doesn't make any sense to specify an acknowledgment queue if you don't specify another queue because it doesn't make a lot of sense to shape just the acts and I don't even think that that is actually officially supported doing so so there are going to be rules here for all of the different things that we selected in that in that wizard but there's a few things that I want to sort of point out a lot of these rules kind of look like they are duplicated but they really aren't for example this uh peer to peer rule here it's not actually duplicated because the protocol is different one is TCP and one is UDP similarly steam of course there's one for UDP and one for TCP and you'll notice this is actually a good example the UDP one does not have an ACK you specified that the TCP one does and of course that makes sense because TCP of course is a connection based protocol and it has acknowledgments UDP is connectionless there are no acts on UDP so it makes no sense to specify an act queue for a UDP rule so that that's what that's done there another thing that I want to point out is of course if this isn't already obvious there is no deep packet inspection going on here traffic is sorted into queues based exclusively on these rules which are almost always using the port as the way to separate traffic so if you're if you you know go through that wizard and you select that you want to lower the priority of BitTorrent traffic if your BitTorrent client doesn't use you know any ports in this range then it's going to have absolutely no effect for you so make sure that you either create your own rules or you edit these rules to change you know the poor range to be in accordance with the software that you're actually using because otherwise these rules they won't have any effect and of course you can add your own rules to match traffic on whatever interfaces you want and then you can of course you know specify the source or destination ports and then you can pick your own queues accordingly it's very easy to add your own rules and you can view the sort of the status of this by going the status queues and here you will see all of the queues and the number of packets per second and the bandwidth that's currently being allocated to each of these queues and you can see the length of the queue so the default queue length is 50:50 slots 50 packets except for of course those queue length values which I pointed out earlier they had they were set by the wizard to 500 slots you can do how many packets have been dropped in HQ and how many sort of time slots have been borrowed from other queues due to LinkShare so this is this is of course quite boring because there's really no traffic going through this VM at the moment but I recommend that once you make changes you just take a look here and see if things look right and you know and if they do then you can actually you know if you really want to get a Wireshark or something and you know check if it's made a difference to your your latency or whatever it is you're trying to do but if it doesn't look right here chances are it probably isn't right and it might save you some time I also point out that in the traffic shaper of course you can view this by queue as well and so all this really does is it just offers a summary really of sort of the most important parameters but of course it doesn't actually it doesn't show you the service curve info and it doesn't allow you to change anything so it's just a way that you can sort of quickly at a glance see sort of the most important parameters but really I think you really need to go into the debye interface to see what's going on if you if you want to get rid of the shaper of course you can delete queues and things manually but the easiest way is to actually click the remove shaper button right here so if you use the wizard and you you don't you also need ultimately decide you don't want to do this or you want to build everything yourself from scratch you can just click remove shaper and when you do that it removes all of the rules sort of all of the queues sorry and it also removes all of the floating rules that it created as well so it won't remove any rules that you have created but it will remove all the ones the wizard created so it's sort of an easy way to just get rid of everything if you want to start over see I think that goes over everything that I wanted to cover today and so as always thanks for watching

Info

Channel: Mark Furneaux

Views: 61,189

Rating: undefined out of 5

Keywords: pfsense, guide, tutorial, traffic, shaper, firewall, router, bsd, codel, altq, hfsc, scheduler, priq, cbq, rules, floating, tbr, regulator, queue, token bucket, limiter, fairq

Id: rF46PNid1Mo

Channel Id: undefined

Length: 83min 23sec (5003 seconds)

Published: Mon Jul 03 2017