Cisco Firepower - Automating Cellular Failover

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
In my last video, I did a quick overview of how  to setup a NetGear modem with Google Fi for LTE   cellular backup. And the reason I went through  all of that, was ultimately to do this - which   is to provide cellular backup to my home network.  Especially because as work from home recently has   become so much more of an important thing, having  a reliable internet connection has suddenly also   become super important. And unfortunately  my internet connection varies from day to   day. So I wanted that ability to just failover  my internet connection to something secondary,   in case my internet completely went out or  was just performing so terribly that it was   unusable. So in the diagram you see right  now is a brief overview of what I'm trying   to accomplish. I have my internal network, which  I'm just representing with a single switch and   a single PC as internal clients - in reality  I have quite a handful of things. I also have   a Cisco FirePower 1010 as my external firewall  from my home network. And I currently have two   connections going into it - one is going to be  outbound to my existing internet connection and   second is going to the new NetGear LTE modem  that I just configured recently... and that's   ridiing over over a Google Fi cellular network. So  ultimately what I'm looking to accomplish is this:   I want some way for me to be able  to monitor my existing connection,   using IP SLA or some other route tracking, to  measure an external service - in this example   I'm using Google DNS at 8.8.8.8. And I want to be  able to tell if that connection exceeds a certain   amount of packet loss or packet latency. And if  that happens, go ahead and flip the connectivity   over to the NetGear modem for now - but should my  primary internet connection come back into line   with the thresholds that I've specified, fail  back over to my primary internet connection. So in order to accomplish this, I evaluated  a couple of different ways of trying to see   what I could do using just the FirePower  box alone. And unfortunately there's not a   good way within the the current software base  to accomplish exactly what I was looking for.   So instead I went the route of writing  custom automation using Python in the   backend - to do both the monitoring and  the route injection and removal in case   of failover. So first we're gonna have to go  through a set of changes on the firewall itself   to get ready for this. So let's go ahead and  switch over to our firewall - and we'll login. Okay, so the first thing we're going to need  to do is configure our interface for the LTE   modem. Now as you can see in the diagram, it  looks like ports 1/7 and 1/8 provide power over   ethernet - which will be helpful since I purchased  the Power over Ethernet variant of the NetGear   modem, to reduce cabling and also benefit from my  battery backup. So first thing we're going to do,   is go ahead and go down to interfaces. And we'll  go down to interface Ethernet 1/8, which is where   I have the modem plugged in. And we'll edit that.  We're going to go ahead and change the name. Keep this as routed, set the status  to enabled, and we'll set our ipv4 to   DHCP - so it can collect an address from the  modem. I will note that on the modem itself,   I kept it in routed mode - so that  we have a upstream gateway that we   can inject a route to. So we will remove the  "obtain default route via DHCP" - since we   don't want a default route to push all of  our traffic over the modem. Next we'll go   ahead and go over to the PoE tab - make sure  the Power over Ethernet is enabled and hit OK. Now we will need to deploy our changes for that  interface to come up. But what we also need to do,   is add this interface into an existing security  zone. So we'll go ahead and go up to objects,   and security zones. And in order to make  efficient use of my existing ACL policy and   all of the configuration that I have already,  I'll just go ahead and add this to the existing   'outside' zone. So it'll benefit from all the  same rules that my existing internet connection   has. So got an add "LTE backup modem",  hit OK and OK. The benefit of doing that,   means that now I don't have to go through  all of my access list policies and change   all of the rules - since all of the rules  are ready allowing traffic from the internal   network to the outside zone. But what we will  need to do, is configure an NAT policy. So go   ahead and go up to policies > NAT. And we're  gonna add a new manual NAT, which we'll name   "NAT_LTE". And we'll go ahead and set this to  "after auto NAT rules" and type to dynamic.   Our source interface is still going to be our  internal network, which I have on a trunk port. And our translated is going to be the modem.  Source address, I'm going to set to my internal   network. Destinations: any. Source for  the modem, we're gonna go ahead and set   as interface. And we'll go ahead and hit OK. Next  we're gonna have to make one other change that's   going to help me with my automation. We'll go  back up to the device page and go to routing.   Now the way that the script I wrote works, is  going to be by injecting a static route into   the FirePower appliance every time I want to  failover to the external internet connection.   Now in order to keep monitoring my primary  internet connection even when I'm failed over   to the secondary connection, I'm going to add a  static route for the IP address that I intend on   monitoring over my primary internet connection. So  we'll go ahead and hit create static route - and   we'll name this as "failover_monitor". And our  interface is going to be our outside interface,   not the modem. And for networks we're gonna  create a new network. we'll create a host for   Google DNS which is gonna be 8.8.8.8 - hit OK and  add. And so we'll go ahead and type Google DNS,   add that in here. And our gateway is gonna be my  upstream gateway from my current provider - so   we'll add a new network object. And we'll add that  address in here, hit OK, go ahead and find it, add   it in here - we'll keep the metric at 1 so that  it's always preferred and hit OK. Once we're done   with all of those changes, we need to go ahead  and deploy them to the firewall. So we're going   to hit up deployment, review our changes to make  sure that we have everything in here that we need,   and hit deploy now. This is going to take a  minute so we'll come back in just a moment. Okay - now that our changes have deployed,  we'll go ahead and flip over to Visual Studio   and look at some of the code. So the first thing we'll see, is this options file that   I created. And this is going to contain the  configuration settings that we'll use for our   tests. So first we're going to see the options  relating to the hosts that we're monitoring and   the thresholds that we're setting for for  latency and loss. So you'll see what the   ping target is - in this case I'm monitoring  8.8.8.8. We're gonna send 10 ping messages,   every time we run the script. My max latency is  that I don't want to exceed 2000 milliseconds. And   my max loss is no more than 20%. Next we'll have  a couple of settings that we want to configure for   our FirePower itself. First is the address - in my  case I have a host name of just FDM. The username   and password that we're gonna use for the API  credentials to log into the device. We're going   to set what our failover route is. Now in a normal  case I would just set this to 0.0.0.0/0 - but for   the purposes of this test we're just going  to inject a host route to fail over to the   LTE modem. We'll also configure what our failover  gateway is - this is the IP address of the modem,   that will be setting as our next hop. In this  case that'll be 192.168.5.1 - and the failover   interface, which we just configured as Ethernet  1 / 8. Now this is comprised of two different   scripts - there is a path monitoring script  and the firepower script. We'll take a look   at the path monitoring script first,  because that's gonna be the easiest. The first thing that we're gonna do is go ahead  and load all of our options out of the file.   This script performs two primary functions - the  first one is going to be running our ping tests.   So this is going to go ahead and send the 10  ICMP messages that we had configured already   to the host and measure the latency and response  time. Then we'll go ahead and calculate what the   loss and latency is, and make a determination  on whether or not that exceeds our thresholds   or it stays within the thresholds. Depending on  the result, we're going to make a call to our   FirePower module - to either add our new static  route over the LTE modem, or remove that route. So let's go ahead and take a look at the  FirePower script. Now this script is a   little bit more involved in our path monitoring  script. This module contains all of the logic for   creating network objects, creating gateways, route  entries, and automatically injecting & deleting   the routing from the FirePower device  itself. So you have a bunch of config,   including loading the options, the parameters for  host headers, and OAuth. We also have our config   for authenticating to the FirePower, getting  the global routing table, adding our route,   and removing a route. So this script is going  to check a couple of different things every   time it's run. So for example, if our loss  and latency is within the thresholds and   we're already on the primary connection. it's  gonna make a call to FirePower to make sure that   the static route still does not exist - and  if it does exist remove it. If it finds it,   we assume that we already failed over in the  past but the primary internet connection is   good now - so we want to go ahead and remove that  route. In the event that our thresholds exceed   what we're looking for for the primary internet  connection, this script will go ahead and add a   static route to our destination - in this case  it's a single host, but again it could be just   0.0.0.0/0 for a default route. If our thresholds  are outside of the bounds that we've specified,   this will go to the FirePower, create any  network objects and routing objects it needs to,   inject the static route over the LTE modem, and  then deploy the policy. Now if it runs again and   we're still outside the thresholds - what it  will do, is it'll go back out to the FirePower   and just check to make sure that route is still  there - and if so make no changes and quit. Okay, now let's go ahead and test our  script. In order to simulate a failure,   we'll go ahead and change the max latency to  about 15 milliseconds. My primary internet   connection usually stays between 20 to 30  milliseconds, so 15 should be well below   that. Before we go ahead and run the script,  we're gonna run a quick traceroute to check   what path were taking out to the Internet  currently. All right and as we can see that   is going out my current internet connection  - so now we'll go ahead and run our script. And we'll see pretty quickly, that the average  response time is 37.7 milliseconds - which is   above our threshold of 15. And our script informs  us that that does violate the thresholds that   we've set, and it is going to go ahead and fail  over to the secondary internet connection. And   we'll see that it creates our routing object  out to 9.9.9.9 via our gateway of 192.168.5.1.   After that's been done and the route has been  added, we'll go ahead and deploy the policy.   Because policy deployments can sometimes take  a minute - the script will continuously check   to see what the status of our deployment  is, back off for about five to ten seconds,   reach back out, and check it again  to see if it has been completed. Okay now that our deployment has successfully  completed - our script does inform us that   traffic has been failed over to the backup  connection. Let's go ahead and verify that   by performing our traceroute again. All  right and as we can pretty quickly see   our second hop in the list is 192.168.5.1 - so  we are going over our LTE internet connection.   We can also validate this by going back to  the FirePower interface, going to our static   routing config - and we see that we do have our  backup route out the LTE backup modem interface,   for the /32 host route that we configured via  our gateway. Now as I was mentioning before,   if we run this again while it's failed over  - we'll see that the loss and latency still   violates the thresholds, but the router  already exists, so there's no changes   necessary. So let's go ahead and fail back  over to our primary internet connection,   by updating our max latency back to the 2000  milliseconds and running our script again. Now in this case, we see that our average response  time is only 55 milliseconds - which is far less   than the 2000 milliseconds configured. So  our loss in latency is definitely within   the thresholds that we have specified. So  next, our script is going to go ahead and   reach out to the FirePower device, find the  existing static route that we have configured,   and delete it. Then we go ahead and deploy the  policy changes - which happened pretty quickly   this time around. And then we can go ahead and  check our traceroute one last time - and we'll   see that the connection has failed back  over to the primary internet connection.   Again we can verify this by going back  to the FirePower device - and we look   at our static routing config now, we have no  host route out the LTE cellular connection. I hope this video is helpful and if you're  interested in using the script or learning   more about it, I'll go ahead and post the  code to GitHub - Check the video description   for a link. Well that's all I had for  today then - Thank you for watching!!
Info
Channel: 0x2142 - Networking Nonsense
Views: 780
Rating: undefined out of 5
Keywords: network automation, python, firepower, fire power, snort, ids, ips, security, lte, modem, backup internet, failover, 300-735 SAUTO, 300-735, SAUTO, DevNet, DEVASC, DEVCOR, DevNet Expert
Id: Mf5zIt9HFxk
Channel Id: undefined
Length: 13min 45sec (825 seconds)
Published: Fri Jul 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.