Juniper Networks End-to-End Automation with ServiceNow

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
some say that he writes code during his rem cycles that the first language that he ever learned was c even as a child all you know is he's calvin remsburg that that that was amazing thank you for that but i'm sorry that i didn't warn you at all it couldn't be further from the truth i i'm a classic network engineer that just happened to learn network automation a few years ago and so again my name is calvin i'm a sales engineer and what i want to show you today is a demonstration of how to use the extensive apis from juniper's juno solution as well as the mist environment to fully automate your network environment and i've got a series of demonstrations that we're going to show you today one is we're going to ztp an entire site through the apis by leveraging servicenow then we're going to add a vlan to an existing sd-wan site then we're going to address one of the hardest problems that i faced as a network engineer in the enterprise and that was this concept of a big red button how to safely isolate a site once it's been identified as a cyber security risk within your environment and how do you safely re remediate that or or reintroduce the site back into the environment then we'll talk about one of my favorite conversations which is vxlan and evpn and we'll show you how you can fully automate the configuration aspects from that environment and if we have enough time we'll we've got a little special thing up our sleeves for a bonus demonstration so i've got one more slide that's it everything's going to be live live demo this right here is the network automation architecture that we have we've got the servicenow as our configuration change so all configuration changes within our environment will be instantiated from servicenow we're going to pass the information over from the servicenow forms into ansible ansible will reach into github make sure it's got that latest copy of the project then it's going to reach out to our network source of truth in this instance we're using netbox but that can definitely be something like solarwinds or some kind of internal cmdb info blocks it doesn't really matter in fact none of these icons on this dashboard actually really matter that much every single component can be ripped and replaced and that is because juniper has always been about that automation story right we've had an api before we had a cli way back in the 90s when everyone thought we were crazy for it but that type of focus on automation throughout our entire history has led us to create some really unique opportunities for our customers and we're hoping that we're going to showcase that just today so with that as i promised that was my last slide everything else is a live demo so what we have here is my servicenow dashboard now this dashboard really has two main purposes in life one is to perform all those network changes that we just talked about but the other one's a little bit more exciting so let's talk about that really quick what we're able to do from juniper's apis and missed apis is we're actually able to give you live information as to exactly the things happening within your environment i can get a status update for how are all my access points in my environment right now i can see i've got three can three that are disconnected one's connected we'll get to that in just a second but i can also take a look at my missed wired inventory i can look into my sd-wan environment as well and i can automatically create incidents inside a service now whenever a specific event happens within my environment so for instance if i'm running bgp in my data center i need to know whenever i have a bgp outage these incidents were actually automatically created just through the api integration that we can build with servicenow so my teammates automatically when they come in for the day they load the dashboard they have incidents assigned to them automatically saying exactly what happened when it happened and to give them as much information from the syslog information so they know exactly what they need to do in order to remediate that problem in addition to the apis providing all this live information we can also give insights into other things happening within your environment we can track who's making manual configuration changes not necessarily to ostracize them but in reality when you start building network automation you typically start with small wins and small victories and it's good to know exactly where the automation tools have some gaps and people have to make manual configuration changes here we can track who's making changes on what devices at what specific time and here one of my favorite things we'll definitely get into this is the ability to have an exact notion of what's your data center configuration is are all of your devices running the golden configurations or do you have one small variation on any of these devices so in addition to the networking side let me just show you this also applies to the security and that's because juniper's sd-wan solution is built on our mature next generation srx firewalls so again because of that we get visibility into things like ips violations firewall violations and content filtering and again because everything that we do with juniper is through the apis we can take that information import it to whatever tool that your customers running within their environment this is no longer really a conversation of a vendor selling a single pane of glass this is more like a zero pane of glass saying you bring your tool to us in this case servicenow and we'll send our data to you in a way that you can ingest it and actually help opera uh help automate your operations okay so going back to our first use case which was we need to zero touch provision an entire site well let's go ahead and take a look at what that site actually looks like right now if i move over to my missed portal i can see that i have a site here named bootstrap to site not a lot of information regarding this we have no country no time zone really associated no address and if i look at the devices i can see that they've been added to the missed portal they're in my inventory and they're assigned to the site but they really have no configuration here i can see we've got a virtual chassis i've got things connected to it but as sujit had ported out with our configuration we actually have no port profiles or port configurations assigned to this switch in addition to that we have access points that are also connected to these switches i can see that they all weren't warrant the name of bootstrap ap and boot and no name for our switches okay so what do we need to do in order to fully provision the switches the virtual chassis the vlans the port configurations the host names snmp strings all that stuff in addition to creating the wireless networks as well what's a little bit easier than you might imagine because of the power of our apis let's move back over into our servicenow portal and what i'm going to do is i'm going to select the wired and wireless operations now from here i've got a couple different things that i can do from servicenow but i'll simply select on the configure network devices with netbox that's a really interesting point because what we're doing here is we are not actually use having the user pass in any data they are not logging in they're not adding any form fields in here all they need to do in order to fully provision this site is select the name of the site from the drop down menu in my case that site's name is katie but let's just pivot really quick to talk about where that netbox component really comes in so what we're using again is our network source of truth or in this case we're using netbox and i've got a list of different sites in here and inside one of these sites for kd for instance i have things like the facility uh the description of the site uh time zone address contact information i can also see i have the vlans provisioned and here if i look in the network devices i've got three different access points i've got a sd-wan firewall i've got a handful of switches a couple of them are in virtual chassis mode and this is really important because all of our data actually lives here inside of netbox we're not requiring the user to fill this in manually and what this enables you to do is your teammates who are a little bit apprehensive for using network automation to do their daily tasks maybe they don't want to learn python maybe they try to learn ansible and it just didn't resonate with them here they get to use a very easy to use and very familiar tool in netbox by just passing all the configuration data inside of netbox and through with the api integration we can grab it at runtime and automatically provision all of these sites so without any further delay i'll go ahead and select the site name of katie and i will click this order now button now from the network engineers perspective that's it my job is done i can effectively go home for the day i did a very good job but in reality what we've actually created is an approval process maybe you want a team member to review your site make sure that all the data is entered correctly inside of the netbox environment maybe it's an architect or some other person but we want to have this approval process to make sure that all of our changes have been validated by a teammate before we initiate the automation so i'm going to go ahead and emulate that approval process by selecting on my name and before i click this approve button it's really important let me just go ahead and show you the ansible tower because as soon as i click this approve button what we're going to see back on ansible tower is that i need to log in is my jobs and so i've got all of my jobs sorted in a descending fashion and it's really important because when we click that approve button we're going to see this job reach out to github it's going to grab the latest copy of the project then it's going to reach out to netbox it's going to talk to all those apis and it's going to fully provision our site so let's go ahead without any further delay i'm going to click on this approve button right here now let me see did that got approved all right let me move back over into my ansible and i can see that we've created a new job here called servicenow configure a site with netbox now on the left hand side i see that variable that we had passed from servicenow which was only the name of the site and with that on the right hand side what we can see is that we're reaching out to netbox we're grabbing all the information for those that site saying give me your physical address contact information give me the firewalls the switches how are these connected give me the access points the vlans all that information and oh by the way we also sent a message to the team here that we have successfully provisioned the site with netbox not only that but in my slack client when i got that message i can actually see the photo of every one of these devices that we got sorted either from that box or from from the miss cloud so i can see exactly the name of the device where it fits within the environment and so everyone on my team just got a notification that we've successfully configured this site let's go back over and see first of all that took roughly 31 seconds for us to fully provision that site if i move back over to the redtail site and i refresh my page here what you'll see magically is that the switches have been renamed not only have they been renamed not only the access points renamed but we also have real configurations on there i see i've got my network trunk ports i got my access ports i got vmware servers on here as well and if i open up a vm console to one of those devices i can see previously where all my pings were failing until just a few seconds ago now this site can actually reach the public internet i can reload my web pages and now i'm getting full internet access back at my site so just like that we were able to successfully provision virtual chassis switches uh access points vlans host names etc simply by clicking a button through service now now my next my next trick my next demonstration here is to show how to add a vlan to an existing sd lan site and so for that let's go ahead and look at my sd-wan orchestration platform for juniper that's our contrail services orchestrator here i've got three sites three spokes sites and i've got a hub let's go ahead and take a look at one of my sites named magnolia inside of this site i've got five vlans we have vlans 10 through 14. they're in different vrfs uh we got the ip prefixes everything's nice and proper but my task for the day is to create a new vlan to the environment so what do i need to do for that let's go back to our servicenow portal i'll click on the network automation button here and this time i'm going to be focused in my sd-wan operations so for this task we're going to not select create a new sd-wan site but rather create a new sd-lan network now here another simple form just a little bit more information for us to fill out first things first let's go ahead and select the name of the site in this case it would be magnolia for my vlan name let's just do network field day 23 that's fine and the vlan id will also be 23. oh look we got some data validation that i entered the wrong key there we are so we're going to do 10.23.1.1 24. and we're just going to say nfd23 and the pre-shirt key for this by the way we're not only provisioning the sd lan network but we're also provisioning the wireless network on the miss side so that users can can connect and associate to that so for that i'm going to type in juniper 123 as my super secret password i also have the opportunity to provision any of the access ports on my devices for this new vlan but i'll leave that blank and for now i'll just click on the order now button now before we go ahead and we go through that approval process again let's take a real quick look at that site magnolia so at the site magnolia what i've got here is i've got a switch i've got an access point if i drill into this i can see that i've got some trunk ports everything looks fine and dandy and if i look back over on magnolia employee pc i can see that we've got someone watching youtube i've got some traffic generators going on in the background everything's chunking along everything is really really happy and we got a continuous ping to the internet okay so everything's good so far let's go ahead and create this new sd lan network by clicking on the approval button here and we'll go ahead and replicate the order of operations that we saw previously i clicked the approve i got the approval notification right here we'll move back into ansible and what i can see now is that i've got a new task that's been instantiated of creating a new sd lan network now on the left hand side you see all the variables that we passed and we see some other ones that i have hidden for right now but on the right hand side again what you see us doing is compiling the json payload and then pushing the configuration through the missed apis and we are i'm sorry to the cso and the missed apis if i move back over into my sd lan environment what i'm going to see here magically show up is nfd 23 and it's been data modeled now this is really important if you're looking into getting into automation you may have heard of data models you may have heard of things like yang just know that everything that we do at juniper again is api driven we take care of that data modeling for you we take care of all the yang communication make sure that all the data that's been entered in the form is correct for instance no one's put a letter in an ipv4 address uh and here the team just got a notification in the slack channel that we have successfully deployed a new sd lan network we see that vlan is id is 23 we see the ip prefix etc and if i come back over into my my myst site here and i look at my wireless lans for magnolia we suddenly have a new nfd 23 that's been added and just to validate that that just happened we see that just a couple minutes ago or actually less than a minute ago we actually created a new nfd 23 wireless lan so just like that we're able to add networks to again to switches wireless access points creating the ssids for the w lans as well as adding vlans to a next-gen firewall which happens to be in an sd-wan uh orchestrator which then provisions things like my firewall policy my sd-wan policy all of those things completely obfuscated from you now when you think about the skill set that is required to operate an automated world like that you can start to see the immediate business value for your organization and that no longer do you have to hire your ccies to do your menial tasks like this right instead you can take those critical resources in your environment and put them on harder problems to solve things like architecture or advanced troubleshooting so extremely valuable types of benefits to your organization simply through the apis that we expose at zero dollar cost to your environment now let's go ahead and talk about that big red button this again is one of the hardest problems that i've seen within my environments is it's a typically an annual project where you've got someone that wrote a perl script back in 2007 that person left the organization no really no really knows how it works but everyone's got to figure out how to safely secure a site that's been identified as a cyber security attack so for us we'll just go to the network automation tab this time i'm going to click on the cyber security event and for this i've got two options available one is the ability to isolate a site and one is to recover probably don't need to explain which one does what but let's go ahead and go to that site isolation now again here all that's required for me to do is simply select the name of the site in this case we're just going to go ahead and take that site magnolia offline and click this order now button now obviously for something like this you would want to have some kind of approval process some kind of pipeline with your network security team to make sure that yes we really want to really really want to take this site offline but in our case let's just go ahead and pretend that i've gotten the sign off for my cyber security team i will go to my request and i'll click this approve button and let's see how long it takes for us to isolate this site we'll go back over to ansible tower we'll look at my jobs that i've got laid out here i can see that i've got a new job here of cyber security isolate the only thing that we pass to it thrown from servicenow is simply the name of the site and at the right hand side you can see that we are configuring the json payloads to take this site offline now that all in all took 12 seconds for us to execute who said ansible slow well it kind of is but you know besides the point 12 seconds is a lot faster than our human hands can do things so let's go back over to our site magnolia switch one and let's take a peek at that configuration now now what you see is that on my switch port configurations i've got a port profile associated to this called the black hole network that's on ports two through nine and also eleven i'm saving my up links to my sd-wan firewall and i'm also saving my access point but this isn't just a dead end vlan the port profile also has a set to where it disables any physical interfaces on the boxes so if we return back over to our console here what i'm going to see is the pings have stopped coming and we're starting we're going to start to get some icmp unreachable messages here my youtube videos now got the spinny wheel of death indicating that we've got some kind of site operation and again we were able to execute this in just under 30 seconds actually half that at 11 seconds we were able to isolate this site okay now as this goes through we're going to let those icmp messages come through and instead let's talk about what it takes to recover a site that's been taken off a line we're going to return to our servicenow portal and if you've been paying attention you probably understand the workflow that we've created for this and it's completely intentional again when you think about network automation you typically think about a lot of the resistance you get in from the classical network engineers like myself the reality is that they don't want to have to learn all these new tools and processes so what we can develop here is a single pipeline that is that handles multiple kind of disparate workflows whether you're provisioning sd-wan sites data centers taking things online doing software upgrades it's all the same workflow so that everyone feels comfortable and familiar no matter what it is that they're doing within the environment so in our case we're going to select the cyber security event this time i'm going to select the recovery button and again all i need to do is select the name of the site and again we're just going to go ahead and approve it you definitely want to make sure your cyber security team has signed off that a site is available to come back online so we've gone ahead and created that pipeline let me click this approve button and we'll see just how long it takes for that site to finish its job to we can recon uh recover from that site isolation now i'll look into my template here again the only thing we needed to pass to it was the name of the site and what we're able to do again is just reach out to netbox grab all the parameters for my site's configuration develop that json payload send it to the rest api this time it took two seconds longer it took 14 seconds but if we return to our site now and take a reminder of what the port profiles are right now let me go ahead and refresh my page now i can see that we have the vmware servers with the correct port configurations the black hole port profile is completely absent from the site because it's no longer applicable we see an iot device as well and all of our site's communication is back online and operational now what this will ultimately entail is back here at the site we see our pings coming back in through the uh the terminal and our spinning wheel of death has finally gone away so we were successful in isolating a site and successful in recovering that site simply by issuing a form within servicenow to do all that difficult work for us now next we're going to talk about again one of my favorite conversations which is vxlan and evpn and oh boy i love the technology but it is difficult to configure that is for sure uh so let's go ahead and show you again let's let's first bring some visibility back into this dashboard that we had here we had this tool that we have called the data center compliance and this isn't just a pretty graph this is actually a live understanding of which devices in your environment are not running the golden configuration so if i click on the ones that are out of compliance or the false ones here i can see i have two devices that are not in compliant that is my distribution switch one and my access switch one and if i move over into my data center topology i can see that with access switch one affected and distribution switch one affected well employee pc one is gonna have a pretty hard time pinging its layer two neighbor over on this side of the data center employee pc2 and if i click into this pc right here i will see sure enough my pings to my neighbor are all unreachable whereas my my neighbor employee pc2 does not have a problem communicating with its default gateway and above everything is is perfect on that side so we definitely know that we've got some configuration issues and if i just drill into this for a second not to show off the cli but if i actually look at the full configuration for my device i can see it's roughly you know maybe 10 lines of config obviously not enough to be a vxn evpn configuration so let's go ahead and bring the remediation to our data center we're going to again do this through the servicenow portal because all of our configuration changes are instantiated from there so we'll return to our home page i will simply simply select the name of the site or from my data my data center provisioning tool we'll go ahead and select the red tail data center which is our data center and i will click this order now button now what we've got going on here let's go ahead and go through that approval process and i'm gonna log in into my there we go so okay so what we're gonna see now is when i click this approve button right oops right down here that's going to kick off the automated workflow to our ansible tower now this one's a little bit different than what we've been seeing before data centers just got to be a little bit different a little more complicated but the reality is that this time i'm using the concepts of infrastructure as code to fully manage my data center configurations and what i mean by that is if we drill into this task the campus fabric central route bridge build we drill into this task what we can see is that every single line of every single device's configuration is being generated into individual files and then we assemble like voltron all of those individual files into one configuration for each device and then we pushed that to the remote devices but only when that device was not running the golden configuration this is capable because of again juniper's netconf api that we've been supporting on our platforms across the environment so your srx firewalls your mx routers your ex switches your qfx data center switches as well what we can do is since that everything on the juniper configuration is built upon that data model we can just simply replicate that data model here through the power of ansible and ginger two create the full configuration and then push it to the remote devices in addition to that because everything that we're doing is we're building the full configuration there are literally no limitations that of what the kind of design requirement that you have if you want to run a three-stage class in your data center great if you want to run a 72 stage class in your data center we should probably talk to your architect but it doesn't matter it doesn't matter if you want an ospf or rip next to rip next generation does not matter because if it's supported on the juniper command we can support it here in our configuration template so let's go back and let's see the actual application of our data center configuration and again what we're doing is we're logging into every one of our qfx boxes in our data center and we're grabbing the running configuration on that box and we're doing a literal linux diff between what's on the box and what we generated and if that diff is greater than zero then we know that that device is not running the golden configuration and is now a target for the actual configuration push so here not only do we see that only the two switches that were out of compliance got the configuration we actually get that diff we know exactly what changed and because we're using ansible tower this is permanent within our audit log now we know exactly what changes happen within the environment what was added what was removed who instantiated the request within the environment so you can tuck that on to things like your change control numbers etc and the last task that we have within our playbook is to just re-run our golden diff uh configuration tasks so i come back over into my data center now and i open up employee pc one i now see magically my pings are now being restored i can now ping my neighbor employee pc2 and if we look at that configuration that we had previously looked at on access switch one and i just look at the configuration it's now it's a little bit more than 10 lines of configuration to say the least so again we were able to successfully identify which devices were in our data center that were not running that golden configuration we're able to actually remediate them through servicenow so that we could restore services within the environment now there's a lot of other things that we can do through the servicenow portal i just don't have enough time really there's things like doing software upgrades taking a data center device out of production for maintenance or replacement there's a lot of really great powerful things that we can do but instead what i'm going to do is i'm going to check my time and then we're going to see an actual bonus demonstration so this is something that we had alluded to within the previous slides and it's not going to be using service now this time so let's revisit our our network automation architecture and we remember all the tools that we've been using within our pipeline one of those tools that we've been using to keep in communication with our team through all these network configuration changes has been slack slack is the pivotal application that all of our teammates get all their update messages for within their environment let's show you how to flip it on its head this time instead of reporting messages to slack we're actually going to use slack to automate components within our environment and again all this is completely available and has been available to all of our juniper customers no matter if you run mist if you run srx's or ex switches it doesn't matter since the juniper boxes all run the same base operating system you get the full automation capabilities exposed to you so let's go ahead and showcase how to use slack to help with our automation within our environment so for this time i'm going to walk through the environment the perspective of that i'm a i'm an i'm a field engineer that's installing an sd-wan site again that's the firewall the switches the wireless access points etc and i need to know whether or not the site that i've been provisioning is online and activated and has been successfully provisioned but rather than pull out my corporate laptop hook up a console cable to a switch know all the archaic commands in order to figure it out or even worse call the knock to get a report status on you know what's the current status of my site instead i want to automate that remote that reporting capability simply through using the same messaging client that i'm already using today and that is my slack claim here all i need to do is open up my slack channel and type in the command sd-wan and what i'll get is a prompt that comes up to me and asks me which sd-wan site would you like to validate well i'll go ahead and select the site well let's let's pick on katie again and we can have the option of putting in any additional information into this remember this is a report that can be reviewed by all of your teammates so sometimes a field engineer may want to say hey we're installing this firewall it's underneath the cashier register at a gas station just for that kind of automation or for that documentation purposes for our use case we'll just go ahead and leave this blank and i will click this submit button now what we get back immediately is a message from our slackbot to say that we have received a request to validate the site katie which is really important because what's actually happening when we submitted that form field right there we took that value that we had entered the site katie and we wrapped it into a json payload and we sent it back up to ansible ansible then logs in to our control service orchestrator through the api it grabs all the information for my site and it formats that response and sends it back here into the sd-wan so i are the sd-wan bot so i can see that my sd-wan site i've got activation has been successful i see it's provisioned and it's currently up i also see other characteristics of my site so i know that it's time for me to pull down my ladder put my backs my boxes back in my truck and head to the next site for me to do my installation again without requiring any kind of command line interface or doing a site survey or or connecting into the actual network device through my laptop or calling somebody and a knock i'm able to simply use my messaging client and it doesn't really stop there i can also do the same thing for my lte circuits right so i can validate the health and status of an lte circuit again because the juniper sd-wan solution is built on that mature next-gen firewall we can terminate dsl adsl vdsl lte any kind of network transport connection but that in itself creates some challenges so in this case from the juniper side we've got a lot of data that we need to gather from that lte modem let's actually log in back to ansible tower to see exactly what's happening right now if i move back over to my job i can see that we've created a new job template here for wan lte report and what we're actually doing is we're running three separate net conf rpcs to the actual box to say give me all the information on your modem give me all the information on your wireless health find out if there's any uh signal interference etc and then what we're going to do is we're going to compile that into a nice pretty picture or actual nice pretty message and again send it back to that client so it's important for us to at least give them that immediate feedback from the bot to let them know hey we've received your request please stand by this typically takes about 30 seconds for us to generate and then from here we're going to go ahead and build out that report and send that payload back over to the missed client so they don't have to open up any of that command line and then interface with it from there one other thing let's go ahead and see our slack bot right here so here we can see in the lte channel we've got the connected time i can see the network that i've been connected to the three different rpcs are structured right here in a format that's easy to read and again everyone on my team gets this notification or you could have made it gone directly as a direct message to that specific user let me point out one other thing that's pretty neat about this we talked a lot about dynamic packet captures in the previous session it's one of those real key important things that separates juniper is that we can actually capture the problem when it happened and we can filter out all the other noise from the packet capture that you only get that specific event and again we only capture the headers so let me show you here how can we get some how can we use slack to get immediate feedback to the problems within my environment so here we've written just a quick bot called problems and here i can select which type of event i would like to see in my case i want to look at 1x issues within my environment because everyone knows if there's ever a problem with access knack is the first thing that gets blamed and sometimes rightfully so but we want to actually validate whether or not that's the case so i got a message from my my client that says that we have received a request to validate that and what we're going to get back in my slack client on my desktop is all the dot one x issues that have happened at that specific site within the last day not only do we see the dot one uh authentication failure we also see the ssid that this happened on the mac address of the host and the packet capture we can actually download that packet capture and open it on my workstation so i can see exactly what had happened i have the packets and the packets just don't lie so if i need to go back and do any kind of troubleshooting think about having this kind of functionality enabled to your network operators it's it's amazing it's incredible it completely transforms the way that we manage networks today and here at juniper again we've been doing this since the 90s this is nothing new for us so everything that you see here today is completely open source it's completely free to run within your environment and if you're interested in it i've developed this entire thing on a twitch live stream we have the videos up on youtube if you're curious on how it is that you can learn network automation feel free to reach out and join us on these twitch live streams or watch the youtube videos reach out to your local account team at juniper visit the resources that we have exposed to you again my favorite resource of all time is nrelabs.io where you can learn all these network automation concepts that we've shown today in real time within your browser without requiring any kind of login or any kind of additional tracking cookies or something like that so with that that concludes my demonstration for the day thank you for having me i really appreciate you being here
Info
Channel: Tech Field Day
Views: 1,578
Rating: 5 out of 5
Keywords:
Id: j1KJByx7OLI
Channel Id: undefined
Length: 38min 16sec (2296 seconds)
Published: Thu Oct 01 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.