NANOG 68 100G+ Data Center Evolution and Challenges

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let me work that okay thank you thank you Phillip right so my name is Nick Aragon yo so you will have been exposed to a lot of french-speaking name this morning it if Goethe you gotta gallo so it's a good thing to present this morning thanks for the opportunity for those who don't know XO we do test orchestration and service assurance and monitoring 3d another text monitoring solution for customer that are looking to test monitor and do service assurance over fiber medium and fiber service so I was hearing yesterday Mr jag Lee talking about fib when he was mentioning DWDM optical white box things like that I was very happy because it that's typically my domain my domain of expertise is not let's say you're peering it's not bgp it's not everything it's more physical layer to layer for depending on which side which model you use but i really believe that you guys even if you don't touch necessary or work with the fiber medium too much it's definitely where the industry is going it's not always go always in fiber to the home and things like that but definitely there's bed with pressure and fiber is there to stay in to face the music if i can say so let start with that so well bandwidth pressure if we're looking at the data center metro network and even data center inter codec network all these fiber segments are put under pressure as you see on the slide you see many links let's say a combination or a patchwork of one gig but most of the time in major data center it starts at ten gig to the port to the rack okay going to the me the the main distribution area where it where people leverage 40 gig more and more 100 gig getting out of the data center in wide area network or data center interconnect where 100 gig mainly career and bass transmission is widely deployed these days and some of our customer already work in this portion of the segment of the network sorry they plan four hundred gig as we speak okay all the speed is very intimately related to the transceiver technology the more transceiver technology that comes in the market and this year we're seeing a lot of transceiver introduced PSM for that will be good over single mode fiber up too far optimized for 500 meters Pam for that will be used in the DCI up to 80 kilometer to let's say make an alternative to the more costly coherent base transmission transceiver so as more transceiver you see coming from the industry people adopt that because there's a big bandwidth push and increase requirement one thing is interesting to understand also and that's this portion here probably that you guys will not be surprised about that but when I speak to more people at managed fiber they are not well they are all surprised or impressed we see as I said the big push for bandwidth requirement cisco vni tells us in the last temporary report that 50 but of bandwidth global data center traffic worldwide is the amount of bandwidth that is running everywhere on the networks inside the data center outside the data center to the customer in 2016 this volume will double in the next three years to pretty much 10.4 satellite but the most important thing to understand is that this traffic is seventy-five percent or seventy-three percent confined inside the data center it's rack to Rack transmission the fader like you put on facebook not to promote this only service but before people see that on their page on there or they're feed it just update all your friends pages and that's a lot of traffic that stays inside the data center so this amount of traffic seventy-two percent seventy-three percent sorry within the data center ten percent only DCI between the data center and a little bit less than twenty percent that goes to the DC user so that's definitely where will we see or where are we seeing today more bandwidth pressure inside the data center definitely so 100 gig or migration from 10 gig to 100 gig will be a very big topics of discussion and activity in the next few year by kolos by web scale company everybody that runs data center that of course needs peering DNS management and everything but that needs a very proper fiber infrastructure ok this slide shows that I was talking about transceiver the D industry has developed two main category of transceiver that are called or coined parallel optics based on mpo connector so mpo to the transceiver are parallel optics parallel optics can give some benefits you can break out and bring some bandwidth to different pod or racks without top of the rack switching without aggregation everything is done at the server at the at the physical fiber-optic layer and with optical card or optical port on the server that we see more and more you can run up to the server with the parallel from the main switch at the entry of your data center I'm breaking this out so that's the scenario number three on the picture and you have also the scenario number two that is more a little bit a little bit more for 40 gig or 10 gig that Hugh's always the same mpo backbone cable all but that will use mpo connector to the transceiver in the switch the other category is more what i call the WDM optics or the serial duplex it's it's going to be the same speed 40 gig 100 gig 10 gig but the big particularities is that you have LC connector on the transceiver and you do transmit and receive over to fiber and if you leverage like in the clr for scenario transceiver then you will use WD ham to have 4 times 25 gag line rate per instance and provide a 100 gig interface on the transceiver LC duplex connection to the different other switching element in your data center leaves fine I believe you know that very well this architecture has been proven to scale to be very effective I present the slide not to tell you where or educate you about the leaf spine because I know that I believe that everybody knows in the audience what it is but I just reinforce the point that you see a bunch of black box with patch panel that makes a lot of connectors that can be manipulated and when you manipulate a connector it's basic it can be contaminated contamination and connector it's according to our source ninety to ninety-five percent of the issues you will find on fibre optic infrastructure because most of the time people don't have the proper equipment to inspect or clean but most importantly they don't have necessarily the right training to know that they can do a lot of damage if they take the connector and they just gently sweep it on a finger oil contamination is a very big issue in data center before to extend a little bit on the inspection and contamination I just want to present this slide to put again for a basic test of measuring insertion loss or optical return loss or reflectance that's parameters that you need to test some time at the construction some time at the upgrade of network sometime in troubleshooting because something went bad and you want to know where it comes from okay then you need i say basic testing tools physical layer tools to test the fiber optic medium like oh LTS or OTDR and everything I present here is very generic you see the blue product that are associated to my company but we're not the only one that sell otdr LTS or things like that so I built this slide back to be the most vendor agnostic possible to elevate the discussion and the education so what I want to put everybody on the same page if you're not there already is that you have the 0 LTS on the bottom and you have the OTDR in the bottom first thing when you use a new LTS you need two units these unit are light source power meter some time Oh LTS well Oh LTS by definition integrate the light source in the power meter both element inside one tester then you can do bi-directional loss measurement it will measure the length of the fiber also according to the standard and according to the standard because the sender gives you some value for some physical layer standard or I Triple E service standard it will tell it with it will tell and these information are encoded in the tester that for such a line rate with multi mode or single mode fiber you cannot go over such a thins or such a loss budget then the unit will tell you or tell to your contractor a big pass or fail green tax read fail diagnostic easy to understand and then it's better it's easier to troubleshoot the issue after so again Oh LTS to unit can do bidirectional and this one will give you the total insertion loss and total optical return loss some does it will never tell you where the fair where the mapping of the fault is the OTDR on this side works with one unit slacker radar sends light and receive the signal that comes back and the signal that comes back comes back because of impurities on the fiber or bad connection somewhere a bad connection at the panel will create may create a big reflectance minus 30 DB of reflectance it's not good at all I'll come back to that it can create issue so the OTDR and if you want to measure properly the first connection in the last connection you need to use these launch and receive fiber tail all the OTDR vendor ask you to do that it's a proper methodology to characterize the input and output connector if you read if you just want to troubleshoot and you are interested to know where the brake can be on the fiber and not get the real precise measurement of the entry and output connector you can plug a very short fiber there right there and you will get the break if there is a break somewhere but if you care to measure the connection there you need to use this launch and receive ok so the fame to glory of the OTDR is that it can map the issue and one best example is what is my river what is the back reflection at this patch panel that n in set of going at this which it can go out of the data center and if there is a too much big back reflection there an EDF a can stop working automatic power reduction mode drop the power for safety reason if this by example back reflection is worse that then minus 30 DB okay so here what I present is a night still take the sleeve spine example in this example the blue let's say connection you see on top can be copper so fiber begins at the leaf switch the edge switch to the core switch where you see spine in the bottom so this use case represent something very typical that people needs to do in data center move add and change of transceiver if you change one thousand port of leaf switch and spine switch in a migration to 100 gig or 40 gig have a bunch of transceiver turns all you will find some issue at a certain statistical percentage inside the spool of transceiver so the best way to let's say achieve this migration effectively is to get a method of procedure the method of procedure that we recommend it's quite clear and simple if you touch a connector and you disconnect something inspect with the probe okay then you will get a pass or fail on the instrument or on the interface because each time you manipulate you can contaminate keep that in mind train your people to do that and before to inspect you disconnect but you test the old link with a no LTS you don't expect an ish you you just want to validate that this permanent link or this channel link without the patch cord is good and in the budget and you do that for one reason and you want to do that very quickly and that's why the ults is the best technology because it tests that link in three second two wavelengths the use of two wavelengths will help you to diagnose if you have a macro band for instance and it will validate the budget once the permanent link is clear and you know it's good then you move your transceiver and if you have an issue on the transceiver you don't ask yourself if is the fiber or a transceiver the fiber has just been validated that's this kind of method of procedure that is that saves at the end a lot of time okay so that's another use case in this use case it's more at the construction when you construct these permanent link and you want to follow the standard recommendation the standard recommendation the TI a 568 the dash three per instance tells you don't test d-link the total link test the individual permanent link there's one reason to that because this tender have arised the you LTS and the ults cannot map the fault so if you do something more complex than just one basic link you have more chance to have issue that you cannot map so actually you can use the LTS if you follow the standard that I said and you test individual permanent link LTS is good otdr is good also because you want to test the permanent link including the connection you need the launch and receive as I said but what we promote more and more is if you go with 0 ddr because this one whatever the vendor by the way if because the OTDR can map default each and every connection you can let's say take your receive cable bring it here and test your complete link then you map each and every connector position and that's a does the power of the idea i mention mpo briefly at the beginning of the presentation as one of the two well that one of the two centers category of transceiver made by the industry was using one of these two was using mpo MTP you all understand that mt pmpo are multi fiber connector the industry use that right now either in 12 or 24 fibers 24 fibers is two rows of 12 and the the disconnect er is is compliant if i can say for single mode and multimode fiber okay so if we're talking multi-fiber align precisely in a row it's very important to have when you connect these these two these two element the mail and FMA the female you need to have a good physical contact and that's a reason why we have pins guiding pins and guiding oils to really uh sure the perfect physical layer enough physical layer but physical contact and the alignment of the 12 fiber in between so that's the point here you have the yep let me do this extra animation here if you have a good physical contact on your mpo if the guiding pin in the guiding or well a line if you create multiple connects and disconnect you will increase the tolerance of the mechanical elements so you're going to begin to have more in certain laws you're going to begin to have returned laws that will go back to the laser that can unsettle eyes the transmission mpo is a very good connector because it gives very nice than city and its ease the installation of twelve fiber or 24 fiber at a time ribbon but when it's in a backbone inside it behind the racking connected one time clean properly at the connection and never disconnected no problem if you begin to bring that to the to the MP to the transceiver and you do a lot of move at and change of transceiver on the same channel link that goes inside this transceiver you will have issue you will begin to see reflectance issue and one of the big trend of these connector is the migration to 100 gig 100 gig especially with 25 Gigot lane rate on each fiber is less tolerant through two reflectance that 10 giggling and we are kind at a cornerstone in the industry people migrate from 10 giggling to 25 giggling Andy see some kind of issue of reflectance at connection points and talents so it required it requires some kind of testing or troubleshooting consideration I was talking in pio of course you see here a nice 24 fiber a multimode connector these connector needs to be inspected of course the industry came out with tip on the probes of different vendor that can do that job and you can inspect each and every fiber as you see on the picture on the on your right actually and different vendor have different options you can go up to 400 x times magnification you have very easy to use auto center autofocus product on the market and that this this connector can be inspected such as single LC with a different tip that goes a little bit more money but because there is more mechanical parts but that's as easy as the as a single fiber empty empty pmpo cable configuration again they come in different polarity or pin out okay so this is another test consideration that you have if you use this technology or your contractor will add the test because you need to make sure that if you use a type a fiber to fiber straight no reverse its connected with the same type cable extension because your fiber one from one side will not reach what it should reach if you connect by mistake or you can prac today able to type B so before to connect these components you need to validate the pin out or the polarity simple as that their solution for that on the market another big trend that we see of course for those who go with the parallel optics that means mpo connector to the transceiver and that would they want to do more than 40 gig for the gig 100 gig that's exactly the PSM for type of transceiver I was talking about so what what people people buy from different cabling solution vendor pre-terminated fiber infrastructure they receive that they want most of the time to do a pre quality pre-installation assessment because these pre-terminated infrastructure what is between the point one whoops sorry what is between the point i'm gonna get there these these pre-terminated infrastructure comes on pallet on the real and the transport on the transport you can have some issue so people want to do a pre installation testing stage using otdr to see if something is break somewhere along the fiber they install it and after the installation to make sure that any fiber has been kinked or bent or damaged they do again the installation again with the same otdr to see if the installation stage as induce issues okay and the industry has come up with companion for these OTDR and mpo switch namely as you see here then you can convert the power of the OTDR to the MPO interface so that's very easy to use and of course you have the different lunch and receive that are used to characterize the input and output connector so as you may remind that the site before there was three big number green on the bottom it was three connection point that's what you get when you test with an OTDR in an NPO switch that's what you see this is the lead this is a switch this is a lunch fiber this is the first connection point second connection point third connection point for each connection point you will be able to get the detail of the insertion loss the reflectance you will have the accumulation of the total loss on the link OTR is very powerful you get all these detail very easy to interpret and you get a summary of each fiber on this connector provide you a pass or fail and in that case that was 11 good fiber one bad fiber this fiber ear if you see was good but you see a fail on the top because that's what that's the other fiber companion on this connector that was l okay so yep I was talking about back reflection a little bit earlier in this scenario of migrating from 10 gig to 100 gig that is an issue that we see since a couple of years with people that have started to deploy 100 gig coherent then and that's let's say what we continue to see with those who start the installation of the first shipping if I can say of the PSM for on 500 meter in single mode what I present ear is the result of an application note that we did the reference is there you know I will explain we did 19 connection mating of connectors okay for each connection we took for sample of data for each connect EULA each of the 19 connection the sample of data were contamination at different wavelengths so clean connector at 1310 clean connector at 1550 dirty connector with finger oil after it in 10 and dirty connector at 1550 and we measure the insertion loss which is a very popular parameter for qualifying the quality of the installation of fiber infrastructure what we see is a strict correlation for each connection okay each connection are created equal but not are not created equal sorry so each connection can vary from point to point 15 2.6 of course everybody understand that but for each connection we have a very strict correlation for the four sample ok so one nice correlation as we see presenting different insertion loss performance for current connection 19 sample 19 item like this normal what is happening if we look instead of looking at insertion loss if we look at Richard loss or reflectance that's the same thing and as I said before reflectance is more important if you deal with higher wavelengths I our line rate sorry or I or I or speed interface 100 gig in namely and it will not be different in 400 day the same 19 connection the same sample of contamination but two different correlation the clean one are straight always I are 55 DB of reflectance no issue there all green lights no problem and the dirty one those will have finger oil on the end phase of the connector it's systematically 10 to 12 DB more issue more reflectance finality okay just for figure oil a 19 connection stricker relation of contamination and where you see the red spot it's because I took the liberty to identify 6 connection out of nineteen thirty percent of the sample that are below 45 DB minus 45 DB of reference y minus 45 well the standards tells us that the components center the 11 801 I IC or the CIA 568 tells us for single mode because this was single oh by the way 35 DB is the the parameter the level of reflectance you need to look at before to cry out loud okay and that's a lot of the reflectance so the another sender identify 45 DB that's pretty much what I see as a good target and some customer discussion told us if you deal with coherent at minus 30 DB as I said before you have a PR issue automatic power reduction mode from the edfa that will create a lot of issue of the transmission signal or the direct modulation type of transceiver we look at minus 42-45 pretty much like this so again in this experiment thirty percent of the connection contaminated created big issue based on these threshold yeah other issue that people face a lot in the data center I mentioned that you do a lot of move out and change of transceiver and okay you can get the procedure I rev recommended before validate your fiber if you have an issue you will know that it's not the fiber is a transceiver but if you have an issue with the transceiver you're not more advanced you don't know what the issue is so the industry has came up with some solution base on service on on module that tests 100 gig performance and software application that will check the quality of this transceiver or I speed pluggable giving quick pass failed verdict base on a less than three minutes test that will do a sequence of six tested looking at the power consumption the temperature the skew and not for load test of 72 hours of course three minute you just want to know if the strength Seaver is good or bad and when you have to install thousand of transceiver it is let's say the communication with your vendor to get a tool like this that gives you more information on what are the issue what are the issues is it as we see here one lane that doesn't a deer or one bad lane or things like that the tool will tell you that providing you a verdict pass/fail of the quality of this transceiver I'm more in the DCI section of the top right now the data center interconnect and I bring to the table or on the presentation slide one in four one important topic network latency it is more and more an issue data center people are strategically mapping the construction of different data center to reduce the distance to you to their main market why because fiber itself induce latency if you run over hundreds of kilometres of fiber you will add a native latency you cannot hardly reduce okay you just need to bring your data center closer we cannot put data center in each own of each subscriber of course but and on top of that different application the one of the best example is the trading desk or the financial people where portion of millisecond means million of dollars profit in trailing okay so latency is a big issue you need to control that you need to measure that and you need to solve reduce that but before to talk about key performance indicator of the service itself like latency packet loss bandwidth qos bit error rate estimation things like that you need to make sure if you deal with your data center interconnect that the physical layer itself doesn't bring any issue okay so you need to especially when you rent fiber from people that cannot tell you when it was contracted constructed cannot tell you what type of fiber is it G point 6 52 g points is 57 is it a mix is it a patchwork a fiber ask them do you have chromatic dispersion and TMZ data for me look at their face ok marca our customer asked this information to let's say people the rent dark fiber or lease fiber and most of the time they decide to invest themselves or contract people to do the test themselves to get the information because at the end of the day it's your subscriber that you will get you that will feel issue on the key performance indicator most of the time protocol layer created by physical layer issue baseline okay so you need to do CD PMD in van os NR orosa characterization if you use DWDM or things like that or EDFA and when this is when its clean when you know there is no problem of course then you go to service activation testing talking about service activation testing before to do that you need to well 11 good best practice is to make a bit error rate evaluation okay later to test traffic of the traffic of the validation of the traffic of the on the link you estimate the bit error and then you know that the first it's a layer of your protocol protocol stack is clean when you have that dated before that the physical was okay then after the birth of course comes the more service activation method oriented testing multi-service eater net testing service activation methodology based on why point 1564 of the itu-t is a good example TCP throughput test cell everybody that has that but some people financial and big data user want to be sure of the true put and they want to get some visibility on this testing based on the RFC 6349 okay and for a portion of the people where time is very important it's not one hundred percent of the market of course but sink II base on 1588 is another type of service that can be done on the service activation level and because you have many service on this fiber you need to test these these parameter or test on each class of service that's important and one best practice is always to test from 10 maggot 2d committed in formation rate the Cir ok if you go at 100 year 10 meg to 100 gig if you go a little work 10 meg dolor okay and again the industry have solution for that okay last two slide I think and you were good in time I'm switching now a little bit more to the monitoring portion of what the industry can benefit in term of type of testing this use case is typically the data center you can you can have a very big benefit to do an intro pod enter pod inside the same data center or enter data center type of monitoring to characterize all the layer to layer three well using layer two layer 3 technique and measuring the mess performance between all these probes across all these probes so that's that's one way to see the thing it generates a lot of data but again the industry come with dashboard analytics and all that 2lp you find what really matters inside all the data that are generated and what last but not least and while it applies to the next slide but i'm gonna send tell it that there you don't need necessarily to install our our Bey's probe the industry came in the last two years with virtual probes that you can install on your own server that reduce the cost of the deployment significantly I'm getting a little bit out of the data center ear I'm more in the data center interconnect data center operators that have more than one premise one building sometimes separated with the serious distance or big distance sometimes intercontinental thing like that will get a lot of benefits to measure the performance of let's say the service they provide between their infrastructure it were outside of their pod again in different category of service if you provide voice over IP IP TV video over-the-top IP VPN service skype business is a good example also very popular these days inside it well data center community help Microsoft how are the company to provide the service and what we understand from some discussion we add with some people the people that lets say house the servant service and provided to different region can gain or oh they gain to know desert the quality of experience they provide to their subscriber let's put it that way and that's where the city industry came up with its a mix of passive probe an active probe an active probe Alps to perform some emulation of service so you can emulate voice video web transaction or streams and capture the quality of experience performance as your subscribers see it because it's kind of a test stream that you're monitoring and everything generated by a virtual probe or a base or hardware based probe inside your network so accumulating data with passive probe is interesting but simulating traffic and doing the same measuring the performance scenario that your customer or subscriber see the service is always very good because at least if some people complain you can reproduce this kind of behavior in some portion of your network so you get a better visibility so that's pretty much it for me two minutes before a time or two hundred thirty second just take the opportunity to invite you to a linkedin forum that we started vendor agnostic always we want people to share to share concern sure ask question things like that and it's open to the competition it's managed by x fo again vendor agnostic so go there if you love what you see i invite you to subscribe and i hope that this talk was instrumental that you learn a bit of things i'm going to be here for the rest of the week so if you have discussion if you want to chat about some things ask question it's got to be my pleasure to discuss with you later today or tomorrow and if we have a bit of time for question that's pretty much where I am so thank you so much you you
Info
Channel: NANOG
Views: 1,140
Rating: 4.6363635 out of 5
Keywords:
Id: B3V237QtqNA
Channel Id: undefined
Length: 43min 41sec (2621 seconds)
Published: Wed Oct 19 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.