Ask an OpenShift Admin (Ep 23): DNS and cluster networking

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] good morning good afternoon good evening welcome to another edition of ask an open shift admin i am chris short executive producer of openshift tv i'm joined by the the main you know host of the show andrew sullivan but we're also having our special guest the one teammate i know that brags about his knowledge of dns christian hernandez how are you today he is dns man yeah it's uh well you couldn't have an episode about dns without without you i know yeah without me jumping on yeah i'd be i'd be almost offended if you did i am happy that you are here today with us in person instead of just participating i say just participating on chat um you know i think we do that to each other quite a bit and participating in each other's streams just not not actually in person uh so hello welcome everybody to the ask an openshift administrator office hour stream uh so this is one of the office hour series of streams that we have here on openshift tv and the goal here is really for you all for our audience the people who are listening and watching in to ask us anything at any time about whatever it is that is at the top of your mind so please don't hesitate to do that you know we more than welcome those questions again whether or not it relates to whatever we happen to be talking about today we've already got a question great that's that's not a question but like interested in seeing the episode so yeah sorry speaking of which i i didn't even i didn't look at the staged uh youtube channels to see if anybody like last week remember we had a question four days early so yes we did not have that this time i don't think i don't think i haven't seen it at least i i mean i always have the restream chat open so i feel like i would have seen that i had it have popped by kind of deal right so all right well thank you everybody uh welcome and again please don't hesitate to ask those questions at any point in time uh so today's topic as or if you didn't see any of the social media or other things leading into today we're going to be talking primarily about dns and also a little bit about cluster networking but primarily about dns uh and the genesis for this is christian and i spend uh what what feels like a substantial amount of time i mean relatively specific a lot of time yeah yeah talking about dns and dhcp and dhcp we we think is starting to get out there right the the requirements the configuration how it affects things and all that but dns is still largely a black box to folks you know they're not folks yeah yeah there's some of these you might have read the rfc yeah well [Laughter] understanding how it works and knowing how it works in openshift and kubernetes is not the same thing exactly and that's what we often see as a problem so today we wanted to kind of look at you know maybe demonstrate a little and and explore dns inside of openshift and keep in mind that a lot of this will also apply to kubernetes as a whole so it doesn't necessarily matter if you're using openshift or not hopefully we'll be able to answer some of those questions dns is a universal standard yep and it's it's funny because you know i was an infrastructure administrator for a long long time and uh it was it was always the network's fault until virtualization came along and then it was storage's fault right and now that we've kind of worked past all of those right most folks are using either hybrid or or all flash storage now suddenly it's back to being the network's fault and a lot of times it's dns enough yeah most of the time it's dns i have a nice haiku i could post yeah i was going to say where's the haiku about that the dns haiku by it's posted on people's walls i would imagine yeah um i very seriously my old office yeah yeah i very seriously figured out how to put or was looking into how to put up one of those little sidebar things like the news that just had an image of that haiku before we did this oh nice yeah i think um there's there's some software that that does that um oh yeah it's like a virtual camera and you can post things on it's yeah there's there's a few of them um so before we get started before i hand over to you christian and start asking you a lot of dumb questions uh there was a few things that i wanted to talk about uh so for those who are regular watchers regular uh attendees of our stream you know that i like to cover a few things that i have seen either internally or externally that i believe are relevant uh and also important things that maybe you haven't heard yet so let me find the right tab here which is always an issue these days always the tabs i've actually have very few open today i just have a lot of windows all right that's the one i'm looking for oh there you go so first and foremost a bit of shameless self-promotion we have been doing a weekly blog post after each one of our shows they come out on fridays so this one happens to be from last week if you missed anything from last week not only do we have a link to the stream itself but we also go through and try and put as many of the questions as many of the things that we can find in there so if you want to right quickly skim because i know you know one of the big things about doing a video stream is there is no transcript there's no way to do like a control f and find a topic of interest so i i try to go through i try to identify the questions and pull them out and then link to where they're answered inside of the inside of the video so nice these these are better than our release notes it's pretty good it's awesome dedication right yeah no it is there's a lot of that's a lot of work yeah definitely kudos i gotta come on this there we go yeah window gets in my way and yeah yeah i know yeah i'm always trying to figure out how to move it and i never need to like like i have three monitors and it needs to just like go down right you know just get out of it yeah yeah so i will paste a link to that into the chat uh the second thing i wanted to talk about is this is a bug that's been in place for a while i had actually you can see here it was first reported back in september late september i actually had not heard or encountered this i don't like to bring up bugs very frequently because you know they happen all the time so usually until it's something that i feel is uh coming up or or more i guess serious or coming up more frequently than a any generic old bug i'd tend not to bring them up so uh this one happened because i saw no less than three customers this week that encountered uh the scenario where their vsphere ipi node had gone offline so for whatever reason the worker node was offline right maybe it was turned off maybe the physical node had had you know powered off failed for whatever reason some yeah and the uh ipi process effectively reaped that node right it deleted the node because it was going to provision a new one to recapture that that capacity yeah the problem is that virtual node the vm had some pvcs attached yeah and so the the power the node was running it had some pods those pods were using pvcs the node failed for whatever reason so it still had those vmdks attached to it okay and when ipi came along and destroyed the node so too did the pvcs um yeah so that's that's not good um you know data loss is never a good day on at any point in time ever uh so a couple of things to be aware of here so first and i'm scrolling down yes there's a number of missing comments in here i apologize uh i i'm not logged into bugzzilla right now because there's customer information associated with these so i don't want to show that that information to the worlds but there is a fix for this i believe it was just committed yesterday evening or maybe early this morning um or pr just rather so it should be fixed in a coming release right hopefully sooner rather than later but for now just be aware of you might want to set that if you're doing auto scaling if you're doing or using ipi and the auto scaler together to reclaim those nodes or to otherwise restore those nodes you may want to configure the auto scaler with some sort of delay so that way it will have at least five minutes so that way in the event of node failure the kubernetes scheduler will say hey i need to i need to you know clean these up and i need to move them yeah give the the scheduler enough time for those uh for the workloads on the on the node to move to another node before it it reaps that that vm yeah yeah exactly uh so another one and man i always feel i feel bad bringing up two bugs um that makes it seem like a bad week even though no it's well you know what you can actually feel a whole show of just like looking over cooks but how depressing would that be yeah that would be a really uh so so this one i've seen come up a number of times um internally especially i've seen it come up a few times externally as well essentially we have uncovered a uh inconsistence bug for uh core os virtual machines deployed to vsphere using openshift sdn so we think this has something to do with the vmx net drivers that are in the rel 8.3 kernel the one used by coreos 4.7 and it's effectively this manifests itself as some packet loss so if you're if you have upgraded to 4.7 or if you've deployed a new 4.7 cluster on vsphere and you're starting to see you know one two three five percent packet loss this could be the result of that there is a workaround i don't know similar to the other one i'm not logged in because there's some customer information here there is a workaround it doesn't look like it's in the public comments if anybody needs that just let me know andrew.sullivan redhat.com i'll share that with you effectively it's setting or turning off the vxlan offload for the network adapters inside of the virtual machine so it's just a couple assist controls that you execute you can test it by oc debugging or you can use machine config to set them in a reoccurring manner nice okay getting out of the uh things are broken things are bad you know bug bug reports hey we love failure we love you know helping folks you know fix their problems on this show channel agree you know i i try to you know again it's it's uh i i try to preempt or try to make you aware of things so that way uh you don't accidentally stumble into a bad a bad situation yeah um so the next one is a good one uh you know i i talked about sizing in a couple of different ways with three or four different aspects whether it's sizing the cluster sizing the nodes sizing storage right all those things across a handful of shows here on the openshift admin hour ask an openshift admin hour so recently you see march 16th simon delord published a blog post here that walks through application sizing principles so a really good reads you know share this one with your your application developer application admin peers that kind of walks through all of the different relevant settings different capabilities things like that to help you size those clusters both how the application is allocated resources and how those resources are then reflected inside of the cluster and when i do the show notes for this one i'll uh i'll try and link those other episodes inside of there if i remember to make a note to myself yeah also your episode on um uh resource limits is really good also to link there because that has um going from you know going from a vm to to a container is very different because the the developer is used to getting 8 cpu 16 gigs of ram because of like scale right because oh i need to be able to now that's different now you're essentially constraining them what do you mean i only get you know half a cpu cycle it's like well it doesn't completely work that way so um that that episode your episode of uh of uh of resources resource limits and ranges is good too i got it i got the link right here yeah recommend those users watch that as well drop it in chat here for everybody there you go all right so the last topic i wanted to talk about before we get started here uh is one and we were joking before the show started that uh i'm i'm just gonna kick this hornet's nest um because it's it's a bit of a a sensitive topic um i think internally and i know that there are some people who uh are passionate about it externally as well and that is reference architectures oh you said about i know i'm sure if any of our our marketing peers are watching right now they all just had like their stomach not up because they're afraid of what i was just getting ready to say [Laughter] you know like when you said it prior to the show i was like oh boy [Laughter] so reference architectures um for a little bit of background here so anybody who has been keeping up with openshift for the last roughly two to three years you'll remember that we used to publish reference architectures more or less every version or every two versions for various uh i see somebody asked for the sizing link it's uh it's the second one up the sizing applications right above the last chat yeah scroll scroll up a little bit that came from let's say youtube yeah yeah i'll post it over there just in case okay um so we used to publish reference architectures red hat published reference architectures regularly and we stopped doing that for a couple of different reasons so the big one was quite simply resources right constantly trying to and having you know devoting not just the people resources but the physical resources and all of the things associated with going through and generating those documents was you know quite frankly hard and it really was and invariably it always led to you know if we used one partner's you know set of resources instead of another partners right sometimes there would be you know hurt feelings and stuff like that um so it it became a bit overwhelming in a number of different ways so the decision was made rather than creating reference architectures which the perspective was and i would be very welcome to anybody's um anybody's additional thoughts or perspective on this but the perspective was that most folks read a reference architecture not to literally implement it as written but rather to glean right rather to determine what information is important what information is relevant what decision points do i need to know to deploy an openshift cluster in my infrastructure so it was really um for lack of a better term right it was it was a guide it was a consolidated source of documentation for here's one way of producing an openshift cluster so the docs team took upon themselves to take all of that knowledge and put it directly into the documents so if we look at can i type here no my browser is beach balling on me oh oh it's grayed out everything oh great out there we go there we go all right so if we look at the documentation here we and we scroll down to for example this scalability and performance section you can see that and this section has grown with each each release they are doing exactly that they're taking as much of that information as they have as they can get and putting it directly into the documents right recommended host practices is a great one and allowing you our audience right the people who want to ingest those reference architectures to really understand the rationale behind those decisions right not just oh i see that you know in the reference architecture they said you know this value to x well why why was it set to x sometimes we explained that sometimes we didn't so through the documentation the goal is to provide all of that background all of that information so you can make the right decision for your infrastructure so with that in mind however we didn't completely abandon reference architectures right it seems that way and some of you may be thinking well i have seen some reference architectures come out and you are absolutely correct we now rely on our partner ecosystem for those so i'm going to post this blog post into the chat here so this one which is coming up on a year old now so i i assume dave will probably update it and i've known dave for a few years now he works over on our partner team right they do a lot of the reviews of these types of things and the goal is him and his team work with for example if we scroll down here right cisco and dell emc and hitachi and hpe and so on and so forth everybody and those partners create the reference architecture which is then reviewed and contributed to by red hat so if we were to look at and i'm gonna pull up one of these right this is the hewlett packard enterprise the hpe site for open shift reference architectures i'm also going to paste this in here so if you are using hpe hardware and let's say you want to deploy openshift onto your dl whatever servers right dl380 i don't even know what the current gen is so we can click on this 4.6 gl and dl excuse me and it walks through all of the different components of their reference architecture this is all created by our partners so they still exist they're still there they're just not created not published by red hat red hat yeah it's it's yeah it's the vendor that you're trying to put openshift on and that intuitively that intuitively makes sense because being a software vendor red hat like we we one really can't recommend hardware because there's just like a ton of hardware that we're supported on like we like andrew was saying we're are we're just we don't have enough people to go out and buy every single hardware and you know vet everything every possible combination so we we lean heavily on our um um on our providers like like hp and dell dell emc to do these right and we can review them that seems to be a better um better use of our time rather than trying to build a team with an impossible task yeah exactly and to together we are stronger right so yes exactly i just thought of that that simpsons episode yeah i was thinking of the exact same thing okay so the one where they they mimicked lord of the flies yes exactly exactly that's great so i happen to click on hpe over here um you know you see netapp and cisco have one for flexpod netapp has one for their you know netapp hci there's there's you know literally dozens of partners that do these now you will see one exception to this and that is openstack because red hat is you know red hat openstack we create a reference architecture and actually it's one of the uh or two of the folks on our peer team the field product uh where they field product managers fields yeah yeah yes yes yeah august uh simonelli resox and him and company right so they they create that open shift on openstack reference architecture and that's because it's a whole you know red hat stack there just like if you were to you know similarly and i think i've said this before cni plugins csi plugins that come from partners we don't document those inside of the openshift documentation we rely on the partners for those the exceptions being if it's you know red hat virtualization or a red hat openstack platform those are documented by red hat because it's a holy red hat solution and that's that's all i've got these these other tabs are for uh our topic at hand so i will wonderful i i i will now stop pontificating about things unrelated to dns and uh essentially i'm gonna start off i'm gonna play the role that i was born to play of dumb guy um and wait that's my job we can share well we can share we can all come together [Laughter] uh so so christian um you and i started this conversation last week or maybe the week before um and really it came out of some confusion around how does dns resolution happen for openshift and it turns out that this is a little more complex than you might expect so can you elaborate on that and talk about you know dns in openshift and the difference between node-based dns and pod-based dns yeah so um so where do i start right um you know what i mean with with uh with with uh with with dns so i um originally i was gonna start like in the beginning what's dns but i imagine that at some level uh since this is asking admin um there's some fundamental understanding of what dns is right so um dns and kind of sort of what it um how it works from a high level so um i'll start with the fact that i guess you guys all know how um you know what dns is what the root servers are what um you know what uh um author uh you know server authority is and all that stuff so um so dns in open shift is uh kind of a layered um design right so originally dns in kubernetes so uh i think i'm just gonna start there dns in kubernetes and what it's used for then i'll work my way up so dns kubernetes is used for service discovery right so this this is um it turns out dna that that problem of service discovery was solved a long time ago with dns right there's been a lot of software that's tried to do service discovery you know people writing service discovery and in the end it just turns out like dns was the um so what do you what's the answer paul's therefore what do you mean by service discovery so by by service discovery meaning that um as an application or as a service inside of openshift kubernetes in general um it can reference a service by a name for example i have a front on web give a stupidly dumb yeah two two tier application right i have you know node.js front-end and some sort of database back-end i want to be able to just say database in my application and it should just work right i shouldn't be as a developer or even as like an admin or as an end user i shouldn't have to care or memorize ip addresses or what the ip addresses are or how to connect to that so it should be as simple as just saying my sequel right or database or whatever name you decide to deploy your database as so that's the idea of service discovery i can reference something by name and it automatic automatically knows the ip address so and essentially you know um funnily enough dns is a perfect um software to do that because that's just what it's been doing um there's uh and that's what originally um it was the the original intent right for uh kubernetes service service discovery so in kubernetes that's kind of like the the bear um you know uh the the the kernel the the core of the the use of dns inside of openshift yeah so um pods are deployed that represent some micro service right front end backend database whatever that happens to be and then they have quite literally a kubernetes service in front of it that says point to these pods and it it discovers those ips using this front facing right external facing external being still within the cluster but right dns name so that other components can say connect me to front end and the service translates that into the set of pods the set of ips for the pods that represent that front-end microservice correct yeah so it did you can um you know just reference things inside the cluster in general by a name right so it's essentially you can even connect to other pods um you know in by name right so everything is managed and everything has a name um and that's managed by the internal dns server in kubernetes right so uh before that was based on uh um cube dns they've actually um upgraded um a while back to core dns yes right which is originally it was sky dns i remember remembering dns yeah and it was dns yeah i remember compiling sky dns and the very first time i deployed kubernetes in like the 0.8 days yeah yeah exactly it was like sky dns it was it was also bear bear bare bones dead dead simple right and so i like um let me where am i here i was gonna share my screen a little bit and kind of go over the docks um a little bit so let's i can only find there we go uh the button make sure it's the right screen wherever you through it now you need it yeah wherever yeah exactly now i need this little there we go oh i can move it i'll move it over here how about that there you go way way over here on the right side um out of the way yeah so um dns right so i will um what's what's funny about searching for dns uh on the openshift docs is that it's like not the first one um there's the api for networking right so um it's under networking that's that's how i remember how to get to it um and then there is understanding the dns operator so um openshift deploys um um core dns via an operator right and so um let me make this a little bigger uh because there we go i can make it bigger if you like is that too big no okay 110. seems good um so we can you can actually get bigger let us know please yeah just let us know right yeah um i can you can essentially copy pasta this uh let me let me make sure i'm at the right cluster be i upgraded this um last night okay so it looks like the upgrade went okay not that it shouldn't but you never know very um of you yeah right right um thanks yeah exactly see i say i said dns i didn't say openshift um [Laughter] so this is um it shows the dns operator and um just like all things in openshift it's controlled by an operator so we have a dns operator here um and uh you can see the version here and that's all good so um what's really cool about this here when you describe here it'll give you information about your cluster and so um the important information is this uh this information here so the dns service for um for kubernetes runs on the uh the service address right so the service address um from a high level i guess now i could i could play dummy to you andrew is an overlay network uh for the most part it's an overlay network um that sits on top of your regular network right a software defined network um it's possible but this will get i don't want to rabbit hole too much but it's possible to have kubernetes running without an sdn if you ever did kelsey's high towers um kubernetes the hard way he shows you how to do that um it is hard it is hard because anytime you add a node you have to add the routing table to all nodes and it just makes sense just to use a software network so um this is an overlay network so when you see this this is my service overlay network and my cluster domain is cluster.local which is the default so cluster.local is the domain for kubernetes the internal domain for kubernetes so let's take a step back for a moment so the core dns service inside of openshift is for pod resolution or more specifically service name resolution for pods correct service name resolution for pods so this cluster.local is meant to be only for internal resolution and is has no bearing or no uh importance with for example the name of your cluster right so your whatever your cluster name happens to be you know mine is usually you know like something.work.lan or whatever that happens to be nor does it have at this stage anything to do with your upstream dns servers right at this stage no we're just we're at the pod service we're at the service level uh because that's where where dns uh core dns really shines right it's the um uh at the service level so and then just uh just a quick aside that cluster ip that you see there the 172.30 so that ip address will come out of what you define as the service ip range in your install config.yml so if you change that from the default you could see a different ip there and that's perfectly fine yeah yeah so it it gleans that information from the range you give it and it'll it'll take that information and then it'll create this ip address so and this is the dns ip for your pods and that's the cluster domain so uh one also quick thing before i actually just drill down into the pods is the um uh the cluster.local is um you shouldn't name your uh your cluster you know openshift.cluster.org like that's yo yo you'll have you'll have really bad bad time i don't even think anything doesn't local anything down essentially anything.local so that actually came up with um with some customers because they've had like local domains so um kubernetes has essentially ruined that for everyone so we have a question christian so sonics uh so how does this talk to the router how does the service talk to the router i'm assuming core dns and the uh and the dns operator there you go yeah so um i'll actually i'm actually going to get to that workflow so um i imagine you mean yeah um so but in general the service um the the service layer is a non-routable layer but that doesn't mean things can't get in and out so but i'll i'll talk about that workflow in a bit so um so yeah so then this is the um the ip address for uh the dns service inside the the service name right so um if i go to back to the docs um and that does a describe um oh and this gives you a nice json path there you go yeah uh i've never seen that dollar interesting if anyone knows jq let me know what that dollar does um so uh that's that same that's the the service network range so as um as sully was talking about this is the range and that's and that matches this this ip here let me clear it um i always like to clear because you know people don't like looking at the bottom i know i hate it when someone's sharing their screen and they they're like clear so my head my head could go up um yeah yeah so um so before uh i wanted to go through this dot line by line but i think i'm gonna skip around a little bit because it just makes more sense right so um the core dns has a um has a configuration and so um and that's this is stored in a config map right and by the way andrew and chris i can ramble all and on so please give me time checks so i can get to all the points i want to get to um and so uh so this is the configuration file all right so if you do that um oh i hate this manage fields i can't wait to kubernetes um in 4.7 in the gui it automatically collapses those in yeah in 1.21 i guess which would be openshift4.8 it'll um it'll take take out the manage fields you have to specifically ask for them so um so this is uh the configuration file for core dns it's pretty simple um it first gives the um the uh the domains it's it's it's the authority for and then um it even gives uh an arpa for ipv6 as well so um that's you know something some our customers ask for is ipv6 support so um it'll send information you know to prometheus so you know it it has that plugin already set and then it has this line here called forward dot resolve dot com so let's let's explore that a bit so um so the pods all the pods have their resolve.conf set to let me describe that again to this ip address right so that's um you know uh dot 10 right 172 380.10 so let's get let's get some pods um i know i have an app running here uh there we go test just my sequel oc uh n test rsh this guy and then do bash because for some reason sh everyone likes ssh command not found clear okay well bear with me can i do this yes i can okay and so um yeah control that was control by the way yes i didn't see that um if i do a cat etsyresult.conf it has a search field right so this search field breaks down pretty easily first it does the um the namespace right so the namespace take is is a part of the dns um so it says you know anytime you do a lookup look up first look up the name from this domain then this domain in this domain and then finally this domain right last one is your external correct yeah this one is my it's external or my my internal but to open your external openshift um domain um so if i you know if i try to do do i have dig no ns look up that ping is a very small container um yeah so when it is host on there no this is a very small container okay i think i think this is alpine the it does a search right if i do like an nslookup of fubar dot test is style local oh sorry if i do a foobar it'll automatically tack this on look for that then tack this on look for that and then it'll go down that side um then it has the name server for this pod to that the dns the internal dns server inside of openshift so um it says 172.30 so it'll look up this is its main dns server and it's the only dns server this pod has so um it'll only ask the the internal dns server so what happens if this pod tries to reach outside the cluster right so what if i tried to do a ping of like dns1.ocp4.cloud.chx right or if i do a ping of google.com how does core dns um handle that request right and so um the answer is wasn't that other uh was that yaml the answer is in the next line right so um it said okay well forward everything that i don't know about to the file um and in the host resolve.conf file look for um everything sequentially meaning uh check the first second third fourth one two three four five seconds yeah exactly and so um so in the pod it sends everything to that dns server so that's first and foremost that's the only config in the pod and then the core dns server will then say okay well i don't know this i'm just going to forward this onto my host resolve.com file so let's take a look at what that looks like uh oh this is a windows server interesting okay oc r d debug this is this is what this is a server that i hack on for windows containers by the way so um hopefully no blue screens here so this will um so now we're basically remote shelling into the um the uh cluster the node here yeah that one node so i do uh ch root christian just to uh i answered in chat already um so uh another question how do you how do you change the order of name servers after the cluster has been installed um and hopefully our answer is a line yeah what did you answer and i'll say yes so i would i would um so uh changing the order i would put it in um like in the machine config or if you're using dhcp just change the order in dhcp um if you have a particular name server you want to connect to for a certain domain i would use uh core dns's plug-in feature and like the only plug-in feature that we support right now is forwarding and i'll go over how forwarding works in a bit so um yeah so it's in line with what i said i just didn't do the machine config part of it yeah yeah yeah well the machine config if you want to just actually change the order on disk or dhcp right like if your dhcp is handing out your name servers then just change the order there um and next time the lease happens it'll it'll flip them and so yeah so here i'm i'm in the node and if i do a cat etsyresult.conf um notice it it's uh i'm using nm state right um and so here it'll choose this name server then this name server and this name server sequentially right um you'll notice that the first entry is um my ip address right so i could do ip adder right uh [Music] is it where is it that's a lot of interfaces yeah well if an open shift all the pods yeah it's all yeah yeah although pause right yeah it's all the uh virtual interfaces like this one that's why can you do a find it's there line eleven there you go oh it's interface let's do that yeah hey what do you have so yeah there you go so um so this is for um the the hosts file right so it'll look in the host file first then it'll do dns one dns2 in that file so that's how that that um that configuration works if if i go back here um it says forward everything i don't know about the resolve.conf mhm um and then it'll use those name servers um as well so very cool there is um a few things i wanna um so here right it tells you operator status you know you just describe it and just kind of see um you know it was degraded because i did uh earlier because i was doing an upgrade um but that's and then the the logs here in the dns operator um that'll give you uh information about the logs so one thing to note um we have a question christian um so from fahad for vsphere ipi does the dhcp server have to provide dns records for the nodes and i think what that's asking is does dhcp need to do dynamic dns updates with the dns server so uh i'm gonna say soft yes it has to um there is a way to provide um dhcp for only the ip addresses and not dns uh that gets a little laborious i would say and i would i would recommend using the dns forwarder uh for configurations like that and i'll i'll explain that uh in a bit so the dns forwarder um it's probably the catch-all for a lot of these use cases um and i'll i'll add on to your responses saying i would i don't think it's required but it is strongly encouraged so and that is specific to on-prem ipi so the on-prem ipis use mdns for their local node resolution so inside of the cluster it will resolve those ip addresses without an issue because it's using mdns for that purpose however if you wanted to do external resolution like you saw christian a moment ago do an ssh i think maybe it was a debug but if you wanted to ssh to that node or you needed to have like you're exposing a node port and you wanted to reference that by name you know dns name of the node instead of ip address then you would need to have external dns have those node names which is where that dhcp updating dns would be recommended but as far as i know with on-prem ipi so vsphere rev etc they they don't need node names in external dns i'll also add on that if you do have dns names and they're wrong or they are different the nodes will actually take those names and use them instead so reverse dns will override the name that's given to it 15 minutes yeah okay man cool so thank you so here um the the dns service uh here runs as a stateful set uh not safe will say uh damon said i think yes so damon said meaning um you know it it'll uh i have actually i actually have uh seven nodes or one ohm's a windows node but the staple set selector is set to os lytics right so if i do a um demon demon set sorry demon said staple i was working with something else with staples and i have staple set on the mine um and so yeah so that i have i have six so there's um the daemon set does uh you know one pod for each um each host that's added with this label um and it's running in every cluster right and you notice that there is um three containers per so let's do a describe on that uh oh describe uh slide yeah what are you describing you should know you should know what i you should know what i want i should just intuitively pick that up right like yeah yeah i should just know what i want i could read my mind right there was um i used to work with a guy who would make um a lot of typos and you i always hear him tell the computer right because we always yelled yell at our computers right like do what i uh do what i meant not what i said [Laughter] so uh the the container that i wanted to call out inside the pod is the dns node resolver and um its file is essential its job is essentially to manipulate the etsy host file right so if i do a um or is that rsh and i think that's on the pod uh hosts no it's not here so um its job is to basically manipulate the etsy hosts file on the pod and so um uh you know notice this has like the ip of the pot so this is my ip my ip meaning the the pods ip address and all the possible names um that's associated with that and that's what the um the job of this uh what's in your resolve dot com for that uh on the ipod yeah yeah i know we looked at that a moment ago but just to compare yeah so it has um the dns server right so it doesn't have to go and look for the um the name for like the pod itself yeah right because it has that entry in etsy host uh for anything else it uses that dns and if it doesn't find that then it just forwards it on to um the etsy resolve.com file on the node so um the last remaining five minutes let's take a look at i've been promising the dns forwarding yes right so um core dns the uh if you look it up a core dns uh let's look that up there it is uh it says i was doing this the other day [Music] it was like like a y or something like that uh yeah plugins right so it's it's a um uh it has the ability to do plugins right so it's essentially core dns was made to be extensible and one of the things so just beyond name resolution one of the things is that forwarding is written as a plugin and you do that by editing the uh the default configuration so let's take a look at that here without the dollar who put the dollar there that copy that tell docs okay um so here essentially uh yeah so essentially here let's let's do this um okay so it's not there uh spec we're deleting you spec and adding these here so um here you're plugging in dns saying that okay so anytime someone looks up food.com instead of going through that um that whole chain right so let's kind of recap the chain the pod looks at its etsy host file locally then it looks at um the its own resolve.conf if it can't find it it will then forward it to the um resolve.conf file on the node itself so instead of doing that the the pod will then uh the the d sorry the pod the core dns when the pod asks core dns it'll say well for any domain foo.com i'm just going to forward that request to these dns servers so this is um you know i i ran it in my day i ran um you know clustered dna server sprawled for for my company across all over uh north america so um you know sometimes you do dns delegations sometimes you have for whatever reason protect it dumb reasons or when we say dollar sign reasons you have yeah you know that's kind of isolated from the other ones that they don't know about each other um so if you have you know stuff out there that you want to reach out to you may have to ask that dns server um and this is essentially an array right so i say foo server the zones right um i can you know put baz right anything under baz and use this dns server what i find cool about core dns is that you can specify a port you can't do that in bind um bind everything's on 53. well bind you can listen but um the client can't ask on other ports so um and then the bar server yeah bar example.com forward it to these uh dns server so in this configuration then once you know if accordion then it kind of injects itself in the middle if core dns can't find the name instead of forwarding it to resolve.conf it'll try these servers and then from there if it doesn't know then it'll forward it to the resolve.com locally so um this is a way to um you know add different dns servers in um in your cluster right so as opposed to like you know the core upstream one from your resolve.com file so um so i have a question for you christian as well as there is a uh a question from eamonn eiman um so i'll i'll ask mine first because i think it'll be a quick one um which is core dns right the the openshift dns service also applies to static pods that are deployed so for example the lcd pods on the control plane correct yeah so that um the the static pods also get uh because they they take part of that sdn right so they'll they'll get that same configuration yep so the and the the five second version is effectively for pod resolution so if you're in a pod and you do a dig or an ns lookup or something that it first looks at its host file it then looks at its resolve.conf which will point it to the core dns pod which is it's a service that's going to be running there's a pod on every one of the nodes in the cluster and that will then look across all of the other services and the forwarders that are defined in this dns config that you have up here before finally looking to the resolve.conf at the node level so if it is trying to resolve something external to the cluster you need to make sure that that result.conf on the node is configured correctly right correct correct and then also um one last kind of thing to interject there the host is um the nodes host uh etsy hosts file also does come into play in that same layer as the result.com file right so it'll um you know we're looking at the host resolve.com file that that etsy host there also um comes into play so so the question from from iman uh can i install openshift ipi across multiple network zones for example one for control plane and one for workers so you you know the answer to that i do um the answer is for ipi today and unfortunately no because of how keep alive d works the way we have it configured is that it requires layer two adjacency um so um for it for it to work right it is possible uh to configure keep a live d to work across on layer three um but that's not how we configure it and that's beyond the scope of of support currently yep yeah i have nothing to add that you've seen me answer that like 400 times yeah i know you've answered it too so it should be a bot right um yeah it should be like maybe i should make one yeah you should have that regex yeah so fahad uh another question can openshift span multiple esxi clusters on the same vcenter when using vsphere ipi that's an andrew question so that is on the road map uh so i know that technically if you were to look today at the documentation for like the in-stream in entry provisioner and all that other stuff you can create the vsphere.conf and have it understand multiple clusters and all that so we've had a number of customers that are asking about this uh we did some internal testing we found that it mostly works um with 4.6 i think they were like 95 success but they found a few things uh they created issues uh it is there is a jira issue for it i'll have to dig up the jira issue um and if i can do that the next four minutes i'll post it in so not not yet but multi-drs cluster clusters is something that is on the roadmap cool that sounds cool so um another question for you yeah so uh set up dns forwarding on my home lab which works great if the upstream dns server becomes unavailable the authentication cluster operator goes into a degraded state any reason for that um so i so the authentication cluster operator so i'm gonna so first and foremost uh i'm gonna need more information not that or like dig into the why that would be so before i go into my long-winded answer for my short-winded answer is i don't know um but i've had issues so usually the authentication cluster operator being degraded is a symptom of something else right um and so i've always found some weirdness with the router so if you have like a router operator i would check to look at the the router logs i would look at the authentication logs um i would look to see there might be a breakdown of the dns chain at that point um the cluster uh the authentication cluster operator probably has the upstream quarter quote upstream dns name like oauth dots you know um you know clustername.example.com and that's probably in the pod and since it can't look that up it's probably breaking down from i'm just guessing at that point where is what's happening yeah no such host yeah see look there so the so they if you're watching this and not looking at the the chat if you look at the chances it says no such no such host um that's that's your external lookup and i think that's that's what's failing the health checks for the external lookup yeah i think the the authentication operator relies on the quote-unquote well-known endpoint um to certify that it's talking to the right thing and i think that connects externally through a node port so it looks up the node name which it's relying on that external right those aren't stored in coordinates those will be in either the upstream dns or in that mdns responder depending on your installation type so i think that's why that's happening yeah i do yes so we're just assuming yeah so oh we got two minutes nice yeah watch them technically all right um so monster frames um facing an issue discord by the way okay yeah or i don't think we can tackle openstack right now no i'm i am not an open stack expert by anybody's the stretch of anybody's uh imagination so uh please chris doesn't have a cycle too yeah so uh uh please follow up on either discord or you're welcome to send me a message andrew.sullivan redhat.com or uh if you want to send me a message on twitter uh practical andrew and uh i'm afraid to say that i believe my direct messages are publicly open so you can you can reach out that way mine are definitely open for short two s's feel free to dm me on twitter too yeah and we'll we'll connect you with the right people we'll get an answer for that so don't don't hesitate to reach out um so chris uh christian my apologies uh any last minutes closing thoughts in the last 45 seconds or so yeah so it's it's really um i posted a link to the docs um all i can say is just um if you're having trouble with dns just follow the chain and see where you're falling down um so many times you know uh people blame dns when it's not dns where dns is the symptom um so it's good to always troubleshoot uh with the osi model we had one of those even one of those this morning so thank you christian really appreciate you coming on today um as always it's a pleasure to have you i appreciate you sharing it thank you um so for our audience thank you for joining us today as i said please don't hesitate to reach out at any time with any of your questions you're welcome to contact me directly via email andrew.sullivan redhat.com social media practical andrew on twitter uh christian do you have social media or other yeah yeah so unfortunately i have um a weird social media handle not too weird christian h814 so um my first name in last initial 814 so now you guys know my birthday and i expect presents so go ahead and there you go send me a tweet uh dm vms are open it's 2002 right that you were born yes yes [Laughter] and as always chris please uh please take us home yeah uh thank you all uh coming up on the channel here in mere seconds will be an openshift comments briefing talking about uh building multi-cloud provider platform kubernetes so stay tuned thank you all
Info
Channel: OpenShift
Views: 1,378
Rating: 5 out of 5
Keywords: OpenShift, open source, containers, container platform, Kubernetes, K8s, Red Hat, RHEL, Red Hat Enterprise Linux, Linux, OpenShift Architecture, networking, OpenShift networking, OpenShift DNS, domain name system, DNS Operator, pod dns resolution, node dns resolution, CoreDNS
Id: xefHFc5pnJs
Channel Id: undefined
Length: 62min 20sec (3740 seconds)
Published: Wed Mar 24 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.