Ask an OpenShift Admin (Ep 22): OpenShift + VMware = BFF!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] good morning good afternoon good evening wherever you're hailing from welcome to another edition of ask an open shift admin today we are joined by a very special friend from vmware mr robbie jerome if i said that wrong i'm sorry uh but andrew the uh cuddlish curmudgeon in the company would you please uh introduce you know yourself and what we're talking about today absolutely thank you chris uh so yes welcome to the ask and openshift admin office hour um you know as i had to glance over here i still have to look at the title yeah because i i still want to say the open shift administrator office hour i've decided to put the title in front of me from this point forward yeah yeah so uh you know forgive me i'm if i stumble over the name of the show because we're still this is the second week third week third week i think yeah that we've that we've had the new name so um so yes welcome uh thank you everybody for joining us today this is one of the office hours series of live streams which means that we are here for you we're here in an ask me anything style of interaction uh you know we encourage you to ask us well literally anything about openshift about administering openshift uh about anything that's top of mind in that respect and you can do that at any point in time across any of the platforms yeah we have some fancy technology that that re-broadcasts re-broadcast read class storage classes you've been dealing with those for the past few months uh actually yes uh so rebroadcast everything across all the platforms uh so matter where you are we will we will be able to respond back and see that in the absence of those questions in the absence of things that are top of your mind we generally have a topic and today as chris highlighted when he opened today that is openshift and vmware together and i'm really happy to have robbie on today uh robbie and i have been working together for i don't know two plus years now um which is kind of funny yeah it's kind of funny because i've i have spent so much time with you on the phone and video chats and all that other stuff i don't think we've ever actually met in person which is kind of funny so um well you're right then yeah so so robbie uh if you don't mind please introduce yourself yeah sure so i'm robbie jerome i work at vmware and i talk about apps and platforms on on vsphere in the varying flavors and i've spent probably the best part last two years focusing quite heavily on openshift and working with andrew and the engineering teams and and robbie and i we work together quite a bit because we are kind of each other's interface into the respective companies so whenever vmware folks want to know about openshift things you know we he interacts with me and vice versa whenever i want to know about vmware things i go through robbie so it's been uh really beneficial for for both of us to learn a lot um about about their respective stuff fun therapy we're okay yeah yeah so um for anybody who is watching the stream please feel free to ask questions i'm going to spend just a couple of minutes here going over the uh the top of mind things so for anybody who is a regular watcher of the stream right i tend to spend a few minutes up front talking about the things that andrew was wrong about last week um which nobody sent me any um obscene emails telling me that i was horribly wrong oh that's good and yeah every once in a while it happens uh and the other one is kind of things that have come up in the last week or so things that i keep seeing internally and sometimes externally that i think are worth bringing up to you all to our audience sometimes externally so the first one is i there was a pretty lengthy thread that happened this week about the difference between a open shift deployment that is mixed operating systems versus mixed in architectures cpu architectures so with 4.7 we announced with vsphere vsphere ipi right we can do windows nodes inside of an openshift cluster so core os control plane windows worker nodes that are able to to do all the things that they do absolutely fine fully supported nothing wrong with that works great what we can't do is mix for example a x86 deployment so x86 control plane with power nodes or ibm z nodes right stuff like that and we get that question somewhat frequently especially from of course our peers at ibm um but unfortunately all of the nodes have to be on the same cpu architecture so just just be aware of that i mean that's a core kubernetes thing um and there's not much that i think anybody can do about that at the moment right like mixed arch is hard period yes i'll post this into the chat i i have this i have this kubernetes issue bookmark because i use it at least once a week and this is the the kubernetes issue the upscreen upstream uh issue where we can't mix different uh cloud providers essentially so when with cloud providers what we're really referring to is infrastructure types so i can't have some nodes that are deployed to vsphere and some nodes that are deployed to physical servers unless i do a non-integrated openshift deployment right so what used to be called bare metal upi yeah so and that's just because it's a kubernetes cloud provider limitation it doesn't understand it doesn't recognize when those nodes can't all be managed the same so it just doesn't allow them to join the cluster so second one node auto scaling this was a question that comes up somewhat frequently you know once every week or two we get questions around can i have a cluster preemptively scale based off of some sort of criteria all right so hey the cluster has hit 80 cpu threshold or you know 70 memory or whatever that means can we go ahead and provision some extra nodes using the ipi mechanism the answer for that is unfortunately no um but maybe so the cluster auto scale or the node auto scale mechanism doesn't work that way it doesn't take action until pods fail to schedule so i just submitted a job that job needs you know 40 pods that have x amount of cpu ram and at pod number 26 i'm out of capacity right there is no more cpu available right it needs to take action so it'll it'll start spinning up nodes to be able to schedule that so if you wanted to do some sort of pre-emptive scheduling or preemptive scaling rather there's a couple of ways you can do it they're all basically the same which is some sort of additional automation so maybe that's creating a simple pod right it's some sort of automation that goes into a pod deployed to the cluster that is checking those metrics internally and just says hey you know does an oc scale or something like that against the machine set and basically the same thing but done externally right something your external monitoring system whatever you happen to be using that is watching those metrics and then taking action for you the last one that i have here is extracting ignition files um this was a fun one um so somebody asked me you know hey i uh i accidentally broke my or deleted the installation folder for my cluster and i need to add some more nodes this is a upi install how do i get the ignition files so that i can add those nodes right because you know you have to provide ignition uh so how do i get that out of my cluster so there is a command that we can use let me see if i can share screens here you can do it i believe uh you know sometimes it takes a little faith all right here we go yes always so i'm uh this is this is my cluster for today um come on cluster this is a vsphere cluster it is running on vsphere 6.7 u3 i believe um so it happens to be ipi but really doesn't matter in this instance uh so the command here is not that one let me pull this guy so the command here is this oc extract command so from the openshift machine api namespace we're going to pull this secret slash worker dash user dash data and what we get is the ignition right so this would be what we you know if i'm doing you know vsphere upi this is what i would attach is that machine property if i'm doing you know bare metal or something like that this would be what's hosted on that web server that we pass into that and the same thing we can do for the control plane nodes just change this from for example worker to i think it's master yep so over to master we pull out that one so on and so forth so super straightforward i'll paste this command into the chat thank oops you i say that and because i highlighted part of this over here it uh it decided to yeah thank you mac os for being overly helpful yeah all right so yeah that's that's those were the three things that i had uh today or leading up into today uh i do want to take a moment to highlight um very very important uh tomorrow at 9 00 a.m chris 10 a.m 10 a.m uh we will have the what's next presentation streamed here on openshift.tv across all the various platforms all the platforms yes sir so what's next is the roadmap presentation for openshifts so there's usually two of these that we do quarterly so one is a what's new which is what's coming in the next release and then the what's next which is what's on the roadmap for a little bit further out so i will be on the streams i'm sure chris will be on the streams as well right uh answering questions anybody who's interested please join um we will be if we can't answer the questions we'll take those and we will re-ask them internally yes so and we have multiple ways of harassing the product managers to get those answers so don't hesitate to ask any questions that you have about what's coming in the roadmap for openshift all right that's enough of me rambling yes we already have a list of questions that i'm trying to pull back up yeah so somebody and i did not know that you could do this um so when we schedule these especially now that we are part of the red hat's uh youtube channel they schedule these like a week or two in advance and apparently you can come into that pending uh stream and you can chat from the time that it's scheduled so somebody asked us a question like last week um which is pretty interesting to me um so i i think we'll start with that question and then i will come down through the rest of the chat here and see what we can see what we can answer before we get uh started talking about other things so first what about storage fee motioning happen on an openshift worker node which is carrying vmdks provisioned by the csi driver so essentially pvcs from the csr driver so some good news uh we recently updated the documentation uh let me find my right tab here and i will share this guy all right so we recently updated the openshift documentation uh and if we scroll up here i am on the openshift 4.7 docs and this is just installing a cluster on vsphere so we go to installing we go to installing on vsphere and take the first option and if we come down here we have this using an openshift container platform with emotion so previously the guidance here ranged from don't use the motion v motion is not supported to the emotion works and it's a support gray area so you probably shouldn't do it but now it is officially tested it works right it's fully supported to do and you can see here compute only the emotion right so the compute only is the important part here essentially that's what's tested they make some comments in through here about migrating with pvs and stuff like that and how it might change the references i haven't verified this but i think that that is talking specifically about the entry provider so remember with an open shift deployment with an ipi deployments or upi deployment we will automatically configure and use the intri vsphere storage provisioner it's there out of the box and that's the one that is fully supported by red hat the csi provisioner is different right you deploy that day two it is supported by vmware etc so my understanding is that this is true this paragraph is true about we don't want you to do storage emotion with in-tree pvcs robbie and i believe that and robbie please speak up if i'm speaking out of turn believe that this should work with csi provisioned volumes however if you want to be absolutely sure and comfortable and not have any worries about it at all i would definitely recommend doing a cordon and drain and doing that storage via motion while the node is not hosting any workload and if you really want to take it to the maximum you know security or comfort level turn that note off right shut it down so that way it's a cold migration yeah and it's it's as you say as you say andrew with the with the new provider um if you've got that loaded in you know i i i move i move nodes around all the time in my home lab and never have any kind of issue but if you want to be absolutely in line with the support statement in the docs that's that's what's in front of you right now um yeah your mileage may vary but it does seem to work spot on the new driver has got a lot of changes and you know it does have a different vpn version so you need to check your version numbers if you're going to be running with the the new driver instead of the entry driver yeah and i i should have asked you this before we started robbie of checking to make sure that you have a cluster on vsphere 7 with the csi driver one of the things i'm hoping you'll show is some of that vcenter integration that comes with the csi driver because i think we talk about that a lot but we haven't often shown it of you know hey as the vcenter administrator i want to feel comfortable i want to see the things that i have if you don't have that that's okay um we'll we'll follow up with it and include it you give me about two minutes so i can magically create one magic magic gotta love magic uh jp dade um can we do a bare metal ipi vsphere installation of openshift 4.7 with windows worker nodes so bare metal would imply that so or bare metal ipi would imply that we are doing something like a vbmc or or something like that so no because i think what you're asking is can i deploy some nodes to vsphere and some nodes that are maybe physical and some nodes that are windows maybe physical or virtual and we can't mix like that so you can do and i'll take it further as windows worker nodes are only supported either in azure or on vsphere api right those are the only two platforms or in the only two installation methods that windows worker nodes are supported at all jpdade clarifies these on a hpe synergy frame it's all virtual okay yeah so unfortunately or fortunately depending on your perspective yeah it would need to be a vsphere ipi deployment make sure you watch i think it was two weeks ago we had christian on maybe three weeks ago yeah we had christian on the stream yeah he talked about windows nodes um so there are some caveats there right make sure you're using ovn uh sdn with uh hybrid mode so that way it'll work across the different operating systems all that fun stuff um what do i get from the vsphere cloud provider other than storage so that's a great question um and robbie please feel free to interject here as well so there's a couple of things so the cloud provider which is responsible for uh really managing nodes in an ipi deployment but can also work with upi so what do i mean by that so the machine api and the machine config operator etc are designed or they use to create and destroy and configure nodes that are a part of the cluster and the cloud provider or what we in cloud provider is a bit of an overloaded term right depending on who you're talking to in red hat and certainly across other vendors in the industry we sometimes use it to mean different things but with openshift and with andrew so the cloud provider is that bit that talks to the underlying infrastructure provisioner and or provider rather and provisions manages those nodes it is available through upi let me dig up the link here um too many bookmarks indeed i'm having a really hard time searching for this episode let me just go find it somehow all right so i will paste that into chat and then i'll bring it up over here so this kcs how to create a machine set for vmware and openshift 4.5 and this is if you deploy using upi and then you want to be able to do node scaling right using the the cloud provider right the machine set paradigm inside of that cluster so this will bring back or or introduce some of the ipi requirements to your upi cluster so for example with upi you can do static ips right and you can you have to create the dns entries the load balancer all of that other stuff so when you introduce these machine sets when you start doing dynamic node provisioning you're going to have to have dhcp right those nodes have to get their ips from somewhere and you'll want to be careful about how in particular the routers land across those nodes and the reason i say that is because well we'll end up with you know you could end up with a router that ends up on one of these new machine set provisioned nodes that hasn't been added to the load balancer because remember it's upi the load balancer would be manually configured by you so either use a load balancer that has an operator for integration so that way it can dynamically update it or create like an infrastructure machine set so that way you always know where the routers are you can always have the the load balancer configured to point at those but otherwise it works great and it's really good for you know that dynamic provisioning that you want to right dynamically scaling the cluster up and down that you want to be able to do while also being able to take advantage of your external enterprise load balancer instead of you know the ipi people id thing that we have there so please uh hopefully that answer your question um i think that was yeah yeah i believe yeah so if that didn't answer your question please feel free to follow up i see that you uh can you talk about machine api machines and machine set if we're using upi so yes very quickly so machine api machines and machine sets effectively don't apply with upi unless you're doing something like this right where you're creating that machine set after the fact so that doesn't exist for all providers i think actually i don't know that's a good question i know we i think the only one that it's documented for is the only one that we've really been asked for that i'm aware of is with vmware so yeah but normally with upi without day two configuring this machine set configuration those things wouldn't apply they'll show up there's a quick one there's a quick one andrew back on the csi topic just come through chat um saying yeah shouldn't uh shouldn't we use csi and forget the built-in driver um taking taking don't taking out the built-in driver taking out the entry driver is is a bad idea um i think andrew whether you want to speak to that uh i will say it's not possible um because the the cluster storage operator will ensure it's always there yeah it's it puts it back and um i think it's tied it's tied quite closely with the registry so you can you can put them both in and you can change the default so you can change the default to the the new vsphere driver which is going to give you the integrations into vcr and vcenter and visibility into the provision vmdks and and pvs up into vcenter um but don't spend too much time trying to delete the existing csi because the operator is going to put it back and it's it's going to upset your cluster a little bit if you do successfully manage to remove it so change the default yes don't delete the existing one that's that's in there yeah and i i would i would say and robby i think you'll agree with me i i would encourage folks to use the csi driver um absolutely the the integration with vcenter just the general improved capabilities especially if you're also a vsan user right just the general improved capabilities and reporting and everything there you know snapshots um the rwx of the file modes um what am i what's the word i'm looking here for robbie uh with vc yeah we've got yeah the visa the vsan integration you can type storage policy so you can you can you can set your encryption levels and stuff like that um you can have a number of different policies that will let you have um different different numbers of copies on vsan and stuff like that so you've got far more control if you're using the new driver um and then you've got other access to some of the new stuff if you're on on the vcf platform you can use some of the um you know the the file access and stuff like that as well which is which is quite new and i don't think we've spoken about before and so maybe we'll we'll do another session on that another time yeah so that was that was fine i have a whole line of vmware shows that i i hope to get to uh someday um so i'm kind of going in order here um so forgive me if if you've asked questions and we haven't gotten to him yet i've got them all uh taken down here as notes okay by the way so daniel asks with a cluster installed using upi before static ipo is easily available uh can we replace dhcp with static ips uh yes but no so yes you can the problem is doing it is messy and and by that i mean the way to do it inside of the cluster post deployment is to either uh create a machine config for the nodes right and then for each node right have a machine config that goes in and drop something into like etsy sysconfig network dash scripts or into the network manager directory right that tells it what its static ip is that gets messy because now i have to have a machine config for each one of my nodes and now i have to you know potentially have a machine config pool for each one of my nodes and it just it's it's not nice uh so the other way as of 4.7 in tech preview you have the nm state operator so you could use nm state to go in and reconfigure that interface to give it a static ip address so the recommended way of doing it for better for worse is to reload the nodes you know take them down one at a time reload use the the vm property you know now that now that we can use the vm property to add that static ip information and and do it that way so i i know that that's not the uh not the most friendly or the most ideal way but you know sometimes i have to remind people that we treat core os as an appliance and as disposable so when you want to make you know super low level config changes and the ip address for that first primary interface the one that's on the machine network uh subnet um that that's one of those low-level things that we really want to treat a little bit carefully with that type of stuff we don't wanna um so i i hope that answered your question dansel daniel excuse me man i can't talk today uh waleed not just view motion it's also drs that was rescheduled vms based on utilization uh so i i really i know this isn't a question it's a it's a statement but i really want to touch on something important here and robbie and i have been talking about this for well basically two years and that is what is the relationship between nodes deployed openshift nodes deployed to vmware or to vsphere with drs with resource reservations and you know that type of resource management type of principles so i'll i'll ask you robbie right how should we configure those things um carefully no it's it's it's it's actually fairly straightforward so with esx you know vsphere clusters um the system is looking after the cpu memory for you so when you deploy vm onto the cluster it's going to give you the resources and balance them based on everything else that's running on that cluster or on that esxi host if that host starts to become resource-contrained constraints you're using up too much memory or too much cpu workloads will get get pulled back and you might get as much cpu or you might become memory constrained that will impact the performance of the application in this case you know an ocp if it slows down too much ocp is going to get upset and think that there's a problem with a node or a master and and it's going to try and resolve that so if you use a resource pool you can allocate a certain amount of dedicated resources to your to your masters for example or to a particular set of workers and say absolutely for these these virtual machines i want to guarantee that they will always have at least 16 gigabytes of ram or 32 gigabytes of ram and that's going to have no impact whatsoever on resource management unless the esxi host starts to become resource constraint in which point you're always going to have 16 gigs allocated to the vms that are in your pool and and you're going to be good with that now the subset of that is there's a concept of shares where you can actually sub allocate even further and say of that 16 these particular vms have have shares most people these days just just stick with reservations just to guarantee a certain amount of resources so in a in a sort of deployment for open shift you would allocate um some dedicated resource a resource pool for your for your masters for example or your control plane um so if you do start to run out of resources you know that's protected and then if you have particular workers that you want to shed your particular tasks to and you know that you want to guarantee that that particular subset of workers has resources you can put those into a resource pool and allocate that yourself at the same time you might want to tag one of those tag those nodes that it's it's it's a special kind of known so you can schedule workloads straight onto that node and if you are if you are filling up the cluster you can you can guarantee those resources now that's for a single esxi host across a number of hosts in a cluster um drs which is the dynamic resource scheduler is going to v motion move live migrate the machines around to make sure you've got the best possible utilization for that entire esxi cluster so it's going to balance out those workloads for compute and memory um and io as you go across so if you've if you're if you're resource constrained to the entire cluster then things will move around resource pools will kick in so you you've got the flexibility to to over commit resources to say i need to use 300 gigabytes of the 300 300 gigabytes of the 250 gigabytes i've got on this eso i host because i know that actually all of these vms that are are my openshift cluster aren't all going to be fully utilized all the time i just want them to think that they have the resources and the workloads will balance out appropriately so it's just a way of protecting critical parts i protect some of my workers i protect the masters by giving them resource pools but unless you are actually running out of memory it's not going to kick in so i i'll ask you and i think you and i might have slightly different perspectives on this um so previously in multiple other streams i've made the suggestion that over commit for worker nodes specifically essentially you should never over commit the control plane nodes but for worker nodes should be handled as close to the application as possible so if if i'm wanting to over commit over subscribe my cpu and memory resources so putting applications that have or want more cpu more ram than is available in the worker nodes that should be done at the kubernetes at the openshift level and oh and reserve or essentially protect the resources at the hypervisor level but what you're saying is essentially let vsphere do what vsphere does yeah um so and it depends on your workload and how dynamic you want to be and how you know whether we have for a development cluster for example over commit would make sense if you're in a production cluster and you know the workload you understand it then maybe you don't overcommit but um with kubernetes when it schedules a workload it says you know have i got 50 gigs of ram and this much cpu and kubernetes says yes i could schedule this pod here and off you go it doesn't go back and check um you know that's that's just running that resource is allocated to that workload and if that workload doesn't actually use what you asked for so it uses one gigabyte of ram and almost no cpu those resources are wasted um now the sphere esx is constantly monitoring it's constantly looking at memory utilization and cpu utilization so it's going to rebalance it's going to do some clever things with things like memory compression to make sure you get the most out of whatever workload is running on that platform so you can safely over commit memory you can safely over commit cpu and openshift is completely unaware it's quite happy the only point you get to when it becomes interesting is when you do start to become resource control and constrained esx will will do things like reduce the amount of cpu you have so that the vm will actually slow down which will slow down the processing of your of your of your workload um and linux and kubernetes is completely unaware of that it just it just starts to run a little bit slower it does some time shifting on it to to make sure every every virtual machine gets a a cpu cycle um and it does the same with hyper threading as well you know hyper threading is effectively half a cycle um on esx you don't get half a cycle it will that that thread will get scheduled twice if it lands on a hyper thread so it does some really clever stuff um yeah you can really get into it and get super nerdy if you if you want to get into cpu cycles and numera awareness and stuff like that so you can really tune the the performance of the cluster and or you can just overcommit a little bit and monitor it to see how it's going but it can drop off very very quickly so to your point andrew about being cautious and saying look don't over commit resources if your workload is you know if you are actually using 32 meters ram you are using all that cpu all the time then there's no benefit to overcommit um and the esxi schedule is still going to try and deliver that capability so if you know that workload is always going to be busy don't overcommit it just say right this is a prod workload this is how much it needs put it in a resource pool and it's fine but for dev clusters or build clusters where you've you've got this you know fairly rapidly moving memory and cpu requirement as things get built and destroyed and you can get more more bang for your buck on the hardware if you if you go with the over commit yeah and i i'll say you know i want to get get all nerdy and down in the weeds on that i don't know if we have enough time especially because we're continuing to get more questions thank you everybody for questions we're going to address those but i know you and i have talked about you know creating some sort of usually it's referred to as like a best practices guide or something like that but more of a i don't know if we want to call it best practices so much as here's what you need to know about openshift and vmware and stuff like that so that's that's one of our background tasks um that hopefully we'll get we should create a little video of over committing cpu and watching what it does yeah that'd be fun yeah so uh jpdade asks and this question for you robbie can we use vcf uh vcloud foundation with openshift 4.7 uh that's easy yes [Laughter] uh yeah um yeah vc vcf is one of the supported platforms you can create a workload domain for before your openshift clusters drop drop drop that into a workload domain andrew's finding me some docs there we go and and spin that up as you would any other other workload domain um use the use the cloud native storage driver to integrate your vsan um if you want to do some stuff with nsx and plug that into openshift you can do that as well depending on what version of vcf you're running on um so yeah it integrates really nicely and andrew's kind of demonstrated that we have we even have documentation around how you do it and some best practices on that so here to talk about it can yeah um so i see a question i don't know if this one is vmware specific uh planning to buy a ryzen 5000 series cpu with bare metal core os can i install openshift bare metal yes i actually use ryzen 2000 series in my uh in my home lab uh so yes coreos does support those cpus um and then the kernel version it it is the same as rel so the simplest way to get the actual real current running kernel version uh let me switch what i'm sharing here where'd you go becca so i can do a oc get node come on and do an oc debug against one of the nodes so we'll just pick one of these one of these guys it'll pull down that debug pod here in just a moment and once that comes up we'll do a just as it says here we'll do a change route to slash host and then i can do like a simple u name dash a so you can see this brand new 4.7.2 node is running kernel 4.18.0-240 so pretty easy to find out the real actual running kernel if it's not a running node if it's not a running host that you can that you can ask it might be in the release notes i'm not sure so we would just have to check there let's see i see i think you're you're working at it yeah i got i got roger moore handled i think just pointing them to the container catalog was a huge help so if you're trying to run containers as root and open shift you're going to have problems and that's okay we want you to have those problems because running everything is root is not a healthy practice yep yep yeah it's uh you know the whole security thing that we we forcefully place upon you uh whether you want it or not uh are infranodes declarable in the install config.yaml not in install config you can create them in the manifest and have them deployed when the cluster is set up but you still have to day two go in and modify the workloads to actually schedule on to them uh christian hernandez did a great stream on this um actually i think he did one by himself and i think he and i did one together but it was last year we'll dig out the links for those and i'll put them into the show notes thank you um it was last year holy smokes yeah yeah it was uh i want to say it was late summer early fall last year uh fair enough so and for anybody who doesn't know we do have a weekly blog post on openshift.com blog goes out friday morning where we recap everything that was in the show we provide links to all the stuff that's here so if you missed the links in the chat or anything like that or if you if you didn't see something that i typed you can always go back into that blog post and and find all that information again it publishes thus far it's always published friday morning early friday morning i say early friday morning it would actually be lunchtime for robbie because you're you're over in the uk i've given up working out time zones i just i just sit here forever i know i i i always feel bad because i talk to you a lot in like what is effectively mid-afternoon for me and i know that's like early evening you know time for you and you always brush it off as oh but i have to deal with you know the west coast time zone because that's you know vmware stuff and i i still feel bad yeah being honest right has more questions around uh drs yeah yeah please so yeah okay so there's i'm i i think i'm getting the questions in a different order to you guys but there's there's one one i want to call out here around um affinity rules um so what should we do around how we manage infinity for control planes um now when you when you do a default install or ipi upi um openshift doesn't create any affinity rules at the vsphere level and so the vms get get spun up but there's no affinity rules if you want to start building a you know resiliency into it and making sure that your your control planes are spread across physical esxi hosts um you need to go into virtual center you need to to tag the vm and say right this is my first control plane this is my second this is my third none of these folks can exist on the same esxi host um at which point if they're already on an esxi host vsphere will just v-motion them off and they'll get put on physically different servers um so if you do lose a you know a physical piece of tin you're you're only going to lose one of those control knows at a time you can do exactly the same thing for workers so if you have a bunch of different workers that you you've you've tagged with different workloads you can use affinity rules to spread those out across esxi clusters and depending on the size of your environment that may also extend out to physical racks in the data center so you can have a policy that will let you say right actually i'm only going to have one control node and control node vm per rack in my data center and i'm going to put these workers across the esxi hosts so you you can be get quite sophisticated with your rule set as to where you want parts of openshift to be deployed either in vcf or in or in just a homegrown home built vsphere environment and vsphere is going to take care of that openshift is not going to not going to care the the only thing you need to be aware of of course is if you provision up a new openshift node um or you or you recreate a control plane that's that's gone that doesn't have an affinity tag by default so you need to go in and recreate or use an automation to to do that for you right here it's really cool yeah it's really cli um the other the other thing to remember which is a nice feature is you know we've got a little tick box that says highly available tick that tick box for the control plane nodes or you know any of the nodes really um that way if uh if we do lose a node because of a hardware failure or it just stops responding for whatever reason um these fills spin it up somewhere else immediately and it's it's really really quick at doing that so if you lose a i mean if you lose a worker because of a hardware failure vsphere can often put the worker back before openshift has realized that it's it's lost that mode to start with um and yeah and it will and then the kubernetes layer will schedule pods and handle all of its internal workload magic as well so they work really well together but out of the box those settings aren't aren't turned on so high availability for control plane affinity rules to make sure you're not landing on particular physical hosts if you want to want that level of control and the only thing i'll add is i always recommend soft affinity rules just in case especially if you have relatively small number of nodes in the cluster you know if if you have a cluster that's you know only three nodes drs cluster vmware cluster that's only three nodes and you set hard affinity rules for like the three control plane vms if one of those physical nodes goes down then it won't restart that vm that control plane vm because it would violate that heart of anti-affinity rule if it's a soft anti-infinity rule it'll restart it and then when that physical capacity comes back it'll it'll automatically move it um so in the interest of time i'm gonna i'm gonna rapid fire some questions here so daniel and this is in referring to the csi provisioner primary benefits come if you're using vsan then right so i will say my perspective is no there's if you're using traditional data stores coming from whatever storage you're using you still get a tremendous amount of visibility in vcenter as to what's going on with those uh pvcs pvs right that are inside of there probably yep no absolutely um if it's uh a data store in in vsphere um you can apply the same storage policies that you can with your with your traditional data stores um as you carry vsan we talk about vsan obviously because it's our thing but yeah anything that's a data store in vsphere cns is going to take care of it for you and you can you get that visibility at the at the vmware level so it's not just tied down to vsat and you you said cns there which is i don't think we've i don't think we've explained that uh what when cns is and how it relates um yes the cloud native storage versus the the driver yeah and cloud native storage or cns is what vmware calls the csi driver right yeah yeah so it's it's it's lots of fun with terminology let's talk about clusters andrew [Laughter] uh is it possible to change cloud providers of an existing cluster unfortunately no i wish that were possible it would solve a lot of support headaches for us why such hate for dhcp you know i don't i don't know i wish i knew andrew's opinion is there's a lot of um old-school think about dhcp right it was never used in the data center for security reasons right you know oh anybody can just go in and plug in and get an ip address and stuff like that um but that being said that's you know just my opinion i i find dhc dhcp to be very helpful um but then again these days i just run a lab i don't run the data center anymore so right uh uh qui i got one here for you are you declarable in the install dash config yaml yet uh oh we we addressed that one so unfortunately no oh we did no yeah so you can create them at install time just to just quick revisit you can create them at install time uh by putting the machine set yamls into the manifest file so you would do a openshift install create manifest and then dump them in there so we'd provision them but you would have to go in after the fact and modify the services to actually deploy over there um jp data i see that i debug into the node i've used ssh to access the nodes so my prefer so what is your preferred way of accessing the nodes it's one more dangerous than the other so i tend to use the debug simply because it's easier you know i don't have to you know use a specific or provide the specific ssh id i don't have to constantly go in and edit my known hosts because i deploy clusters three or four times a day sometimes and if i'm ssh into those nodes i'm constantly having to go in and remove them um so generally speaking that's i use oc debug into the nodes probably 95 of the time there are occasionally some things like if you need to access devices like if i'm if i need to check and see whether or not it detected a usb device that i plugged in a lot of times that type of stuff i will have to ssh in and i see christian answered that question as well [Music] is there a way to control different cluster architectures in one panel for example i have three clusters um i think that we have acm sorry i started to read the question and then it broke up a little i think so i i think you might have asked or might have helped answer that one chris acm would be the yeah i mean acm would be the red hat answer to that yeah um but like if it's like you need to give your developers a you know into all the clusters versus like the admin side where it's you know the op side yeah yeah kind of depends on what you're trying to do um yeah i think this is a and andrew needs to get more familiar with with acm um but i think what you're thinking is like uh where's my v center so i can vcenter right i can have multiple clusters like could i have you know cluster one in here and then this is my you know openshift admin console or whatever for that cluster and go down to cluster two and see that one and so one interface where i can join all of my clusters to be able to manage them or same thing with cockpit right cockpit i can have one host that is my cockpit host that connects to other hosts and is able to do those things so i'm showing you this and i just realized that i'm not sharing that that window ah lovely thank you um you know so anyway so no unfortunately acm is the closest thing i know of but again i'm not super familiar with acms that might not meet no means so christian says uh my answer to everything is arguable cd um get ops does solve a lot of problems along that route and you can definitely have that kind of uh all-in-one view with argo uh across multiple nodes i feel like not sure um um so i see a question and robbie i don't know if you know of anything here are there any kubernetes enhancements that will have insight into the fact or danger that the kubernetes nodes are running on the same physical host excellent question um i know there's a lot of talk on the sig about it but i don't know if anything's actually uh what would be affinity rules right like in kubernetes itself but that would be an interesting rfe for the cloud provider maybe you know having the cloud provider you'll be able to either report that information or even configure affinity rules so the latest blog post on this is from kubernetes side well this documentation but um i'll just link it here just to you know for posterity but they're kind of going through like how would you do this currently and you know note isolation restrictions affinity and anti-affinity rules are all in play there yeah um so we'll lead i see h a fault tolerance um so yeah ha and ft are not necessarily the same thing fault tolerance is yeah fault tolerance is um it is the same vm executing on two separate hosts at the same time so if one of the hosts fails the other one picks it up and runs with it immediately and to your point waleed there is a a limit of eight cpus per vm for that and it that that uh using fte while it's fine if you need it be aware of two things one um it can it uses a lot of network throughput right to replicate all of those there's some very specific requirements on on ft yeah it is super latency sensitive um yeah i mean i i don't think i've ever tested it with with openshift or even that with kubernetes because aha typically puts things back and before kubernetes cares um yeah uh suggest any tools uh to deploy an openshift cluster in one click uh apart from ansible yeah i'm just answering that right now right i'm thinking of a small lightweight kind of like argo cd cluster like a kubernetes and docker kind of scenario actually robbie is there anything in the vrealize realm yeah um yeah i mean we can use vrealize automation to do the one click button that kicks off the deployment for you um there's some stuff up front that you obviously have to provide with the configuration i mean andrew we messed around with it with the um upi installer and creating ignition files and automated that stuff um ipi i think it's it's fairly trivial to do actually yeah ipi it isn't one click but it's you know one command and then you fill out like your your what's your view center what's your credentials and what ips do you want to use and it kind of goes yeah and and we realize we'll let you will automatically provide all that stuff so yeah we don't have anything out the box because vrealize is mostly around what is your environment look like how do you want to automate it but we have customers that have that one click normally we we use it with some governance wrapped around it so the one-click deployment is is more around are you allowed to deploy that 32 node openshift cluster or perhaps do you want to get someone's permission first yeah so we've only got about five minutes left um robbie one of the questions i wanted to ask you is we know that there's like or we generally encourage folks to use csi um you know for for customers who have nsx right we generally encourage using nsx integrations are there other things that maybe in the vmware portfolio or in the red hat portfolio right ways that those two things can integrate and get kind of more value more benefit um yeah i mean we spend a lot of time talking to customers that have well a lot of the other solutions in the portfolio that are normally you know vm based but we've been moving everything to kind of being more container focused over the last last few years so i'm going to take the opportunity to share share a slide and share a screen um so give me a sec and i will so our anti-affinity i'm going back to questions here andrew you can handle this one are anti-affinity rules recommended on the openshift you know plane the entire openshift cluster or should you do it with vms and vmware right like what's the right answer there so i would say andrew's opinion is that anti-affinity rules soft anti-affinity rules are strongly recommended for the control plane nodes probably for infrastructure nodes if you're using those definitely ones that are running routers and then worker nodes it's really going to depend on you and your applications and a number of other things um you know one of the things i i uh a conversation for another time is you know the philosophy of do i you know if i'm deploying openshift to vsphere or really any virtual infrastructure right do i have fewer larger nodes you know maybe one vm per physical host or do i have more smaller nodes where maybe i end up with three or four or five open shift worker nodes per physical host and kind of the benefits and and stuff there so but i don't want to interrupt um from from what robbie was gonna say i mean that's another great topic right i mean i i'm gonna just if you go can you see that is that working yes perfect so i mean we've we've been talking mostly around the just the vsphere cloud provider stuff which is the stuff at the bottom here where you know it's it's v sphere and vms right i mean we talked a bit about cns uh class um csi drivers for vsan and storage um nsxt integration gives us the sort of full view into the into the pod networks as it does the vm networks so the the bottom part is very much about giving you the you know the view that you have with vms but at the open shift level and and making everything kind of have a level playing field but the the other thing we've been working on a lot is you know how do you do day two and how do you see the openshift workloads the applications are running openshift alongside all of your other workloads so feeding feeding all the logs into something like log inside a login site cloud along with esx logs along with vm workload logs and getting that end-to-end view so you can see if the database that's sitting in oracle somewhere that you've never touched for the last decade is causing you a problem with your with your microservices in openshift example is where we've been focusing some of these integrations so observability a wavefront was was one of the first operators that we we got certified on operator hub actually andrew so you can you can go into openshift go to operator hub and say give me wavefront and that's going to give you the dashboards a real-time view on openshift alongside vcenter vsphere and vsan and any other vms that you've got in the environment so you know we focus very much on making sure openshift works around everything um and plugs in the way you'd expect it to so i mean we don't really have time to go into any of this stuff but it you know there's there's lots of other things beyond just the cloud provider interface we've been working on yeah i know you know having and at the bottom of your slide there you had the open shift on vmc on vmware cloud on aws that one's near and dear to my heart because you and i have worked on that one together and published that so anybody who's a vmc customer and they want to deploy openshift onto vmc that's now a fully supported platform robbie and i went through and did the deployment guide for that and you know that was where i got re-re-familiarized with like log insight you know the last time i used login site was see this is 2021 so uh nine years ago um it was a great product then i can only assume it's only gotten better you know in many ways but it was one of those like i've forgotten it existed so and you can absolutely configure you know openshift to if you don't want to use the openshift logging service for whatever reason you know if you've got login site you've already got one do you want to use two maybe maybe not but you can absolutely configure openshift to only forward to login site or to go to both right to the openshift cluster logging service and to log inside so on and so forth so yeah and it comes up sometimes where you you just want the operations folks to have their logs in one place and then the dev guys have their logs somewhere else and everyone has that so i think we are at the top of the hour now so thank you everybody for all of your really phenomenal questions um great session i'm gonna take as many of these as i can i'm gonna put them into the blog post i'll i'll try and and link to where we answer them uh in the video um as best as i can if you have any additional questions if there's anything we didn't get to you can always reach out to me or chris so social media i am practical andrew on twitter uh chris is at chris short you can also send an email uh andrew.sullivan redhat.com uh robbie is there uh you're on twitter as well yeah i'm on twitter and hit me up on robbie j on twitter um and i'll i'll work with andrew to answer as many questions as we can this was uh this was a lot of fun thank you for having me on yeah we'll have to do it again so thank you very much robbie appreciate you joining us um and to our audience thank you and we will see you tomorrow for the what's next presentation and next week at the same great team same great time same great channel yes and i would like to remind everybody that red hat summit is coming so uh please register for that if you are interested in attending thank you and without further ado i will turn everything off thank you robbie for joining us really appreciate it buddy you
Info
Channel: OpenShift
Views: 831
Rating: 5 out of 5
Keywords: OpenShift, open source, containers, container platform, Kubernetes, K8s, Red Hat, RHEL, Red Hat Enterprise Linux, Linux, VMware, vSphere, virtualization software, virtual machines, OpenShift Virtualization
Id: a9s-EWpIjYU
Channel Id: undefined
Length: 62min 30sec (3750 seconds)
Published: Wed Mar 17 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.