Kubernetes? Database Schema? Schema Management with Atlas Operator

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
how should I manage dat today schemas from a cluster you recommended schema hero right I did but don't pleas please please don't use it it's the best option but it's bad really bad I think I found a better one I'll let you know later okay thank you [Music] bye maning database schemas in kubernetes is hard no wait no hard is not the right word it is silly mostly because we do not have the right tools we can R existing schema Management Solutions into containers and run them in kubernetes as let's say jobs but that is silly that is not how we work in kubernetes we want to Define resources and let kubernetes do their rest and to do that we need custom resource definitions and controllers or operators so if we exclude those solutions that were just wrapped into containers and without following what we consider best practices in kubernetes the only option the only option at least that I'm aware of is schema hero schema hero is the best solution for managing database schemas in kubernetes but at the same time schema hero is horrible how can that be you might ask how can something be the best and horrible at the same time well that's easy if you're the only option you are the best option no matter how bad you are if you would compete in a race let's say and you're the only competitor you will get gold medal for sure you could run for a few meters then take a rest have lunch have some coffee and then walk the rest of the truck and you would still win you're the only one that's similar to the situation of schema hero it's the only option for those wanting to manage database schemas as kubernetes resources hence it is the best option and what's the problem with schema hero well there are quite a few it supports only simple scenarios it does not emit statuses making it close to impossible to use with GitHub tools or to observe it it does not have a decent way of installation and so on and so forth I already explored some of the schema hero issues in that video so I will skip repeating myself here what matters and I'll have to be honest here is that I was promoting schema hero for a while but that's because it was the only option that I was aware of now we get a second Contender for managing databas schemas in kubernetes it's a tool I already explored in this Channel and back then I was complaining that it does not work well in kubernetes they listened and now it does the tool is called Atlas And besides previous methods for managing datab with schemas it now has a kubernetes operator so the question is whether it is better than schema hero the bar is very very very low so it should not be hard to beat it as a side note this video has nothing to do with this video back then I explored Atlas as a tool for managing database schemas outside kubernetes and I was complaining that it does not work well in kubernetes today I'm exploring Atlas kubernetes operator which is a new addition to Atlas that's enough talk let's see it in action let's take a look at a few resources we have an atlas schema with a specific name and further down there are credentials that tell Atlas how to connect to the database there are a few ways we can provide credentials and they are all all wrong or to be more precise one type is missing if I would have many atlas schemas there would be too much repetition those 10 to 15 lines would need to be in each of them it would make so much more sense to have something like Atlas schema credentials resource that would be referenced in Atlas schema now that's not a big deal so let's move on further down we have schema itself which can be HCL or like in this example SQL now it makes no sense no sense at all to mix kubernetes yaml with HCL I understand how HCL might be a good choice outside kubernetes but inside kubernetes manifests no no it makes no sense that's why you don't see it in this example over here I'm defining schema in the SQL format now the SQL itself might be confusing if you know SQL as I hope you do you might think that this schema would work only the first time after all it says that it should create a table and that would fail miserably if the table already exists it is not declarative right well wrong Atlas will always and repeat always create a table but not inside the destination database instead it will create that table in an ephemeral database compare it with a destination database calculate the differences and then apply those differences in the destination database it's actually a really good way to manage schema that has a lot of potential to avoid typical pitfalls other tools fall into anyways from the end user perspective what matters is that you can think of create table as just filler around declarative statements that Define Fields relations and other things we might need in databases would they prefer if the whole manifest is in kubernetes format of course I would there's no good reason to mix one declarative format with one imperative format the texts as declarative nevertheless SQL is widely accepted and known language so it's still a better option than mixing yaml with HCL one argument in favor of SQL is that every single DBA knows it so there is nothing to learn that argument is false since there is everything else in Atlas schema to learn another argument would be that the atlas team prefers esql over other formats but that is also not true Atlas is on the mission to convince everyone that declarative formats are a better choice than SQL it's been pushing HCL for a while now and it's the preferred way to define schemas outside kubernetes it's just that Atlas is moving towards kubernetes where HCL does not make any sense so they opted for SQL as the default choice because the third option pure kubernetes yam would probably mean rewriting the whole Atlas engine now to be clear none of those things is a real problem everyone knows SQL so it is quite okay to define the schema in that way even though it might be a bit confusing and might not be the best choice the alternative destination to define the schema is to store it as a config map and then reference it in Atlas schema I think that that's silly as well so I will pretend that there is no such option no config map where I Define SQL and then reference it in cetas resources no no please no now going back to the Manifest further down I have a second Atlas schema there is nothing special about it the only thing that you need to know for now is that I defined two tables in two different schemas and that one references the other as a constraint now let's apply those schemas by executing Cube cutle apply with all the banks and whistles we can see that two schemas were created we can always retrieve schemas if we would like to see their statuses now that ready column alone is a reason big enough to ditch schema hero in favor of Atlas kubernetes cannot manage resources without statuses schema hero does not have statuses so kubernetes does not know what to do with it I will not go deeper into it as I already mentioned watch this video over there uh for some of the issues we face without statuses Atlas does report statuses hence that alone already makes it a winner we can see all that in more detail if you output a schema to yamamo look at that that's a proper kubernetes status with a message and reason and the status and the type next I will check my postgress database and see whether schema was really applied both of them and to do that I will exit into the database container which by the way was created through cnpg Cloud native pogress which is awesome and you should check it out I made a video over there by the way the links to all the videos I'm referencing is in the description so check it out over there don't stop watching this video Don't next next I will enter the psql shell switch to the app database and list all tables now this is disappointing only one of the two tables are there Atlas failed to create the second table but there is a reason for that as I already mentioned briefly Atlas creates ephemeral databases creates schemas Compares them to those in the destination database and then applies the differences and that works great only when executing Atlas CLI but not when running inside kubernetes as different Atlas schema resources you see Atlas creates an Emeral database for each Atlas schema resource and that means that it created one database for the video table and applied create table statement then it created another DB for the commment table and applied create table statement and that one the second failed because the Common Table references the video table but the video table is in a different ephemeral database what that means is that we have to Define all the tables that reference each other in the same Atlas schema resource otherwise well it won't work we can see that there is something wrong with the comments table by listing all Atlas schemas it's in the very fying first run State and it will stay stuck forever and ever now let's take a look at the better definition of those two tables stored in videos dasw yaml this is almost the same as the previous silly demo videos resource except that both SQL schemas are now in it and the second resource is completely gone with both tables defined in the same resource Atlas will create a single ephemeral data apply both schemas and that should resolve the issue of comments referencing videos through the constraint and foreign key but before we apply those changes I want to check one more thing will Atlas delete tables if you delete Atlas schema resources let's run Cube cutle delete and see what happens if I list the tables in the database we can see the table is still there now I understand why Atlas does not delete tables it spins up ephemeral databases applies the schemas and Compares them with a schema in the destination database it cannot delete tables that are not in the ephemeral database because it does not know which ones are managed by that Atlas schema resource nor which one might not be managed by Atlas at all however this is the part where Atlas should have leveraged kubernetes web hooks it could have intercepted the request to delete atas schema and deleted whichever tables were defined in that schema resource but that's not how Atlas Works outside kubernetes so I guess that's not what they thought to do inside kubernetes either now that's a Pity since that would mean that I would have to delete tables manually or come up with some work around and TR be told tables are not deleted often nevertheless that's a feature that should be added to Atlas it must be able not only to create and to update or edit but also delete stuff now let's get back to the Manifest that contains both tables in the same resource and apply it so Cube cutle apply videos one yam and back SL DT to list the tables in the database and the second table is still not there let's wait for a few moments and list the tabl again and voila it's there this time with all the tables defined in the same resource Atlas did not have any issue with the constraint and foreign key in the ephemeral database so he generated the schema compared it with the destination database decided that the table is missing and it created it and that's brilliant if you describe Atlas schema we can see from the message that it created the table so far we saw that it can create tables and that it cannot delete tables let's see whether it can alter tables I have a modified version of the videos one yam manifest if you diff videos one and videos two we can see that the field description was edit if you output videos too we can see that the videos table now has the field description now before I apply the modified version of the Manifest let me double check that the field is not already there by switching to to psql and describing the table itself the description column is not there so let's apply the modified manifest describe the table again and the field was added great if you describe Atlas schema we can see from the message that this time he decided to alter the table by adding the missing column and here's one more complaint even though the message is displaying correctly the last operation the events are not it is always saying applied schema and I would like the events to be more precise it would be great if I would see things like table X created table y altered and so on and so forth that would help a lot with observability besides creating altering and not deleting tables Atlas has a few additional features mostly designed to prevent unintentional changes one of those features is linting which allows us to analyze migrations and detect and prevent potentially dangerous and unacceptable changes let's take a look at the difference between the Manifest we applied and the modified version of it you can see that I removed the title column but also that I added lint policy that will treat destructive changes as errors now to be clear destructive in this context applied to destructive changes to columns and other properties of tables and not destruction of tables themselves since as we already saw Atlas does not delete tables so let's see what will happen if we apply whatever is defined in vide 3 yaml and describe the atlas schema and we can see that it reported the proposed removal of the column as an error and if I switch to psql and describe the video table the column is still still there finally the last example will be yet another modification of the Manifest where I changed the linting policy to not treat destruction as Errors By setting error to false also addit div policy but in the interest of time I chose to comment it out uh we're not showing it here you can check what it does together with all the other features in the official documentation so let's apply the changes and describe the atlas schema and this time without the safety net of linting Atlas chose to drop the column we can confirm that's what really happened by describing the table through psql and we can see that the column title is now gone and that's enough for for demo let's talk about Atlas kubernetes operator pros and cons the good and the bad I can say right away that tlas is a much better option than schema hero there is no doubt about that zero still it's not perfect and that imperfection is mostly due to the fact that Atlas was not designed for kubernetes and that the operator was added recently so there's a lot of room for improvement now let's start with bad things to begin with the syntax was clearly ported from non kubernetes version of Atlas as a result we can Define schemas as SQL or HCL while SQL makes a lot of sense Atlas itself has been pushing for HCL as the preferred way to define schemas so the project itself thinks that SQL is not the best choice on the other hand HCL makes no sense as being embedded in Atlas crds from my side I will certainly not use HCL so I'm going with what Atlas believes is the second best option SQL if Atlas was designed for kubernetes we would have SQL and yaml not HCL as the the options now you might say that SQL is a perfectly valid uh way to define schemas everyone knows it so no one needs to learn anything new the first part is true everyone knows it the second part is not since there is something to learn and that something is atlas schema Syntax for everything but schemas themselves the disadvantage is that I cannot easily use it with other kubernetes native tools for example if I would need more complex policies than what Atlas offers I would employ kerno but since we are not dealing with yamu but SQL that would not work there are many many other examples but at syntax with embedded sqls would prove to be difficult if you would combine it with other kuet tools next the second we cannot Define separate Atlas schema resources for tables if those tables are related to each other as you saw from the first example that confuses the hell out of Atlas that means that all schemas need need to be defined in the same manifest at least when those schemas are related to each other and that will quickly make manifest too big and too hard to manage finally we cannot delete Tables by deleting Atlas schemas which means that we have to delete them manually now that might sound like a lot of cons but it's not as a matter of fact a few months ago my list was much much larger so I discarded Atlas at that time however the team behind Atlas was listening and they fixed most of the issues ahead so I can only assume that many of those listed here will be resolved soon as well after all the operator is in very very very early stages and it would be silly to expect it to be perfect from the start now as far as the pros I have only three but they are big to begin with Atlas is much much closer to proper database schema management tools like liquid base or Flyway Atlas itself is a robust and reliable tool that has been proven to work both in simple and complex scenarios it's not a toy which is not something I can say for schema hero spinning up Emeral databases comparing schemas and generating migrations is a great way to manage schemas for the second Atlas kuet operator is an attempt to transform Atlas into kubernetes native way of managing schemas now you might ask why does it have to be kubernetes native and the answer is simple datab schemas are related to Applications so it makes perfect sense to manage schemas in the same way we manage applications that means that we want to Define schemas as kubernetes resources together with other resources that Define applications we might want to use kops tools to manage schemas and we might want to observe them in the same way we observe applications and all that means that it cannot be just a container that we run as a job it needs to be a proper kubernetes resource with statuses and everything else that comes with it schema hero is not that it is a proper kubernetes resource but it does not have statuses and that makes it close to impossible to use it with GitHub tools or to observe it and finally it's open source it's free there is a commercial version in form of Atlas cloud and even that one has a free plan with relatively generous offer now nevertheless if you prefer to self-manage you can do it completely free now Atlas kubernetes operator is far from perfect but it might be the best option we have at the moment at least for those who want to manage databas schemas as kubernetes resources heck Atlas is a great choice even for those who do not use kubernetes at least not for schemas but that option running Atlas outside kubernetes is not the subject of today's video Atlas kubernetes operator is right now my preferred way to manage database schemas try it out and let me know what you think see you in the next one cheers
Info
Channel: DevOps Toolkit
Views: 8,232
Rating: undefined out of 5
Keywords: Atlas, Atlas Kubernetes Operator, Atlas database, Devops Toolkit, Kubernetes database schema, Kubernetes operator, Master Database Schema Management, Viktor Farcic, atlas database schema, atlas schema, database, database in Kubernetes, database in k8s, database schema, db, db schema, k8s, kubernetes
Id: 1iZoEFzlvhM
Channel Id: undefined
Length: 23min 22sec (1402 seconds)
Published: Mon Sep 18 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.