Webinar: Online Upgrade Using Logical Replication

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone thank you so much for joining us today we're going to give it just a moment for everyone to jump in the session and then we'll get started okay great hi everyone thank you for joining us today my name is courtney brown and i'm a senior marketing manager here at edb i will be your host today for our webinar online upgrades using logical replication i'm joined by florin developer and dba at edb now before i turn it over i just want to go through a few housekeeping items this presentation is being recorded we'll be sharing the recording and the slides with all of you after the broadcast all of your lines are currently muted if you have a question feel free to submit it in the question panel today's session is scheduled for about an hour and we expect the presentation to take most of the time but any extra time we've allotted for q a if we are not able to address all of your questions we'll be sure to follow up afterwards with any unanswered questions and now i will hand it over thank you thank you courtney hi everyone is your production environment in need of refresh soon your postgresql would be no longer supported in this presentation i'll explain why you should always update and do it without significantly impacting your production my name is florinidia and i started working at second quadrant two years ago and now i'm part of the adb i am a member of the pgeological and vdr team as a developer tester support engineer and even consultant on some occasions i also have been involved in organizing pg day in italy and hope one day to become a postgresql contributor so what's for today's agenda i will briefly explain how postgresql releases work and what is the difference between minor and major releases i will talk about and give you some pros and cons around the upgrade options i will give a quick introduction to pgeological because it's the tool i'll use for the upgrade and finally i will show you what are the steps to actually perform the major upgrade first of all why upgrade probably the most important aspects are the security improvements every postgresql version adds more than application methods and ssl tls version support more features also every version releases gives us new or at least better options for monitoring high availability or disaster recovery more performance there are versions that improve massively performance for example 9.6 with parallel query or version 10 with just in time compilation however as a rule every release improves performance which change without changing anything on the application level and new sql standard implementations functions and constructs to help developers bring out the best now a quick introduction on how hospice releases works thanks to its continuous release over the past 30 years postgres earned a strong reputation for reliability uh feature robustness and performance you probably all know that posgus is enriched with new features once a year with what is called a major release and the miner releases on the other hand are quarterly and add security or bug fixes and there are so many for example the latest podgy sql 12 miner has a result four cvs and more than 50 buff fixes postgres global development group guarantees these minor releases for more or less five years for each major version of oscars this means that when they reach end of life after five years from the major release they will no longer have security updates or bug fixes from this camera we can see for example that we are running out of steam for 9.5 which will receive its last update this february miner releases are always compatible with all the other minor versions that is you can install a 12.5 on a server that only had 12.1 we don't need to update consecutively however it is of course highly recommended to apply them as soon as possible when they are released to do this you need to shut down podgus install the updated packages and restart podcasts simple as that however the advice is to have a q environment and first update and test there and always check the release notes that will give information on any if existing changes that you need to take care while updating you should start by updating first the standbys switch over to your applications and then update the formal masters there is no need for anything else as simple as that there are several options for major upgrades insteads and they are they all have their pros and cons the dump and restore is suitable for any postgresql versions but being very slow is not recommended for large databases pg upgrade is very fast especially using the link option however fast is not zero downtime also you need to recreate all your high availability setup as the standbys will not be usable with the new version when our databases are not suited to use these two options we can use logical replication this consists in creating a logical replica and waiting for it to synchronize with the master completely we can even create all physical standbys needed and they will get all the replication from the primary primary automatically one other pro of logical replication is that you can choose what database you can or you want to replicate with pg upgrade instead it's all or nothing nothing it's recommended to monitor the caster for a couple of days and check that everything is working as expected when all the checks are green turn off the old master and redirect the application to the new postgresql instance with the use of a connection of a connection puller for example pg bouncer this can be performed in zero downtime like for the minor updates it's advisable to automate the progress process and test it in the qa environment until you are confident with the procedure podgus already has a logical replication mechanism built in it is available from version 10 and it's mainly implemented on videological i will show you in this presentation the necessary steps to upgrade with pgeological however there are other logical replication methods to do it as well please not also that physical replication is not an option because of the incompatibility of the data directory between major versions to geological uh uh i will use the geological um version two there's also version three but it's not over source yet and the pgeological is a fully integrated postgresql extension that does not need triggers or external programs for its operation it is a valid alternative to physical streaming replication regarding high availability and disaster recovery it brings more flexibility such as replicating between different versions and besides you can do selective replication that is only what you want it gets the replicated and one important thing related to upgrades is that is that it can replicate sequences the state of sequences added to replication set is replicated periodically and not in real time dynamic buffer is used for the value being replicated so that the subscribers actually receive future state of the sequence the geological supports any version of postgresql starting from 9.4 so since there's logical replication inside positive score why would one want to use digilogical or why use built-in replication what are the differences the built-in replication can be used in specific environments where it's not always possible to install extensions such as pg-logical due to strict rules on what can be installed on those servers so only bulk built-in functions are available fidget logical on the other hand it's much richer in fun features there will always be features that won't be part of postgres for a variety of reasons for example the community might think that posgus doesn't need certain features inside the core or maybe just because they haven't yet been been introduced in the list latest version or perhaps some are only required by specific customers with pg logical we can act immediately where needed and have the functionality available for all postgresql versions now that i've finished the introduction to basic basic notions around postgres and pg logical i will show you the steps you know we need to take to create a logical replica and use it as the new master once the synchronization is complete in my example i will upgrade from 9.5 to 13. let's see what are the missing what we are missing by using posgus 9.5 for example every major release of posgus introduces new fixes and innovative features there is a nice website y-upgrade.tapes.com where you can specify the version you are using and it will give you all the fixes that you are missing i counted more than 80 cvs for this case and as you can see in the photo there are more than 2 000 fixes allow me to list a couple of very important features that are missing like parallel aggregates parallel sequential scans declarative partitioning scrum authentication just in time compilation parallel creation of index stored procedures reindex concurrently and so many others okay before diving into explaining the steps and requirements for our braid i would like to explain some terms i'll use the node is the information that postgres has on the database ready to send or receive with pidgeological providers and subscribers are the roles that a node can assume provider is a node that sends the data instead instead the subscriber is the one that received the data the replication set manage which tables will be replicated and also what actions on those tables to ensure that the subscriber received data from the provider we must specify a subscription and configure the rules of this through the replication sets i will be using pg95 and pg13 as hostnames for the source and destination servers respectively respectively there are some requirements first of all all tables must have the primary key so that pg-logical is able to identify the tuples to be modified on the subscriber however as a side note on tables without primary key it's possible to replicate only inserts but this is not the case for us we need everything to be replicated for our on our upgraded server both present data and new changes we don't want to miss anything for example you can use a query like this if every table has a primary key this query should return zero rows if not please fix it and before before going any further tables must have the same name on both nodes the tables must belong to the same schema on both databases must have the same columns with the same data types the check constraints not null constraints etc must be the same or at most weaker on the subscriber than the provider there must be more permissive however in our case of online upgrade we will need to copy all the schema so so this is not really relevant because we need everything there so i will show you the next slides how we can do that which logical runs at database level so global objects such as rows are not copied the best way to copy them is by dumping restore global objects only you can do that by using dash dash global only options with pgdom all unlocked and temporary tables will not and cannot be replicated ddl is not automatically replicated managing ddl so that the provider and subscriber databases remain remain compatible is the responsibility of the user the replicate ddl command function allows us to execute ddl on the provider and on the subscriber consistently so you need to install pgeological on both the old and the new server and you need to install the correct version for each of everyone in our case postgas 95 pg logical for the old server and position 13 which logical for the new one now that we have the packages installed we need to configure some parameters in the postgresql configuration file on both servers shared preload libraries to pg logical this statement will cause posgress to load pg logical on startup so that you can use the extension if there are already other values for this configuration just that geological separating with commas logical wall level we need this to make posgus write the walls with the necessary information for the logical decoding increase the max worker processes this sets the maximum number of concurrent background processes that the system can support we need to allow postgres to create new worker processes to be used by pgeological increase maxwell centers we also need the wall centers that will take care of sending the decoded data to the other node at least one per node we can put a slight slightly number there increase max verification slots egological uses replication slots so we need at least one on the provider node all these settings needs a postgres restart please note also that the postgres configuration file is not replicated so we'll you will have to tune your new server accordingly to application back benchmarks and test which is very important to run before the switch over in the postgres authentication file we must give replication and connection permissions to the user will use specifying the database correct name or databases if we if we use more than one and my user is called pga pgl user and this user must have replication permissions and must be a super user it's strongly recommended that you use a pgpass file to manage your passwords these changes to the hba file needs only a reload so these were the preliminary steps now on both servers in their respective databases we need to create the extension after that we will create the provider and subscriber subscriber node by specifying at least a meaningful name in their connection string this will identify the databases as nodes ready to be used by pg logical we now add all the tables and sequences in the default replication set you can do that for schema in the example i'm adding the the tables and sequences of the public schema this means that any subscriber who connects to the default replication set will receive all changes in the public schema it will synchronize this way all tables and sequences during the initial phase in a single transaction for very large databases you can also choose to add the larger tables one at a time by initially not adding them to the default replication set but adding them to a new replication set after the subscription is created let's make the subscriber communicate with the provider now for this purpose we have to create a subscription from the subscriber node using the create subscription function we need at least two parameters subscription name and the provider data source name we can leave the default values for the other parameters the only one parameter we need to be to be careful of is synchronized structure well as i said before we must have the same tables with the exact definition on both nodes we can do that manually with a damp and restore using the dash dash schema only option with pg dump or let fig logical do it by setting synchronized structure to true as this parameter defaults to false the other parameters are optional and in our case a major upgrade they are irrelevant the replication sets would be used to manage which data to replicate to the subscriber but for our case we need to replicate all the data so we use the default replication set with synchronized data we can decide whether to synchronize all the data already present or to replicate only changes from this moment on this one defaults to true so it's what we need forward origins is used to send changes made by other nodes and not just by the provider that is for cases with multiple providers and subscribers and apply delay is used to have a replica with certain delay needed for example in disaster recovery cases now the present data is replicating from the provider to the subscriber and when the initial synchronization ends the geological will continue to replicate the new changes as well we can check the subscription status using the show subscription status function and if the result is initializing then we are still in the initial phase of copying the data if we get replicating then we are already replicating the data in real time as i already said pidgeological works at the database level so we need to repeat all the steps from creating the extension to the creation of the subscription for each database we use and want on the new server always test the procedure in qa environment first and possibly with a similar data set and server capacity to measure the time for synchronization and simulate the cut over test it multiple times from scratch in order to practice and master the procedure this technique also all this technique allows application developers to have a sandbox on new page new major process version environment similar to the production environment on which they can also perform benchmarks when all our databases have a subscription and are all with replicating status we can switch our application set up a connection puller to the main database if you are not using it already pause the connection pool force the synchronization of all synchronouses with the synchronized sync with sequences function switch the configuration of the pool over to the new system and resume the connection pool so it now access the new server and reload okay we can now clean our data new database of everything we use to create this replica we can unsubscribe delete the pillage logical node drop the extension drop the user role and reset the rows in the configuration and authentication authentication files related to pg logical that we previously modified and that's all easy no so summing up this presentation uh according to our experience at second quadrant now adb the use of pidgeological is the preferred way to upgrade the newest postgresql major version especially in the physical and virtual machine environments where a major apollo scale upgrade goes along with a major operating system upgrade and most likely hardware upgrade don't forget that this approach allows application developers to test and benchmark the new database and reduce the business risk a one fits-all solution doesn't exist this is just one possible way to perform a major upgrade always choose the one that adapts better with your context it might be that logical dump and restore work first perfectly fine in any case always start simple and improve incrementally based on your feedback from test so make sure you test first as i said before there is also version three of pg logical that was re-architected to support even more functionality than version 2 including it has a replication slot failover support that means physical standbys can substitute either provides all subscribers it supports the replica identity full uh the limitation that the table must have primary key is gone it has transparent ddl replication which logical takes care of replicating the ddls by default faster performance replication to kafka and rabbit mq transparent support for partitioning and re-partitioning on the fly better conflict detection it can detect and resolve more type of conflicts and more granular conflict resolution configuration does not require superuser to be used for the replication and provides much more statistics and viewing to state including worker errors visible from sql rather than from the logs all very useful for monitoring the same approach can be used to perform major online online upgrade in containerized scenarios orchid stated by kubernetes think about upgrading in the future of postgresql 13 database to a postgresql 16 database without any time just by starting a fresh postgresql cluster using logical replication from an existing one this is what we are currently doing with our cloud native audisql operator uh well that's all for me thank you for attending and we have some minutes for four questions and also you can reach out on social media thank you so much so i am going to pull up our questions so there's a few different variations of this question but people are are basically stating that they're running a specific version of postgres either like 9.4 or 9. 3.4 and they is there any sort of limitations in terms of up using this um method to upgrade from like one version to another there's lots of questions about like they're using a specific version upgrading to another version is there any limitations the limitation there's no limitation but only that pg logical is available only for from version 9.4 so from 9.4 it's uh we don't have any limitation you can upgrade from 9.4 to 13 or or anything 9.4 9.5 anything so we can go lower than 9.4 but that's the only limitation okay um someone else is asking how many concurrent databases can i migrate using pgeological well there's no such limit i mean i think pg logical is limited hard limited at 1024 something like that that's the limit that i think it's not in real real databases however you need to consider your server so if you have enough you have a hardware that supports all those replication thoughts or the walls senders that are working there's there's no problem for pg logical it can can replicate uh any number of databases and then someone else wants to know for minor upgrades is this the fastest way is the best way always pg logical well for minot upgrades the best option is like i said you just install the the binaries and you restart podcast you don't have to do anything else so it's it me you don't you don't have to use you don't need to use vg logic or any any other tool because and not not even pg dump that is the simplest one because it all it it already does does it transparently yeah only stop browsers install binaries start process it's simple and it's always compatible between minor versions okay let's see um another question um what happens if the process fails for any reason what happens to the production database well if the if the process fails then well i mean nothing happens to the production database it will it will fail to create the replica and uh uh the only the only thing that we have to do to monitor and stay and be careful is the replication slot it depends on when it failed to do the process so if the if it created the replication slot that slot could keep uh accumulate wall and and that that could be a problem for the production that we finished this space and everything but that is if when you are doing this process you you you have to monitor it and to see if everything goes well and if it's not going well the the one of the one i think the only thing to see is the replication slots on the on the primary that based on the production database there are a few questions coming through comparing pgeological to streaming replication and pgeological to other forms of replication do you maybe just want to talk about that topic broadly well like i said physical replication is not suitable for an online upgrade because it doesn't it's not compatible between my major versions so uh it's a different use for for uh for online upgrade you can use logical replication you can use other types of logical replication there's there are all types like sloane londis or but yeah there's also the if if you are on a version that permits it also the built-in replication method from version 10 only and [Music] yeah it depends on the the use case physical replication only physical replication replicates everything and it's not you cannot do anything else you you can add you can at most make a copy and then do a pg upgrade on that copy on that send by and when uh and use it later as a master but if it's not it's not meant to do to to be used like that and pg logical you can use it or logical application you can use it as in as a standby but yeah physical application you cannot use it for online upgrades okay let's see what else is coming through a lot of the questions are really um specific to to use cases so we're probably gonna have to follow up with some of those um after the session someone wanted to know if you could use this cross platform so windows to linux um pg logical uh well you can we create some binaries also for uh for windows but it's only meant for for linux mainly it's only developed mainly for for linux so if you have uh the need to to replicate the windows or to windows server you have to you have to contact us probably and we see if we can help you with that depending on the case right okay so it looks like those are all the questions that we're going to be able to cover live today as i mentioned any other questions that we were not able to cover will follow up after the fact with each of you and provide our answers also as a reminder you'll be receiving a link to the recording of this session as well as a copy of the slides so the email address that you used to register uh and so that will be coming your way later today so thank you all for joining us and we hope you have a great rest of your day thank you thank you
Info
Channel: EDB
Views: 428
Rating: 5 out of 5
Keywords: EDB, EnterpriseDB, Postgres, PostgreSQL, opensource, database, live, webinar, webcast, replication, pglogical, upgrade
Id: Mv5YZMH7YNE
Channel Id: undefined
Length: 34min 41sec (2081 seconds)
Published: Tue Mar 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.