PostgresOpen 2019 A Look At The Elephants Trunk

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay it looks like we're good to go I guess I have the owner of keeping you from lunch by running over so I'd better get started so I don't run over too much and of course then you will be able to like hear or the other people going for lunch but I will keep you here forever my name is Magnus Hagin aura I'm here today to talk to you about postcards version 12 let me just start out who has seen me do one of these new version of post press talks at postcards opened before ok surprisingly few actually I tend to do these almost every time because we do happen to make a post Chris release approximately every year so the setup will be the same but hopefully we'll be able to list actual new things because we haven't released post question 12 before we try to keep each release once a few quick words about myself as I said my name is Magnus hier nur I work for a company called red pill impro we're an open source services and consultancy business in Scandinavia so I'm out of Stockholm in Sweden myself where I lead up our database efforts which are unsurprisingly centered around Postgres otherwise I probably wouldn't be here within the Postgres project I'm a member of the core team I'm one of the committers who supposedly writes the code you can probably blame me for the bugs and I'm also serving as the president of the board for PostScript which is the European equivalent of post with us which is the organization that runs this conference so I'm among other things involved in running the European sister conference of this will will be in about a month's time in Milan in Italy if you feel the need to get to the other side please join us we still have tickets anyway enough about me let's talk about posters 12 who's already running posters 12 no one guys ok one person you don't count now as you might know post Chris 12 isn't actually done yet as if that keeps people from running it but it's pretty close to being done we hope to have it out pretty soon we did actually make a new beta release on post ghost 12 yes sterday who has he stole that yet probably no one if you hadn't been running in so this the schedule of post quest well what is about to become Postgres 12 started just over a year ago in July 2018 when we branched off Postgres 11 so the way that we work with the development in the post race project is we do all sort of feature development on the master branch in the repository and then at some point we make this very bold statement and say okay this is now Postgres 11 stable and by definition it immediately becomes stable and that would then turn into Postgres 11 and when we do that so this happened in July of 2018 that's when the master branch is renamed to post grows 12 and features for Postgres 12 starts coming in so at this point if you were to look at the master branch of Postgres it would say Postgres 13 because we have branched that again in July for the next upcoming version but then of course we spend some time stabilizing it during development in Postgres we worked with something that we call commit fests and this is just sort of our term and our method for doing the iterative development where the idea is we spend approximately a month doing feature development and then we spent a month reviewing those features and getting them committed and that's the thing that would call a commit fest so for post was 12 we had five of them we had one in July one in September one in November of last year one in January and one in March of this year and at that point post was 12 volts you know sort of finished since then it's been polishing it's been bug fixing it's been you know finishing the last pieces and as I mentioned right now the state is that yesterday we released the fourth beta version and there are discussions ongoing right now about what's the schedule going to be for the final version 12 which will I mean it will be out soon I'm not gonna say it will be out before the end of the month because it's not that much left in a month but there is nothing saying where at least not going to be out before the end of the year and I'm not gonna tell you which here [Music] but we're not going to release on New Year's Eve I'll promise you that so let's go into what we're actually doing let's start with the things that we broke because every new release breaks something right we're gonna start with something that I hope does not affect you but the data types ABS time round time and T interval have been removed does anyone even know what this is yeah so if you don't know what it is you're probably not affected by us removing it they've been deprecated since forever and now they're actually removed so that should hopefully not hit you this second breaking thing in our message posters has now removed the support for disabling strong random numbers I hope you never did that but if you had a deployment that relied a new building posters manually and turning off strong random numbers you can't do that anymore but really that that should not be a problem and that's really they we broke some other things but this is not like when we did post press 10 well like a quarter of my talk was here are things that are now broken because we renamed a whole bunch of things and just we made them more consistent and in doing so broke the entire world at least if you were DBA writing scripts against Postgres and it's paying off now suppose because twelve shouldn't be breaking a lot of things you should be able like the the upgrade process should be fairly painless of course the way to find that out is for you to actually upgrade it and let us know if it worked but it's not supposed to be that painful so let's look at the things we've added instead and when I look at this I tend to group it into sort of four groups I'm going to talk about DBA and administration features sequel and developer features backup and replication specifically and then performance of course because everyone loves performance and I don't really know in a lot of way like what's the difference between a DBA and a developer in today's world I really hope you don't necessarily have like you know independent teams that don't talk to each other or that do this we're all supposed to work together but I've just decided to you know man I draw the line if it's exposed at the sequel level then its sequel and developer and if it's like config file or similar then its DBA and that's just an if definite that's gonna last for the upcoming 43 minutes and 12 seconds but first let me start with you know some tiny things who's ever used PS gel okay who has ever needed some sort of help with anything in the world ever right so in psql this is one of those tiny things that people have been talking about for a long time and then it finally got done when you do you hopefully all knew about Eddy that you could do backslash age in a command and you get the syntax for that command that has been there forever the thing we have now is that it'll actually put out the little URL that you can click if your terminal allows you that will bring you to the documentation page on the website that has the full documentation it's a tiny thing it turns out to be really useful previously well you went to the website and then you would search and then you know it's like no you just click on you're done right very simple very useful and speaking of documentation our documentation now has figures like graphs and stuff it's not just pure text I think we have like three figures but there will be more right the big thing here is we have the foundation to add more figures and even today I believe you're looking at the version 13 documentation there's actually more figures already but in 12 it's not going to be that many but we've started the process of making it a little bit less just text and adding some figures of and graphs and things like that those are tiny things let's go into the actual features that you are able to use so on the DBA an administration site I'm going to start mentioning where the view called PG stat SSL that shows you SSL information about everybody who is connected to your system using SSL it now adds the columns to get the client serial number and issuer DN which can be really useful if your monitor if you're using client certificates you will be able to from this view see who is actually like which certificate was used to log in on this connection and the big difference is also that we're not masking the data for unprivileged users and that is something that can actually break things just like with PG start activity if you connect with a non super user or a non monitor account a lot of fields will just be empty they will be null or they'll say insufficient privileges that now also happens in PG statuses else so you have to either be a super user or a monitoring user or the user making the connection then you will be able to see the data but if you're any other user the view will basically be empty for those connections and speaking of SSL we've also added config parameters whereby you can set the minimum and maximum allowed SSL protocol the current values you can set them to TLS version 11.10 and 1.3 and the defaults are that we go from TLS v1 up to anything so at this point 1.3 postcodes doesn't support anything before TLS v1 let's just disable then you can't enable it and of course we don't like to set this to TLS v 1.3 and you should do that if you can confirm that all of your clients have a new enough open SSL to actually support TLS v 1 3 which a lot of people don't which is why the default is still TLS v1 but it lets you increase the minimum TLS level if you have the ability to do so and on the topic of connection and encryptions posters now supports gssapi encryption we've supported gssapi for authentication for a long time and you could combine it with SSL and TLS for encryption but if you have a well working in setup Kerberos environment using gssapi you can now get encryption natively in that and not have to set up SSL a typical example of the gssapi environments are of course people using Active Directory which is not exactly uncommon today and if you have that the system already has all these encryption keys set up and by using gssapi encryption you can use that you don't have to set up certificates and all that stuff it'll it'll just work if you don't have gssapi and you are just looking for encryption use SSL but if you already have it or if you have the infrastructure it's a great way to get that easier than you get it with that with SSL now unlike SSL gssapi encryption requires you to use GSS for authentication whereas SSL can use any of an occasion it's a separate layer and along with this you also get a PG stat gssapi that then works the same way as PG stat SSL but brings you in the information about the gssapi connection is that so let's talk about it this versions improvements of vacuum every version improves back in one way or another right and everybody loves vacuum so you run it all the time the vacuum command has now added the ability to get the parameter called skip locked or you can say vacuum everything except the things that are currently being locked just skip those instead of blocking and waiting this is for your manual vacuums and then you can sort of run that more frequently and eventually something will hopefully be unlocked and all that stuff there is also a flag for vacuum called disable page skipping which bypasses so the way the Postgres works in recently modern versions is that it looks at the visibility map and there is data there that says it can just skip vacuuming the gist pages because we know that nothing has happened here they're all visible there is nothing changed and you can turn off that optimization by giving this like this is typically necessary for you know debugging if you have corruption issues with your visibility map which hopefully you don't have but it is a new flag and lets you do that there are a couple of other useful improvements to vacuum the vacuum DB command if you're running the command line tool to vacuum multiple tables for example to vacuum your whole database you can now say min X ith and min MX idh so basically you can say vacuum every table in my database that hasn't been vacuumed for 200 million transactions or you know quarter a billion transactions instead of having to figure that out separately and then feed it the list of tables and MX ID is the same thing but for multi exits now I'll just repeat what I usually tell people ask which is you probably like most people don't need this because most people will just handle this with auto vacuum but if you are in a position where you need to compliment or a vacuum with manual vacuums this can give you you know another tool in the tool chest to figure out exactly when different tables and up getting back you because as anyone who is sort of run a substantially sized Postgres database knows if your vacuum doesn't work nothing works the copy command has received the where clause what that means well we've this is the copy into the database when you've copied data out of the database you've been able to say copy and then just give it a sequel query with the where clause and say copy the result of the sequel query to a cc file we can now apply a where clause to reading from a file so in this case we're saying copy my table from this my file with CSV where a equals five and it will then just read the whole file and apply the where clause to that file now that you can put anything in the where clause it kind of it has to apply to this row you can't do fancy joints and recursive queries and things on your copies but it lets you do the simple level filtering at an early stage in a lot of cases the way you did before while you loaded it into a temp table and then you did insert into my main table with the where clause and now you can just skip that temp table and apply the where clause directly so again small feature very useful related to that the p SQL command has learned to natively do CSV it has a whole bunch of different formats these days it has the standard format you can get like HTML tables I believe we can get latex output and all sorts of weird things and now you can add CSV by basically saying pset format CSV and every select you do will just come out as CSV maybe you saw I'm not entirely sure when I'd use that myself but hey it's there who's using PG stat statements today okay who doesn't who has never heard of or never used PG statements at all okay this is getting better it used to be that 90% of the people who put their hand up on that and that makes me sad because PG stat statements is awesome and you should all be using it or you should have a really good reason for not using it and it's oh I didn't know that existed but it should be a good one so a PG Start statement has now gained the ability to reset individual query statistics previously you could reset everything now you can go pinpoint and say there's one query just reset this query so that I can run my application and see what changes because maybe you you know changed your schema in a way you created an indie actually removed an index you want to look at what the new statistics looks for this you just reset this one query you just passed the PG stat statements reset the query ID and this wonderful number that's the internal hash number in the form of a 64-bit integer and it'll reset that one thing you can of course still just say PG stat statements reset and reset everything but that does less you say reset everything we've added support and I'll put a little disclaimer on this one post chris has had a there's a PG stat progress vacuum you already that'll tell you how far along a vacuum is there was been views added for create index we indexing cluster that lets you see how far into a create index process for example we are now there are discussions right now going on about reverting this in Postgres 12 when removing it again because issues have been found with the implementation that may not be fixable on time so all the features I mention here I think this is the one that has by far the biggest risk of not actually making an introversion file if it doesn't then hopefully it'll get fixed and added back into version 13 of course but you know no guarantees but if it does work and if it does get in it is beautiful in the end because who has never run a create index and then like okay how far long is this thing like how long is this gonna take it did you get half way in 10 hours or did it get 1/10 of the way in 10 hours it's very useful information at but there are issues with it if you happen to create another index while you are creating an index for example in a different session and the whole progress gets confused so we'll see about that one but if it does get in I do believe it it'll get it'll be very useful on larger systems another thing that we finally got into Postgres 12 that I think for many years have been one of our most requested features is reindex concurrently we've had for quite some time the ability to say create index concurrently and create an index without exclusively locking the table for a long time and stopping your application but if you needed to re index you couldn't do that and the workaround has been well used crate index concurrently to create a new index and then drop the old one which works in a lot of cases but it doesn't really work if this is a key because then you end up dropping your primary key and you have dependencies of foreign keys on it and the amount of really ugly workarounds that people have done to to work with this is painful and now you can just say reindex inherently very convenient of course the Rionda can currently take longer than a regular reindex that's how it works but that doesn't really matter because it doesn't look your system while it does but while it's running who is running their database with checksums data level checks on some today who is not running that because you can't turn it on once you've set up your database so postcodes 12 will let you turn it on unfortunately it will require your system to be offline but at least you don't have the dump reload you have to shut down your system and then you can use the PG check sums tool and say enable check sums on my database it is going to rewrite the whole database so if you have a large database it will unfortunately be a long amount of downtime there is a way to do it without a lot of downtime if you have a master and a standby and you do things in exactly the right order you can take them online offline in a controlled fashion and actually have very little downtime using this of the whole cluster you will of course have non redundant as its running and also if you use the PD checksums tools which is just a rename of the PG verified check sums tool what you had before it's now been renamed to PD check sums because turning on and off check sums is not verify but if you're using the verify mode of it you can now get a progress report so I'll tell you how far long it is so it's a step on the way we hope to have future full online ability to turn on and off check sums but not yet and off lines at least better than nothing and then we have on this side also what might be at least turning into the one of those features that everybody talks about in Postgres and it's actually completely useless in this version it's not completely useless but you're not gonna be able to use it for anything which is that will have pluggable access methods letting you basically have pluggable storage engines in Postgres except you're not gonna have workable pluggable storage engines in Postgres version 12 it's not complete enough it's a lot of infrastructure work and hopefully it'll be enough so that when Postgres 13 ships you'll be able to actually use it but we've had a number of cases of this with Postgres if anyone who went to simons talk this morning about partitioning there was a lot of talk in posters version 10 about partitioning but in reality it wasn't good enough for most people until version 11 we had parallelism in posters version 9 6 that very few people could actually use but then in version 10 suddenly it started working well and I feel it will be the same thing here right this is not gonna help you in version 12 but it's excellent marketing material but then in version 13 hopefully it'll come out and we have already have to actively developed community level storage engine one that's undue based sort of similar to how Oracle does it for example and one that to call in our store then you know would look will actually have both of those for posters version 13 but don't get your hopes up and thinking this will be actually useful to you as end users in version 12 because it won't so let's take a look at the SQL side and let's start with things that we break because it's fun to break things you can no longer create a table with oh I DS has anyone intentionally created a table using the widow IDs clause in the past 15 years oh not just to test things so with our ideas you probably all know what oh ID sorry these are the object IDs they're all over the system catalogs all over the system table and if you created a table and said with oh I DS it would add a column at the table called oh I D that would have this global 32 bit counter global and 32 bit and counter it's not a good combination of things today right 25 years ago there was probably fine because nobody had that many rows in their database but this would literally put a 32-bit limit on the number of rows across all of your tables and after you've wrapped that bad things started to happen so this has just been removed it has been deprecated since 2005 you know don't tell us we don't keep things alone for you for a long time but it's now actually gone it also means that oh i D is now a regular column all the magic is gone including in the system tables the visible difference there is if you do a select star from PG class for example you will now actually see the OID call previously you had to say select oh I D comma star and you would see the OID column this is just weird magic the main thing that has also allowed us to do there is also a lot of back-end a lot of magic on the backend side internally in Postgres that special case the OID column know that code has just been deleted deleted code is like the best kind of code because it can no longer have any bugs and in the end I believe this was sort of the actual removal was triggered by this pluggable storage API because otherwise they'd have to re-implement oh i d--'s for every storage engine and today reimplemented a feature for something that was deprecated in 2005 just didn't seem right another useful thing exposed at the SQL level is generated columns this is something that's in the sequel standard and of course it's the sequel standard is always not entirely complete and different databases do different ways but the way this is we create a column that is the result of a computation so it's a column where you can't put your data in it but it's the result of a comp computation 'post occurs only supports stored generated columns where we actually store the value there is also something called virtual generated column where every time you select it would calculate the value so a virtual generated column would be something like it's sort of a mix of a table and a view a store generated column is more like it's like a table with an automatic trigger that always calculates this value and you can't override it kind of a thing so if you put it in this simple example well you know create table it has a column a and it says be generated always ask a times to stored every time you insert a value it'll just multiply it by 2 and store it in the different column if you update it it'll multiply it by 2 and store it in a different column now if your computation is as simple as this you should probably not use to generate a column because it will still store it and that's expensive but if you're doing an expensive calculation then using this will just make it easier like you're storing large byte a object so you want to store the the sha-1 of it excellent example for generating : right because you can't forget to update the sha-1 when you are using a generator : the system will just do it for you sorry it should be able to call any function that's stable you obviously can't call volatile functions you can have it computed as current timestamp or something that's different but other than that it's I don't believe you can like reference other tables levels but any general expressions if you're using enums sorry so the question is are there any plans to support virtual columns I think so but I don't know how for I had those plans are I don't know if the work has been started if anyone's using enum I know it's been one of the very few things that you couldn't do if you open a transaction if you did the begin and then you did want to add a value to an enum that wouldn't work you had to do it outside of the transaction scope which was annoying for a lot of these tools that automatically generate migrations well now you can right now you can say begin you can do your alter type add value and it works and you can roll back the fact that you value it's all very nice of course you'll get a different restriction instead which is you can't use it in the same transaction but it turns out again in a lot of cases the the this was a blocker for a lot of these tools that would automatically generate migrations and they are perfectly fine with this restriction because they wouldn't do that anyway but being able to just add it within a transaction breaks out of a lot of the limitations of those tools one of the bigger things that we have at the SQL level which is why I haven't covered it in detail here is something called JSON path I think there was a different talk during the conference on JSON path if it has already been I hope you saw it if it has been and you didn't see it well there is video Jason path is a sequel standard way to query Jason like post Chris had Jason long before sequel standard had it so we've had our own way of querying jason with these some somewhat weird - a lot of people like syntax is of matching JSON document to JSON document jason path is basically a new query language that uses jason B and you can access it and it's sort of it looks a little bit more like how you would access them within JavaScript and a little bit more a little bit less post press specific it still uses the same indexing and that's the important thing if you have your Jason B columns with your Jason B indexes your gene indexes with Jason B Alps it will just work and it'll plug into those there's a bunch of you functions and operators there Jason B path exists and path matches and query and of course this being post pressed and particularly in these areas like these at question mark like we have all these funky operators that are really interesting now we have more of them but as an example just a very quick I'm not going to go into detail how it works but for example this is a JSON path syntax dollar dot a is extract the key a from my JSON document and in this case that's the let's see the apt question mark is the Jason B path exists so this one basically says it does the key a exist in this document or in the lowercase does it equal like the JSON path queries does it equal one so it lets you sort of wrap the entire query inside of the JSON path expression itself again if you are working a lot with JSON I would encourage you to look into that and in particular if you haven't seen it already look into the video from the JSON path talk that we had here at this conference another thing that we've been doing for a long time in Postgres who knew that CTE so with queries you might have the common table expressions with something as was an optimization barrier so who knew that it was that's more of you should have known that it was because a lot of people would use C to ease because it's also a really nice way to structure your queries to make them very readable but they also hide a lot of details from the Postgres optimizer they can't optimize across the boundaries of CTS or it couldn't optimize across the boundaries of CTS and not only is it that it can now do it it will do it by default so for those of you who knew that it was an optimization barrier I saw some fame some well-known faces but their hands up probably are the people who would then use these to Fritz as basically query hints but we don't have query hints in Postgres but we have CTS that our optimization barriers and you could use that as a workaround if the query planner did something you didn't like you'd wrap it in a CTE and suddenly you know like haha I'm the one who's deciding here not this stupid planner and if you do that that's gonna go away unless you tell us you really like to do that because we're adding a way to do it so just as an example we have a new keyword for CTS now this is a really really really silly example but it actually works to show this so I've created two CTE right I have a CT equal T and I see th called t 2 they're both exactly the same thing they're select star from foo and then my actual query is select star from T Union all select star from T 2 and if you look at the query plan from this in a post version prior to 12 it looked like this and here's how we see the the CTE materialization that post was just assumed well these are two different things right and I'm gonna run these two independent see two E's and I'm gonna put in the pend node that takes the output of CTE scan on T with CT scan on T - it's gonna depend the results and it's getting one row from each because I used a very tiny test table but at least I put a row in it not zero now the new thing that we have in Postgres 12 is you can now say this case with t 2 as materialized that'll tell post press to treat T 2 the way that it used to treat it in previous versions for both gross whereas it'll treat T as an optimized CTE otherwise the same query and when you look at it you'll notice that the T is actually completely gone in this query apply because the optimizer looked at and said it's silly to put this to put select star from foo in a CT like that doesn't actually provide any useful feature so I'm just gonna ignore that and inline that and just run it but when you explicitly said but I want you to materialize this then you get a full CT scan now by default post res will apply materialize if your query references to CTE in more than one place but if it doesn't then it's going to try to inline it in every case so if you have in your codebase places where you are intentionally using CT East to control the query plan you will need to put this materialized keyword in there otherwise that optimisation goes away now in fairness you should probably try running your query without the materialized because maybe you don't actually need that anymore because other things in the query planner have also gotten smarter since you put that in there and that's the classic problem of query hints right in databases that have query hint says which is you put the query hint in when you need it but you never take it out so if you have this don't just blindly go in and put materialized on everything test it and then put the chair lies in if you still need it because it's quite possible you don't so moving on few quick things about backup and replication if you are configuring max while senders the maximum number of replication connections it is no longer part of max connections so previously if you had a hundred max connections and 50 max while Sanders if you actually used all those 15 while senders it would only be 50 connections left for regular users now if you put a hundred max connections and 15 Maxwell senders that'll be a hundred max connections plus 50 Maxwell centers so it just counted from a different pool that is dedicated to while centers it just gives you a better control but in particular if you have a high number of a number of wall senders you might need to consider actually decreasing your max connections again because you don't need them anymore if you have a low number it doesn't really matter recovery dot Kampf no longer exists it's gone I mean it probably still exists in your repos but we don't use it all the settings from recovery Kampf have been merged into the main configuration file or if you want they can go in the Postgres top auto Kampf if you use ultra system so you can now use ultra system to configure your recovery parameters and a few things like that so if you have any scripts that deal with your recovery or any scripts that deal with setting up your replicas and things like that you'll have to redo them now I'd say that's an excellent time to look over the tools that are out there and not write your own so if you're doing backups look at the tools like you know backrest and Bowerman for replication we'll look at those tools for bootstrapping look at things like Patroni rep manager and all these things and maybe retire some of your old scripts that you wrote yourself because these guys are doing it better since you have to update your scripts anyway but you do so instead of recovery code we now have a recovery that's signal because the recovery call file did two things right it controlled how you would do recovery and the existence of the file would tell the system to do recovery the recovery dot signal for has no content but if it exists it triggers this to enter recovery mode and then it will read the settings from the main configuration file to perform the actual recovery so again scripts update similar there is now a standby dot signal which does the same thing but for standby mode so that would be the equivalent of having a recovery comp with standby mode equal zone that is similar to doing creating your standby that signal now one of the why would you change this well one of the reasons is we can now apply the ability to reconfigure things on the fly in version 12 there are only two things that we can reconfigure we can change the recovery min apply delay for delayed replication and we can change the archive cleanup command without restarting Postgres so if you have a standby server with active users reading data you can change these things without affecting it hopefully that didn't get into 12 is the ability to change the primary con info so basically re point to standby to a different IP address without restarting it that didn't make it into 12 hopefully it'll make it into 13 because then you'll be able to like move your primary role between standby nodes by having another standby server that is actively running queries repoint itself without terminating or any sort of interruption to these queries we're not quite there yet but again infrastructure for for future features recovery parameter recovery target timeline has a new default that new default is latest that is probably what you wanted before you just didn't know it and a lot of you will have put into your templates that you always wrote recovery target timeline equals latest in your files that's now the default and it's it's the setting that the vast majority of people actually want there's a new function called PG promote sequel function to promote the standby you connect your standby select PG promote boom your novel master don't need to do the PG CTL command anymore if you are using exclusive based backups so the old style of doing base backups they're even more deprecated eventually they'll hopefully be removed but we just keep adding the deprecation level almost everyone I will still claim who who has scripts using this have scripts that don't work they just look like they work and there are a lot of dangers with exclusive base backups which is if you for example get a crash and the server restarts while you were running a backup it's not gonna come back up it doesn't start that's not great there are two different ways of doing non exclusive based backups there is one is the low-level way of doing it which still uses SQL functions but scripts have to be updated to use them or you can use PG based backup which will also do that but if you look at any of the big backup tools like backrest and barman they don't use this anymore they've stopped they didn't at one point but they stopped years ago and they're using the non-exclusive version so if you're doing your own script you should really look at using the non-exclusive version it's much much safer yes that is also a very good point you should not be using your own scripts in the first place there are cases where you have to use your own scripts but if you don't have to don't do it just go look at what what things like a backrest and barman do and then consider how many of those special cases it you thought of and if it's like more than 2 out of 500 you're better than average so final section performance a couple of smaller things if you're using space partition who's actually using space partition just indexes anyone one two three okay that's good they now support K&N searches who's using K&N in regular gist today you might not even know it KN in this the K nearest neighbor so it's basically you can use the gist index and now the space partition just index to do order by limit on things like distance within your gist and that also now works with space partition just just gin and SPG's will not generate a lot less wall during index build it's one of those good performance improvements that just it just will you don't have to do anything nothing changes it's just faster and less well everybody's happy this is one of the that's a shortened version of what it's actually called is they detail the partial details of compressed atoms everyone else that means right what it means previously if you had a large value in Postgres such as and the reason that this was written in the beginning like a large geometry in post G is if even if you only wanted to read the beginning of it Postgres would have to read the whole thing uncompress it and then read just the first beginning of it because these are automatically compressed the exception being like substring if you do some things that nothing else the way that it it's been optimized is that these other api's can now tell the system basically I only need this much and it'll only decompress that much and this can in particular what we know already is there can be a big win for post G is because most post G is often has large structures going in that's the main we say it's going to show up as useful for other things as well but in particular for things like post GIS I'm not going to go as far as say that we fixed the final thing on partitioning but we've picked one of the really big hurdles on partitioning first we've added the ability to do flexible partition bounce which is sort of going back when we had the old-style partitions you could do anything then in post post n we added declarative partitioning and we've restricted a lot of what you could do and we've now enabled a little bit more so you can do range partitioning based on expressions and not just values performance-wise a huge difference in partitioning is how locking is done in post press 11 and earlier when you ran a query that ran on one single partition as it planned the query it would actually look all your partitions so if you had 10,000 partitions it would look all 10,000 of them and then read one the looking has now been postponed until we actually read the partition which means if if we can identify our data to be in a single partition we'll just look that partition and not the rest and of course we'll look the master table so you can't drop it while we're reading it and things like that this can lead to much much much faster index can partition scans particularly when a large number of partitions involved it can make a really huge difference yes yes its beta it will always delay it until it's cancer in every single case regardless so if you have ten thousand partitions and you need to scan ten thousand partitions it'll delay the locking until it actually scans them but of course it will still have to lock them so there's there's no limitation on that you can also now attach a new partition to your table without taking an ass access exclusive look there are use cases with that becomes really really important but the first one I think is the most important one around that if you copy data into partitions for some reason previously postcodes didn't use what we call multi inserts it would actually open the table and sort of inject the tuples one by one into the table access manager it now uses multi inserts it's just faster that's what we need to care about copy is faster but the really big thing that I find is you can now have a foreign key that reference is a partition table and this is one of the like partitioning the point of partitioning is they they should be transparent to your application and then we say well as long as you don't have foreign keys it's like you should be able to have foreign keys in your database now in Postgres eleven you could have foreign keys from your partition table referencing another table but you couldn't have party foreign keys on other tables referencing into the big partition tables now you can as long as the partition table has a primary key but that you can't reference a table that doesn't have a key at all and having a cross partition primary key has some restrictions on what the primary key must look like they obviously still apply here that is that the partition key has to be part of the primary key because otherwise it just won't work that is there but this sort of unlocks a whole bunch of cases where you just couldn't use partitioning before without I mean we don't want to remove our foreign case from the database it's kind of one reason why we have the relational database serializable the serializable isolation level is now compatible with parallel query previously if you had said set your isolation level to serializable you would never get a parallel query and well now you will so obviously if your workload is one that benefits from parallel query and requires serializable it'll be a lot faster expression JIT compilation was in Postgres 11 but turned off by default and you had to go in and turn it on in 12 it's turned on by default so keep an eye on your queries there might be cases where they run slower because of it then you'll have to go in and turn it off but the reason that it's turned on by default is that we believe that it will help a lot more than it will hinder an experience shows that it will so that's a lot of features there is of course a lot more than this I mean this is sort of a laundry list of the biggest feature and/or the features that are liked if there are contributors I see there are a number of contributors in the room if I forgot to mention your feature I'm sorry you know come heckle me later I probably should have but there's a limit to how much you can say there's always more there's a lot of smaller fixes there is a lot of performance improvements obviously they're also bug fixes they have hopefully been back patched but in particular certainly now that we've reached a lot of a beta 4 we need your help every one of you and we need your help to download and test this version because without knowing like that's the whole point er you tell us now what's wrong and we can fix it before we release it's a lot easier to fix it before we have released it'll also help you because then you will be able to upgrade once we have released because you've already tested and told us what was broken and we fixed it the beta 4 is available on App packages is in rpm packages it's for Susan it's windows installers like all of these things are available so it's really easy to install you can use PG upgrade to upgrade your machines now you might not want to upgrade your production to this yes depending on who you are Robert famous for putting versions in production that nobody else would even consider but if you have say CI systems put beta four in your CI systems it'll run all the time and you can let us know how it worked if you have a test suite for your own systems run it against post quiz 12 beta 4 and let us know if it broke something let us know if well of course outright if we broke something and it doesn't work we're also interested in knowing did it run slower did something become slower in version 12 obviously we don't intend for anything to be slower but it could happen and we want to know about that and of course we're also interested in the fact that you've ran it and it worked because then we know that you ran it and that gives us more confidence in that what we have has actually been tested for real and running those types of tests doesn't have to be hard right just run it real quick even the small part helps if you yes if you have developer systems or full-on staging systems and things if you're willing to run our beta versions on that that's obviously even better the more testing we can get the better we don't think there is much left until version 12 of course we there's no guarantee you if you start building something in your application using one of these new features that actually includes changes to your application yes we may remove that feature before the release we reserve the right to do that we have reached beta 4 it's unlikely except for this progress meter thing but it's unlikely that it's gonna be removed there is a risk it's always the risks which is why you know putting it in production can be slightly scary that's all I have I think I have approximately one minute for further questions so any further questions Robert yes you should leave production at beta 3 don't go to beta 4 yet anyone else yeah you can all here lunch back there right that's what it's all about in that case I will say thank you for your attention feel free to grab me I'll be here the rest of the conference just grabbed me in the hallway if you have any further questions or feedback on version 12 or anything else and other than that enjoy lunch thank you very much
Info
Channel: Postgres Open
Views: 622
Rating: 5 out of 5
Keywords:
Id: iUm1OPBa0Wk
Channel Id: undefined
Length: 49min 21sec (2961 seconds)
Published: Thu Sep 19 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.