High Availability MySQL using Python - Nathan Coffield

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] my name is nathan coffield i'm here to talk about a design how to design a high availability solution for mysql using python so in 2010 when i moved to the valley there was no real out of the box solution for my sql failover i joined the facebook site reliability operations team and we were like doing failover by hand it was cumbersome prone to human error and resulted in extended downtime for our users even today group replication is uh is not widely adopted as a failover solution so in this talk i'll walk you through the thinking that allowed us to design a system for detecting and remediating mysql primary failures using python automatically uh full t full date full deep dive on this topic takes typically about an hour so i'll do my best to get you up to speed in the next 20 minutes uh that means i have about 90 seconds per slide so wish me luck so just a brief agenda uh first we'll take a look at how a typical load balancer handles a failure in a stateful service afterwards we'll attempt to use that same concepts to implement our failover system for mysql i'll show you some example python that can detect a failure and dive a bit deeper into the limitations of that solution after we address those limitations i'll briefly describe the process for demoting a primary server and then we can get on with the q a at the end of the slide you'll find links to my lightning talking box which elaborates a bit more on the my sequel specific details and presentation from my friend jeff jang from percona live which goes into much greater detail into these systems finally i'll provide my contact information if you'd like to follow up offline with more questions all right so this first slide represents the typical load balancer we have several nodes which all serve the same content the load balancer typically has some method to validate the status of those nodes in this example the first two nodes are obviously healthy but our third node is experiencing a failure the load balancer now knows that it should not serve any additional requests through that connection additionally we have a third column here which can help determine the load on a given node typically you might have some maximum number of connections your web server can handle and the load balancer will do its best to equitably distribute load amongst the group let's see the same data structure let's see if this same data structure can be used for mysql in this diagram we try to map the concept of a stateless load balancer to mysql concepts in the first columns are members of a replication group db1 and db2 are both replicating from db3 and should have an exact copy of the data data which is only slightly behind the primary we aren't doing any fancy multi-master replication here all rights have to go to db3 before arriving at db1 or db2 mysql has a concept of an i o thread so we can use the sql commands using the mysql protocol to query the status of the replication thread no such command exists for the primary replica since it's not replicating but we can use a select one as a fairly reliable status check third column represents latency of the replicas relative to the primary instance this will allow us to disable a replica if for any reason it begins to lag past our applications latency thresholds obviously the primary instance has no replication lag because that's where the rights originate so now we know we should be able to get the necessary information to determine if a replica or or the primary is offline in order to implement our failover system we're going to need some shared topology server to distribute members of the nodes of the group and tell us who the primary is in a load balancer nodes might be hard coded into the configuration but for db connections we typically store these details in some kind of service like zookeeper or console if we can't query for the i o status a replica is offline if the i o thread status is nonzero we know that the replica is online but the connection to its primary is unhealthy potentially indicating a primary failure unlike the stateless service clients are not served directly through a load balancer an application needs to be routed to the primary for rights and the closest replicas for read queries we'll say we'll need some kind of algorithm to ensure that a single unhealthy replica does not cause us to assume the primary is offline so the following is just pseudo code in the it's the fewest lines of code i could come up with to adequately describe this process and i apologize that i you cannot copy and paste this and have a failover system every replica in the group would run this script as a sidecar alongside the mysqld process lines four to eight implement our status checks we described in the earlier slide it's pretty straightforward select one on the primary and show slave status for replicas in this way we can collect the status information we'll need for failover lines 12 and 20 is where we obtain consensus and that's really where all the magic is happening we use a thread pull to talk to every replica in the group and on line 20 we collect those results and if more than half of our replicas concur that the primary is offline will return false this ensures that all members of the group or at least a majority agree that the primary is offline and we can do our failover so in the main thread there in the main function there you see we're just in a constant while loop if the primary goes offline we're going to demote it this solution has some problems um this is essentially what we had at facebook in about 2013 when i joined the dba team and you know we needed to address some of the problems using i o thread status we can only verify the link state between the primary and the replica but you can't detect a disk failure which can be worse than a network failure also note i didn't show the process to disable a replica in the previous script it's pretty simple if the replica is offline or seconds behind math sure exceeds our threshold for latency we disable the replica from the primary our failover system should ensure that a primary is online to perform that operation uh unfortunately there's no way to verify the right path using the mysql protocol alone you cannot make a replica right to a primary and so we would need to create our own protocol to transfer this data between our agents if we go back to our previous data structure we can solve the right path problem by performing a heartbeat insert from our script and we'll create an internal table for that purpose so essentially we just create a heartbeat table to determine if our heartbeat inputs were successful we'll need our scripts to be able to talk to one another so we'll implement a server which can do just that at facebook we use thrift for this purpose but grpc and protobufs can do the same thing we'll create a thread to perform the inserts and gather status from members of the group and we'll write a demoter thread that will perform our consensus check when we detect a local heartbeat failure here you can see our status query has changed from a read-only query to a query which writes into our new heartbeat table this will be used by a local agent to get its own view of the right path the git status function has been changed to connect to our status service because it could get the result of the remote replicas heartbeat inserts finally we iterate through our group members collecting the appropriate status meta data just like we did before and that's basically the status thread uh we'll also need uh something that's gonna now that we've collected the necessary status data we can once again determine consensus so in this example we initiate a quorum vote if the local status thread failed its own heartbeat insert again if more than half of the replicas agree that the primary is offline we can initiate a demotion we also need to implement get primary status so our peers can see the result of our own heartbeat inserts and so that's what you see online 45. i went a lot faster than i expected so how do we demote a replica you might not be familiar with my sequel or it kind of seems kind of hand wavy i didn't cover this in the pseudocode because it'd be pretty involved process but the gist of it is that we need to find the replica with the latest transaction ensure those transactions are replicated to other members of the group and then if necessary promote to a replica in a preferred region one with the lowest latency relative to our applications once all this is complete we can change the topology service to point to the new primary i did it in 10 minutes so here you'll find the links to some of the public stocks we've given on this topic in the past um i'll include this link in the chat for you guys uh both the slack and the remo chat um there's a lightning talk we did at box and the talking for percona live uh the slides there have much more in-depth uh coverage about how all this works and what various systems need to interact in order to make sure that this is safe and performant finally i have a link here for my sql group replication so that hopefully you don't have to implement your own failover system if you're using my sql 5.7 or 800 you should be able to get away with just using group replication finally here's some of my contact info if you'd like to talk [Music] [Music] you

Info

Channel: SF Python

Views: 84

Rating: 5 out of 5

Keywords: MySQL, SF Python, Nathan Coffield

Id: 8ETR2ENQgaE

Channel Id: undefined

Length: 11min 0sec (660 seconds)

Published: Mon Jun 14 2021