[Backend #7] DB transaction lock & How to handle deadlock in Golang

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi everyone, In the last lecture, we’ve learned how to implement a simple money transfer transaction. However, we haven’t updated the accounts’ balance yet because it’s more complicated and require careful handling of concurrent transactions to avoid deadlock. So in this lecture we’re gonna implement this feature to learn more about database locking and how to debug a deadlock situation. Today I’m gonna use a different implementation approach, which is test driven development (or TDD). The idea is: we write tests first to make our current code breaks Then we gradually improve the code until the tests pass. OK, this is the test that we were working on in the previous video. It creates 5 go routines to execute 5 concurrent transfer transactions, where each of them will transfer the same amount of money from account 1 to account 2. Then it iterates through the list of results to check the created transfer and entry objects. Now to finish this test, we need to check the output accounts and their balances. Let’s start with the accounts. First the fromAccount, where money is going out. We check it should not be empty. And its ID should equal to account1.ID Similar for the toAccount, where money is going in. The account object should not be empty. And its ID should equal to account2.ID Next we will check the accounts’ balance. We calculate the difference between the input account1.Balance and the output fromAccount.Balance. This diff1 is the amount of money that’s going out of account1. Similarly, we calculate the difference between the output toAccount.Balance and the input account2.Balance. This diff2 is the amount of money that’s going in to account2. If the transaction works correctly then diff1 and diff2 should be the same, and they should be a positive number. Also, this difference should be divisible by the amount of money that moves in each transaction, The reason is, the balance of account 1 will be decreased by 1 times amount after the 1st transaction then 2 times amount after the 2nd transaction, 3 times amount after the 3rd transaction, and so on and so forth. Because of this, If we compute k equals to diff1 divides by amount, Then k must be an integer between 1 and n, where n is the number of executed transactions. Moreover, k must be unique for each transaction, which means k should be 1 for the 1st transaction, 2 for the second, 3 for the 3rd and so on until k equals to n. In order to check this, we need to declare a new variable called existed of type map[int]bool Then here we check that the existed map should not contain k After that we set existed[k] to true Eventually, after the for loop We should check the final updated balances of the 2 accounts. First we get the updated account 1 from the database by calling testQueries.GetAccount() with a background context and the ID of account 1. This query should not return an error. We get the updated account 2 from the database in the same manner. Now after n transactions, the updated balance of account 1 must decrease by n times amount. So we require the updatedAccount1.Balance to equal to that value. We have an error here because of the mismatched type between n and amount. Amount is of type int64, so we need to convert n to int64 before doing the multiplication. Alright, now we can do the same for the updatedAccount2.Balance Except that its value should be increasing instead of decreasing. And that’s it! We’re done with the test. But before running it, I’m gonna write some logs to see the results more clearly First, let’s print out the balance of the accounts before the transactions Then print out their updated balances after all the transactions are executed. I also want to see the result balances after each transaction so let’s add a log here as well. Alright, we can now run the test. It fails at line 83, where we expect the fromAccount to be not empty but of course it is empty at the moment because we haven’t implemented the feature yet. So let’s go back to the store.go file to implement it! One easy & intuitive way to change an account’s balance is to first get that account from the database, then add or subtract some amount of money from its balance, and update it back to the database. However, this is often done incorrectly without a proper locking mechanism. I’m gonna show you how. First we call q.GetAccount() to get the from-account record and assign it to account1 variable. If error is not nil, we return it. Else, we call q.UpdateAccount() to update this account’s balance. The ID should be arg.FromAccountID, And the balance will be changed to account1.Balance - arg.amount. The updated account record will be saved to result.FromAccount. And if we get an error, just return it. After this, we have moved money out of the fromAccount. Now we can do similar thing to move those money into the toAccount. I’m just copy this block of codes. Then change this variable to account2, This one to arg.ToAccountID The result will be stored in result.ToAccount, This account ID should be arg.ToAccountID, And the new balance should be account2.Balance + arg.Amount OK so the implementation is done. However, I’m telling you it’s incorrect. Let’s rerun our test to see how it goes! The test still fails. But this time the error is on line 94, where we compare the amount of money that goes out of account 1 with those that goes into account 2. In the log, we can see that the first transaction is correct The balance of account 1 decreases by 10, from 380 to 370 And the balance of account 2 also increases by 10, from 390 to 400. But it doesn’t work correctly in the second transaction. The balance of account 2 increases by 10 more, to 410 While the balance of account 1 stays the same, at 370. To understand why, let’s look at the GetAccount query It’s just a normal SELECT, So it doesn’t block other transactions from reading the same Account record. Therefore, 2 concurrent transactions can get the same value of the account 1, with original balance of 380. So it explains why both of them have the updated balance of 370 after execution. To demonstrate this scenario, Let’s start the psql console in 2 different terminal tabs and run 2 parallel transactions. First I will begin the 1st transaction in this tab Then switch to the other tab and start the 2nd transaction. In this transaction, let’s run a normal SELECT query to get the account record with ID 1. This account has a balance of 748 EUR. Now I’m gonna copy this query And run it in the other transaction. As you can see, The same account record is returned immediately without being blocked. This is not what we want. So let’s rollback both transactions and learn how to fix it. I will start 2 new transactions. But this time, we will add “FOR UPDATE” clause at the end of the SELECT statement. Now the first transaction still gets the record immediately. But when we run this query on the second transaction, It is blocked and has to wait for the first transaction to COMMIT or ROLLBACK. Let’s go back to that transaction And UPDATE the account balance to 500 After this update, the second transaction is still blocked. However, as soon as we COMMIT the first transaction, We can see that the second transaction is unblocked right away, And it gets the newly updated account with balance of 500 EUR. That’s exactly what we want to achieve. So let’s go back to the account.sql file And add a new query to get account for update I will just copy this get account query And add FOR UPDATE at the end. Then we open the terminal and run make sqlc to regenerate the code. OK, now in the account.sql.go file, A new GetAccountForUpdate() function is generated. We can use it in our money transfer transaction. Here to get the first account, we call q.GetAccountForUpdate We do the same thing to get the second account. Alright, now we expect this to work. Let’s rerun our test. Unfortunately, it still fails. This time the error is “deadlock detected”. So what can we do? Don’t worry, I’m gonna show you how to debug this deadlock situation In order to figure out why deadlock occured, We need to print out some logs to see which transaction is calling which query and in which order. For that, we have to assign a name for each transaction and pass it into the TransferTx() function via the context argument. Now inside this for loop, I’m gonna create a txName variable to store the name of the transaction. We use the fmt.Sprintf() command and the counter i to create different names: tx 1, tx 2, tx 3, and so on. Then inside this go routine, Instead of passing in the background context, We will pass in a new context with the transaction name. To add the transaction name to the context, we call context.WithValue(), pass in the background context as the parent context, and a pair of key value, where value is the transaction name. Here it says the key should not be of type string or any built-in type to avoid collisions between packages. Normally we should define a variable of type struct{} for the context key. I’m gonna do that in the store.go file because later we will have to use this key to get the transaction name from the input context of the TransferTx() function. Here we declare variable txKey of type empty struct{} This second bracket means that we’re creating a new empty object of that type. Alright, now let’s go back to the store_test.go file and pass the txKey and txName into this function. After this step, the context will hold the transaction name. And we can get it back in the TransferTx function by calling ctx.Value() to get the value of the txKey from the context. Now we have the transaction name, we can write some logs with it. Let’s print out this transaction name and the first operation: create transfer. Let’s do the same for the rest of the operations. This operation is to create entry 1 This one is to create entry 2 This operation is to get account 1 for update This one is to update account 1’s balance The next operation is to get account 2 for update And the last operation is to update account 2’s balance. Alright, now we can rerun the test to see how it goes. But to make it easier to debug, we should not run too many concurrent transactions So I’m gonna change this n to 2 instead of 5. Then let’s run the test! And voila, we still got the deadlock, But this time, we have a detailed logs of what happened. As you can see here, Transaction 2 ran its first 2 operations: create transfer and create entry 1. Then transaction 1 jumped in to run its create transfer operation. Transaction 2 came back and continued running its next 2 operations: create entry 2 and get account 1 Finally the transaction 1 took turn and ran its next 4 operations: create entry 1, create entry 2, get account 1, and update account 1. At this point, we got a deadlock. So now we know exactly what happened. What we have to do is to find out the reason why it happened. Here I have opened the simple_bank database in TablePlus, And at the moment, it has 2 accounts with the same original balance of 100 USD. I also prepared the money transfer transaction with the list of SQL queries that should be run exactly as we implemented in our Golang code. The transaction starts with the BEGIN statement. First we INSERT a new transfer record from account 1 to account 2 with amount of 10 Then we INSERT a new entry record for account 1 with amount of -10 And INSERT another entry record for account 2 with amount of +10 Next we SELECT account 1 for update And we UPDATE its balance to 100-10, which is 90 USD. Similarly, we SELECT account 2 for update And we UPDATE its balance to 100+10, equals to 110 USD. Finally we do a ROLLBACK when a deadlock occurs. Now just like what we did before, I’m gonna open the terminal and run 2 psql console in order to execute 2 transactions in parallel. Let’s start the first transaction with BEGIN Then open another tab and access the psql console. Start the second transaction with BEGIN OK, as we’ve seen in the logs, First transaction 2 must create transfer and entry 1. Let’s copy this INSERT query and paste it to this transaction. The transfer record is created. Next the query to create entry for account 1. Inserted successfully! Now we have to move to transaction 1 And run the first query to create transfer record. Now back to transaction 2 And run its third operation to create entry for account 2. Also successful. Then the fourth query to get account 1 for update. Now we see that this query is blocked. It is waiting for the transaction 2 to commit or rollback before continue. It sounds strange because transaction 2 only creates a record in transfers table while we’re getting a record from accounts table. Why a INSERT into 1 table can block a SELECT from other table? To confirm this, let’s open the browser and search for “postgres lock” This long and complex query allows us to look for blocked queries and what is blocking them. So let’s copy and run it in TablePlus. As you can see, The blocked statement is SELECT FROM accounts FOR UPDATE And the one that’s blocking it is INSERT INTO transfers. So it’s true that queries on these 2 different tables can block each other. Let’s dig deeper to understand why the SELECT query has to wait for the INSERT query. If we go back to the postgres wiki page and scroll down a bit We will see another query that will allow us to list all the locks in our database. Let’s copy it to TablePlus. I’m gonna modify this query a bit because I want to see more information. This a.datname field will show us the database name, Let’s add a.application_name to see which application the lock comes from. The l.relation regclass is actually the name of the table, L.transactionid is the transaction ID, L.mod is the mod of the lock Let’s also add l.lock_type to see the type of the lock. L.granted tells us whether the lock is granted or not. A.usename is the username who run the query, A.query is the query that’s holding or trying to acquire the lock. The time when that query started or its age are not very important so I’m gonna remove them. The last field is a.pid, which is the process ID. As you can see, we’re selecting from the pg_stat_activity table, alias as a, and join with the pg_locks table, alias as l, on the process ID column. It’s ordering by query start time, but actually I think order by process ID is better because we have 2 different processes that are running 2 psql consoles with 2 parallel transactions. So it will be easier to see which lock belong to which transaction. Alright, let’s run it! Here we can see some locks from TablePlus application, which are not relevant What we care about is only the locks that came from psql consoles, So I’m gonna add a WHERE clause here to get only the locks with application name equals “psql”. The database name is also not important because it’s always “simple_bank” in our case So I will remove a.datname as well. OK let’s run this query again. Now we can see, there is only 1 lock that hasn’t been granted yet It comes from the SELECT FROM accounts query of the process ID 3053. The reason it’s not granted is because it is trying to acquire a ShareLock of type transaction ID where the transaction ID is 2442. While this transaction ID lock is being held exclusively by the other process ID 3047, with the INSERT INTO transfers query. But why a SELECT FROM accounts table needs to get a lock from other transaction that runs INSERT INTO transfers table? Well, if we look at the database schema, We can see that the only connection between accounts and transfers table is the foreign key constraint. The from_account_id and to_account_id columns of transfers table are referencing the id column of accounts table. So any UPDATE on the account ID will affect this foreign key constraint. That’s why when we select an account for update, it needs to acquire a lock to prevent conflicts and ensure the consistency of the data. Having said that, now if we continue running the rest of the queries on transaction 1 to create entry for account 1 Create entry for account 2 And select account 1 for update. We will get a deadlock because this query also has to wait for a lock from transaction 2, while transaction 2 is also waiting for a lock from this transaction 1. And that clearly explains how the deadlock happens. But how to fix it? OK let me rollback these 2 transaction first. As we know the deadlock is caused by foreign key constraints, So 1 simple way to avoid it is to remove those constraints. Let’s try comment out these statements Then run make migratedown in the terminal to delete the database schema And run make migrateup to recreate the new db schema without foreign key constraints. Alright, now if we run the test again, It will pass because the constraints are gone, so no lock is required when select accounts for update. And no lock means no deadlock. However, this is not the best solution, because we don’t want to loose our nice constraints that keep our data consistent. So let’s revert these changes, Run migratedown Then migrateup again to have those constraints back. Now the test will fail because of deadlock again. Let’s learn a better way to fix this issue. As we already know, the transaction lock is only required because Postgres worries that transaction 1 will update the account ID which would affect the foreign key constraints of transfers table. However, if we look at the Update Account query, We can see that it only change the account balance. The account ID will never be changed because it’s the primary key of accounts table. So if we can tell Postgres that I’m selecting this account for update, but its primary key won’t be touched, then Postgres will not need to acquire the transaction lock, and thus no deadlock. Fortunately, it’s super easy to do so. In this query, instead of just SELECT FOR UPDATE, We just need to say more clearly: SELECT FOR NO KEY UPDATE This will tell Postgres that we don’t update the key, or ID column of accounts table. Now let’s run make sqlc in the terminal to regenerate golang code for this query. OK the code is updated. Let’s run our test again. It passed! Excellent! So our debugging and fixing is done Let’s remove our debug code. Change this n back to 5 Remove this transaction name No need to add value to this context anymore. Then remove all logs in the TransferTx() function. and this txKey variable as well OK let’s run the test again. It passed. And we can look at how the balance of the 2 accounts are changing after each transaction The balance of account 1 is decreasing by 10 And balance of account 2 is increasing by the same amount. Perfect! Now before we finish, I’m gonna show you a much better way to implement this update balance operation. Currently, we have to perform 2 queries to get the account and update its balance. We can improve this by using only 1 single query to add some amount of money to the account balance directly. For that, I’m gonna add a new SQL query called AddAccountBalance. It’s similar to the UpdateAccount query, except that, here we set balance equals to balance + the 2nd argument Let’s run make sqlc to generate the code. A new function is successfully added to the Queries struct. However this balance parameter looks a bit confusing Because we’re just adding some amount of money to the balance, Not changing the account balance to this value. So this parameter’s name should be amount instead. Can we tell sqlc to do that for us? Yes, we can! In this SQL query, Instead of $2, we can say sqlc.arg(amount) And here, instead of $1, we should say sqlc.arg(id) This amount and id will be the name of the generated parameters. OK let’s run make sqlc in the terminal to regenerate the code. This time, we can see the parameters’ name have changed to what we want. Cool! Now come back to the store.go file, I’m gonna remove this GetAccountForUpdate call, And change this UpdateAccount() to AddAccountBalance This should be AddAccountBalanceParams Change this Balance field to Amount And remove this account1.Balance. Let’s do the same thing for account 2. Remove this GetAccountForUpdate, Change this to AddAccountBalance, This one to AddAccountBalanceParams, This balance should be changed to Amount, And remove account2.Balance here. And we’re done! Let’s rerun the test. Oops, it fails. The error is at line 95, where we compare the diff Expected -10, but actually got 10 instead. OK I know why, Here when we update the balance of the FromAccount, The amount should be -arg.amount because money is moving out. Alright, now this should work Let’s rerun the test one more time. Yay! It passed! Let’s run the whole package test. All passed! And that’s it for today’s lecture about locking in db transaction and how to debug a deadlock. I hope you enjoy it. And stay tuned for the next lecture, because I’m telling you the deadlock issue is not completely resolved yet. There are much more to learn about it. In the mean time, happy coding and I’ll see you very soon!
Info
Channel: TECH SCHOOL
Views: 5,010
Rating: undefined out of 5
Keywords: golang database lock, golang db lock, golang transaction lock, database locking, db locking, database transaction lock, db transaction lock, database deadlock, db deadlock, golang db transaction, golang transaction, golang database, golang db, backend course, backend master class, backend tutorial, golang tutorial, coding tutorial, programming tutorial, tech school, tech school guru, techschool, techschoolguru, postgres lock, golang database deadlock, golang db deadlock
Id: G2aggv_3Bbg
Channel Id: undefined
Length: 28min 20sec (1700 seconds)
Published: Tue Aug 25 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.