AWS Lambda Python functions with a database (DynamoDB)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
If you've ever used AWS Lambda functions before, or you've seen one of my video tutorials about them, then you'll know that they are a great way to deploy your code or your API into the cloud for a really cheap cost, but also without having to worry about servers. However, unlike in traditional servers, Lambda functions disappear after they run. So a common problem I see that people are running into is the problem with data persistence. How do you save data or state when you use the Lambda function? So today we're going to solve that problem by using a cloud database, AWS DynamoDB, together with our Lambda function. And we're going to write a simple Python function that increments account for each time we execute it, so sort of like a page visitor account, that type of thing. But it's going to save this information into a database. So every time we use this function, it's going to remember what the number is and increment it each time. So let's get started. We'll start off in our AWS console. Go to your console and search for the Lambda service. If it's already on your dashboard, you can click it. Otherwise, go into the search box and type in Lambda. We're first going to create a new function. So I'm going to press create function over here, and we'll author it from scratch. And I'm going to create a function that counts the number of times a user has visited this function. So I'm going to call it visit count, visit count function. And select our runtime, which is going to be the latest Python version. And you can pretty much leave everything else as the default setting. Once the function has been created, let's just go down and test it out really quickly. So it's already given us a boilerplate function handler here. We don't have to change this yet. We'll just go to test. And to test, we have to create a new test event. So we'll just create one here. And I'm going to call it "hello world", and we can use this template. We'll save that. Now we can hit "test", and we can see that the function executes. And here is the response. Status code 200, and the body is "hello from Lambda". So to update the function, we can implement a random message. And this event item here that it takes in the argument is what we configure as the test event in "hello world". So let's make the test event have a user, because we want to say hi to the user and count their number of visits. And I'm just going to call this user here, Pixegami AWS, like that. And then I'll save that. And now that object that we just configured will be accessible in this event object that's passed to our function here. So if I want to access that user, I'll call user equals event and then access that key. And then I'm going to change the message as well. I'm going to change it to JSON message. And it's just going to say "hello user", like this. Okay. So let's test this again. But before we test that, we have to save it and then click "Deploy" so that it actually updates our function. So I'm just going to go do that. And now it's been deployed, successfully updated, so I can test it again with my test event. And now we should say "Hello Pixegami AWS". So now I know my function works. I'm not going to write my function in here because I like to edit locally on my device. In my VSCode editor. So instead, I'm going to create a Python file just like this one. I'm going to call it "lambda_function" and I'm going to create a function called "lambda_handler" with "event" and "context" object. So let's go to our IDE and do that locally. I'm in my IDE now VSCode, and I'm going to create "lambda_function.py" as my file. And here is going to be my handler. So it's going to be Lambda handler and it takes an event. It's going to be "any". And context, which is also going to be "any". And I'm pretty much going to start with the same content that I had in my actual Lambda function. So I'm going to get the user from the event. And I'm going to say "hi" to the user. And I want to be able to test this function locally as well. So I'm just going to create "if name == main" and then I'm just going to call this function like this. And I'm going to change my user to "Pixegami local" so I can tell the difference between using this in the cloud on AWS and using it locally on my computer. And I'm just going to run Lambda handler with this event and print out the results. So let's go ahead and run that in our local terminal now. So you can see here it runs and I get my message. "Hello, Pixegami local" and that's fine. Okay, so in this function, I actually wanted to count the number of times I visited this page. So how can we implement that? Well, let's start with a visit count integer. And I'm going to change the message as well. And I'm going to change it to "hello user", you have visited this page X amount of times. Just going to move this page a little bit so you can see more of the text, maybe zoom out. And now let's go ahead and run this again. And now I get this message, "hello, Pixegami local, you have visited this page zero times". And that's good. But no matter how many times I run this, it's always going to be zero because my function and my Lambda function as well has no way to save or persist the amount of times that I've actually visited this page. And even if I increment the number of times that I've visited the page, it's still just going to be static. So I can do this function as many times as I want, it's always going to start at zero and always going to end up at one, which is fine if I use it once. But if I want to accumulate the amount of times each user visits this page, I need some way to store that data. So to do that, I'm going to use a database. When using AWS Lambda, we can pretty much hook it up with any kind of database you want. Some of you might be familiar with SQL databases, or MongoDB. But I like using something that's serverless, because Lambda is serverless, it's cheaper to run, it only charges you per time you use the function. So I want a database that also charged me per use, rather than something I have to pay on a monthly basis. So Amazon actually has a service that lets us do this, and it's called DynamoDB. So we're going to create a database using DynamoDB. And it's a bit different from SQL or MongoDB. It's more of a simple key value store, but it will fit the purpose for this app. So to create a DynamoDB table, again, go to your AWS console and then search for DynamoDB. And Amazon has maybe nine or 10 different database options we can choose from. But there's only, as far as I know right now, there's only two serverless database options. One is DynamoDB and the other one is called Aurora, which is sort of a SQL database. But we don't need a whole SQL database at the moment. So I'm going to go with a simpler option, and we're going to create a DynamoDB table. Once you're on your DynamoDB dashboard, just click Create table to get started. And we have to enter a table name. So I'm going to call it "Visit Count Table". And here we have to choose our partition key. So you can sort of think of the database as like an online dictionary or an online hash map. And the partition key is basically the field we use for the objects hash key. So the values that we put into the database as the key needs to be unique per entry because that's how we look up items in the database. So because I'm counting visitor counts per user ID, I'm going to use the username as my partition key or my hash key in this case. So it's a string and I'm just going to call it user. And the sort key is optional, but it's kind of saying if we wanted to have many items that had the name user, for example, if this was a database that every item was tracking each time a user logged on, I might have duplicate user key IDs, but then I could have a timestamp for my sort key. So when I look up how many times the last 10 times this user has logged on, I can look it up by their username and then sort it by the 10 most recent or the 10 earliest timestamps. But we're not going to use that today, so we're going to leave that blank. But just so you know, that's what the sort key is for. Normally, I like to leave everything on default settings. And I'm not sure why AWS doesn't do this as a default, but we need to change this read and write capacity because as it stands, we are kind of reserving the ability to read and write to this database. So we're kind of reserving a flat usage fee for it, which is, again, it kind of goes against what I was saying before where we want to completely serverless and pay per use. So that's the only thing we have to change. And we go to customize settings here and everything's also the same. But when we go to read and write capacity settings, we change it from provisioned to on demand. So here we simplify billing by paying for the actual read and writes. And that's what we want because we don't know how our application is going to scale. So we don't want to provision a read and write capacity for it and either risk an outage if we don't provision enough or overpay for too much capacity that we don't actually need. So I always recommend picking on demand so that it just fits our use case perfectly. And we can skip everything else and just go to create table here. We'll click that. Now, once the table is created, you should see it show up here in your tables view. And you can click on it and see the actual table. And you see our partition key here is called user. And everything looks all fine. And it's on demand capacity. And if we want to see what items are inside, we can go to explore table items. But this is a new table and there are no items. That's fine. So for the next step, we are actually going to interact with this table from our Lambda function, or actually first just from our Python function here. So we're going to try and interact with that. But before we can do that, you need to have the AWS CLI installed on your computer. So just go into Google and search for AWS CLI. It should be the first result that pops up AWS command line interface. So we click on that we see the AWS CLI documentation. If you haven't done so already, please first read and install the AWS CLI for your operating system. Once you've installed it, you'll also need to configure it. So you can follow this guide here quick configuration with AWS configure, you run this command where you'll be asked to put in a secret key ID and an "access key ID" from your AWS account, which will basically give your desktop access to your AWS account. And you can expose these values to anyone else because if you give it to them, then they have access to your AWS account and all the services that you pay for on there. So configure this and once you've set it up, you should be able to test it out in your command terminal. And if you've set it up properly, you should be able to run a command called AWS STS get caller identity, which will tell you who you are, which account you're accessing as which user from this AWS CLI. So if you see it show up with the account ID and the user ID, and it looks like the account that you've set up, then that means everything's working properly and you'll be able to proceed with the next step. Now with AWS CLI set up, we can interact with the rest of AWS services from this Python function. To do that, we need to import the AWS SDK. The AWS SDK for Python is actually called "boto3". So we'll have to import that here. You might need to install this in your Python environment. So if you don't have it, just go back to your terminal and type "pip install boto3". Once that is installed, we can access the DynamoDB table we created earlier. So what I'm going to do is I'm going to first read the table and see if there is already a user in the table. And if there is, I'm going to load the visit count from there. Otherwise, if there's not, I'm just going to save whatever the current visit count is back to the table. So let's give that a shot. And then we need to set our table name. So we called this visit count table. So let's copy that and put it here. And then we can create a table object using this DynamoDB client table function and then the table name that we have. And once we do that, we can look up the user and load the visitor account from there. Okay, so we get this item by calling table get item key equals user, and then we'll get a response. To get the actual item in the response, it'll be a key called item with capital I, which is not Pythonic, but that's okay because the SDK was probably developed in Java first. Doesn't matter. If it does exist in the response, we will call visit count equals response item and then call it count here like that. If the item doesn't exist, then this will be none. So nothing will happen and visit count will remain zero and that's fine. But the one last thing we have to do is actually write the updated visit count to the table with this user key. So let's do that right after we increment it here. And we can do that by just table put item and then we create our item here, which has got the user and the count. And that should do the trick. Now, I know I used a lot of different boto3 APIs here. So boto3 resource table, get item, put item. And I use GitHub Copilot. So all I have to do is just type in the comment and it kind of auto suggests that for me, but I knew that that was the direction I was going. However, if you want to understand more of the SDK or know which commands you can use or just read up more about them, the way to do that is to search the boto3 documentation on Google. Just look boto3 DynamoDB documentation and you should find the full manual. So you can read all about these APIs that we used. And if you just go into Google and type in boto3 documentation, for DynamoDB, you can click here and it's boto3.amazonaws.com. And you've got the full documentation with all the methods and the statements and the parameters and everything there for you to read. So back to our function. Now, let's actually test it. So I'm going to go back to my terminal and then just type in Python Lambda function, which is just run that test event that we created. And now I see this message, "Hello, Pixegami". You have visited this page one time." That's fine. So that's what we have before. But let's see if we run it again, if that number goes up. And indeed, it does go up. So the Python function is now actually saving this information on this cloud resource that we created here, this table. And it's incrementing the visitor count each time for this user. And if I go and use it again, it's up to three times. If I change the username, because it's a new user, it hasn't seen it in the table. So it should set it back to one. So let's go ahead and try that. So let me just change that to "Jack", for example. And then if I run it again, then Jack's only been seen by this page on time. I think our table is working locally. Now, let's get it to work in our Lambda function that we created at the beginning of the video. Before we do that, though, I need to fix something that we've done earlier here. So I hard-coded this table name, visit count table, which is fine. But most likely, you're going to probably want to configure things like this, like your table name and how your resources talk to each other through environment variables or some other configuration that isn't hard-coded. And that doesn't require a code change to edit. So to do that, I am going to instead create an environment variable to get that information. I'm going to call it table name. And we'll have to import OS as well since we're using that now. So we import OS and this will get the table environment. But of course, in our test function, we haven't set it yet. So we have to set that. So that would be our visit count table. And let's just test that again to make sure that it still works. Okay, so that's working correctly, except that when we put this into a Lambda function, we can now specify the table through this environment variable name instead. So getting this to our Lambda function is really easy. I'm just going to copy the whole function. I don't need this part here because that's just for local testing. So I'm just going to copy everything from the function itself and even the imports up here and paste it in my Lambda console. So go back to our Lambda console. We have our function here, Lambda handler, just going to delete all of that and then replace it with the one that we've written in our local file. And then I'm going to press deploy to save the changes. And now I can test it again. So let's just look at our test event again. And our user is called Pixegami AWS. So let's see if we run it, if it's seen this user before and if it can actually save the number of times we've seen this function. I'm just going to hit test and we actually get an error here. Well, that's expected because we decided to specify the table name from this environment variable, but we actually didn't set it anywhere on our Lambda function yet. So let's go ahead and set that. We'll go up here, go to configuration, go to environment variables, and then we should have nothing here. So you click edit and you can add one. And we're just going to add one called table name. And it's visit count table that we called it earlier. So hit save. And now let's test our function again and see if it works. So go back to the code and then click test. And we still have an error, but it's a different error this time. And the error says an error occurred access denied exception when calling the get item operation user blah, blah, blah, blah, blah, is not authorized to perform get item on resource this table because no, I can't read the rest of the message. But essentially what it means is that this Lambda function, we never gave it permission to start editing, reading and writing things in this table here. So we created this table. And by default, all AWS resources are secure. So if you create them, they don't automatically come with the permission to talk to each other or to read each other's information. They're kind of siloed resources. So we actually have to do something to explicitly give this Lambda function resource the ability to even see or edit or add items to this visit count table resource. And I wish it was really easy to do that. Like there'll be like the add permission or something button here. I mean, if you look at the configuration, it's not obvious. If you go to permissions, there's some resource summary here. And there's a policy statement where we can add permissions. But this is a bit misleading because when you click here, it's actually adding permission for other resources to use this Lambda function. So that's not what we want. We want the ability for this resource to do things to other resources, to the table specifically. So this is probably the most challenging part of the learning curve when working with Lambda. But the important thing to understand is that the way that Lambda permissions are modeled are from these thing called roles, these AWS account roles. So every function has something called an execution role. And this is kind of the actor that runs this function. So whatever permissions we give to this actor, we essentially give to this function. So in summary, to give access for this function to use the table, we need to grant this role that it's created for us automatically. We have to grant this role the ability to use that table. And we can do that by just clicking on this role name. It's going to open up a new window where we can go ahead and edit its permissions. So once I've opened that up, I'm on a page now where I can see that role. And it was created specifically to run this function. And I can see its permission policies here. And its existing policies basically just let it write some logs over to CloudWatch. We wanted to be able to add and edit items and read items from a DynamoDB table. So we're going to add permissions here. And we can choose between attach policy or create inline policy. Inline policy gives us more control. We can specify exactly what it can do, like what items it can read, what exact table it can read, or attach policies that is just picked from a preset. And this is simpler. So for this video, I'm just going to use this one. Click on attach policies, and you'll find a list of all the sort of pre-configured AWS policies. And we're not going to browse this because there's almost 40 pages of them. Instead, I'm just going to search for DynamoDB. So type in DynamoDB and press enter. And you should see these four policies related to DynamoDB. There's read only, there's execution. What we actually want is full access. So this is a little bit too permissive for just the Lambda function. We don't want it full access to every single table on this account. That's not a great idea for production, but it will fix this function for us for the purpose of this tutorial. So we're going to attach it. And now you can see the policy has been updated. The role has been updated within this new AWS DynamoDB full access. Now this function will be able to use the table. So let's go back to the function and just run the test again. And this time it works. So the message I get is, "Hello, Pixegami AWS, you visited this page one time." So let's click test again. And you can see that the visit count is increasing. So now it's two times, three times. And now we can actually use a Lambda function to save information that will persist and will not be destroyed as soon as the Lambda function finishes executing. So if we go back to the table, we can actually look at the items we've created as well. So if you go to the table again and hit scan and then run, it scans the table and returns us a bunch of items. And we can see now that each of the user profiles that we've called this function with is now in our table with the visit count registered here. And if I go back to the Lambda function and I keep running this a few times, now it's up to five times for this user. If I go back to my DB table and refresh it, you can see that this count has changed up to five. So that's pretty much it. That was a really quick tutorial on how to write data from AWS Lambda to a database. We use DynamoDB because it's the cheapest and simplest one for our use case, which has really nice key value pair tables like this. If you do want a SQL like database, then I suggest looking at AWS Aurora, which is a more SQL schema compatible, but also serverless database. I've never used it, so I don't know much about it yet. But that's another option. But from here, you should now be able to build persistent app state or a more complex application with your Lambda function, getting it to talk to your database. Now, if you've enjoyed this video and you want to take your skills a little bit further, I recommend you checking out CDK as well. I have a video tutorial on that too, which I'll link in the description below. But CDK lets you configure all of the stuff that we clicked here today. So the Lambda function itself, the runtime it uses, the environment variables, like all this stuff here, even the roles and the table, it lets you do all of that as a code. So you can write that up in a code file, and then you can deploy it or you can version it or whatever. So it's just a much nicer way of setting up all of this infrastructure without having to actually log into the console and click it ourselves. Because even though this is really good for learning purposes and for doing it for the first time, it's not really scalable if we want to manage very large systems of infrastructure. So I recommend checking out the CDK stuff next. Otherwise, I hope you found this useful and thank you for watching.
Info
Channel: pixegami
Views: 25,444
Rating: undefined out of 5
Keywords: aws, lambda, dynamodb, table, serverless, database, cloud, tutorial, python, coding, learn coding, learn programming, learn how to code, how to code, how to use aws, tech, tech bootcamp, fullstack, web development, programming, learn aws, coding tutorial, aws tutorial, python tutorial, learn python
Id: CjVPMocEECM
Channel Id: undefined
Length: 25min 12sec (1512 seconds)
Published: Thu Aug 25 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.