AWS Glue Tutorial for Beginners [FULL COURSE in 45 mins]

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi guys welcome back to the channel for those of you who are new here i'm johnny chillers and in today's video we're going to do a zero to hero course on aws glue we're going to cover aws glue in full using the console we're going to cover aws glue data catalog aws glue etl aws dev endpoints and aws triggers as well by the end of this lesson bit of housekeeping first in terms of the aws console when filming this course aws are in the process of kind of switching a few things around and you'll notice that it changes part way through the course the way the console looks when i refer to aws jobs i'm actually talking about things contained in the legacy tab at the moment and i'll put that on screen so you can see at some point in the near future aws are going to merge the legacy and the jobs tab together again and everything that i talk about will be available under jobs again so depending at the point in time you find this video those two things may be separate or they may be one but hopefully it's easy to work out from there from what i've just shown you in terms of actual course housekeeping all the data that i use i put on my github page where you can go download it the slides are also available for free but i've hosted those on buy me a coffee you just need to enter zero dollars and you don't need any credit cards or debit cards to sign up so you can go get those for free it's just easier for me to host them there rather than my website to keep costs down in terms of aws some basic aws knowledge would help but it's not essential but you will need to have signed up for an aws account of course aws glue does not come under free tier aws however the entire course cost me 60 cents to do so if you buy that in mind and make sure you switch things off as you don't use them then you should be roughly in the same ballpark but that varies region the region and time you have resources up but it's definitely worth the investment if you want to learn aws glue with that being said i would also appreciate a like and subscribe to the channel it helps me continue to make these resources for free and i also put them all on my website and if you want to go there and check that out the link is in the description now with all the housekeeping out of the way let's get into the course join me on the computer and we'll do the first slide and start learning about aws glue hands-on using the aws console so in terms of the course overview we're going to cover what is aws glue why do we use aws glue then we're going to do a quick bit of setup work on the aws console this will involve set up an s3 bucket with a couple of folders and then downloading the data from github onto s3 once we've done that we'll cover the theory on the glue data catalog then we'll look at the glue data catalog on the console we'll discuss what an aws database is and create our database we'll look at the tables and create a table we'll also discuss partitions and how they operate inside aws plus we'll already have a partition on our data we'll cover aws glue crawlers we'll look at aws glue connections and we'll look at aws gully jobs once we've got a job up and running we'll look at the triggers and how triggers work in aws glue and then as a final bonus we'll look at aws glue dev endpoints and actually create an endpoint and a glue script using that endpoint so what is aws glue aws glue is a fully managed etl service it consists of a central metadata repository which is known as the glue data catalog so that's kind of fundamental to what's going on and it sits right here in this diagram as you can see it's central to everything it has a spark etl engine which is completely serverless as it says at the start and it has a flexible scheduler so if we look at the diagram over here you can see that glue data catalog sits in the middle and it has connections tables settings transformations and out of the glue data catalog you can create jobs or jobs can use information ie metadata in the glue data catalog to create scripts and these scripts can be scheduled and then they move data from point a to point b and you can access glue.catalog through the aws management console which is how we're coming in today to do majority of this work but you also have command line as usual so why do we use aws glue so aws glue offers a fully managed serverless etl tool this removes the overhead and the barriers to entry when there is a requirement for an etl service in aws so in other words it's the native aws etl service so if you're looking to move data around aws in a manner where you don't have to manage the infrastructure then aws glue is the tool for you okay set up work so we're going to do a bit of setup work we're going to create an s3 bucket we're going to create a couple of folders under this we're going to upload some data and then we're also going to create an i am role that we can use with aws glue for the remainder of this course so this is the setup work so let's log into the aws console and i'll see you there okay guys that's me logged into the aws console we want to go to s3 so type in s3 in the search menu and go to s3 once there we want to create a new bucket so i'm just going to create a new bucket for this work and give the bucket a name it has to be unique within the entirety of aws so i'm just going to call mine after myself full course because hopefully no one else has my name uh just accept everything else is default so just give it a name and we'll accept everything else as default that's fine create the bucket and once the bucket is created and click into it you'll end up here so we need to create a couple of folders while we're here so click the create folder and the first one we're going to call is data and that's where we're going to put our data so just create that folder the next folder we're going to create is called temp hyphen directory and the temp directory is where our temporary files that glue actually creates and needs to persist will stay and we will supply this actually as a parameter at runtime so in other words this is directory the glue is going to use in a temporary location and create folder and the last one we'll call is scripts so this is the location that glue will actually persist the scripts that we create in it and we'll supply this again when we're in glue let's go into data and let's create a new folder and what we're going to call this is customers underscore database and what i'm doing here is giving the folder a physical name that will represent the logical interpretation so because this is called database this folder will be where our customer artifacts for the database sit so let's create that folder so in other words this is just going to be our database folder and under that is we're going to create another folder and we're just going to call this one customers customers underscore csv and that is going to be our customers csv table and we're going to create that then we're going to click in that one more and we're going to create another folder and we're going to call this one data load which is going to be our data load date so give it the date um that you're on currently and i am on that so keeping all integers and i'm saying that that is 2021 of the 12th 30th i'm putting it together so that's year month day so yyyy mm day day so day day and create that folder so with the download folder and let's open that up now here next thing we need is the data so on to the link below in the description go to the github there's a customers.csv file sitting right there if we just click into it quickly you'll see that it's a customer csv file it's got customer id title first name middle name last name suffix customer a lot more data than that as well if i just go across here password hashes phones emails everything we need um for the purposes of this demo so back out and you want to go code and you want to download as a zip double click the zip once it's downloaded and get it open on your main downloads folder that's ready to go for me back on to aws upload and go to add files and go into the folder grab the customer csv and click ok then click upload and as you can see the customer csv has landed in the folder so that's the setup work in terms of s3 the next thing we need to do is create an i am rule that's a service rule that we're going to use throughout for glue so go to i am i'm just going to open up a new tab here so i have it somewhere else you can click on the same tab if you want go to rules go to create rule give the role a name we want an aws service rule and we're looking for glue so it's in alphabetical order which is g so click glue as you can see glue next permissions let's be very bad and just give it admin access because we're creating buckets and stuff the default glue policy won't work out of the box because it is a resource constraint on bucket name so to keep things simple i'm just going to use admin access bad practice and production but it's okay for the purposes of learning and it's easier for all of us following along that we just have all the permissions let's leave the tags we don't need that and i'm going to call this one glue course full access and i'm just going to put delete on the end so i remember to delete the rule once i'm done with it and then let's just check it exists uh glue hyphen course full access oops about wrong but it'll be okay delete okay so that's all the setup work complete let's just jump on do a bit more theory and then we can get hands-on with aws glue on the console itself so the first thing we're going to take a look at is the aws data catalog as i said earlier in the video this sets at the center of everything aws glue and at its heart and i've actually made this away from the bullet point it's a persistent meta store so it's a persistent metadata meta store and that's really important that's its fundamental job is to act as a persistent metadata store it's a managed service that lets you store annotate and share metadata which can be used to query and transform data and the big question is what is metadata so over on the right hand side here i've put some examples that could be data location schema data types data classification and when we jump into the console in just a second this will be broken down into databases tables connections and even classifiers so when we jump on the console you'll see the glue data catalog in action but at its heart in theory it is a persistent metastore of data there's one per region in aws account that's a good question that comes up on the exams a lot so there's one meta store per region in aws you control access to the meta store and the resources that it holds through iom and you can use it for data governance so you can annotate it and say this data sensitive or the status classified or this data is incomplete you can give it descriptions and names and things like that so let's jump onto the console and that's actually look at where the aws glue data catalog is so back on the console and type in glue and once loaded you can see here on the left hand side the glue data catalog so it's all these things here it is databases and within databases you have tables and connections it's crawlers which is inside crawlers you have classifiers you've schema registries which we won't be touching in this course it's used for um streaming data mainly um we won't be touching it it does come up occasionally but in terms of learning glue and glue 101 and what you really need to fundamentally know about glue schema registries comes after that and then you also have settings so it's about encrypting metadata and things like that so create a catalog even though we say it's persistent metastore at its heart it's databases tables connections crawlers classifiers and settings and inside databases you start with the default database so this is just the default one that aws create for you so let's jump back into the theory take a look at what databases and tables are and then we'll create a database and we'll actually use a crawler to create a table so we'll do this in a couple of different parts to break up the theory and the practical but let's jump on to the actual slides for now and cover what a database is so an aws database is a set of associated data catalog tables definitions organized in their logical grip so it's just a name that we give a i a database which is being represented by this database symbol that we place tables in so tables belong to a database and let's not forget fundamentally as well the data or the tables all reside in the original locations the aws uh glue data catalog is just a meta store of information belonging to that data it doesn't actually move the data anywhere it always stays where it is so database is just a way of organizing tables into logical groups you're not actually physically moving the tables anywhere or moving the data anywhere it's just a naming convention to logically organize those tables and let's go create a database so on the console in database we want to add a database and we need to give our database name so if you remember back onto s3 and i'll just go there quickly on this other tab so if you remember inside the s3 folder we created we called it data and then i created a folder called customer database all my tables belonging to this database i.e the folders below will be set inside this folder so all the tables below belonging to this database will be set inside this folder it's worth bearing in mind that you don't actually have to use a folder for your database um you can just leave that part blank when i asked you and you can make your tables from anywhere but i find it easier to keep it logically and physically uh located the same so database and then the next directory done is tables back under the console we need to give this a name so it's going to be customers yep customers database i should also add um always use underscores um because spark um needs underscores for its engine it will convert them for you from hyphens to to underscores in certain places and other places it won't let you enter them on the glue dot catalog but just make sure everything is in small letters and underscores that's really important location well we already knew what it was because i copied it in and description is this is a database of customer information okay and we create and you can see that our customer database has been created if i click in we get a little location here as i said it doesn't really mean anything but it makes more sense to create a folder as your database and then place your tables under that okay now we have the database set up we're going to take a look at aws glue tables so as i mentioned before in the video database is just where we logically group tables and tables by definition are the metadata definition that represents your data the data resides in its original store this is just a representation of the schema so this is really important what it's saying is that the data always resides where it has lived and the glue that i catalogue is just a representation of that data in schema form if we quickly jump onto the console and we go to the s3 bucket it means in the terms of this s3 bucket when we add this glue data or sorry when we add this data to the glue data catalog the data is still in s3 the data doesn't go anywhere similarly if we were to scan a database for example and add it to the glue data catalog the data still resides in the database we're only bringing the schema information across onto the glue data catalog the data itself always sits where it lives back onto the presentation if we just look at a couple of concepts we need to understand before we add these tables so partitions are really important partitions are folders where data is stored in s3 which are physical entities which are mapped to partitions which are logical entities so physical entities mapped to logical entities and what that means if we just jump back onto the console and take a look at our customer csv as i mentioned before this folder is going to be our database and inside we'll add another couple of tables before we're done but right now we have the customer csv folder which is going to become a table in the glue data catalog underneath we have partitioned by data load so this is our physical partition as a folder where we're going to get a data load column and the value for all the data under this folder will be 20 21 12 30. if we were to carry on and add more data load folders with more data then our queries if we added them and told it to look at the data load date would only look at the data in that specific folder for that query that we specify what okay example so straight off the aws website if we take a look at what's going on here you can see that they're using a sales table where they have partitioned or put the data into three separate folders below where they have year month and day and the way this works if we were to write a query and we wanted the data for where the year was 2019 the month was february and the day was one you'd write the query to say where year equals 2018 where month equals february where day equals one the query would come in to the sales table go into the year 2019 go down into february and then go down into the day one and search these two files here it knows they exclude all this other data sitting there because of the way the partitions are so partitions or just folders in s3 then help us query the data more efficiently because the query only goes down into the folders that are specified right one last thing to cover before we create a table and that's aws glue crawlers and a crawler it's just a wii program that aws have wrote that actually goes in and looks at all the data on s3 or databases and infers the of force is to save us having to manually enter the table schema in the aws glue data catalog now i'm going to show you how you would manually enter the schema right now but we're going to do is actually stop that process halfway through and add the table using a crawler so let's jump onto the aws glue console and take a look right back on the console what we want to do is go databases customer database tables and customer database and now we're in the table section of our customers database but obviously we don't have any tables so as i mentioned if you go under the left hand side there and click add tables you have two options you can are three options you can add tables using a crawler add a table manually and add a table from an existing schema if you have one i'm going to quickly show you how you can add a table manually using the console so the first thing you need to do is enter a name which would be test and we're going to call this one customer csv and you want to select the database which is the customer's database you want to hit next that is stored in s3 and you want to specify the path in the account where the data resides so that's into the s3 bucket that's in the data that's into the database and it's the customer's underscore csv you want to click next and you want to select our data is in csv our delimiter is comma and you hit next now you must manually add the data itself so as i mentioned we have a partition column if we jump back in our partition column is called data load and what you would manually then do is click this and say download and then you need to give it the actual data type which is small end and add then you need to add the next column now if you just quickly jump back on the get and you click into the customer csv file you'll be table representation of what's going on so the next one would be customer id paste it in and then choose it as small letters or change it to small letters and underscores and that was a small int and you add then you want to add the next one and it would be name style so you copy and paste name style in name style and if we have a look at it it's in uh boolean data type so we would go find billion which is here and you would add then you want to add another column and this time it's title so you would just take title and change it to small letters then it would be string and r and you get the picture you would go through every single one of these columns the whole way across and add them in manually takes a bit of time but rather than doing that aws have actually added a thing called a crawler so if we exit out of this and don't save we come back here and to add using a crawler you want to go add tables add tables using crawler what we want to do then is give the crawler a name so i'm going to call it crawler under store customer underscore csv and we want to click next data stores crawl all folders s3 include path let's go find that csv table which is in data databases customer csv select then we want to go next add another data store no use an existing rule which is that glue service rule we created at the start uh make sure you have this if you haven't go back to the setup section and follow the steps to get it added on demand and database we're going to put this inside the customers database and we want to click next and that is perfect hit finish highlight it and then click run crawler this will take a couple of minutes to start up run over the data and find the tables so i'm just going to pause the video here and we'll pick it up when we're done okay as you can see that took 48 seconds to run on total and one table was added so let's take a look at the table itself so let's go in the long way by databases customer databases tables and customer database and you can see that we've added the table you can see that it was updated on the first of january 2022 at 4 33 a.m and as you can see we have the table name based on the folder name and s3 so it's used this folder name to infer the table name if we click into it you'll see that very quickly we have some information so this is the location of where it is which is here it has recognized that it's csv off the bat it's put it in the customer databases it's recognized that um there are delimiters of comma it has recognized that it needs to skip the first line and it's looked across a couple of other things just to check different things like orders of columns or columns quoted etc etc and then most importantly it's picked out that data load as the date partition and then it's inferred all the other names and as you can see as i said it needs it in smalls for the spark engine so it's done that but it hasn't done anything else it hasn't tried to separate names out it's just took them as is and made them into smalls and that's fine for what we're trying to do now let's have a quick look at the data in this table so one of the things before we go is to actually view the partitions and if you click onto it you can see that it's picked up that partition and again that's really important so view partitions and it's picked up the partition and that's the first part of the data that we can see and then the next thing you want to do is actually look at the data to do that you would go to aws athena so click in the top and go to athena i'm going to open that in new tab new athena console experience excellent let's get out of that there and let's click on the left hand side and go to query editor this is the first time you've been in the athena you will have to set up a few things so before you run a query you need to set up a query results location in s3 if we go back into our bucket and we go up to the full course i'm going to create a folder here and i'm just going to call this athena underscore results and save and then jump back in the athena so we're going to use here to save the results settings we want to go manage we want to browse s3 we want to go to that glue bucket full course we want to click on the athena results and choose so that's it there athena results choose and save brilliant then we want to go back out into the editor which is here you want to be on the aws glue.catalog you want to be on the customer databases and then if you just click this little three dot menu and click preview table it'll do a select limit 10 and then it will bring back the data for us and if we have a little look you can see that all the data is there and present excellent okay so the next thing we're going to look at is connections which we'll look at very very quickly you can skip that section if you're not interested in it and then we're going to look at actually creating aws glue jobs and doing some etl on this data okay next we're just going to quickly look at aws glue connections they are part of the glue ecosystem they are useful but they're not really critical to getting started but good to know information for for future developments so it's just a catalog object that contains the properties that are required to connect to a particular data store in other words it's connection strings with usernames and passwords saved so you can just reference the connection object rather than having to write out the big long string so if we just quickly jump onto the catalog and we go to connections and then once on connections if you just go add connection you give it a name you can choose your database type depending on what you're doing so there's jdbc rds red shift document click redshift or rds whatever you're using i'm going to do a roar and then you click next and then you just fill in this information as it sits and next and that's you i'm gonna leave this here because you do have to have a database set up in the background um for it to work i do have one but set up databases is beyond the scope of this course there are other videos on this channel that take you through that process and adding it to the glue data catalog so feel free to go look at those other videos and pick it up from there where i cover this a lot more in depth but that's all you need to know for now is that essentially you add a connection object to the glue data catalog like this and then you reference that object in the code rather than specifying like the username and password and connection string every time you need to reach out to that data source okay so next we're going to take a look at aws glue jobs and this is the crux of the etl work so a glue job is composed of a transformation script which we're going to take a look at in a second the data sources which we've already added data targets and we'll take a look at that in a little second job rooms are initiated by triggers that could be scheduled or triggered by events or indeed that event can be also manually running the script that we're going to do so the first thing we're going to do is jump onto the console and take a look at the script and hi aws actually automates some of the creation of the scripts and the different parts we need for these glue jobs so back on the aws console in glue and this time we want to go etl and we want to go jobs and once on jobs we want to click add job um we're going to call this customer csv to par okay and i'm just going to put another little underscore there i am role is the glue rule that we have been using so this is just starting to set up the job under job properties so job properties leave everything else as default uh in the section type and the glue version it's past the word it's stored so we actually set up this path previously if we go down and we go into our s3 pocket and we go to scripts so we're just going to store it in scripts and we want to add that and then the template directory is the temp directory that we also set up so glue full course data our temporary directory and add then leave everything else as default and click next our data source is our data source and we click next and then we want to do a change schema so we click next and we want to create the tables in our target data our store will be s3 our format will be part k our connection we don't have one and we need to actually then put in a target for a bucket so back in and this time we're going to do full course data inside customer database and click select now we're actually going to add more here because we want a new table so back into the s3 bucket back into data back in the customer database and we want to create a folder right here and we're going to call this folder customer underscore parquet because we're going to be the name of our table and we want to create that folder back into the console back into the target path back into the glue full course back in the data back into the database and then choose customer part k then click next and as you can see it has mapped the schema from one source to our other target source that's perfect and we want to save job and edit script so as you can see glue quite handily creates a script for us we don't have to do anything in this case and what the script is doing is taking our csv data and changing it to parquet there's nothing particularly complex but just a couple of things i'll point out here on screen so you know what's going on glue uses a dynamic frame to lift data from the catalog so you can see here it's lifting from where the database is called customer database and the table is our csv and you'll see here that actually it's writing down using a dynamic data frame here and it's writing to that s3 location that we specified there you go in part k format oops too far there for us in parquet format right there so we follow this code three it lifts the data from the data catalog that we specified from that s3 location it changes the names of the columns it actually keeps them the same but that's what this is doing it's mapping it's that screen before where it mapped here it's just putting it into resolved apply and then it's dropping any fields which are null and then at the end it writes down to that s3 location that we specified berlin you want to save and then you want to run the job click run click run job and then click at the x at the top right and you can go on to this jobs page highlight the jobs and you will see that it is starting up it's starting to run this will take a couple of minutes it'll cost a few cents what i'm going to do is pause the video here and then we can pick it up once it's done okay as you can see that job has succeeded right here it took in total 49 seconds to run and it's put the job output um into that s3 bucket that we created so we can actually go view that data so if we go to s3 and we go to our bucket we've been using this entire time close glue build course we go into data go into the database and we go to part k you can see that it actually created a parquet file that has our data and you may say hey well i want to see that that again in athena how do we do this well that's simple we go back under the glue console we're going to go databases customer database tables and customer database add tables add tables using a crawler and we'll call this one customer underscore part okay we'll go next datastores crawl all folders s3 um leave that as default path we want is if we go back out onto this folder up one to the customer database and it's the customer parquet folder is the one we're after copy the uri when you have the actual folder ticked make sure you're on the per k one paste it go next add another data store no iam roll is existing rule and it's that full service rule that we've been using the entire time that we must delete schedule on demand output database customer database don't prefix anything hit next hit finish highlight run crawler this should take a couple of minutes and we'll find our table i'm going to pause it here and then pick it up once that crawler has ran okay that took apparently 48 seconds and oh no sorry mark mark okay so apparently that took three minutes and it's now successfully added to the customer database so if i click customer database per k you can see that we actually picked up the entire file again like we did in the original video this is actually available in athena to be searched so click on athena go to query editor aws glue data catalog yep customer database and this time it's the parquet table we want to see so if you just right hand click and generate preview table there will actually end up with the rows as you can see in the table so our job has run successfully we've transformed that data and then i've created a new crawler placed it over the top of the parquet data and then using athena we can actually query that data as you can see that's how simple it is to use aws glue to change data from csv into parquet okay so the next thing we are going to take a look at is aws glue triggers a glue trigger initiates an etl job triggers can be defined based on a scheduled time or an event so simply put triggers or a way that we can run etl jobs by glue without having to manually click the button run back on the console if we go to triggers on the left hand side we'll take a look at a scheduled trigger first so we go add trigger you give the trigger a name so i'm going to call this one scheduled daily 8 pm and you guessed it we're going to run it on a schedule it's going to be daily so it'll run every day at 8pm so let's just go to 8pm and you can have a look here you can run early daily you choose the days weekly monthly or even a custom job and a custom job you have a full ability to make it run at whatever time and whatever frequency you want using a cron job syntax so let's just keep this daily we're going to run it we're going to hit next and then i ask you what jobs do you want to run at 8 pm every day and i say the only job we have and then we go next review steps and that's a done this job will now run every day at 8pm the only thing you need to do is enable it so if you right hand click and then click uh activate trigger enable and that's it activated and if you want to deactivate the trigger or deactivate the job then click it and then you'll have deactivate and it will no longer run every day idiot so you can basically pause and activate the actual triggers themselves once you have created them now the next type of trigger we want to look at is an event trigger and that's where an event happens and then your job is kicked off but to do that we need to create another glue job so back into jobs down here on the left hand side add job then we want to call this one just uh customer job two keep it simple find that glue rule that we need to delete at the end keep it at the spark job change this to the script locations as we had for the other job so i um glue full course scripts and in that one and then i'm just going to be lazy copy and paste this here okay then you want to come down here and find that temporary directory that we're looking for so again it's in that bucket from earlier temp hyphen directory leave everything else as default we'll do the same data again same thing again change schema we're going to create a table in our target amazon s3 part k data go find the same location that we use for the first job we're not actually going to run this job it's just for the purpose of looking at the trigger data database and then part k add next one to one column mapping so all the columns are one to one again nothing special exactly the same as the last time and then we just want to save the script and exit we don't need to run at this time so we have a second job back in the triggers and a bit of a bug of the console i find if you don't refresh the trigger page it will pick up the new job that we need so just a quick refresh add triggers and we're going to call this one a event trigger so we'll just go event trigger keep it simple job events this time on succeed so when the customer csv 24k succeeds job or the job succeeds so that's the first job when our first job that we created succeeds we're now going to ask it to do something and that's something when i hit next is run the second job so when the first job succeeds kick off an event and run the second job leave everything else as default let's click this box this time so it's automatically enabled and click finish and as you can see what it's saying is trigger parameters when customer csv topper k finishes run customer job that's as simple as that for triggers that's how we get a daily or a scheduled trigger on any sort of time frame we want running and an event trigger where we can build dependencies so you can see that the glue scheduling means it's triggers and it's a very powerful thing indeed i advise you to go in play around with the different types of events different types of triggers different types of schedules and get used to the interface itself okay guys so the last thing i want to talk about is aws glue dev endpoints uh development endpoint is an environment that you can use to develop and test aws glue scripts it's essentially an abstract cluster and what it allows us to do is test scripts from local machines or terminals without having to create the jobs that we've been doing this entire time so it's a way that you can type code locally send it up the glue and then get a response back word of warning it is very expensive in terms of an individual you're looking at fifty dollars a day i think we're running one in my work environment right now and we're paying about a thousand dollars a month for it so if it's a personal thing it's quite expensive if you don't turn it on and off but in terms of companies if you've 10 or 12 developers using the endpoint the amount of time they save is worth it worth it now on the glue console itself we just jump back on they're located down here and you can add an endpoint by this button i do have a video on the channel that covers this in depth far more than you need to know for an aws exam all you need to know for an aws exam is really that they exist so i've put the link on the screen now if you want to actually go set up your own dev endpoint and i include pycharm professional on this so if you're using it in a work environment this is one way you could interact with it within your company so go watch that video if you want to check it out now well guys that's the end of the course i hope you really enjoyed this aws session where we went from zero to hero in aws glue and under an r we've looked at the glue data catalog we've looked at etl jobs we've looked at triggers and we've even looked at dev endpoints themselves the next steps i would advise you is to go back look over the course material play around by yourself and aws and then look at all the resources on youtube i have more videos on my channel alone that cover aws glory in depth so you can start there or branch out to other learning resources as you feel free please like and subscribe to the channel because it really helps me out keep producing these resources for free i've also put a link to my website in the description where there's plenty more learning material as well until next time guys thanks for watching
Info
Channel: Johnny Chivers
Views: 232,340
Rating: undefined out of 5
Keywords: aws, aws glue, aws etl, cloud etl
Id: dQnRP6X8QAU
Channel Id: undefined
Length: 41min 29sec (2489 seconds)
Published: Sun Feb 20 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.