FULL SQL DATABASE COURSE | Learn SQL in 70 minutes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey my name is melchan and in this tutorial you'll learn everything you need to know to get started with SQL which stands for structured query language we'll cover some theoretical knowledge first around databases then we will install our very own database management system then we'll create a database within there some tables within this database and then finally we'll query these tables to get insights from them in this course we'll be using postgresql one of the most popular relational database Management Systems out there it is completely free you can download it for Windows and for Mac if you stick around till the end of the course you'll even be able to write a query that will get you the 30-day moving total all of the resources are in the links in the description below so there's no need for you to type out any code make sure just to follow along and engage with the course and understand everything that I'm going through to be honest mostly I'll just be copy and pasting the code in as well as I'll be spending my time explaining everything to you rather than typing out everything so without further Ado let's get to it okay so first of all I created a simple course outline here so we can go through what exactly you'll be learning so in the first two sections in the databases queries tables and case sections we're gonna go through some theoretical knowledge first and then we move on to the Hands-On bit so in Section 3 we'll create some tables then in section four we're gonna insert into these tables then in Section 5 we're going to move on to some basic select statements and we're also going to explore some where Clauses with and and or now move it on into section six we're going to use the in not in is null and between operators and then in section 7 we move on to order by limit distinct and we're also going to rename some columns then in Section 8 we're going to use the extract function to extract date information from the date columns and then in section 9 we're going to move on to string manipulation so we'll use the upper lower length and trim functions then in section 10 we're gonna use concatenation Boolean expressions and we're also going to introduce wild cards so then moving on to section 11 continuing on the theme of string manipulation we're going to use the substring position and coalesce functions and then we're going to move on to some numerical functions so we'll use the Min the max the average the sum and the count now in section 13 we're going to use a group by and a special where statement the having that you must use when you use Group by now moving on to section 14 we're going to use case statements and we're also going to combine case statements with the group by and we're also going to use these case statements for transposing our data then moving on to section 15 we're going to start working with multiple tables using joins and unions and then finally in section 16 we're going to move on to something a little bit more complicated in the form of sub queries so first of all let's see what databases are what a database management system is and what the two main types of databases are so when you think of a database just think of any collection of related information and a database management system is essentially just a software application or software program that can help you create and maintain these databases now crude is going to be a very popular acronym that you may hear and it stands for create read or retrieve update and delete the main operations of a database but obviously these databases can help you with security backups importing and exporting data as well the two main types of databases are relational databases and non-relational databases so first I'm going to cover relational database Management Systems which is what this tutorial is on so these can help you organize data into tables each table has rows and columns pretty much like a spreadsheet so some popular ones would be postgresql which is what I'll cover in this tutorial or MySQL or Microsoft SQL Server now SQL is the standardized language for interacting with relational database Management systems or rdbms and it stands for structured query language SQL is used for crude operations admin tasks backups security user management database access so on and so forth so SQL code used in one or dbms may not always be transferable to another one but given that they're so similar it's say for example if you can code in postgresql you'll be able to code in MySQL as well so let's quickly look at two tables here I have the employees table on the left and then the shops table on the right you can see that within the employees table we have the employee ID column which is the unique identifier so if I look at the first row here employee ID 501 559 this employee ID uniquely identifies Carson Moscone if I look at the shops table on the right you can see that the coffee shop ID is the unique identifier so if I look at coffee shop ID one this is gonna uniquely identify the coffee shop with the coffee shop name Common Grounds now you can see that both of these tables are organized into rows and columns much like a spreadsheet so moving on to non-relational database Management Systems which stand for nosql or not just SQL you can organize data into anything but traditional tables so think of key value stores jsons xmls graph some popular ones would be Apache Cassandra mongodb or dynamodb there is no set coding language so most non-relational database Management Systems they Implement their own language Apache Cassandra for example uses the Cassandra query language so let's quickly look at some examples you see a Json file here and then we have a graph database on the left it's actually a neo4j database which is quite popular and then we see some key value stores as well so let's move on to queries and queries are essentially just requests made to the database management system for specific information that you want to retrieve so the more complex the database structure the more difficult it is to request this information so I created a very simple illustration here of a YouTube database management system now clearly I don't work at YouTube and the formula is much more complicated than this but at its very core when you interact with youtube.com youtube.com then in turn interacts with the YouTube database management system to perform crude operations and it's that simple at its core obviously so let's move on to tables and keys so like I mentioned before tables are split into rows and columns each row refers to an individual entry a single observation and then each column refers to a single attribute so if we look at the shops table here on the bottom left you can see that the coffee shop ID column is a single attribute and obviously it'll refer to the coffee shop ID you can see that the table is organized into rows and columns so the first row would contain coffee shop ID one coffee shop name Common Grounds and City ID 1. now these tables must have a primary key so this is a special column about an entry that uniquely identifies it so if we look at the shops table this special column is going to be the coffee shop ID which will uniquely identify each row within this table and then the locations table on the right the city ID will be the primary key which will uniquely identify each row so City id1 will uniquely identify the city of Los Angeles in the United States and City ID 2 will uniquely identify the city of New York also in the United States now tables may have foreign keys so this is not a must foreign key is a special column that can link you to the primary key of another table so if we look at the city ID column in the shops table this is a foreign key because it links us to the primary key of the locations table so City ID born key in the shops table links us to the city ID primary key in the locations table now tables May sometimes also have composite keys so these are keys made up of two columns and only together can they uniquely identify each row so let's quickly have a look at the suppliers table on the top right you can see that the coffee shop ID and the supplier name they make up the primary key together because only together can they uniquely identify each row we need to know which coffee shop the supplier supplies into and obviously we need to know which supplier is actually the coffee type supplier so if we look at the supplier name column only there's two entries for beans and barley if we didn't have the coffee shop ID we would only know who supplies the product but we would have no clue which store the supplier actually supplies this product to and that is exactly why we need a composite key here all right so now I'll show you how you can easily install the PG admin tool which will be your postgres equal to on windows so the download link is in the description below so it's just this postgresql.org download link and then make sure you obviously choose Windows because you want to install it on Windows and then from here the easiest way is just to click on download the installer so once you do that it will take you to this page where you can choose between various environments so what you want to do is you want to choose the Windows x8664 operating system here and then click the download button so your download should start now so you can see that mine is actually in the bottom left here so overall it's 305 megabytes so I am just going to wait for the download to finish okay so once your download is done make sure you just click the installer here and then let's start installing the postgres SQL tool and then when you're in the setup wizard just hit next make sure you install all of these so we'll have the postgres SQL Server right here PG-14 stack Builder and command line tools and then hit next next next um yep next everything is default hit next okay so that's it make sure you just hit finish at the end and then for the servers just choose the one that comes by default so this is going to be on Port 5432 hit next and then from the application list um we need I think yeah we do need to select something so I think I'm just going to select the one that's already installed so this one hit next hit next and then skip the installation hit next and then finish and now you have your PG admin tool installed so now if I go into the start menu and I type in PG admin you see here that I have the PG admin tool right here so you can just click on it and then it'll start up so once it started up for the first time you will have to provide a master password obviously just go with something that you will remember and then the moment you hit OK you have everything set up so that's it okay so now I am going to show you how you can easily install the PG admin tool which will be your postgres SQL tool if you have a Mac so just make sure that you go to this website here link is in the description below so postgresql.org download and then obviously after that make sure just to click on Mac OS and then the easiest way to install this one is to go to download the installer at the top left here and then click on it and then you get taken to this page and then make sure obviously you select the Mac OS installer hit the download button and then wait for the download to finish alright so once your download is done make sure you click on the installation file and then click on this installation file double click on it hit open and then obviously just type in your system password and then you can just install whatever you have so let the installer begin hit next hit next hit next again next again next again next and then sit back and wait for the installation to complete okay so once your installation is done hit finish type in your password again and then from here just make sure you choose the whatever server you got given so I'll go with postgres SQL 15 because it's the latest hit next and then I think we need to install something yes okay so let me just select the server that I just chose from here so 15.2 hit next hit next again nice after that skip installation hit next hit finish and then that's it now you have the PG admin tool so if I just search for it you can see that I have it here we can open it up okay so when you start it for the first time you will have to enter a master password obviously just go for something that you don't forget so you can reuse it and then that's it you've installed PG admin tool the postgres SQL tool which you will be using in this tutorial on your Mac okay so now that we have everything installed let's jump into the exercise and first of all you can see that I have PG admin so the postgres sequel to open on the left here and I'm going to have the tables open on the right just so you can see what kind of data we're working with so just to go through the tables we have the employees table at the top we have the shops table the locations table and then we also have the suppliers table within the employees table you can see that the primary key is the employee ID column and then within this table we actually have a foreign key the coffee shop ID column which links us to the primary key of the shops table where the primary key is the coffee shop ID column and within the shops table you can see that we also have the city ID as a foreign key which links us to the primary key of the locations table here and then finally we have our suppliers table with the composite key so you can see that the coffee shop ID is the primary key which is also a foreign key because it also links us to the shops table and then we have the supplier name the other primary key so these two combined they uniquely identify each row within this table all right so a quick tour of our PG admin tool here so let me just click into databases here and you can see by default PG admin comes with the postgres database so we're gonna create our own database by right clicking create database and I will just call this database coffee you can leave the owner as postgres and Save so now that we've created our database you can see that coffee appeared here let's go into schemas so you see that you have a bunch of options here but the ones we're really going to be interested in this course will be schemas and then under schemas we have tables so if you click into it you can see that at the moment I don't have any tables which makes a lot of sense because obviously we haven't created anything yet so let's go to the top and open the query tool so I won't be typing out any code really in this tutorial I'm just going to copy and paste the code in you can do the same obviously using the link below where you can find the code so first of all let's create some tables so I'm going to copy and paste in the first two queries and first of all we're going to create the employees table and then we're going to create the shops table so create table is the keyword for creating tables obviously and then you pass in whatever table name you want in this case we'll call this table the employees table because it'll Store employee data and then we need to put in the column name so we specify them as employee ID first name last name email hire date gender salary and coffee shop ID now next we need to define the data type so employee ID will be of integer data type first name last name and email will be for chart 50 and 50 just refers to the length the maximum length of the characters and then higher date will obviously be of date data type the gender will be a voucher one so a character length of 1 because it's either gonna be mil so n or female now salary again will be of integer data type and then we have coffee shop ID which will also be of integer data type so we're going to create this table and then we're going to create the shops table where again you pass in the column names so coffee shop ID coffee shop name and City ID and then you need to pass in the data type which is integer and then virtual 50 for the coffee shop name and integer for the city ID now you can see here that in the employees table the import ID is the primary key so we Define it by passing in the primary keywords and then in the shops table you can see that the coffee shop ID will be the primary key so we defined this as the primary key here so let me just run all of this code and you can hit the little play button at the very top or you can hit F5 and there we go the query ran successfully you can see it on the bottom here it says query returns successfully in 83 milliseconds all right so that's all good let's move on so I'm sure you noticed that we didn't do anything about the foreign key before so let me just go to the tables here on the right and you can see in the employees table that we have the coffee shop ID as the form key but I didn't actually Define this now why didn't I do that now this is because I couldn't Define it here because I haven't actually created anything in the shops table yet so I cannot Define a foreign key that links to the primary key of a table that doesn't exist yet so that's why let's go back into our PG admin tool here and now that we have the employees table and the shops table I am going to add the foreign key on the coffee shop ID so the keywords are alter table which table you want to alter the employees table add the foreign key on which column the coffee shop ID column in the employees table what does it reference so you say references the coffee shop ID column in the shops table and then on delete set no this is just a constraint and for the purposes of this course I won't go into constraints in too much detail for now just go with on delete Sentinel and let me just run this code and you can see again that the query was written successfully so we've added our foreign key okay so next up we're going to create the locations table and then we'll also going to add a foreign key to the shops table and then create the suppliers table so you can see the code here first of all we need to create the locations table so again create table what table name locations you pass in the column name City ID City and Country you define the data type integer virtual 50 and voucher 50 again and then you set the city ID column as the primary key and you notice here that we're going to add the foreign key to the shots table now so let's just go to the tables and right here you can see that City ID is a foreign key here and once we've created the locations table we can link the city ID to the primary key of the locations table so alter table shops add the foreign key on the city ID column references the city ID in the locations table and again on delete set no and then we can run that as well and then finally we're going to create the suppliers table so this is the one with a composite primary key so create table what the table name is suppliers columns coffee shop ID supplier name and coffee type and then coffee shop IDs of integer data type and then the rest of the columns of varchar now we Define the primary key as coffee shop ID and supplier name so we have a composite key and then the coffee shop ID is also a foreign key so let's not forget that and it links to the coffee shop ID column of the shops table and then here we set on delete Cascade again for the purposes of this tutorial just set it like this I won't go into too much detail about constraints now so we can go ahead and run this query and you can see that the query was returned successfully again so now we've created all of our tables let's move on to inserting values into these tables so first of all we're going to insert some values into the employees table and the shops table and just copy the code in here so we're going to insert the first two rows into the employees table here and then we're going to insert the first row into the shops table so let me just go to my table so we'll insert Carson Moscone Khalil core and then Common Grounds right here so let's go through the code insert into is the keyword and then which table employees and then you define values and then you just pass in the values basically so you see here that I passed in all of the values and you can see here that for the coffee shop ID I set it to null at the moment now this is because obviously we didn't actually create copy shop ID with copy shop id1 so we haven't actually created Common Grounds within the shops table so we cannot actually link this to the primary key of another table that doesn't exist yet so I'm just going to set this to null for now and I will change it later on and then moving on to the shops table we'll insert into the shop table and then we'll insert these values now again we set the city ID here to null because we haven't actually created City ID with City ID 1. in the form of Los Angeles in the locations table so let's have a quick look here so you see City ID 1 is Los Angeles as you look across but we haven't actually created this entry yet so we leave it to null for now so let me just run all of this code and you can see that the query was executed successfully so now we're going to set the coffee shop ID for the first two employees to one so let me just copy this code in and we're going to update the employees table we're going to set the coffee shop ID to one where the employee ID is in these employee IDs so essentially if I go back to the table here you can see that the first two employees have employee ID 501 559 and 144 108 and we're going to change the coffee shop ID from no to one for these employees so let me just run this code and you can see that the query executed successfully again so now next up let's insert into the other tables as well let me just copy this code in and you can see that we're going to insert the Los Angeles values into the locations table so insert into which Table locations and then the values are one Los Angeles and United States so this the one will be the city ID the Los Angeles obviously the city name and United States the country after that we're going to update our shops table and set the city ID to one where the coffee shop ID is equal to one because now we have actually created this Los Angeles value here so we can finally link our foreign key to the primary key of another table and then finally we will insert the first two values where the coffee shop ID is equal to one into the suppliers table so let me just go to the suppliers table and you can see here that the first two values will be pins and barley and cool beans the suppliers so let me just run all of this code and you can see again that the query was returned successfully so all should be good now next up I have all of the code in the file so it's just a bunch of insert statements here you see that I copy pasted and everything but essentially what we will do is we will populate the rest of the tables using these insert statements so if I go to the employees table you will populate these eight entries and the remaining 990 and then you will populate the shops table the locations table and the suppliers table in full so let me just run this code right here and you can see that now we have all of our tables created and the values inserted into these tables all right so now it's time to move on to some basic select statements so I've written a couple statements here and you can see that select star will be very popular which is selecting everything from that table so let's first look at the employees table and you can see that if you only highlight a section of the code here and when you run the code it'll only run that bit of the code so select store from employees essentially just gives us all the values from the employees table so you can see that we have the employee ID first name last name email hire date gender and then we also have the salary and then the coffee shop ID at the end and indeed we have 1 000 rows of observations now let's do select store from shops so you can see that we've inserted all of the values correctly let's do locations next let me just select that and you can see that we indeed have Los Angeles New York and London and we have United States and United Kingdom as countries let's select everything from the suppliers and then you can see the cop Bishop ID the supplier name and then the coffee type now obviously you don't have to select all of the columns within the table so you can specify which columns you want to select which is what I'll do next so I'll only be selecting the employee ID the first name and the last name columns from employees let me run that and you can see that now we only get three columns on the bottom here employee ID first name and last name or we could just select employee ID higher date and then the salary so let me just run this bit of the code here and there we go you see on the bottom we have employee ID higher date and the salary now that we're done with the basic select statements let's combine these select statements with some where Clauses using and and or so here's the code and first of all let's select all of the employees whose salaries are greater than 50 000 US dollars so let me just run this code here and let's quickly Look Down The Columns and you can see the salary column here all of the values are above 50 000. next let's select all of the employees who work in the Common Grounds Coffee Shop so if we look at our tables here you can see that the coffee shop name is Common Grounds here and the coffee shop ID is one so what we will do is we'll select store from employees where the coffee shop ID is equal to one and if I run this bit of code here you can see that the coffee shop ID column only has values of one oh and just quickly by the way you can do this hyphen hyphen to create a note so the moment you do this um SQL won't recognize that line of code AS actual code it'll just be a note you see here now it's in a different color so let me just undo that and then let's move on to selecting all of the employees work in common ground and make more than 50 000 so we'll use the and operator with the where Clause here so we're saying that you need to be making above fifty thousand dollars and be working in common grounds so let's return that and here we go let's just look down the columns you see the salary everything is above 50 000 and then the coffee shop ID is one all the way so next up let's use the or operator so we'll select all of the employees work in common grounds or make more than 50 000. so let's run this bit of code here and now if we're looking down the columns you can see that you know we have people working in coffee shop ID 2 3 and 5 but this is because those people say for example Row three here this person is working in coffee shop id2 but is making more than fifty thousand so the query returned it if we look at the first row for example this person makes less than 50 000 but is working in common grounds so the query returned it let's move on to selecting all of the employees work in common grounds make more than 50 000 and or male so we're using two and operators here we're saying if the salary is greater than 50 000 and the coffee shop ID is equal to one and the Gen is equal to male return that so let's just do that let me look across the rows here so we see gender is only M Coffee Shop ID only one salary all above 50 000. all right next up we'll be going through the in not in is null and between operators so first of all we're going to select all of the rows from the suppliers table where the supplier is beans and barley using the equal sign so select store from suppliers where the supplier name equals to beans and Orly so let me just run this code here and you can see that the supplier name here is only beans and barley now we can also do select everything from suppliers where the supplier name is not equal to beans and barley just using the not operator so let me just run that so you see this line of code here where not supplier name is equal to beans and barley so let me just run this code here and you can see that now the supplier name has everything but beans and barley so moving on you can use the sign as well which is a not equal to sign so this will return the exact same thing let me just show you and there you go the supplier name has everything but beans and barley okay so let's select all of the robusto and Arabica Coffee types using the in operator so all I'm saying in the where class here is if the coffee type is in either Robusta or Arabica then return that row so we see here that the coffee type is only a replica or Robusta and obviously you could use the or operator as well so where coffee type is equal to Robusta or coffee type is equal to Arabica and then it'll return the exact same thing probably using the in one is just a little bit neater and nicer because it's less repetitious but it's a matter of preference I guess so next up we'll select all of the coffee types that are not Robusta or Arabica so obviously you could use the not in operator to do this and let me just run this code here and you can see now that if I look down the coffee type column you can see that I only have Libre Eco and Excelsior coffee types so everything but Robusta or Arabica now next up we'll select all of the employees with missing email addresses so select store employees which is selecting everything from the employees where the email is no so let's see what this returns let's look at the email column and indeed we have returned everyone whose emails are missing and if you're wondering how many there are you can see the total number of reviews return this 209 so we have 209 employees with missing email addresses now let's return all of the employees whose emails are not missing so again you can just use where not email is null let me just run that and then you see now the email column only has populated values and then we have 791 employees whose email addresses are not missing now next up we'll select all of the employees who make between 35 000 and 50 000 using the between operator so we'll only select the employee ID first name last name and the salary columns this time from the employees table and we're saying that where the salary is between 35 000 and 50 000 return the rows so this between operator is inclusive of both ends so if you're making 35 000 you're gonna be returned in this query and if you're making 50 000 you're also going to be returned in this query so let me just run this bit of code here and let's look at the salary column as I scroll through you can see that all of these values are between 35 000 and 50 000. so next up obviously you could use the end operator as well so this will give you the exact same result let me just run this code here let's look down the salary column and you can see that all of these values are between 35 000 and 50 000. okay we're doing really well let's move on to order by limit distinct and will also rename some columns so first of all let's order by the salary ascending so again I'll only select specific columns so I'll select the employee ID first name last name and the salary columns from the employees table and then the keywords are ordered by and then you just specify which column you want to order by so this is salary for me in this query let me just run it and then you can see by default in the salary column the order by happens in an ascending order so next up let's order by the salary descending so the only thing you have to do is after you put in order by and the column name you pass in the keyword desc or descending and let's just do that let's run the code and now if I look at the salary column you can see that you have the highest salary on the top so the maximum salary is 67 724 and then the minimum is 9878 dollars okay so how about we return the top 10 highest paid employees so the only thing we have to do compared to the previous query is after order by salary descending we need to limit what we return so if we limit it to 10 obviously we return the top 10 highest paid employees so we can see that these people are the highest paid people now how about we return some distinct or unique coffee shop IDs from the employees table so we know that within the shops table we have five coffee shop IDs corresponding to the Five Coffee Shop names so we're expecting to see five unique coffee shop IDs in the employees table as well so let me just run this bit of code here and you can see here that indeed we have five unique coffee shop IDs so one two three four and five now how about we return the unique countries from the locations table so select distinct country from locations let me just run that and indeed we have two unique countries the US and the UK so next up let's rename some columns so renaming is very very simple with the as keyword so I am going to select the email column and then the email column renamed as email address then I'll select the higher date column and then the higher date column renamed as date joined and then the salary column and then I'm also going to rename the salary column to pay so let me just run this code here and you can see that now we have six columns on the bottom so email and then we have email address we have higher date and then we have date joined we have salary and then we have pay okay moving on to extract this is going to be a short section so all we're going to be doing is extracting some date information from the date column which is our higher date column so you can see here that the function we use is extract and then we Define what we want to extract from the specific column so first I'll extract the Year from the higher date column and then I will extract the month and then the day and then I am going to rename these columns as year month and day respectively so let me just run this code here and you can see that on the bottom now I have the year I have the month and then I have today and let me just actually pass in the actual column as well so higher date right here and then it's easier just to see what we extracted so if you look at Row one here you can see that the higher date is 2015 August 29th so the year is going to be 2015 the month will be 8 obviously for August and then the day is 29 or let's look at another one say row five here 2016 in March 27th so the year is obviously 2016. the month will be March so three and then the day is 27 and that's extract for you all right so let's move on to some string manipulation so we'll use the upper the lower the length and the trim functions first so first of all I will uppercase the first and the last name so I will select the first name column and then I will uppercase the first name column and name it as first name upper and then I will select the last name and then I will uppercase the last name column and rename it as last name Upper from the employee so let's see what this results so if you look at the bottom now we have the first name and then we have the first name in uppercase and then we have the last name and then we have the last name in uppercase so we can obviously of course lowercase these letters as well so you just need to change the upper to lower so let me just do that in the next query and if I run this one now you can see that we have Carson in the original format then we have Carson lowercase only and then we have Moscone in the original casing and then we have Moscone in lowercase only so next up we'll use the length function and we will return the length of the emails so first of all I will select the email column just so you can see what it looks like and then I will return the length of the email column so essentially this one just counts the number of the characters so let me just do that and let me make the email column a little bit wider here so you can see that the first email has 20 characters the second one 16 the third one 18 so on and so forth next up let's look at the trim function so for this I just created some simple hello text here so you can see that the first one is hello with spaces so you can see that I passed in a bunch of spaces here and then the second length is just for hello with no spaces and then the third one I'm using the length function but within this function I'm actually trimming this hello wet spaces and I'm calling this column as hello trimmed so let me just run this code here and you can see that the hello width spaces has 15 characters the hello no spaces has five characters and the trimmed hello also has five characters so the trim just removed all of the white space alright next up some more string manipulation using concatenation Boolean expressions and wild cards so first of all let's concatenate the first and the last name to create full names so when we're concatenating text we're just joining text one after the other basically so you can see this double vertical bar here this is what you have to use so first I'll select the first name column and then I want to pass in a space between the first name and the last name then I select the last name column and I rename this column as full name so let me just run this code here and you can see that now we have the full names on the bottom and just to make it easier for you to see let me just pass and also the first name and then the last name and then you can see that now it's just easier to see probably so if you look at Row one you have Carson here as the first name last name is Moscone and then you have the full name which is Carson Moscone so we can also concatenate columns to create a sentence or whatever text you wish so in the next one we're going to say that first name space last name makes this much money so let me just run this and now we can see that Carson Moscone makes 32 973 dollars khalila core makes 52 800 and two dollars now let's move on to Boolean expressions and Boolean expressions are essentially just true or false Expressions so first of all we're going to say that if the person makes less than fifty thousand dollars then it's true otherwise it's false so I'm saying here select first name and then we're concatenating the last name as the full name column and if the salary is less than 50 000 then just give me true otherwise give me false and I am going to name this column as less than 50k so let's see what happens here let me just run this code and you can see that within this less than 50k column we just have truth and false this returned and it would probably help if I passed in the salary as well here so let me just find the salary column and then now if we look at it you can see that Carson Moscone makes under fifty thousand dollars here so it will be true but Khalil core on the other hand makes over fifty thousand dollars so it will be false now next up let's go with another Boolean expression so if the person is a female and makes less than 50 000 then true otherwise false so now we have an end operator here so we're seeing that if you're making less than 50k and you're a female then give me a true otherwise give me a false and I am going to name this column as less than 50k female so let's select this and now we can see again we have the truths and falses and again it would probably help if I just chose the actual column so salary as well let me re-run this code right here and you can see now that say for example the first entry here is false because the person is a male and well makes less than 50 000 and then let's find an entry that's true so there we go row four lean tarpe so she's a woman and she's making less than fifty thousand dollars so the condition will be true now let's move on to something a little bit more interesting so Boolean expressions with wild cards so wild cards are just string Jokers and we're going to find emails that have a.com convention and then return that as true otherwise as false so we're going to select the email column and then you can use the like operator and then you can use this percent sign to look for wild cards so basically I'm saying that anything before.com and anythingafter.com just as long as it has the.com convention then give me a true and I will call this column.com flag so let me just run this code here and you can see that everything that has a.com in there will be returned as true so if I look at row four it's true Row three also true but if I look at Row one for example it's dot gov and it's false but obviously we could look for any other naming conventions as well or email conventions so I'll look for dot gov next and it'll be the not the.com but the dot uh golf flag so I'm going to rename this as dot gov and then now if I run this code you can see that the first line will be true the second line will be false because the first line was dot go the second line is dot i o let's continue with the theme of string manipulations and we'll use a substring position and coalesce so let's go through substring first and substring essentially just gets you a text from whatever character you specify so what we're going to do here is substring the email column and we're going to get all of the characters from the fifth one onwards inclusive of the fifth character so let's see what this is going to return and you can see that we have the first row here so C Moscone at C moscone0 at census.com but then in the second column we only have Coney zero so this is because Kony the C in the Coney is the fifth character and we're returning everything after the fifth character so let's just count the character so I can go to the beginning here one two three four five ends the C right here is indeed the fifth character now let's move on to the position function so this is used to find the position of whatever character you want and in this case we are going to find the position of the at in the email column so let me just use that run the query and you can see that in CMOS only 0 at census.gov the at is at the 10th position so the 10th character and then in k core one at github.io the add sign is at the seventh character now next up let's do something a little bit more interesting and probably quite useful so we'll combine the substring and the position functions to find the email client for each of the employees so what we'll do here is within the substring function we'll actually use the position function to find the position of the at sign and then we'll just say substring email column and then from this position so it's dynamic because it'll find it for us let me me just pass this in and let me run this code so you can see now that in the first row instead of C must only zero at census.com we only have at census.gov and then if you look down this column you can see that basically these are all the email clients now this is probably still not the best way because you can still see the add sign so to get rid of that you just need to add 1 to the position here and let me just do that and run the code again and you can see that now it's nice and clear you only have the actual exact email client returned so the next one will be the coalesce function which is used to fill missing values with a custom value so we'll fill the missing emails with a custom value that we specify and in this case I will specify this custom value as no email provided so this is a very simple function you just pass in the function name which column you want to apply this on and then whatever value you want to fill the missing values with so let me just do that and now you can see that all of the empty emails will be filled with no email provided so row 19 row 25 row 26 let me find another one there we go row 42. okay so now that we've covered a bunch of string functions let's move on to some functions some numerical functions so we'll cover min max average sum and count so let's first of all select the minimum salary this is really simple so the function is min you pass in the column name I rename this column as minimum salary from employees and let me run this and we can see that the minimum salary is nine thousand eight hundred and seventy eight dollars again you can find the max salary as well using the max function so if I run this you can see that the max salary is 67 724 dollars now we can do some arithmetics as well so let's select the difference between the maximum and the minimum salary so Maxwell minus the minimum salary from the employees table let me run this and then you can see that the difference is fifty seven thousand eight hundred and forty six dollars you can use the average function to find the average salary so if I run this you can see that the average salary is thirty eight thousand one hundred and eighty one dollars but this looks quite ugly with all the decimals so we can use the round function to round this average salary to zero decimals so I'll do that and when I run the query you can see now that it only returns 38 181 dollars so next up you can use the sum function to sum up all of the salaries which is what I'll do now and you can see that this is the total of the salaries now you can use the count function to count the number of the entries so let me just do that so count star which is counting all of the rows everything and then you can see that indeed in the employees table we have 1000 rows so we have 1 000 employee records now I could count on a different column so the salary column only and then you see that we have 1000 salary values so there's basically no missing values in the salary column but again if I count on the email column you'll see now that we only have 791 values which is correct because we do know that we have missing values in the email column okay we're doing really well let's move on to group by and having so we'll use Group by to group by first of all the coffee shop ID within the employees table to find out the number of employees for each of the coffee shops so let's do that first so I will select the coffee shop ID and then I will count the employee ID from the employees table and then I will Group by the coffee shop ID so if I run this bit of code here you can see now that coffee shop id1 has 187 employees coffee shop ID3 has 214 employees coffee shop ID 5 has 203 employees now whenever you use an aggregated function so like Council min max average here within your select statement you're always going to have to group by whatever column you pass in here so this coffee shop ID has to appear here and here as well now let me show you why because say for example if I only run this bit of code here so let me just copy it here you'll see that this won't work there we go SQL actually gives us an error message and it tells us what we're doing wrong so it's saying that columns employee dot coffee shop ID must appear in the group by clause or be used in an aggregate function so that's why we're using Group by here to group by the coffee shops now next up we can return the total salaries for each of the coffee shops so I am selecting the coffee shop ID I'm summing up the total salary from the employees table and again I'm grouping by the coffee shop ID so there we go and you can see that now we have the total salary for each of the coffee shops now next up let's do something a little bit more complicated so let's return the number of employees the average the minimum the maximum and the total salaries for each of the coffee shops so again we select the coffee shop ID column we count everything and we'll name this column as the number of employees we'll round the average salary to zero decimals we'll select the minimum salary the maximum salary and then the sum of the salaries from the employees table and then we'll Group by the coffee shop ID and then finally We'll order by the number of employees descending so let's run this query here and we can see that coffee shop ID3 has the most number of employees 214 it's got an average salary of 40 119 and a minimum salary of 10 592 with a Max salary of sixty seven thousand five hundred and sixty and a total salary I believe that's 8 585 485 dollars so next up let's use the having Clause so after group buy you must use the having Clause you cannot actually use the where Clause anymore so it's kind of just a special where Clause that you use after the group by so again we have the exact same query all the way until the group by coffee shop ID here the only thing that's different is now we're saying that having so it's kind of just a filter so we want all of the coffee shop IDs with more than 200 employees return so we're saying count star is greater than 200 so let's see what we get now so you can see that if I look into the number of employee column we have 214 210 and 203 so we've only returned the coffee shops that have more than 200 employees now let's move on to another query we can also return only the coffee shops where the minimum salary is less than 10 000. so let's do that let me just select this and now if you look into the minimum salary column we actually only have one coffee shop where the minimum salary is less than 10 000 so this is coffee shop id4 okay we're almost at the very end let's move on to some case statements and we're also going to combine these case statements with Group by and then we're also going to use case statements to transpose our data which is going to be pretty cool so let's start out with the basic case statements here so case statements are kind of like if statements and first of all we're going to say that if the pay is less than 50 000 then give me low pay otherwise give me high pay so I am just going to select the employee ID the first name the last name and the salary columns and then I am going to use a case statement so I'm saying case when the salary is less than 50 000 then give me low pay it reads just like English in my opinion so when the salary is more than or equal to 50 000 then give me high pay else no pay in this case else is not really needed because we've pretty much covered all of our bases the salary can either be less than 50 000 or equal to fifty thousand or more than that and then make sure you you end your case statement with the end keyword and this is from the employees table and then let's just order by the salary descending so let me select this let me run this query and you can see here that if I look into my case column here you can see that obviously I'll only have the high Pace first of all and then if I scroll further down once I hit below 50 000 I will actually only get the low pace now let's move on to another case statement let's introduce another condition so now I'm saying that when the salary is less than 20K give me low pay when the salary is between 20K and 50k give me medium pay and when the salary is above 50k give me high pay else give me no pay so let me just select this let me run the query and oh something didn't work here because I'm missing a then keyword here so let me just run this again and here we go now you can see that if you look into the case column you have high pay and then as I scroll through it I have medium pay between 20K and 50k and if I scroll to the end I only have low pay okay so next up we'll use a case and group by to return the count of the employees in each of the pay categories that we create with the case statement so this is a bit more complicated query and actually we have a sub query right here running within this query but don't worry too much about the sub query we're going to cover this in the next section so for the purposes of this query just focus on the case statement so all we're doing here is we're selecting the pay category from this a table so this a is just an alias so you see here this is my query and I created an alias for this query so I'm selecting the pay category from this query and then I'm counting everything within this query and then finally I'm just grouping by the pay category so the pay category is just this case statement here so all I said was the same as before so give me low pay that medium pay that give me high pay and then call this column as pay category so let's see what happens when I run this code here so you can see now that I have the pay category here and then I have the number of employees within each of the pay categories and this is how you can use the case statement with Group by okay next up and not a cool thing so we will transpose our data so we will transpose this bit of data here transposing is just turning the rows into columns so we will select sum case when the salary is less than 20 000 then give me one else zero so basically we're just going to count the number of employees and then return this as low pay and then we'll do the same but this time we're counting the employees where the salary is between twenty thousand and fifty thousand so give me the one that's the count and then we just add them up and then finally we have high pay again we do the exact same thing but for the salaries over 50 000. so if I run this code here you'll see that now the bottom bit so the low pay the medium pay and the high pay will turn into columns so let me just run this query here and you can see that we have successfully transposed our data all right next up we are going to work with multiple tables using joins and unions but before I go into the joints here actually let me just go through some some theory around the join so let me just search something on Google W3 schools yep that sounds good to me and let me just find an illustration here so let me scroll here so essentially we'll cover inner joins which is basically just the intersection of the two tables left join so everything from the left table even if it's missing in the right table and then right join will take everything from the right table even if it's missing in the left table and then the full outer join will just basically give you everything so let's jump back into our exercise and here we go so first of all let me just insert some values just for these join exercises so I am going to insert values into the locations and the shops tables let me just do that run the code and there we go the query was returned successfully now let's check the values we inserted so I'll select everything from the shops first and you can see row six here happy Brew the city ID I on purpose left it as no and then let me look into the locations table select that and you can see that now we have a fourth location which is in Paris which is in France obviously so first of all let's do the inner join so I am going to select s dot coffee shop name and this s is just an alias for the shops table and then I will select l.city and L again is an alias for the locations table and I will also select the country column from the locations table and then I'm saying inner join locations l so basically I am inner joining on the s dot City ID so the shops.city ID and then the locations.cityid so let's have a look at our shops table you can see that City ID is the foreign key and then we can see that in the locations table the city ID is the primary key so basically this is what we're joining on and inner join or join is exactly the same so by default SQL just assumes that you are doing an inner join so if you pass in join or inner join it makes no difference so let me just run this code here and you can see now that we've combined these tables we've joined these tables so we took coffee shop name from the shops table then we took City from the locations table and then we also took country from the locations table so next up let me just show you that join is exactly the same as inner join let me run this code and you can see that this table is exactly the same as before now let's quickly do a left join so this time I'm joining the shops table and the locations table but this time I'm left joining the locations table on the city ID so let's see what happens now let me run this code here and you can see that we took everything from the shops table so we took happy Brew even though city id4 which is Paris doesn't appear in the locations table so there you go you see the city is null here and then the country is also null so there we go this is left joint we're taking everything from the left table now moving on to right joint so this time again it's the same query but this time we're right joining the locations table so let's see what happens now so you can see that this time what's happening is that we're taking everything from the right table so the locations table and you can see that we have Paris in here indeed so row six for the coffee shop name will be empty because we're joining we're right joining everything from the right table even if the value is missing from the left table and then finally let's do the full outer join here and then after that I will just delete the values that I created just for the join exercises so the full outer join will just take everything so if the value is missing from the right but in the left it will take that if the value is missing from the left but in the right it will also join that so again we're joining on the city ID and we're saying full alter join this time so let me run this code and then now you can see that we have everything so we have the line with the happy brew coffee shop name and then we also have the row of data where Paris and France appears only in the locations table and then finally I'll just run the delete from statements here just to delete these two queries and I can show you that they have been deleted so let me select everything from the shops and you can see that we no longer have the shop that I just created which was I actually forgot its name and then oh yeah happy Brew there we go and then select everything from location so you see that Paris no longer appears in this table now that we've covered joints let's move on to Union so you can use unions to stack data on top of each other so rather than joining data horizontally you'll just be stacking data vertically so let's return all of the cities and countries so we're selecting City from locations and then we're unioning it with selecting the countries from the locations let me just run that and there we go you see that we have all of the cities and the countries what you have to keep in mind is that the union removes all of the duplicates so say for example if I'm selecting country from locations and then I'm obviously selecting country from locations again and then I'm unioning these you can see that I won't have duplicate values so I still only have the United States and the United Kingdom but when you use Union all this will keep the duplicates so let me run the same query the only difference being Union all rather than Union so when I write on this you can see that I have us us UK US US UK so we'll keep the duplicates now we can obviously return all the coffee shop name cities and the countries with the Union statement here so let me just run this bit and now we have all of the coffee shop names all of the cities and all of the countries and we used Union for this and we've finally gotten to the last section which is going to be on sub queries so sub queries are for performing operations in multiple steps they're a bit more advanced in terms of methods and SQL statements but we'll just go through the most basic bits of sub queries so let's start out with sub queries in the from Clause so I will select everything from this sub query here within the from class so I'm saying select everything from and then I'm using the select statement again so I'm selecting everything from the employees where the coffee shop ID is in three or four and that I'm aliasing this query as a so let's see what this returns and we can see that we just selected everything that was in the from Clause but of course we could use this Alias a to specify which column names we would like to select I'll only select employee ID first name and the last name columns let me just run this query here and you can see now that we only have employee ID first name and the last name columns selected so let's move on to some sub queries within the select Clause so first of all I will select the first name the last name the salary and then I will return the maximum salary within the entire table so let me just do that run this query here and you can see that we have the maximum salary return for every single one of the rows here next up we could do this obviously with the average salary as well so we can select the average salary which is rounded to zero decimals here and then let me just do that and there we go now here you can see that the average salary is displayed in each and every one of the rows now say for example you could do some calculations here so you could find the difference between the person's salary and the average salary so let me just run this code here and then now you see that Carson Moscone you know the difference between his salary and the average salary is 5208 meaning that he actually makes less than the average salary now next up let's use some sub queries within the where Clause so how about we return all of the US coffee shops so select everything from the shops and then where City ID is in and this is where we use a sub query so I'm saying that the city ID has to be in and then I'm selecting the city ID from the locations where the country is equal to United States so we connect basically to the locations table using this query so let's look at the shops table you can see that the foreign key here is City ID and then primary key in the locations table is the city ID so I'm saying that only select the city IDs from the locations table where the country is equal to United States so let me just run this query here and let's see what this returns so now we only have City IDs one two one and two so if we go back to the locations table that is correct because city id1 is Los Angeles which is obviously in the US and then city id2 is for this city of New York which is again in the U.S so back into our queries we can return all of the employees who work in U.S coffee shops so we have two sub queries running here actually the first one we'll select the city IDs from the locations table where the country is equal to United States and then the second sub query will select the coffee shop ID from the shops where the city ID is in these City IDs here and then I'm just selecting everything from employees where the coffee shop ID will be in this coffee shop ID here so that's two sub queries here so let me just run this code here and you can see that now we've selected all of the employees work in U.S coffee shops and we can check that quickly so let me go into my table and if I click into the coffee shop ID column here you can see that coffee shop ID one two four five so you see basically everything but three appears because three is the only coffee shop that's not in the US because it's in London which is in the United Kingdom now next up how about we return all of the employees who make over 35 000 and work in U.S coffee shops so the only difference now is that obviously I included a where Clause here so where the salary is greater than 35 000 and the coffee shop ID is in this coffee shop ID here that we defined using two sub queries so if I run this now I am going to have all the employees who make over 35 000 so let's look into the salary column yep that looks right and then in the coffee shop ID column again you don't see number three appear okay last but not least let's do something quite complicated so how about we do a 30-day moving total pay so this time we're selecting the higher date and then the salary columns from employees E1 and then we are going to use the employees table again so we're selecting the sum of the salaries from the employees table I'm going to Alias this as E2 where the E2 higher date so where the higher date within this query here is between the E1 employees table higher date minus 30 because we want the 30-day moving average and obviously between the E1 dot higher date because basically we just want the values before 30 days from the current date to the current date and I am going to say this column is called pay pattern and then I am going to order by the higher date so let me just run this query here and you can see now that if I look down the pay pattern column here you can see that first of all we have 17 691 and then our second observation here is 50 099 which is 17 691 plus 32 408 and then fifty thousand and ninety nine plus sixty five thousand nine hundred and ninety three is equal to 116092 116092 plus 21508 is equal to 15 150 505. and you see how these are on the same date so Row 4 and row five it's the same date 23rd of January so the values will be exactly the same now for the next row row six here you can see that the value now is one seven one eight eight six so it's not exactly equal to one 505 plus the 39072 this is because this is already the 6th of February so this is going to take all of the salaries between the 6th of January and the 6th of February so it's going to exclude row number one here so essentially this row 6 adds up all of these values so row two Row three row four row five and row six and this is how we get to one seven one eight eight six so this is how you do a 30-day moving total pay and that's it that is the end of the tutorial if you like this video and took away something from it you may like some of my other videos right here thank you so so much for watching and I'll see you in the next one
Info
Channel: Mo Chen
Views: 130,095
Rating: undefined out of 5
Keywords:
Id: MvcDM2nLdzI
Channel Id: undefined
Length: 70min 0sec (4200 seconds)
Published: Sat Apr 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.