PROC SQL In SAS | Data Science Tutorial | Simplilearn

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] let's start this lesson by defining process QL structured query language or SQL is a generic database language that helps to communicate with databases the park SQL is the base SAS implementation of SQL it allows you to retrieve summarized sort join and concatenate datasets or databases available in SAS proc SQL is used for the following generate reports and summary statistics retrieve and combine data from tables create tables views and indexes update and retrieve data from DBMS modify a podcast ql table by adding modifying or dropping columns the prog SQL allows you to combine the functionality of the data step and the product step into a single step before we begin with the concepts of prog SQL let's understand some terminologies associated with the product SQL the following table lists the equivalent terms that are used in SQL SAS and data processing the proc SQL table is termed as SAS data file and SAS and file and data processing the row in SQL has termed in observation and SAS and record in data processing the column in SQL is termed a variable incest and field and data processing well let's now learn the syntax of proc SQL and its uses let's step into the syntax classroom to learn the syntax of SQL the proc SQL command begins with the keyword proc SQL and ends with the keyword quit the keyword quit is used to terminate the procedure in this demo you'll learn how to retrieve data from a table using the proc SQL clauses in this demo we'll retrieve data of all the products from the electronic data set which have sum of sales greater than 450 in a descending order the data set electronic is imported to the SAS console using the code shown on the screen to achieve data using proc SQL clauses use the keyword proc SQL prog SQL execute the program without using the run statement the columns product sales and order priority are selected from the table electronic using the keyword select in this demo the products that have sale rather than 200 are selected using the where statement the group by statement is used to group data by a specific column here we will group the product column with the group by clause we can also use an aggregate function in the Select clause or in a having clause in this demo the products which have the sum of sales greater than 450 are grouped note that the aggregate function sum is used here the keyword quit is used to terminate the procedure this concludes the demo on how to retrieve from a table using the proc SQL clauses at times you'll need to select all columns or a specific column in a table to select all columns in a table use an asterisk symbol in the Select clause to select a specific column in a table use the column name in the Select clause using proc SQL you can also eliminate the duplicate rows from the output data to do so use the keyword distinct in the Select Clause proc SQL allows you to create a new variable in the query result these columns can either be text or calculations you can add text column to the query result by using a string or literal expression take a look at this example program and its output data set shown on the screen here a new column bonus is created where observations are derived from the sale column the generated output is shown on the screen you can also change the format of the variable and assign a new label to the data set in this example the attribute is used to modify the format of the sales variable and the label is used to name the output data set the generated output is shown on the screen proc SQL also allows you to process conditional data case expression as a valid SQL expression that resolves to a table column where the values are compared to all the wind conditions using case expression in the Select Clause you can extract the data that fulfills the set condition the generated output data set is shown on the screen in this example from the electronic data set the product and discount column are selected and the condition is set to the sales column the end statement is required in the case expression also set the condition in descending order to increase the efficiency because class stops checking the case expression as soon as it finds the first true value the output data set is shown here note that the column order priority 1 is generated calculated enables you to use the results of an expression in the same select cause or in the where cause let's take the previous example and derive net profit to derive net profit create a tax column which is 5% of the sales amount and subtract tax from the profit you must use the calculated keyword with the alias to inform proc SQL that the value is calculated within the query otherwise the SQL code will fail with the method similar to column tax was not found the generated output is shown on the screen using SAS you can also obtain the totals by order Priority One look at the example shown on the screen the sum function returns the sum of each row of the columns specified as arguments the count returns the total number of rows in the group or in a table the generated output is shown on the screen the SQL procedure passed through facility communicates with the DBMS through the SAS access engine the pass through facility allows you to do the following pass native DBMS SQL statements to a DBMS display the query results formatted as a report create SAS data files and views from query results since the database is typically optimized and indexed to handle queries complex joins are handled much faster with the SQL pass-through query take a look at the example program shown on the screen use keyword connect to link the DBMS using the create table statement you can create a new table to define the columns and their attributes you can also specify a columns name type length format and label let's understand with an example in this example the electronic example data set is created this data set will have the data from the electronic data set that has higher order priority the second select statement is used to show the complete electronic data set note that only one table is created using the create table statement the generated outputs are shown on the screen so far you have learned how to retrieve data from a single table now let's learn how to retrieve data from multiple tables if you want to combine multiple tables through SAS code it requires several proc sort data step and emerge function however using proc SQL multiple data sets are combined easily to select data from multiple tables simply join the tables in a query let's step into the syntax classroom to learn the syntax for selecting two tables using proc SQL use the keyboard select to select the table the asterisk symbol selects all the columns from tables 1 & 2 to select the particular column from table simply mention the column name after the keyword select let's now learn how to select data from multiple tables the data that you may need for research can come from different sources to combine them join the tables in a query there are two types of joins inner join and outer join the inner join selects all rows from both tables as long as there is a match between the columns in both tables the outer join returns all matching records from both tables whether the outer table matches or not let's learn about each type of join in detail the inner join selects all rows from both tables as long as there is a match between the columns in both tables it can combine a maximum of 256 tables at a time only rows that satisfy the join conditions are kept let's step into the syntax classroom to see the syntax you can't perform by using a list of table names separated by commas with the where clause or by using the inner join and on keywords let's take an example of how we can join electronic and electronic customer info datasets to attach customer name and customer ID to each other you can select columns from both tables with asterik and utilize the feedback option you can use the feedback option to see exactly how proc SQL is implementing your query in the log session you can see all column names with E and C table aliases the output obtained is shown on the screen you can also customize your query by selecting only required columns in the order you prefer observe the changes made in the code to select preferred columns the output obtained is shown on the screen you can obtain the same results by performing an inner join with where clause and inner join and on keywords both approaches are used interchangeably in practice the output obtained is shown on the screen in contrast with an inner join and outer join keeps rows that match the condition as well as some or all of the unmatched data from one or both tables now there are three types of outer joins left-right and full the left join returns all rows from the left table or table one with the matching rows in the right table or table to the electronic data set and electronic customer information data set is taken as an example the output for the example program is shown on the screen the right join returns all rows from the right table or table to with the matching rows in the left table or table one the electronic data set and electronic customer information data set is taken as an example the output for the example program is shown on the screen the full outer join returns all rows from the left table or table one and from the right table table to the electronic data set and electronic customer information data set are taken as an example the output for the example program is shown on the screen and concatenate the two query results using the Union operator Union operator takes unique observations from the data set and generates a report remember that Union does not return duplicate rows if a row occurs more than once then only one occurrence is returned sometimes you need to return duplicate rows as well in this case you can use the keyword Union all which requires that duplicate rows remain in the output you can also concatenate two or more query results using the operator except intersect and outer Union use the operator except to produce rows that are part of the first query only use the operator intersect to produce rows that are common to both queries use the operator outer Union to concatenate the query results how to concatenate the query results using the operator Union two datasets namely north and south are imported to the sass console in this demo we'll would concatenate the two datasets using the operator Union the table output is created using the keyword create the variables order ID region and sales amount have been selected from the data set north and south using the keyword select you the keyword union is used to can two datasets the union operator produces all unique rows from both queries you note that the variables selected in both the datasets are the same the keyword quit is used to terminate the procedure you this concludes the demo on how to concatenate the query results using the operator union hey once you become an expert in Big Data then subscribe to the simply learn Channel and click here to watch more such videos sinnard up and get certified in Big Data click here
Info
Channel: Simplilearn
Views: 49,643
Rating: 4.9214144 out of 5
Keywords: simplilearn, training, tutorial, certification, course, curriculum, Free resources, data science, data science tutorial, data science and big data analytics, data science for beginners, data science course, data science with sas, data science with sas tutorial, data science with sas training, data science sas, data science sas certification, 2017, proc sql in sas, proc sql tutorial, proc sql basics, proc sql class
Id: hX02sbsmb_w
Channel Id: undefined
Length: 15min 29sec (929 seconds)
Published: Fri Aug 11 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.