External Table Creation and Data Loading In Hive | Day-10 | Hive Tutorial For Beginners

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello friends welcome to our digital today we are going to discuss about a few more ohio queries so in last session we created internal table and we loaded data in today's session we'll see how to describe a table so first we'll check what other tables available in database so for that we have to give show tables command in a hive okay so here it is showing only one table that is employee table now i want to see what other columns and the data types are defined for this particular table so for that we have to use a describe command so describe then table name okay so these are the commands available for employee table so columns and the data types it is showing okay and here similarly we have one more uh query describe extended okay so here describe extended means it will give full information okay full means that particular table name and that table is related to which database and who is the owner for that particular table and what time exactly that particular table is created okay and exactly under which directory that particular table is created okay so what is the difference between scribe and describe extended so describe means it will just give you only column names on the data types but describe extended means it will give full details okay so full details means that particular table related columns and the data types along with that who is the owner for the table and what is the table name and exactly which date or what time exactly that particular table is created under which directory that particular table is created and is it a actional table or it is internal table so whenever you want a full information that time you have to use describe a extended command or else you can use only describe command so describe command it will just give you only column names and data types okay so this is about describe and describe extended [Music] now how to rename a table in hive so i want to rename i want to rename your table okay so here i have the employee table okay so this table i want to rename to emp table so how to rename your table okay so alter table table name table name employee rename to and here we have to give new table name so new table name here i am giving emp is my new table name okay so successfully are renamed how to check whether the table is renamed or not so again we have to use here show tables command see vault table name is employee but now it is showing table name as empty so whenever you want to rename your table we have to use this command alter table table name and rename to your new table name not only table rename automatically data also it will come so here select a star from emp so whatever data available in employee same data it will show for a emp also okay so this is about how to rename a a table now i want to add one new column to the existing table okay so right now here if you see describe employ means how many columns are there describe okay not employ right we have emp table now okay so only a four columns under four data types okay but here i want to add one more new column that is maybe you can assume a department number new column i want to add to existing hybrid table alter table table name emp table yard parliaments okay so here a new column name and then new column related a data type we have to specify here okay so successfully added new column to existing table now we'll cross verify whether a new column added to emp table or not okay so successfully added now earlier it is just four columns and four data types but now it is showing up five columns under five data types so whenever you want to add any new columns to existing table compulsory we have to use yard columns command now we'll see how to connect our dbms okay so generally earlier we discussed that metadata it will store in dbms and actual data it will store in hdfs right so now how to connect to rdbms okay so here to connect rdbms mysql hyphen u root iphone p this is the command to connect rdbms and it is asking a password password cloudera we have to give okay so successfully connected mysql database so here i want to see what all our database is available so databases so here there are multiple databases available in this but under which database exactly metadata will store means under metastore uh database our tables related metadata it will store so use metastore okay so now we'll see emp table related metadata available under this database or not okay so here select from pbls so this is the command basically to check your table related metadata so whenever you want to check metadata available or not at the time you have to use this command okay so here it is showing your table related metadata so here this is your table name and here it is telling uh it is internal table or external table internal and manageable means both are same only manager table means internal table only okay in real time generally to connect rdbms and to check metadata how to admin team generally they will not give access okay but here hadoop software we installed in our personal laptop so just to knowledge purpose i am explaining how to connect rdbms and how to check a metadata okay so again if you create one more table so under this uh whenever you will execute this query by default one more table related metadata also it will show here okay so this is about metadata how to check metadata we will discuss about actional table so external table we have with location and without location two concepts so first we'll create one external table after that uh we'll load data into external table so we will see both the concepts with location and without location okay so how to create a external table okay so before creating external table you need the data file okay so duplicate session cloudera cloudera cd data here i have some data files okay so whenever you want to create a table first you should know which is data file and what are the columns available in the data file so cat employee dot ext okay so here this is the data file so total five columns employee number employee name for number salary department of number okay so create external table table name employee one i am giving the internal table we have not specified external keyword okay so how will recognize it is the internal directional table means if you give external keyword means that is external table but if there is no external keyword means that is called internal table by default it will consider a internal table so here i am specifying column names and our data types so phone number phone number data type begins salary inch apartment number end okay so row format delimited fields terminated by so your data separated by which symbol okay so here our each column related data separated by pipe symbol okay so here we have to give by okay so now we are creating table for without location concept okay so table created successfully so we can cross verify table created or not show tables over here successfully created your employ one table okay even you can verify metadata also here for employ one metadata created or not okay so here it is showing so apply one also metadata created okay now we have to load the data right so we just created a table but we have not loaded up any data so it is giving empty results okay so now we'll load the data okay so load data local in path okay so my data file is available in linux operating system in local okay so pwd so under this folder under this path employee.txt file i available okay so here i'm specifying data file path okay into table so this data file i want to load into which table apply one table okay so successfully loaded data into employ one table so now we'll check whether data loaded or not so successfully loaded and data populating okay so using your select query you can check data okay rls backend also using your hadoop command also you can cross verify whether data loaded or not okay so this is about external table with location without locations for reactional table without a location but here while loading data load data local in path i used right in path then here file path i specified right file path so why this local keyword we have to give and even without giving a local keyword else so we can load data okay so like this load data in path then ever a file path okay so what is the difference between this query and this query okay so if you specify local keyword means you are loading your data file or your data from local path that means your linux operating system your data file located in a local path that means you have to give your a local path okay if you not specify local keyword here that means you are loading your data from a hddfs path so your data file already there in hdfs that data file you were loading into just a hybrid table okay so here you have to give your is a path hdfs a file directory okay so user cloudera then may be a data file okay so if you give local means you have to load data from your local path you have to give local path direct if you're not specify local keyword means you are loading your data from hdfs to hive table so here you have to give only hdfs a path so this is the difference between local and without local okay so if local is there means you are loading data file from local if you're not specifying local keyword means you are loading data from hdfs to hive table okay and now we'll discuss about with location okay without location we created one external table right now with location how to create okay so before that i will just copy one data file to hdfs allocation okay so how do let me create a first one directory inside of hdfs okay adobe fs iphone mk dir user cloudera employ okay i'm creating one directory you apply to directory i'm giving okay so apply two directory created now i will copy one file from local to hdfs under apply to okay so here copy copy means output command we have to use and our local path okay so source path home cloudera data under data employee.txt file so this file we are copying under employee 2 directory in hdfs so local to hdfs we are copying data file okay so we'll just cross verify under employ two data file available or not okay so data file available now we'll create external table with location okay so here hi hi okay so while creating a table i am giving a location also file location also okay file location user cloudera employ one directory okay so here only till directory we have to give no need to give file name under this directory even if 10 files are 20 files also all the data it will load into this particular table okay so so here i'm giving apply to see it will create a table same time it will load data also in this table see earlier we discussed without a location concept without location two queries one is create table statement separate another one loading data query separate okay but with location concept single query itself we are loading a data file also okay so this query it will create your table same time it will load the data also so here create external table name employee 2 okay and then row format delimited fields terminated by and location httpfs path i'm not giving here file name okay so under this employee 2 directory whatever files available even if it is 10 files or 20 files all the files related data it will load into apply to table okay so select star from template 2 okay so successfully loaded all the data files so this is about the external table with location and without location concept next session will discuss about the difference between internal table and external table thanks for watching this session bye friends
Info
Channel: RR Digital
Views: 377
Rating: undefined out of 5
Keywords: Apache Hive, Hive tutorial, Hive tutorial for beginners, Hadoop tutorial for beginners, Hadoop Full course, Apache Hive tutorial, Hadoop course, RR Digital, apache hive vs spark, Big data hadoop tutorial, Apache Hive tutorial for beginners, hadoop tutorial, big data, big data hadoop, hadoop, big data course, big data tutorial, hdfs, hadoop spark, hive, spark hadoop, hive external table, hive internal table, difference between internal and external table, data loading in hive
Id: ye8Rpa_9xfA
Channel Id: undefined
Length: 20min 40sec (1240 seconds)
Published: Thu Oct 28 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.