9. show() in Pyspark to display Dataframe contents in Table | Azure Databricks | Azure Synapse

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi friends welcome to offer studies YouTube channel this is part 9 in pi spark playlist in this video we are going to discuss about show function which is available on top of the data frame object in pi spark so this function will actually help you to display data frame content in a tabular format so let me practically show you that so let's go to our browser where we have already opened databricks workspace and here if I navigate to compute I already created one cluster here so let's go to workspace and here let me try to create a new notebook and this notebook name is maybe so notebook okay this is the name I will give and the default language is python it is attached with our cluster so all good so let's create this notebook now our notebook is created and it is attached with my cluster so here let's try to create one the me data frame that means with some hardcoded values let's try to create it so if you have watched my previous videos by this time you already know how to create a data frame with some hardcoded values so please watch all the videos in the playlist in the sequence order why because every video has a dependency with the next video so all videos are in a sequence order actually so here I am declaring a variable called data and inside this variable let's try to store one list list of items and first item in fact every item in this list I am going to store as a tuple so maybe id1 then maybe value I am entering some dummy string here okay so let me copy this entire string here from here to here so this is one item and then next item then next item so for every item we should use a comma right so let's totally have maybe four rows so first row is id1 second row is id2 third row is ID3 fourth row is id4 and this content Also let's try to change something instead of having the same thing so I'm just typing some some value okay and here I am also em out here also I am typing some value okay and then here let me remove this fourth value also and type something so we have a data variable which is a list of two pools each Tuple contains two items in it now let's create another variable called schema this is again a list so this defines the column names right if if you you know this if you have already watched my previous videos and maybe here maybe ID then comments is the column name let's use that then spark keyword in databricks or even Inspire synapse will give you the spark session object so dot here when I press Ctrl space I will get a intelligence and there is something called create data frame object so I don't see that I see when I click to the control place it auto completes so to this for a data parameter I can supply my data variable whatever I created and there is another parameter if you see it is a schema parameter for the schema parameter I can supply my schema variable what I created so this code is going to create a data frame so that entire data frame object let's try to store in a variable called DF now here on top of this DF when I say dot and when I say control space there is something called show function so this is show function will actually help you to show the data or to display the data in a tabular format so let me execute this shift enter so when I execute this code it is going to create this data frame and it is going to display the data frame also if you closely observe here it is displaying the data frame object but there is one small catch here if you see here the content in the first row is very lengthy but here we are seeing only few characters the reason is this show function by default shows only 20 characters in a given column so if your column value is more than 20 characters then it will truncate up to 20 characters only so that is the first default nature of this one not only that this right now you can say R4 row side so by default right it will show up to maximum only 20 rows only Okay so if you have more than 20 rows then you need to explicitly say how many rows you want actually so I will show all that practically now so first thing is how I can show more than 20 characters here so it is only showing 20 characters right to to do that for example let me go to the another cell here so on the DF so there is a show function right as I said so let's pass this entire thing into help function here to see the documentation of the show function so let me hit shift enter to execute this cell and now if you see here this show function will take there is a parameter called n there is a parameter called truncate there is a parameter called virtual and the definition of every parameter is available here so it will take a number n parameter will take a number that defines how many rows you want to show similarly truncate is a Boolean value or you can supply a number also 0 and 1 that defines like what is the length of the column content you want to show and this what critical parameter will take again Boolean and this is defined whether the data to print in vertical or not so let me practically show you this so let's close this cell what we have this here now and then now for example I want to show full length of the content I don't want this to truncate here so for that in the show let me hit Ctrl space for this truncate parameter we have to supply value maybe false so if you supply a value false then it won't truncate any column content it will show full so let me hit shift enter and show the execution now now you see we are able to see the full content right so this is how it appear but what if I don't want to show full or I don't want to show 20 maybe I want to show only up to five characters or up to 8 characters how to do that so for this trinket parameter you can supply the integer value up to where you want to trunk it so if I use it if I press Ctrl enter see sorry shift enter it will execute now you see only 8 characters it is showing so you can count it one two three four five and then three dots so totally eight characters only it will show you okay so not only that so now here it is showing four rows maybe I want to show only two rows only then how to do that so let's keep this trinket as false so let the full content to be printed and there is some parameter called n you see this in parameter value will Define how many rows you want to show so I want to show only two rows so let's pass n equals to 2 and hit shift enter then you will see only two rows so for example if your data frame is more than 20 rows and if you don't Supply the any value for this n parameter then it will show only 20 rows only okay so that is a deeper nature not so let's let let the four rows to be shown here uh what if I want to show this data in a vertical manner so for that we need to supply a value for this vertical parameter so let me vertical equals to True let me pass the value true to it and see how the display changes here so closely observed right now it is showing in a table or format right ID column columns then here it is horizontally column names are there right now column names will become vertical so let me hit shift enter and see the results in action now if you see it says record 0 that means first row ID column comments column in a vertical fashion and values of those columns here similarly ID column comments column so vertical means it will just show the data in a vertical fashion so these are all the usages or this is how the show function in data Factory I mean in data frame out class will actually work so I hope you got an idea about how this issue function will get used now so let's go back to presentation so this is what I said here also right so if you want to show the full content you need to use truncate equals to false that gives the full content of a column and you can truncate the column length to the design number desired characters as well like this not only that if you want to control the rows how many to show then n equals to 1 and as it as I said by default it will show only 20 rows only okay so if more than 20 rows are there in your data frame then they won't appear so to control the number of rows to show use this n parameter and not only that you can convert the data frame into a vertical fashion using this vertical parameter as well so that's it in this video I hope you enjoyed this video and like this video thank you for watching this video please subscribe to my channel and press the Bell icon to get the notification whenever I add videos thank you so much
Info
Channel: WafaStudies
Views: 13,937
Rating: undefined out of 5
Keywords: PySpark for beginners, PySpark Playlist, PySpark Videos, Learn PySpark, PySpark for data engineers, dataengineers PySpark, PySpark in Azure Synapse Analytics, PySpark in Azure databricks, Understand PySpark, What is PySpark, PySpark in simple explaination, PySpark Overview, synapse pyspark, show dataframe in pyspark, df.show(), df.show(truncate=False), dataframe truncate column content, dataframe show all rows, dataframe show vertically, dataframe show full column content, spark
Id: 9VhitO4KFv0
Channel Id: undefined
Length: 9min 12sec (552 seconds)
Published: Wed Oct 26 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.