Python Excel openpyxl - Reading Excel files with openpyxl

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello my name is xander welcome  back to the python excel series   in this tutorial we focus on reading excel  files with open pi excel in this tutorial   we'll look at five sections and we build  up towards getting our data from the excel   workbook working with cells or selecting  data from cells and multiple values   from cells and then we go ahead and we  convert that data into python data structures so we'll make the assumption that you already  work with python or at least you know how to get   and start working with python so i do have  a worksheet here i've got two sheets uh just   some data it doesn't matter you can use your own  excel sheet for the examples in this tutorial   you will find this in the link in the description  the video description if you want to download this   as well as all the notes  associated with this tutorial   so i'm going to work inside of a  virtual environment so just go ahead   and get that started so here i'm utilizing  windows i'm just going to start a new virtual   environment and i'll get straight into that  and activate that so that we can go ahead first   of all and pip install and we're going to need  obviously uh open pi excel so pip install that now of course if you are new to open pi excel  do apologize haven't explained what it is so   if you haven't seen the first tutorial you won't  know that open pi excel is a python library   uh which has all the tools readily readily  available for us to hook into so we can start   reading and writing and performing actions on our  excel file sheet so this is what we're going to   be utilizing this is what we've just installed so  let's go ahead and have a look completely optional   but i'm going to go ahead and create a new file  called example dot pi i'm going to work from here so open by excel a library a resource we can  hook into so let's go ahead and first of all   let's uh from from open pi excel let's go ahead  and import the load workbook so this is going   to allow us to load workbooks there's no surprise  there so let's go ahead now and utilize this and   we're going to open up the excel file sheet you  can use file name equals you don't need to so my   filex my excel file is called example.xlxs xlsx  now it is in the same folder structure here so i   don't need to do anything special here other than  typing the name and the file extension of this   file so this will work with other file extensions  go ahead and have a look at the repository   and the documentation you'll see that we can work  with different file types that might be worth   mentioning okay so now we've loaded our workbook  in we can go straight ahead and let's just go   ahead and print the workbook so we know that it's  worked okay um let's go ahead and run our file   and here we go so here we have an object that's  been returned suggesting now that we have actually   prepared or we've loaded our workbook in and  we can now start working with our workbook   if you head back to the python package index page  for open pi excel head over to the home page that   takes you to the documentation now if you have  a look through specifically at the load workbook   function there are a few different optimized modes  here that we have for example read only and write   only and that might be um some use for you at some  point it'd be well worth having a look through   the different parameters for the load workbook  when you have a little bit of time so now we know   a little bit more about loading the workbook a few  different parameters that we have available for   example read-only write-only data only for example  so let's go ahead now and actually perform some   simple operations here getting used to um some of  the information that we have at hand now so let's   go ahead and print some things here so from the  workbook we can access or we can view for example   all the different sheet names we might have  available so sheet one sheet two so our excel file   has two sheets uh the name of those sheets is x is  sheet one and sheet two so we saw that this excel   file sheet or excel file has two sheets sheet one  and sheet two now if we were to start working here   with this workbook likelihood is that we're gonna  be working with the data that's currently active   or sorry we're gonna be working with the worksheet  that's currently active so let's go ahead and use   wb again and have a look at active to see what  is the current active worksheet so that's sheet2 so let's just go ahead and change that so wb  dot active and then we can now change this to   another worksheet so how this is going to work  value starting from zero left to right so this   sheet here is going to be a zero integer zero this  worksheet is going to be references worksheet one   and so on and so on if we had more worksheets  so if i were to change this to zero for example   we see that the sheet one will be now activated  there we go so the current active worksheet that   we'll be working on working on would be sheet one  so i can also grab the worksheet so for example   if i just wanted to do some operations on a  different worksheet sheet i could uh simply   just refer to it by name for example let's  go sheet2 for example and then from here   or what i can now do is i know basically i'm  working from sheet2 so let's just output that uh some apologies that's a so sheet title sorry  so that will output the title referring to now i   can then access that sheet through this sheet  variable here so if i now were to output that   it now outputs the sheet to title and then  from here i can start grabbing information   utilizing the sheet variable name here if i wanted  to get some data from some cells within this sheet so that's really the principles there of loading  the worksheet hopefully it wasn't too quick   and then working with sheets  we saw that we can select or   we could at this point select multiple  sheets if we wanted to refer to them   through a variable or we can change the  active sheet that we want to work with so now let's take a look at retrieving  cell values from our worksheets no doubt you've used excel before here you  probably already know that we can refer to cells   or data in cells through for example this system  here at the top here we've got columns a b c d f   etc and on the left hand side here we have numbers  so if i wanted to refer to this cell i could use b   three that would then refer to this cell here so i  can utilize that to extract data from my worksheet so here we've selected the sheet1 sheet1 passed  it into variable and now we'll go ahead and select   this cell so this will return the cell object so  for example b3 that's what we're trying to select   so let's have a look at that you can see  that's going to return the cell object here   cell sheet b3 now if we wanted  to actually print out the value   let's just go ahead and extend here and include  value and then that will then actually return   the value in this case it's a string uh the  name of the book a gentleman in moscow a novel so just moving back here let's go back to  returning the object here so this object is going   to hold a lot of information which we can utilize  potentially so let's just go ahead and we'll store   this so let's go for uh new coordinates uh  equals sheet let's just grab for example a1   okay so we're going to put the a1 object inside  of this variable here called nc so what we   can now do for example is take nc we can gauge  some more information so obviously one of them   being the value so we've grabbed the cell a1 now  we've printed out the value we've got the value   from this cell object so what we can also do here  because we know that a1 here is in the corner here   this id let's just go ahead and just uh  grab some more information so for example   if you wanted to get information about the  row so this is row one which is uh correct   um we can do the same thing here  for example we can see what column   this is in so this is in column one row one and  for example we can get the coordinates from this   in addition to that so coordinate and that  would give us in the coordinate of this cell so you can now see that we can use the coordinates  to grab information the cell object and then in   that object we can then extract more information  so let's move on slightly and have a look at a   different approach to selecting data so we  can select data utilizing the the shell so   let's just go ahead and let's just grab this  again sheet1 so we can now refer to sheet1   through sheet and now we can have a look  or select data utilizing cell so from here   we just need to specify what cell we want to  access so for example we want to select row one   and then column one so that's going to return  the cell object again now we can extend this so   outside of these uh parentheses here  so let's go ahead and type in value   so we can grab the value as per normal and  all the other information that we've seen   so it might not be too often you are selecting  individual cells but it just um adds to the   story here we're building up our knowledge slowly  moving now on to retrieving multiple values so   obviously it's probably more likely that you want  to retrieve multiple values or selection of values   from your sheets so let's have a  look at some of the basics here   so we keep the sheet here active we use print  again just to output so we know that for example   earlier we learned an individual  cell we can access the information   utilizing these coordinates a1 etc so we  can also for example if i just selected a   that would actually give us the output of the a  column so that's also something again selecting   a range of uh objects here i can select the  column through the reference in this case a   so in that respects what i can now do is move  this forward slightly and start kind of slicing   so if you're already familiar in python 2 slicing  this is going to come this is going to be second   nature too but of course if you aren't and you  are working with excel data in this manner it'd be   well worth you kind of stopping at some point or  later on just having a look at python uh slicing   or slicing data with python just those principles  although they might not map exactly across here   it'd be well worth you having a look at  that so for example here if i use a colon   so remember a is referring to the column  name so a c so basically i'm going to   extract here as you'll see here columns  a b and c so here for example i have   um all the cells from column a a1 a2 h3 a4 this  here we can see bs b1 b2 b3 and here we've got the   c's so that gives me the opportunity  to select different different columns   so it's the same principle here for if i wanted  to select rows so for example one is referring to   row one here you can now see i have a a1 b1 c1 so  now i'm returning all the cells from row one etc   so same principles here i can select a range  so one two three so there i'm going to return   three rows so a b a b and c yep here in this case  uh obviously it's one two and three so a a1 b1 c1   a2 b2 c2 and a3 b3 and c3 so three rows so you might prefer and it can be a little bit  more selective we can utilize the yittaro's   function and the iter calls function so what we  can now do let's go ahead and just run a for loop   so for row in sheet dot and here we can use uh  it's uh this array function so what we can now   do is just kind of specify uh for example the min  row so the minimum row is one and then the maximum   can be for example two so i'm going to select  uh row one and two and then i can kind of do   the same thing again for the column so  min i mean call equals let's go for it   for one and then the max column we'll just go for two okay so what i can  now do is just go ahead and print that out print row so let's have a look  at that and you can now see that   uh has no attribute it's a row i've  done something wrong here at rose there we go so here i can take a selection from  my workbook so remember here i'm selecting rows   so you can see here that i've got my first row  a1 b1 and then a2 b2 so i can do the same thing   again here with columns if we prefer working  with columns so the same principles apply so whether you select by columns or rows it's  just giving you that option depending on what   you want to do with the data now you can can go  ahead of course if you want to if we for example   were to just select sheet rows that would select  all the rows in your workbook and the same thing   for column so here we could uh lowercase  columns we could select all the columns   so that was just returning all the  data again by columns or by row so what we can do here in actual fact if we prefer  let's just go back to it's a rows just to control   z that there are some other parameters that we  can add here one of them being values values   only so that equals true   so that's going to return the values instead of  the cell objects here so we can retrieve the data so hopefully now we've got the general grasp  of grabbing data uh consecutive data from our   worksheet now let's convert that  into a dictionary a python dictionary   okay let's go ahead and create a new dictionary  we're gonna call books equals empty dictionary   okay um let's now go ahead and create a  full loop save for row in sheet and dot   it rows so i'm going to use it arrays again  function so in here we're just going to define   what rows you want to select now i'm going to  select from row 2 because i don't want the header   the header information is in row  1 so i'm not going to select or   define a max row here and we'll do the same for  the column for now so call equals min cole equals   one so starting from column one and then i do want  the values only in this case so values only equals   true right so that's selecting my range of data so  now what i want to do is i want to set up my index   so book id equals row zero so get the information  so this here is going to select for example   the index number so one two three are the row  number so let's go ahead and now uh book uh let's   say book equals now let's just uh define our  information we want in our dictionary so title   title data so that's going to  be found in row in the first   or the second cell in the row so the first cell  in the row is zero um that's going to be the   if we just go back to our data that's going to be  the id here so here we referring to cell kind of   or information one so that's going to grab this  data and this is going to be two and the author   so we get the title and we'll go ahead  and grab the the author it's a row in and two so that's going to grab  the all for information we're gonna   need a comma there so that just grabs out the  information and then we go ahead and say books and update our book dictionary book id  and then the index and then the data   so that's going to be book right so  let's go ahead and print this out   so i'm going to use let's just use json dumps  um so books let's give it a nice uh indent let's go for three okay so  just make sure you import json   so that's just to kind of output the data  and let's go ahead now and give that a go and there we go so we've just selected  the first three rows of our of our   worksheet you can see they've got the index  one two and three the title and the author so i hope that gives you an initial insight  there's probably lots of questions now being   or how do i select a a range of information  that's not adjacent to each other for example   so those type of questions i will answer those  type of questions in a later tutorial um i just   wanted to get the kind of the basic information  out there and to provide the scaffolding for us   to now move forward any questions leave them in  the comments thank you very much for listening   hopefully this was useful in some sort of way  and i hope to see you in the next tutorial
Info
Channel: Very Academy
Views: 8,182
Rating: undefined out of 5
Keywords: python excel, openpyxl, pandas to excel, pandas excel, python xlsx, python and excel, reading excel in python, read_excel(), read_excel
Id: BZUclSB_DLk
Channel Id: undefined
Length: 20min 0sec (1200 seconds)
Published: Mon Aug 16 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.