Python Download File from SharePoint, Modify and Upload Back to SharePoint

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what up guys what up what up guys I am Lou and uh got another video guys another another video just from a request from a viewer and ultimately guys we're gonna be talking about how do we download a file let's say from SharePoint in this case gonna be SharePoint but this really applies to any other source doesn't matter download a file from SharePoint we're gonna save this in memory we're gonna modify the file and in my case I'm going to use pandas to modify the file after I finished modifying the file then I'm going to go ahead and take that updated version and save it back to SharePoint so I'll be doing is pulling the file that we want we're going to modify it and then push it back and not everything's being done in memory as well so it's not like we're downloading it saving it locally then we got to read it put it in memory to modify and then we save it locally then upload it again we're not doing all that so this method would work would be very ideal especially if you want to publish this code in a um you know AWS Lambda or Azure functions or things of that nature right so but but guys but but before we get started hit that like button give me a follow give me a follow as well guys on Twitch so I've been putting some effort on Twitch um that channel I'm going to be talking more Hardware related Tech so in this Channel programming on my twitch channel will be more Hardware I'm going to be reviewing uh two mechanical keyboard this Friday um I'm gonna I'm reviewing different stuff headsets mouses keyboards gpus motherboards the list goes on right so I'm very Hardware more specific but go ahead and give me a follow there man I appreciate it uh let's get started all right guys so one of the things that um so we're not going to start from scratch you're right because I already got code that's going to do a majority of what we're trying to do so there's no need to start over from scratch so I'm going to be using this Office 365 API file that I put together you could access this through my uh GitHub account so go ahead and find I am blue coding and there's a repo called python SharePoint Office 365 API so if you take a look at it you'll you know pretty much this is what I have you can just copy and paste this if you need help on how this should be configured your environment variables go ahead and look at the video here's the video right here if you go to my channel I am Lou there is a video that is called python download SharePoint files part one okay so I have a multi-part series you can tell I got part two part three of other stuff but this one is downloading files from SharePoint pile one this part goes very detailed when it comes to the environment variables how to configure kind of how to get that configured how the the site URL looks compared to like my site um the site name all of this stuff right good very detailed so take a look at that guys if you're not familiar with it and that video will kind of give you some guidance over that piece nevertheless if you are familiar with all of that then just literally copy and paste this in your project this is what we're going to be using um but then of course I'm gonna have another file this is where I'm going to be downloading I Just Called It download and uh re-upload so we're downloading we're going to modify it and we're going to literally push back that's that's what we're doing here right so for this if I go back to my um my repo I have if you go to the project under examples click on download files pretty much I'm going to take this but I want to tweak it right now there's a lot of stuff in here that I need because it's exactly what I want to do right so let me go ahead and copy it but I'm not going to use it as is as you could tell if you take a look at this it is saving it I don't want to save it right like we don't want to save it nowhere as we downloaded we want to keep it in memory and then do whatever we need to do with it and then push it back so there's no need to save it anywhere and you could tell I have arguments here where it says argument two it's pretty much doing a um local remote folder destination well this is where I want to save it locally or I'm not saving it locally so this could actually go away right so now we have for arc2 then R3 so I do need to change my values to argument 2 and argument three so these I do want right you could tell first argument is SharePoint I'm going to specify the SharePoint pass so here's one of my examples YouTube 2022 and then of course I have file name this could be set to none um or you could provide if you want a specific file that you want to download right you don't want that on all the files maybe you just want one file so in our case we're actually going to use this because we didn't want to process the one file specifically um and then of course you have patterns you know if you are let's say you're not going to download one file but you're looking at a photo or mini files and and you're trying to access a specific pattern you could do it that way as well and it will give you the list of files of the pattern but again we're going to focus on one file in our case we don't need the save function because we're not saving nothing this is going to go away you can tell we have um you know error here but we'll fix that but ultimately this is going to go away because we we're not going to save it nowhere right and all of this could stay this could stay this is all normal so all of this could stay right these are just some tweaks that we're making um there's a few things that we're going to have to bring in so in my case I'm going to be modifying um that the go here that I'm going to do is I want to download an Excel file specifically from SharePoint so because I'm doing it and I'm going to put it into a data frame right pandas data frame so because of that make sure to install pandas I already have it installed in my virtual environment um and then I'm going to go ahead and bring that in so I'm going to call this import uh pandas as PD all right so that's number one another thing I need to bring in here that this this you know the code that a copy doesn't have I'm going to use the io uh package and there is a a bytes IO class that I'm going to be utilizing and ultimately that's where as I bring in my content I'm gonna put that I'm gonna end up packing that into in memory and to do that use bytes i o class so that's kind of what I'm going to be doing here and I believe this is that guys I believe this is that well I don't need the pure pass because I'm not reading nothing right I'm not reading no local directories or whatever so that could go away this is it so now let's go ahead and uh let's add what we need right that's the other thing now we need to what what are we going to add so the first thing that I'm gonna do is since we know we want to get files well we have a function already here where again because we downloaded it from the download.py file so there they function here to get files and you could tell it's calling the SharePoint class which is up here and then it's calling the download files function and then ultimately it's able to download it right and I'll get back my file object so that that's all good that's gonna stay but I now I do need to upload files right as I download files I'm going to be modifying it then I'm going to be uploading them back so let's go ahead and create that class right quick and we're going to call this uh upload um upload file that's what I'm going to call it right upload file so in this case this is going to take a few arguments it's going to be a file name so this is the actual textual file name whether whether it be sales report dot x y uh uh ethics right is the Excel file extension because I'm doing with files the original CSV it'll be you know sell the report dial CSV you know whatever that name is that's what we're going to have here with the file extension as well uh the next thing that we're going to specify is going to be the folder name or the folder name is where where do I want to upload it to right in our case we're going to upload it to the same folder but it doesn't have to be like we could specify a whole different folder to upload it to but again our case same folder and then the final thing is going to be the context this is going to be the actual data like the file data itself that we're pushing on over and since the SharePoint class that I have here I really have a function for uploading which is called upload files I'm going to end up calling that function so in this case it would be share point oops uh SharePoint upload file you could do it in chunks but in arcade we're going to stick with this upload file like the whole file all at once then I'm going to specify file name all right and then folder name um um and then the context it just happened to be that these these are the same name well uh yeah not context so if we take a look at this this function you could tell we have file name photo name context happening the same name it's a coincidence but it doesn't matter if there's the same name because I'm passing them in the same placement right so even if these this would call new file name um folder only or maybe data again it's it's in the right placement so that's all that matters uh really this is it though guys this is all in the past here the function this function will upload files back to where they need to go which is pretty easy right again taking out what we already have when you're doing some tweaks to it that would get the data back to SharePoint so we have this function to download a file for SharePoint we have the file here to upload the file back to SharePoint so now what we need to do is um we need to create a function to let me make it a bit bigger because I think it's kind of small there it goes so now what we need to do is create another function um and this function is going to be more so to uh modify the file right do something with the file there's some kind of manipulation that we're going to do to it so for now I'm going to call it modify file and this would be whatever action you're doing right if you're downloading a file maybe you're doing some sort of crossed reference of some sort where you're you're iterating over the data by email and maybe you're cross-referencing that email by another data set to retrieve back some kind of identifier and then maybe you're going to add the the identifier to new column I mean again the list goes on or what you could do with it where you're gonna do something basic to kind of give you the concept and the idea so in this case we're only going to have one um one argument which is file object so again if I go back see where we can get our get file we get back this file object well this file object is going to get passed into this function which will come out of five file that's pretty much what we're doing here right so this is where as we get back that context from um SharePoint this is where I'm going to now utilize my uh bytes IO class and I'm gonna end up passing in this file object so by me passing it to the class ultimately what it's doing is now I'm I have this data set this data object that I could read now like I could pass this in through openpy Excel if I want to like read it to that package in my case I'm going to read it through pandas which is ultimately using open py Excel so keep in mind if you're usually doing something like what I'm doing like with pandas and reading Excel files that is a package that you're gonna have to install as well so if you're not familiar with open py Excel you know that is a package that you're gonna have to install um but that's kind of what I'm doing here first so once I do that then what um what I'm gonna end up doing is before I even read the data I want to be able to get a list of worksheets in the file right so this is the kind of this is the key uh I'm gonna call this worksheet list and this is going to call my panda and there is an Excel file class which ultimately takes in a data set like takes in an object pretty much um in this case it could be the file directory pass like if you're working locally it'll be your C drive you know your folder name then your file name right your full pass pointing to your folder your file and ultimately this will read that file in our case where we we're dealing with a an actual object um a um that we're going to read it we pretty much deal with in-memory object so it's going to react the same way they're going to read it no different if you're if we'll be reading it locally and then we're going to call a Sheet's name so Sheet's name is going to ultimately give me back all the sheets in the workbook uh in the list which is why I call this worksheet list so if you have a Excel file that has I don't know maybe 10 sheets but you only want to modify one of them then you're gonna have to specify which one you're going to modify um in our case um we're only doing it with a workbook with one sheet right so just kind of FY I'm gonna make another video later on how to handle workbooks with many worksheets that you're only going to modify one because it's slightly different but I'm not going to do in this video that's all different video all right so once I specify to be able to get my worksheets list now what I could do is I could go ahead and read my um my in-memory data um into a data frame in order to do that I'm going to call this data frame and it will be panda uh read Excel and then see where we have our i o file pass or uh buffer bytes again it takes one or the other it takes a a file pass slash bytes which is ultimately what we're doing here in my case it's going to be we're passing a bytes object which are called Data so I'm passing that in one of the other arguments see what's the sheet name right so for Sheet name I'm going to call my worksheet list index 0. I only have one sheet in here so I'm specifying the first sheet so for me I could just leave this blank and that will be fine because by default it's going to read only the first worksheet anyways right but um let's say you weren't you didn't want the first sheet maybe you wanted sheet number two that you want to specify maybe index number one right because again index 0 is really the first sheet and then one is the second sheet you would have to do that so I'm just doing it more from um just kind of give an example if you want to pull other sheets so this is going to read the first sheet put the information into a data frame which is exactly what we want um do some then here I'm gonna I'm gonna I'm gonna um I'm gonna call this apply mod of um apply changes so this is where I'm going to do something to it right I'm going to be changing something so in my case I'm going to create a new column I'm just going to literally call this new column for testing so we can see what it is and then I'm going to call this um testing new column that's our something basic again in your case this would be completely different right maybe you're adding two columns together you're getting a total uh maybe doing some logic based on if it's greater than 30 you know change the value to this or whatever right the list maybe you you want to iterate and and change all of your values to uppercase values instead of lowercase values I mean again the list goes on to what you're trying to do but you're trying to give the concept so once I do that something simple right nothing crazy we're just doing a basic change um this so the next step is um this step will uh now create the Excel object in damn it man I don't know cannot spell today in memory so now we're going to be the next step is pretty much what as we make changes as you could tell what we did so far we read from a SharePoint site we have our contacts we pretty much package this context now into a a um a bytes object and then we're now able to read it into a data frame now we modify the data frame well now we want to take that data frame and now we want to save that back but before we save it we can just save it directly to SharePoint you know the way like normally to save it you would just do like a save and specify local directory and it would just save there and that's you know so straightforward but now we want to save back to SharePoint so it's not that straightforward what we're going to have to do here is now we're going to create because again think about we have a data frame object we need to create that data frame object into an Excel in memory object because we need it in memory right we want it to be saved in memory not locally and once we do that then we could take that object and now push it back to SharePoint right so that's kind of what we're doing here so in order to do that I'm going to create this this is going to be this is going to call be called out put object that's what I'm going to call it and now I'm going back to my i o so IO bytes and in this case it's gonna be blank so this is just a I'm creating output object that's empty there's nothing in it right that's all I'm doing it's empty for now all right so that's number one um in Panda there is something what's called Excel writer right so we're going to be using calling that class and uh I know to me this is to create Excel objects is what we're ultimately doing here so we have a in-memory object to begin with and now we're going to be creating an Excel object so what I'm going to do I'm going to call this writer and this is going to call it PW Excel writer and I'm gonna my um my see where you could specify your pass whatever but in my case I'm specifying a bytes object and that's what I'm going to be specifying in here which is called output object okay so now that we have that done uh the next thing is we're going to now write to our we're going to export out our data frame so the way you would normally do it in any scenario right if you're going to do it locally it would be like two Excel and then see where it says um it will be 2XL and you kind of specify your pass and ultimately that's pretty much it right you know you're done but in this case it's not we're not writing it to some sort of directory pass we're writing it to our um our Excel writer which ultimately our extra Rider object happens to be an in-memory object so I mean hopefully that's it's not too confusing but it just kind of let's kind of walk through it right uh the next thing is see where I'm going to call it's called index oops index zero so this is key if you don't specify index 0 what that means is it's going to add a column and that column that it adds is going to be for to show like the sequential index of all of your rows so I'll show you an example well I guess I'll try let's say you have you have a worksheet that you have First Column is name second column phone number and then it's age well if you don't put index 0 when they export it out you're gonna have those three columns but column the very first column though in your Excel export it's not going to be name it's going to be um index and it's going to show a sequence of one two three four five pretty much listing them but an actual is taken up which is going to be placed in column A in Excel so if that's something that you want by all means you could leave it there like it will give you that most people I'm gonna assume don't want that so this is where you specify index 0 to remove that get that removed so the next thing too is Sheet's name if you leave it blank what happens is it'll export it out and save it but you're going to save it as default which is sheet1 well in our case we want to save it back to the same name that we got it right again maybe you don't but if you do this is where I'm gonna have to now call my worksheet list index 0. so again this worksheet list is going to give me back a list of worksheet names the worksheet that we're going to be dealing with only has one name anyways right so there's only one name but nevertheless that same name whatever that name happens to be I want it to whenever we save it you know um to excel I wanted to have the same sheet name if that what you want then this is what we would do if you do not want that you could just remove it leave it blank and the default will be shoot once just kind of keep that in mind and once we do that next I'm going to do will be writer save you want to save the changes and then it's going to be output seek zero so this is what we're going to specify 6 0 for our our bytes object um and this is it so this is it what we just did right now we have this output object and that object which is in memory now think of it as it has our um Excel file in there with the same worksheet name with all of the data and all that stuff has everything in there so now we could go ahead and view that object output you know output object um to push back to SharePoint ultimately right um so now see how we have this function which we call upload file so now I'm going to end up calling that upload file and then again it wants a file name in my case I'm going to create I want it to be a unique file name for testing so I'm going to call it um I don't know let's call it what uh testing modify Excel file then the file extension because this is what we're testing out so I'm just again and then calling it that if it's going to be the same name that you had you could do that as well but I'm just calling something different so you kind of see the two right um then the next thing is going to be folder I wanted to save it under the same folder that we pulled the file from if you wanted to go into a different folder then you got to specify where in SharePoint you wanted to go to but in our case same folder folder name so keep in mind this folder name it's actually coming from up here when it's one of our arguments folder name and then the last thing is going to be um output object right which is going to be this but but keep in mind we have to read the context of it as well so we're going to call this read method that's that's part of this object so be the object output dot read so this actually reads the data now it's able to grab all the contacts and everything and that's what's going to get pushed so by us calling this function that we call upload file and then ultimately again it's passing everything into our SharePoint upload file class which is over here and this will know where to save it to you know ultimately pass the context and do all of that um I like to separate my stuff by functions which is why I couldn't easily added this to here but again it gets a little bit more tricky because the purpose of this function is to modify the file it wasn't designed to modify and upload a SharePoint nevertheless though not just the way you know how I've been I've been I've been the more that I have been trying to clean up my code and keep them separated and more manageable you know that just kind of the way I have done it but this is it guys this is it like um oh my bad this is not it so see we're having to get filed function well we need to call this modify file function down here so this is now where we're going to call this it's going to become modify files then we're gonna pass in our file object and the argument um and this is it as we pass this file context over to this function this function now is going to take it you know boom uh now we have it where we'll be able to read it the reason why I'm reading this is to be able to get the sheets name so that's kind of that's the per this is the only purpose why I'm doing this to be able to get the sheets name that's it no other reason I get in the form of a list then this one this is actually going to read that file and put it into a data frame that's what I'm doing here then of course I apply my changes whatever it is that I'm doing right whether it be modification adding whatever it is you do it you know I did it here then once once I finish with that then now I'm going to create another empty object that's called output obj that's it's um it's going to be in memory object but it's empty then that's where I start adding stuff to it which I you know using the Excel writer then uh and then ultimately I start writing to it which is 2XL it starts writing to this object and then you know then I'll save it and then ultimately yeah that's it I'm done and then once I uh send this to my upload file my upload file function which ultimately is doing is passing this information over to my SharePoint upload file which is this function over here it's able to um save it under this name this is a folder path where I want to save it to you then it's going to read the context that we have in memory so again everything I'm doing here is all in memory downloading file it's in memory modifying it put it back in memory and then it's sending it back to SharePoint so now let's look at the file that we're going to be testing out so if I look at my terminal so if I look at uh you know what you can see it there it goes so it's Python and I'm calling this this file my first argument here is in SharePoint so let's take a look at it right in SharePoint so if we go to SharePoint we have data so documents data 2022 sales oops cells and then it's going to be this file right here I know the name says CSV it's not a CSV and you could tell it's an Excel file so if I click on it let's just take a look at it this is what we're dealing with look at our tab you know we have our tab name here and then we have our ID column title name so on and so on right pretty straightforward and then what should happen we run the process should save a new file in here under what name did we call it testing modify Excel this is a new name that it should show up if things work successfully and it's going back as you can tell this is the second argument is when I'm specifying the specific file file that I want to process and then none is because if we look at argument three it's saying patterns well I'm not looking for files and different patterns now keep in mind what I have here for my code and More Design to modify one file specifically and one tab specifically there's a lot of different scenarios where it could be completely different you know maybe you want to modify a worksheet a workbook that has mini sheets and maybe the sheet that you want to modify is sheet number 10. that would be slightly different maybe you're dealing with a scenario where you have 50 filed in SharePoint you actually want to download all 50 and you want to be able to modify all 50 you know all 50 you're doing something to it you want to be able to modify and push all of those back again that would actually be slightly different as well so I'm trying to give at least the core the basic concepts of it but I am going to make some videos related to those two scenarios that we just talked about so there will be videos coming out soon related to that so now let's go ahead and test it out guys if I go ahead and run this code cross fingers hopefully no errors uh boom no errors everything looks good let's come to SharePoint if I hit refresh boom we got a new file testing modify Excel file so again let's take a look at the original file original file these are our columns right that we're dealing with and remember on the new one um let me take a snippet just so we kind of compare the to your right so if I take a snippet of this so now let's go ahead and take a look at the new one so we take a look at it it have the same column name this one slightly has a little format to it because it kind of does it by default it's mode but you have ID title name amount so the amount so this goes all the way to column J well in our case we have column k the reason why we have a k column because this is what we call it the new column and we added some some value here right so this this kind of shows you how it does have the new column that we added which again in my cable the new column in your case it could be um maybe you're adding these two values together and you're creating a new column I mean it could be whatever it is they're trying to do maybe you're moving stuff around right maybe you're like saying okay well I want to move this column this reference column over in column B and you want to shift everything over like the list goes on or what you could do with it but it's just a quick example of kind of what I did so again guys hopefully this helps man this was a request I got from a viewer um again guys any kind of requests you have send it over I do I do get a lot of requests keep in mind do you have do you get a lot of requests um but I am going to be putting effort every week to try to knock out a few of these requests that I get in so it may take a little time just depends but nevertheless I do appreciate the support that y'all been giving me um do you appreciate all of that guys hopefully this helps again um and if you have any questions man just go ahead and leave it in the comments or email me at contact imlu.net and I'll try to get back to you as soon as I can again guys I appreciate all the support and all the love I've been getting and uh talk to y'all later peace
Info
Channel: I am Lu
Views: 28,150
Rating: undefined out of 5
Keywords:
Id: w0pBFo9zpiU
Channel Id: undefined
Length: 34min 0sec (2040 seconds)
Published: Thu Nov 17 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.