Python Download All Files with Subfolders from SharePoint Using Office365 Rest Package Part 7

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
yo what up guys what up what up so in this video guys this is gonna be part seven of a multi-series videos but in this video we're going to be talking about how to connect to SharePoint and not just download files like all files from a folder but we're going to be doing crawling through all the subfolders so we're pretty much we're specifying a directory starting pass directory pass right starting folder and if you happen to have two three four levels of subfolders with files in them it's going to crawl each one it's going to pretty much download all the files and and here's the kicker It's Gonna Save all of those files in the same structure locally right so just like the way you may have a starting point that you have a subfolder then there is a repo uh python SharePoint Office 365 API you can go ahead and down you know get a pull request to kind of get the the uh the code for the office 365. and then ultimately the requirements and all the other stuff right in the example folder that's where you're going to find what we're about to code up once we finish I'm gonna you know push it up to GitHub but nevertheless you'll find that new file in here but yeah guys if you're not if you want if you're not familiar or you're not sure how all of this works um go ahead and take a look at part one to kind of get familiar because we kind of walked through this process but nevertheless I'm taking existing code when you're gonna add on to it so that's pretty much what we're doing all right guys so there's one thing that we need to add to the Office 365 API file again guys this is the existing code so go get a clone from the GitHub repo but there is a new function that we're adding to our SharePoint class so we're gonna under get file list we're going to create a new function and we're gonna call this get folder list self and it's going to have the folder name all right um and then in here we're gonna have our connection equals underscores uh next we have our Target URL uh Target folder URL and in this case is going to be um SharePoint underscore doc and then it's going to be a folder name oops which of this argument right here all right and then we're gonna have our root folder connect web Target folder by server relative URL then of course we're going to specify our Target folder URL uh next we're gonna expand our root folder and this would be folders only right now file just folders only then we're gonna call get and then we're gonna call the execute query method and and that's it after that we're gonna return our root folder object but we want to specify to only return back folders so I'm going to call that so we're calling the object root folder and then inside that root folder there's other properties one of the ones that we're going to call will be folders which ultimately going to give us back a list of the folders in return so that's what we're gonna get back under here once we um create our git folders so what I'm going to be doing here is I'm going to um create a new file so let's go ahead and create a new file I'm going to call this bad boy uh download uh files with uh sub folders subfolder right so that's pretty much what we're calling this with subfolders and make this little bit bigger make sure it's easy to to see on the eyes all right cool so what we're gonna do first we're gonna need to pull in our um Office 365 API which again this is going to be this file here pretty much import SharePoint again I'm not going to go into detail what this does go look at part one it explains all of that in more detail so um let's see what else um I'm gonna Import in system and I believe that's it for now and then the next thing is so I'm going to import in uh pass lib import pure pass so I'm going to be using pure pass pure pass is ultimately to um to kind of build my directory pass but again pure pass is ideal because it works regardless you're dealing with Windows or Mac or Linux right because it's all different your backslashes forward slashes so it kind of standardized it pretty much all right so the first argument that we're gonna have we're gonna go ahead and get that set it's going to be the SharePoint folder name [Music] um may include subfolders so it may be like YouTube 2022 right so let's go ahead and call this folder name um system argument one now the real one it's not zero because again guys anytime you run python in your terminal and then you provide your um the file name which in this case it would be this file that's argument zero right that's the first argument anything after that becomes argument one two and three and so on right so just FY on that those of you who are kind of new uh the second argument going to be a location uh uh local or remote uh folder location so this is ultimately where we're gonna save it to you right in my case it's going to be locally on my computer but technically it could be somewhere else when I say a remote location I mean more if you're working in a corporate environment maybe you have like a an internal uh you know Network right a different servers internally and you can just kind of provide that past right of that local remote server um now if you're dealing with remote from a sense of cloud remote that's that's not going to work in this case this is more um local or remote with Cindy internal Network pretty much so this will be folder destination system argument two all right and then the third one this is going to be argument three um argument three would be um determine if if all bio folders uh all folders I'm going to call this sub folders need to be all folder slash files need to be downloaded so this is what we're saying here is I'm going to call this um crawl folders so ultimately what this is doing here is saying do we want to download all the subfolders and files included or do you just do we just want to download all the files based on the directory paths that we provide so if we provide let's say a directory a pass of YouTube 2022 if this is let's say a blank no or something right let's say it's not yes put it that way it's not yes what it's going to do it's only going to download all the files that relate to this folder only that's it right so keep that in mind but if we specify this to yes it's gonna crawl and download all the subfolders and all that all the files that's pretty much what it's doing so we're gonna we're gonna create a few functions right now right so we're gonna create a few functions that we're going to be we're going to utilize so let's create a uh save file function which is the first one uh one of the arguments would be file name file object and then the next one would be subfolder right subfolder if we happen to you know whatever the subfolder name is we're providing that as well um we're gonna end up building our directory pass and this is where pure pass comes in and the first thing would be our uh folder my destination okay this is locally where we're going to save it to and then it's ultimately gonna take the subfolder so again it's concatening them together but where because where they fold or pass that's ultimately what pure pass is doing where they it is a a directory pass that we can now run you know do a check on uh or better yet save our files to that past right kind of that's pretty much what we're doing um the next thing that we're gonna do is file directory pass I'm going to use pure pass again but this time it's going to be directory pass and file name right same thing just kind of putting them together concatenating them together and then finally we're gonna call our West open um oops file directory pass right binary because again if it's text binary whatever maybe we want to write it to write it and ultimately what we're gonna do here we're going to call right and then we're going to end up specifying well what is it that we're writing right like we specified here the directory pass with file name right we provided that and then of course we gotta write something like the actual object itself and that's where this comes in right the file object is going to save that bad boy so that's kind of what this is doing [Music] all right so this is to pretty much um save the file to uh local or remote location it's pretty much what that's doing all right the next thing that we're gonna do here is create the um create directory if it doesn't exist so we're going to create a function again for this we're going to call this create direct for directory and ultimately going to provide a pass and this is pretty straightforward what we're going to do here right we're going to go ahead and create another use peer pass to create again concatenate our our um folder destination with the the subfolder and we're going to do a check does this exist and if it does exist but if it does not exist then we want to create it it's ultimately what we're doing so let's do pure pass folder destination and then pass right which is this argument that's being that's coming through and then now we're gonna do a check uh you know what I didn't bring this in so like kind of going back on top see how we import the system I need to import OS the reason why because we're going to end up using uh the OS method for pass exist so this method is going to check to determine does this pass exist and if it does not exist then we're going to create it it's pretty much what we're doing so directory pass and we're saying uh my bad I forgot to put since if not exist right if not exists which means it does not exist so if the if it does not exist then we're gonna do a make directory so only if it doesn't exist we're going to create it and if it does exist there's nothing to do straightforward guys pretty straightforward all right [Music] okay so now the next next um function is going to be it's gonna be called get file not FiOS but file we're gonna create two one of them called get files and it would be file uh name and then folder and this one it would be file object equals SharePoint download file name then folder right so ultimately this again there I have a video that goes in more detail of this function here so you could look at that video uh but again this function takes two arguments the file name that we want to download and then ultimately the folder specifying the past on where to find it in SharePoint like the the the folder pass the SharePoint and that's how we'll be able we'll be able to download that file that's pretty much what that's doing um and then as we get the the results back right as we downloaded we're going to end up calling the save file um function that we just created up on top and in this case we're going to provide the file name right whatever it is that we're downloading the same name the file object and then the last one is gonna be folder which spends the same the same folder that we're trying to find a SharePoint we're going to provide that in the save file to to save it in the same manner right so we want to mimic on how it's structured and SharePoint with the subfolders to be able to save it that the same way locally so that's why we're doing that all right so the second function is going to be called get files again with an s right so it's get files and in this case we've got to specify the folder and the folder only so in this case it'll be files list um SharePoint underscore get files list and then we're going to specify the folder now what this does here this function again watched my previous videos it goes in more detail this function actually grabs a list of files from a specific folder that we specify so we specify a specific folder and it's grabbing a list of files right so if there's happen to be five files in that folder it'll give me back those five files in the form of a list so that's kind of what this is doing right that's what it's called files list and then once I get that back I'm gonna end up doing a for Loop this will be four file in files list and then I'm gonna end up calling the git file right now now without the S not FiOS but get file function and inside here I'm going to be specifying the file dot name and then folder okay keep in mind this file the list that I got back is actually giving me back objects in the form of list right so if I have five files the the values inside those five rows of the list they're not string text value they're actually objects which is why I gotta call this object then I'm going to specify the dot name property to get back the name value right the string value of the of the file name so just kind of again Watch previous videos if you like to go into more detail what that's doing but just kind of just a quick Insight that's kind of what that's doing all right so now that we got that created we're going to create our next function which is called get folders okay and all we're doing here providing a folder pass so the purpose here is just like this one it gets we provide a directory pass and we get back a list of files well if we have subfolders that's what this this is doing this is going to give us back um get back a list of sub folders um from specific phoneer so again this is going to give us back a list but of subfolders in return so that's kind of what this is doing all right so what I'm going to do here I'm going to call this list okay so under our get folders we're going to we have our a blank list object called it l and then we're going to call we're going to create a new object folder object equals SharePoint get folder list and then we're going to end up populating in here the folder again which is going to be this guy and this argument is going to go in here okay next we're going to do an iteration we're going to iterate over this folder object because it will be it's gonna pretty much it's a list of folders right um so we're gonna call this folder um just call it item in folder object and then we're going to call this sub folder equals and we're going to use the join method here and ultimately what we're joining here would be our folder which again this argument here and then it will be folder item name so in reality this is this is our our subfolder object I probably should have called it that so let me call it sub folder object it's a better name more Suited so what we're doing here we're creating a subfolder directory like the Full Pass of the the folder with subfolder and pretty much just put it concaten them together this is what we're doing here gotta make sure to call the name right you gotta get the name property as well um so once we've done that now we're gonna do is a a pin we're gonna append the subfolder to our list right right now it's blank but we're gonna pin as we do our iterations to kind of get the whole list back which would contain a list of all the subfolders then we're going to return back that list um and that's it guys those are all the functions that we need that we need so the next thing we're gonna do is let's go ahead and have our code on what we wanted to do so ultimately I'm gonna have this where if name equals main which means if I run the script any code that I have under here is going to run so if you're not familiar if you've seen this before you don't know what this means that's kind of you know keeping it simple that's ultimately what it means so I'm gonna have some some conditional logic that says if crawl folder equals yes right again that means I want it to crawl all the subfolders and download all the files that's pretty much what I'm saying here right so the first thing that I'm gonna do here is I'm going to um I'm going to call my folder I'm going to create a folder list object and this is going to call the get folders function right that we created up here and these the the argument is pretty much going to be the folder name remember how we created this argument over here this is the folder the SharePoint folder that we want to start at right so I'll show you an example in a minute but ultimately that's where we're starting at and it's going to give me back a folder list because again I'm calling the get folders function that's what it's going to do next I'm going to do iteration over that folder list that I returned that I got back it's going to be folder folder in folder list and then so now that I'm iterating over the list I'm going to do another for Loop and in this case it's going to be called sub folder in a folder on my bed under the git folders function again I'm calling the function because this function returns back a list it's why I'm able to use it in my for Loop so just again you're wondering why I'm doing it this way that's why so this is my starting starting think of it as level one folder that I provided SharePoint and what I'm saying here is okay give me back the the folder that are in that folder right so let's say give me back five well now I have a five a list of five folders in here that I'm now iterating over as I iterating as I iterate over each one I'm getting back another list of folders inside each one of those subfolders that's what I'm doing here which is I called it subfolders and then from here I'm doing a folder list dot append and I'm appending my subfolders to my folders list now keep in mind this could be initially would be maybe two three folders right so you have your starting folder let's say we have three folders under there one two and three right one right under one two and three but let's say under those three each one has five folders under there which is a total 15 right so now you have level zero level one and then you have level two which has 15 total folders what this is doing and and the reason why this works is as I iterate over it once and I find I'll have three folders it's actually appending those results those subfolders to my folder list because it's appending them to my folder list it now is going to add it to this object right same object which means it's it just my for Noob is going to iterate over whatever I appended as well so as my list is bigger it's going to the for Loop process of getting that folder list as well so that's how I'm able to get a complete list um so once I've done that [Music] what I'm gonna end up doing is folder list and I'm gonna call this this will be my starting Index right pretty much index zero which is the the the the starting point on my list um oops I want to add my folder name in here and the reason why is I gave it a starting point but I want to download files from that starting point folder as well if there is any files then from there I wanted to iterate over as well one thing to keep in mind this isn't a is in a clean lit it's in a nice order and this is key because we start creating directory paths it will be able to create things easily as it creates things in levels right so if you have a folder let's say you have a folder called uh music and then it's the year 2022 and then it's let's say months and then let's say it's day well if you have a folder called music let's say there's nothing in there and then you automatically I don't know where you want to create a folder that's called 2022 uh January um you know a date it can't create it because a 2022 doesn't exist yet and then B the must doesn't exist here this will put in a nice order so as it creates the files the folders it would actually create 2022 first and then once it's ready it'll create the mouse folder then eventually once it gets to to the point we're trying to save let's say a file that belongs into the date there's already the year the month all the other stuff is already created right so that's something just to kind of keep in mind um so let's see now I'm gonna do my iteration so I'm gonna ultimately end up iterating over my folder list so in this case it will be uh four folders in folder list um and then we'll create folder if it doesn't exist and in this case would be create directory remember if it doesn't exist it's going to create it but if it does then you know we're all good uh um next get the files for specific folder location in SharePoint so this is where we're going to call the get files and we're going to specify the folder uh name but again this is all we had to like build our list first of all the file structures that we want to see and what I'm going to do here is I'm gonna print this so we can see it right so I'm going to print a folder list so you can see how the final result looks right so I'm going to do that uh now this is the if condition so of course the else in this case would just be get files folder name so again this condition is saying if we want to crawl all the folders and download everything make sure pass this argument as yes if it's not yes it's only going to download the files for whatever directory pass we provide it's not crawling any folders so I'm going to download all the files so that's something that just kind of keep in mind um Kai this is it man this is it so now that we have everything in here let me go ahead and Save and let's go to SharePoint because I want to show y'all how this looks so if I go to SharePoint see how I have this this data folder than I have a year right 20 20 22. we're gonna be dinner with 2022 in our case and then we have files then we also have folders called cells so called has another folder in there called January but there's a file in here too and then there's more files in here right so we have you know a lot of different file sales we have January we've got files in here we have a one called test same thing let me go back to February test two and more files so as you can see in the as an example mainly subfolders many files located all over the place so we're gonna go ahead and run the process and crawl it this is where I'm going to specify where I want everything to be dumped into right so this is where I want everything to be saved to so I'm going to go ahead and copy this URL pass this directory pass and remember let's look at our arguments these are our arguments on where we're starting so first it's going to be python the name in my folder my file which is down on file with subfolders okay number one my first my second argument which is this one is folder name so in this case if I go back it's going to be data 2022 come on download everything in the data 22 for by me doing that it's not going to download this 2020 right but it's going to download everything inside the 2022 folder if I if I want everything I will just specify data but in my case I only want files and subfolders from data 2022 folder so I'm gonna causes data 2022 space the next argument will be where do I want to save this files to so I am put double quotes and specify the folder where it's going to get saved to the reason why I say put double quotes is if your directory pass or has any spaces and things of that nature it throws it off if you don't like in my case they're in those spaces I could do that double quotes you know it'll be fine but just to kind of keep it keep a standard I just automatically put double quotes regardless then my third argument here is going to be um do I want to download all subfolders I'm going to select yes if you don't want to just type in no none anything else besides yes put it that way but in my case I do want to download all subfolders so I'm going to type yes so now that I have done that uh let me make it a little bit bigger let's go ahead and run it and hopefully no errors but yeah let's run it all right let's see it's running it's running we got a list back okay so far it's looking good let's look at our folder boom we got a folder in here remember how I just showed this folder it was empty data 2022 oh I see files I see another more folders called cells January then I see oh there's another file what else is anything else being created or no there we go that's all done so if I go back sales January February test boom dude it downloaded everything all files all subfolders and you could tell the subfolders match to what's in SharePoint so if I go back to the SharePoint compare the two data 2022 data 2022 call sales file names if I go into sales and I go into sales dude everything matches to the team it all matches so again it did exactly guys exactly what we wanted to do and then again if we look at the list and you could tell the list see how we have the directory passing sharepoints that we want to ultimately download files for but we'll look at the pattern right something that I want you to be aware of there's a pattern to here this is the top level directory this is I'm going to call this tier one or level one or level zero better yet right and then you have level one um directory which is after 20 the year which is sales and calls and then after this you have level two which is inside cells you have again January February and then inside calls you have January then after that you have level four which is test inside January and test two inside February but look at the order and the reason why this this matters and it does this automatically because as it goes through the process and creates folders and does all of that it goes in this order pretty much right so that's why it there shouldn't be any issues uh because of the order it's going to uh I do believe regardless um it probably still even if it goes in this backwards order it'll create all these folders as well I believe again now that I think about it I think it will do that but either way you know I just kind of keep it more structure and a nice order flow to avoiding issues you'll kind of see the pattern so again guys this was whether this was a request from somebody and of course this is my part seven of a multi-series uh dealing with SharePoint um this is key dude this is very very beneficial so many ways especially if you're trying to Archive stuff in SharePoint uh maybe move it somewhere else or just archive it right zip it up and store it to the side because you're cleaning up SharePoint or whatever this will work would work ideal and it'll crawl all your folders and download all your files and all of that guys so again hopefully this helps out and again guys man appreciate all the support I've been getting thanks for watching the video and peace
Info
Channel: I am Lu
Views: 6,328
Rating: undefined out of 5
Keywords:
Id: krNXp0XECHg
Channel Id: undefined
Length: 36min 19sec (2179 seconds)
Published: Wed Jan 18 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.