Trade Options Like Nancy Pelosi (Python Tutorial)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone i'm walking through golden gate park here in sunny san francisco a lot of people ask me larry what's your secret to making millions in the stock market the truth is i just copy other successful traders such as trader stewie on twitter kathy wood of vest and nancy pelosi [Music] hey everyone welcome back to another video just recently i was following this account on twitter unusual wales and it said that pelosi is using deep in the money call options she just exercised 10 million in microsoft shares on march 19th and immediately after that microsoft was awarded a 21 billion dollar army contract for the hololens as you can see microsoft stock since march has been on quite a tier as has roblox and other stocks that nancy pelosi has been purchasing and these aren't just stocks these are actually options so nancy pelosi purchased a tesla options so that got me thinking where can you find this information where does it come from well it turns out there's actually a transaction report when a member of congress purchases soccer options like this and so i downloaded one of these this transaction report you can see for instance that nancy pelosi purchased 100 call options in apple and 25 call options in tesla and this is not a political party thing if you'll remember last year for senators i sold stocks before the coronavirus uh crashed so you remember kelly loffler and other senators dumped a bunch of stock right before the covert crash around january 24th so i've been talking a lot the past few months on how you track activity of other traders whether it's someone on twitter wall street bets uh kathy wood of arkhan fest and so forth and so that got me thinking why not do a tutorial on following along with what congress is doing after all we talk about following insiders and the smart money well who has more inside information than someone that has the power to create our laws so today i'm going to show you how to use python in order to programmatically download these financial disclosure reports on house.gov and then parse out the text using this pdf library in case you want to take this data and do something interesting with it such as creating an api or some type of app that follows this unusual activity so let's go ahead and get started with the tutorial so i know there's probably a lot of other people who have already built apps that do something like this and someone usually leaves a comment about it below however i like to try to do things in my own way and learn a little bit more about the source data and write my own programs to interact with that data and put it together in my own unique way so like any other programmer i just googled it and looked up stock disclosures and it turns out there's this stock act that requires members of congress to publicly file and disclose financial transactions okay and so the first result here is this office of the clerk us house of representatives there's a senate side as well this is the house version of that and if you get to this website you'll see there is a section in the overview here where it has all of the years listed and since it's a government website they obviously don't always have the top google engineers working on this so there's not some sleek json api usually these sites return data in a weird format or you have to screen scrape it or something like that so in this particular case i looked at how the data comes back and it looks like they have it available to download by year and so i can go to 2021 here and if i click it you'll notice this actually downloads a zip file so yeah step one we need to figure out how to use python in order to download and extract a zip file inside of that zip file you'll see there is a text file and an xml file right and so the text file looks like it is formatted okay so if we look at this looks like it's a tab delimited file and so if i search for pelosi here you can see some dates and this last column and if you scroll up to the top this last column is called document id also there's an xml file in here and that's just another format that's not quite as popular as it used to be xml used to be a big deal years and years ago but most people returned their data in json format or use other other formats and so if you look through this you see it's structured a little better since we have these tags and so if i look up a pelosi right here you'll see it has doc id wrapped around this these numbers here so what is this document id i wanted to know what that was because from here i can't tell what stocks were actually purchased and so i copied this document id and then i just googled pelosi and that number and if you do that you'll see it actually pulls up this periodic transaction report and so it looks like all of these pdf files are stored in some directory here and so what i want to do is use python to programmatically download the zip file extract that zip file get a list of document ids from this tab delimited text file so we'll parse a tab delimited text file or you can parse xml with lxml or another library like that and then from that we should be able to use this url format to grab that document ids pdf here and then we can download all the pdfs or we can use another library such as this pi mu pdf a library here there's many pdf libraries and we can use a library like this to actually extract uh text from this pdf document and it's unfortunate that this isn't in the nicest format but i thought this would be a good exercise to try a variety of libraries to deal with messier data problems where you don't have data we don't have data in the nicest format so now that we've briefly discussed the steps that we'll take in order to get the information we're looking for let's go ahead and write a quick python script to get this information so i have a little folder here called uh trade like nancy and i have nancy.pi in here and i'm just going to import a couple of python packages so i'll import csv that's built into python i'm going to import json because we might need that later and i'm going to also import a zip file and i'm also going to import a couple of popular packages that are not built into python but you might have them installed already so we've used requests before request is just a library for making http requests so we can request a url and download a file or download some json data and i'm also going to import pi pdf2 and so if you don't have these installed what you do is you you do pip3 install requests and pi pdf2 okay so what's the first thing we want to do the first thing i want to automate is downloading this zip file so over here i can see where it's linking to and so i'll just copy i'll right click and copy the link address and so i can just say zip file url equals and i'll just store that as a string and so let's see if i can use requests in order to download this file so i'm going to do r equals request.get zip file url okay and if i run that it will actually request that zip file but it doesn't do anything with it so what i need to do is download that zip file and write it to the file system so what i can do here is just open a new file so i can do with open and i need to give it a name so i'll give it a name called zip file name equals and let's name the local file uh 2021.zip okay and i will uh use the zip file name right here so i'm going to open it and i'm going to open this file for uh writing so we're essentially creating a new file on our disk called 21 2021 zip and i'm going to open it as f so i have a reference to that file and then i do f dot write okay and i'm writing r dot content so r this is a reference to this request and once i uh finish this request i'll have content and i'm going to write that content to the file system and if i run that okay you'll see there's a 2021.zip right there on my file system so that seems to be working pretty well so now if i go to my file system i have this zip file let's see if i can open it up okay and you see inside of it i have my text and my xml file here so that seems to be working i successfully downloaded the zip file but i don't want to have to click to unzip it on my desktop i want this to all happen programmatically so the next thing i want to do is unzip a file programmatically so i imported this package called zip file and so what i can do is do with zip file dot zip file and just so you know the zip file package is right here so if you look for it it shows you how to work with zip files and you'll notice there is a class called a zip file and if you look into it you can see the zip file object here and you just need to give it a file and a mode so you can open a zip file for reading so i'm going to do zip file dot zip file and i'm going to give it my zip file name which i've already specified is 2021.zip and i'm going to open that for reading so it opens for reading by default and so i'm going to open that file as z and then i can call z dot extract all so that will extract the zip file and this extract all all it needs is a path to extract the files to so i'm going to extract it to um i'll just extract it to the current directory so i think i can just use a dot there so i deleted the zip file so let's see if i can programmatically download the zip file and extract it so i'm going to run this one more time and you see that it automatically downloaded the zip file and it also extracted it so now i have this text file here in the same directory okay so that's good to go so now that i have this text file in this xml file i'm going to parse one of them so let's just use this csv module to open this text file so to do that i can just do with open and just do a normal python open and so if i knew the name of the file i can just do fd.text or we could open a directory and scan through anything that ends with text file and parse all of them so one thing we might want to do is download all of the zip files that are on that page so i'll open that file as and i'll just use f again here since it's local to this with statement and i'll do for line and csv dot reader and i will give it the f here and then i'll say delimiter equals uh slash t since it's delimited by tabs and not a comma here and so for each of the lines in there let's go ahead and just print the line and see what we get so i run that downloads the file extracts it opens the text file and you see we have a list of lists here and so you can see that csv file parsing worked and you see there in alphabetical order here and i can find ones that have a pelosi in them and it looks like index 0 index 1 here is the last name and so what i can do is say if line 1 equals pelosi and you can pick on any congress person you want to for some reason i'm picking on pelosi here uh just because that street happens to be near my house and it's in the news but pick on anyone you want so i'm gonna do if line one equals pelosi then i'm gonna print the line and so we'll just print those okay and you'll see that just has nancy pelosi trades and then the document id so we'll get this document id and that's going to be a line 8 because that's the 8th index there and then the date is line 7. so let's say you want the date you can do line 7. okay so you could store these values and do whatever you want with them maybe you want to send yourself a notification if there's a new document available or something like that okay so i got that information now and so now what we want to do is download the actual document so we should have the pattern here in our browser so i pulled up this pdf file and what i can do here is get the url for the pdf so i'll say pdf file url equals and this will just be the base url and then what i can do here is just substitute in whatever document id it is and so that's the pdf base url and so i can do r equals request.get pdf file url plus and then we can do the document id and then we'll do a plus a dot pdf or you can use an f string so you can do something like this so we'll do a pdf url and then we'll do doc id dot pdf like that and i believe that will work and then i can again write those to the file system so i'll do with open and then i'll do uh another f string and i'll do a doc id dot pdf dot pdf as and then we'll see we'll give it a new file handle since we already used f locally in here so i'll do as a g or pdf file and then i'll do f dot write r dot content here and if i do that let's see if we get it so i'll run that no such file or directory i need to open this for writing so i'll do wb here so it needs to write a new file so i need to pass that second parameter and if i run that and i get this error right must be a type string not bytes and oh i used f right here when it really i called it pdf file here so that's the disadvantage of you reusing these variable names over and over again is that i had it nested up here so what i want is pdf file dot write r.content okay and if i write that this should write the files and you'll see i now have three pdf files locally here for each of those documents that was inside of the text file that was inside of the zip file so now if i go here to my local finder i should be able to pull up these documents and so i can inspect nancy pelosi's trade history for 2021 and you see this alliance bernstein holding what is that let's check it out so we have alliance bernstein uh holdings here and let's see has this been a good stock to buy let's see how the trades are going and yeah sure enough looks pretty good over the past uh six months to a year but everything's kind of gone up so looks like it's been a pretty good one to buy and uh yeah what what's significant about this company who's worked there before kathy wood kathy wood worked there actually in 2001 before uh arc invest so i thought that was kind of an interesting uh coincidence so yeah you see apple tesla and walt disney are in here and then we could even open up some of the newer documents so let's see what's in here let's see we got more of this a b stock so that must be a good one so uh and then and then the last one obviously is the one where we found the microsoft uh call options that were purchased um and this is an exercise of the call options that expire march 19th and so looks like these microsoft calls had a strike price of 130 and then there's this roblox purchase right here and so uh yeah that's how we programmatically download the zip file extract it open the text file parse the tab limited delimited file and then programmatically download some pdfs from the web and download them locally and then as a final step let's say we don't want to read through a bunch of pdf files with our eyes you know this channel we want to automate everything so if we want to extract text from a pdf file with python what do we do well python has a lot of cool libraries this is one pdf library and here's another one pi pdf2 so i ended up trying uh this library right here called pi mu pdf and it has a variety of examples here so let's find the documentation and usage here so if you go to the documentation for this library there's this import fit so this is called f-i-t-z and so this is the one i'm going to use i think i tried two of these pdf library so uh yeah that's why i imported pi pdf two earlier you can use either one and so i ended up using this one and so to open a document i can just take this part which is doc equals fits dot open and you just need to give it a file name and i have a file name right here and i don't want to download all these files again so i'm going to comment this all out so you get the picture here and so let's see how to open a pdf file let's just isolate this part of the code and so let's just put 2001 8539.pdf right and if we want we could put this inside of the loop and open up all of those files and then according to our documentation you can load pages so there's a method called document.loadpage and if we look at that it just needs a page id and so we can do page id equals zero so let's try loading the first page and so we'll store the results of that in page and let's just print page and let's see what it gives us so i'm going to run that and it just says page zero so that's not very interesting and so let's call another method called uh get text so it has a method called gettext so once you have a page object you can call get text on it and that should get the text of a page so i'll call get text if i do that you'll see look at that we have some raw text extracted from a pdf file and you can see a symbol like microsoft corporation in there and start parsing away on this file and so uh just text it looks like it's still a little bit hard to work with you know you need to figure out the description and the location and so you can drill further and further into this library to parse this data and so there's some other methods you can use so instead of just getting text like that you can say json data equals get text and give it a type of json and this is all in the documentation and so if i print it as json data okay you can see it gives us like the coordinates of all the text of these bounded bounding boxes and all the information you'd find inside of a pdf file you see font information and so forth and inside of there you'll find information about the pdf file so a lot of metadata so i just hit the 20 minute mark of the video and rather than go on and on i told you i wanted to keep these videos a little shorter and release more often going forward so like 12 or 15 minute videos so if you want to dive further into the pdf format uh play around with this library and dive into the documentation you can call methods on it to get the text get all of the pages and then if you want to explore what's in this format you can load the data as json get the keys and loop through all the blocks and all the coordinates and whatever you want to do and so feel free to put this together in your own way maybe you want to send yourself notifications maybe you want to build a follow congress web application on top of this maybe you want to send a text message or a discord alert there's a lot of cool ways you can mix and match this type of information and plug the data in and the idea of this channel is that i just show little building blocks of this stuff and you can go back to other videos in the channel maybe you want to store this data in a database and and the list goes on and on of what you can do here so that's it that's what was on my mind i saw a tweet about pelosi's stock trades and i wanted to see where they got that information from and see if i could write a program to grab that information for me so uh thanks a lot for watching the video hope you enjoyed it take it easy see you the next one
Info
Channel: Part Time Larry
Views: 13,181
Rating: undefined out of 5
Keywords: congress trades, financial disclosure reports, python, nancy pelosi, roblox, microsoft, insider trading, options, unusual options activity, programming, script, api
Id: FQH_m-GEkdI
Channel Id: undefined
Length: 21min 19sec (1279 seconds)
Published: Wed Apr 14 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.