Company screener with Python and Finviz - Python for Finance

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome to this new video tutorial on python for finance in this video we are going to learn how to create a screener so a screener is basically based on some criteria that we are going to define so for example in this example we are going to use the sector and the industry we are going to retrieve data for all the companies meeting that criteria and then we will get it in an excel file so this is particularly useful in order to identify stocks so if we are unsure which companies are for example cheap we can say that we want to define some price to sales ratio price to earnings ratio and so on and then based on that criteria that we defined the screener will search those stocks meeting the criteria and return them to us on an excel right so we are going to do this using python so in python we are going to scrap a website in order to get this table that we see here that it contains in in this example it will contain all the industries in the technological sector and then having as a country consumer electronics so then we will scrap that based on that criteria and we will get all these metrics that you see here like the change of price market capitalization price earnings prices price to book value the earnings per share for this year then the volume the outstanding number of shares the flood shares how many of the shares are owned by insiders db then return on asset return on equity current ratio a lot of ratios performance of this week of the month and so on and all of these metrics that you see here it will be returning to us on an excel file okay so that's what we are going to build in this video so to start we are going to use a website that is called finbis so what we are going to do is here i will you you can go into google it's very famous probably some of you already know this page it's great for financial data so it's totally free for use it has some limitation on the free version but anyway this code is done with the free version but in case you are really looking into using a lot of the functionalities offered by the page then you might open an account with them it's not that expensive but for this video we will use the free version so this will be the screener and then you can see here that you can define any sector or any characteristics that you want for a particular company or for a particular group of companies and based on that criteria then the filter will run and then it will return in this table the companies that are meeting your criteria so for example we could say that we want mega market capitalization then it shows you the companies that are bigger than 200 billion right so you get them in a table the limits are for the free version that you have you have them in a table but then you only get one group of 20 so you have here in this case 40 companies divided into pages okay but that's okay we can also extract the information even if they are in two different pages then so in our example we will take all the market capitalization let's imagine that we want to analyze a particular sector which will be the technology sector and then the industry which will be i think consumer electronics that's where we have a poor gopro and so on so we only have 14 companies in here and then what is important is we want to get this into an excel because that way we can for example we can analyze the metrics and we can do different things also it is much better for the analysis to to get this in python in a pandas data frame and then either use the pandas data frame or use an excel so in this case you see that there is um here is the url that we see so you can see that when we make here some filter the filter are appearing here on the url so after the f you will see that because we have selected the industry consumer electronics then we have industry then we have underscore and then the fields that we have selected same for sector so sector underscore and then technology is our criteria so that's particularly useful because when we make a request with python we need to make a request to this particular url because then it will return our selected or the companies that are meeting the criteria so this is how you can understand what is being returned so it's important for you before using or scrapping a website to understand how this works so this is the url and then you see that here we only have the overview so we don't have so many metrics and then if we move into valuation then we have so many other metrics related to valuation if we go to financials then we have other metrics related to financial and so on so if you see each of these tables has a different version so the overview version is hundred eleven the valuation is 121 and the financial is 161. so why is that important it's important because when we make the request we want to merge each of these tables into a single data frame or into a single table right because then that way we will have everything together in a single overview so we don't need to deal with different tables so that's what we will be doing good so having said that now we know more or less how this page works the structure and how we can extract data so let's move to visual studio code in order to run the or to build this script okay so let's start now with python so in order to get the data that we saw from the website we will use a package that is it called request so basically what this is going to do is going to make a request to the website in order to extract the information from that website so let's now quickly copy the url so go into the website and include the filters that you want and then copy then we will pass the url within our get and here we have then we need to return the text like this because we will get basically the html that we convert into text so we can parse it later on so it should be like this now let's basically save this in a variable that i will call we can call this screen for example doesn't really matter you can give a better name than this one and then what else i want to do is let's print this it might not work because we might need to pass here some headers because the website sometimes they don't accept crappers from a script so then you need to pass a header like if you are using now like if you use a browser but let's print and see if this works so now if i run python then industry dot py our request is not defined okay i'm using here an s usually you have request so let's run this again and this seems to be given as okay so as i mentioned you don't have permission to view this directory or paste use the credentials that you are using so in order to solve these we need to pass some headers so i have here my headers basically these i found in stock overview so you can also go there and take some headers just copy paste but basically we need to pass the user agent so here we are basically telling which kind of browser we want to personate like chrome safaris at once and then with that we need to pass that with our request so when we make a request to the website then the website will think that we are a browser and then we will be able to retrieve the information like this so to pass our headers we need to pass here headers equal headers so this way now if we run the code again hopefully this this will work um so you see now that we are getting actual data uh you can see for example that here is the table the table data that we got and so on but this is the html so now we managed to get the data but only html which is difficult to read right so we need to do something with it what we can do is we can use pandas because pandas has a method that is called pandas html which basically what this method does it goes into the website and then it searches it search for tables and whenever it finds a table it returns that table in japan's data frame so when we run that i think this will be one table this one we don't want then the filters will be another table we also don't want then this probably here will be another table we don't want and then this will be another table this is the one we want to keep so let's see how we can do that so of course we need to import pandas to use pandas read html and now here is quite easy now let's save this in a variable i will call then we do pandas read html and then here what we need to pass is we need to pass the request so here is the request that we make we basically the request converted into an html so basically copy them i will remove the print copy the request we made including the headers dot text and then this will basically be within our tables hope that makes sense now next what i want to do is i want to print tables and see what we get i will expect to get a few tables later we'll see how we get the one that we are interested in so i will clearly clear this run the code and pd is not defined okay i need to import pandas as pd so now run the code once again and then you see you you see that you get different tables and this is the main one this is the one that we were able to see on the website this one here right so we are basically getting it in a pandas data frame here which is perfect has 18 columns great because now our tables variable contains all these panels they so it's a list containing the different pandas data frames which corresponds to each of the tables that we saw on the website so in order to extract only this table and get rid of all the other tables because we don't really need to have for example this table here we can basically because these are these tables now is a python list so you see if i go to the top you see this is a python list and each element in the list is a table so we can slice the table we want and the table we want is the second one started from starting from the end so if we pass here minus 2 then if i run my the code we are only getting the table that i wanted to it so we get rid of all the other tables just by slicing the list great so now i will keep that instead of printing i will keep that in a variable i will still call it tables and now if i look into my table i have here some numbers that i don't really want for the header of my pandas data frame so let's get rid of these numbers and use these particular headers of the table name as our headers for the data frame so we can do that by using tables and then columns so this basically is telling data to the data frame which columns we want to use here we will need to pass the names so this will be in the i log in the index position number zero so now if now we are using the first row as our column let's now print the tables you see and then we can see what we are getting so now you see that we have managed to use as a column the first row of our table before and we got rid of the numbers but now we have a duplication because we have two times we have the header and then the first row which basically contains the master header so let's remove that first row to do that we can again slice our pandas data frame so we'll be tables and then we want to keep from the first row to the very bottom so that we'll get rid of our duplicate duplicated row so now if i print tables you see that now we only have our headers and then the first row is basically the data from each of the companies good so now we manage already to get the first of the table right and i think we are getting the overview here it's 111 here in the version so let's check in our code what version are we requesting and we are requesting actually the version 161 so let's instead because 161 i think it was the ownership or the financial so we are getting the financials in this table here but we wanted to start with the overview and the overview is the 111 so in order to get the 111 just simply go into the url replace the six by the one and run the code once again and then you should be getting you see now the table is a bit different is because basically we are getting the overview table because we change our version here then you can do the same you can add any other screen as you want instead of the sector and industry you can use any of the other filters that you have here like descriptive or fundamentals or whatever you want you see how many screens you have screen options you have so you can basically pass that within your url uh as we pass here for the sector and the industry and by changing that you will be able to screen the companies based on your criteria for me it's fine the industry and the sector the sector i want to analyze this particular one good so now it would be nice to find a way to merge all these four tables together right because at the moment we have to change our url in order to get each of the tables but does not really help helping so what we can do is let's create a function and then with the function what we will do is every time we run the function we will request the one table and then we will request a second table and then we will concatenate these two tables so then the result will be i'm getting a new pandas data frame with the two tables together so in order to do that we can basically create a function here so it will be get screener you can call the function whatever you want and then just be sure to indent everything and then here we will return the tables return the table and then maybe what else can we do here so here we have hardcoded the table name let's pass here a variable so we will pass here the version we want to have and then that version will pass it here so that way we can dynamically with the function retrieve the different tables so here we need to pass curly brackets like this and then the version variable then this will actually do the trick so the version we pass here it will be given here as a variable to for this to work we need to pass here the f strings if you are using python 3.6 or higher this is the way with f strings to pass the variable within a url so that whatever we pass as an argument from the function will be plugged in here so we can dynamically get the different tables good so let's maybe try this let's create a variable i will call table one one one because this will get the one one one table that if you remember that was the overview uh so this will basically on the table into this variable then we need to pass here on the table one one one as a string like this and then if we look here the overview was very one one one one so this will be plus the one one one one will be passed here now if i run the code and bring the tables one one one let's see what we get okay so now it seems to work now we can pass 161 which was the other table we're looking at with the financials then you see that the table changes automatically perfect so what we can do is in our function let's now now i think the easier way will be to request a different table so let's request table 161 now so that will be get screener and then 161 and now if i print the six one table this will give me the other table which was i think the financials table right let's check so the financial table was 161 and then the evaluation is one to one so let's basically now add another table which will be the one to one now our problem is that we have this straight tail right but they are not too good and we'll have to print each of these in order to see them so what we can do is we can somehow merge the tables together so that we are able to show everything into a single table as i saw before in my excel right that i had so to do that is we can use consolidated tables like this and then pd merge so what this will do is it will merge any tables that we pass here within our our argument so first we will merge table 1 1 1 and table 6 1 6 1 so that will be shown as a single table within this variable so let's pass our two tables that we want to merge table one one one table one six one and then now we need to specify how we want to merge them so we want to make to do um outer merch type this is just specifying how are we going to merge the two tables okay next is we need to specify how do we want to merge our table so if we look here we know that the way it is going to work is going to look into the two tables and it's going to try to find a unique value in it or unique column in each of the tables and based on that it will merge them so for example if they have the same ticker then it will take the two tables and merge them by the ticker so meaning it will take the whole content of the second column and then putting it every all the way to the right so that we have the two tables together so to do that we need to specify on which column we want to do the merge we want to do it on thicker because we know our ticker in the two and tables are the same so we need to specify left on thicker i think it's called right like this and then right will be right on equal thicker good so i think with that it should work so now if i print the consolidated table we will have table 1 1 1 and table 6 1 6 1 together so if you saw before our tables were having 18 columns and 11 columns so now i will expect to have more than that let's see how many we have when i print the consolidated table so now as you see our um our pandas data frame is showing a higher number of columns because we have basically merged the two columns together so now we have 28 columns meaning we have merged the overview table here the overview table with the financial table so we have everything together into a single data frame good but as i mentioned before we have a third table here that we also wanted to merge at the moment we have these two together so in order to merge the third table the 200 121 with the existing tables we can use pd merge again like this and then basically we need to merge our consolidated table because this one already contains these two tables together and then we will add one more which will be table one to one and then we can use exactly the same arguments that we used before so now add that here with a new pd merge and if we print this once again we should have many more columns so as you see we have now merge another column because we have 45 columns together okay so this might not be very user friendly in order to look at this because obviously it's not even showing all the columns so we can export this into an excel by doing consolidated tables to csv then here in brackets and within quotes you can give the name of the file so this will be for example test dot csp and then if i run now the code there will be a new test file in my desktop or wherever you are running the code i am running it in my desktop so then i will expect to find a file called test.csp containing our merge table so let me open that one and as you can see this works great because i have now for the technological companies in the consumer electronics industry all the metrics together like return on equity ratios performance price volume things like this which are very useful all together in a single data frame so i could compare one with each other right so for example of course we know that apple is the bigger one then we can see which one has the best ritual on assets so this is not bad this in this company here has a return as it's bigger than even apple return on equity of course apple has very good then we can look into the current ratio quick ratio and depth ratios gross margin for example so we see that apple has very strong gross margin compared to the others this company also has quite a big gross margin so this is a way you can really find good companies because you can compare across different industries and put them together in a table and see which is the one that you are really interested in as i mentioned before this table only shows 14 rows right so in case for example you have an industry with a lot of companies like this is the case then you might need to change your code slightly because basically this url will only give you the first 20 companies so you will need to make a trick in order to get to get basically the information for the remaining rows so in order to do that i will not do it here because otherwise the video will be too long but here you will need to make a new request to the second page right so that the second page if we go to the google chrome you can identify because it has here this character on row 21. if you go to the to the row 41 then draw 41 in the third page row 161 is the fourth page and so on so you know that you will need to make a request to that particular url containing r and then the the number of the row and then you will need to do the same you will need to basically you can reuse this code because you will get it in the same format and then you will basically have to concatenate this table that we got with the first page with the remaining of the pages so here you will do a concatenation with pandas and then here you will pass your your table one uh your and your table two and this will give you basically um the row the first 20 rows and then the second 20 rows together so you will have 40 rows and you can do this for the 60 80 the 100 and so on but then i leave it up to you i already show you how to do the basics you can expand and then get the remaining of the rows as i mentioned before this page is great so have a look at it and in case you want to apply more advanced filters or for example have more than 20 tickets in a single table then you can have a subscription with them and then you can enhance the functionalities but i show you how to basically use them basic and the free version good so thanks a lot for watching i hope you find this video useful i will appreciate if you can share it and like the channel thanks a lot for watching
Info
Channel: Coding Fun
Views: 3,390
Rating: 4.9499998 out of 5
Keywords: Python for Finance, CodingFun, Finance, LearnPython, Fundamental Analysis, Company Screener, Python Company Screener, Price Earnings Screener, Industry Screener, Finviz Screener, Python Finviz
Id: cGL1FJKEfAk
Channel Id: undefined
Length: 26min 23sec (1583 seconds)
Published: Sat Feb 06 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.