How to make POST request with SCRAPY and pass form data - FormRequest alternative that always works

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey what's going on guys goodmakerkins here i've been so busy with my new chess programming channel so i didn't make videos on web scraping for quite a bit of time but recently i got an email from one of my subscribers asking to help him with the post request within the scrapy framework to make in order in order to scrape this uh laptop uh data from gbhify.com dot a u so uh the approach that he was uh considering uh was quite pretty predictable so he was trying to make this using uh so-called scrapey's form request which is uh a way to e to make post request easy when the form data is passed within the form request as uh as the dictionary but in this case uh the forum data uh actually looks a little bit um i don't want to say weird but but not really that uh common and i just tried to apply this form request just just as he did to in order to make uh an http post request to this sort of a site but but unfortunately the form request method could uh uh did refuse to take the this sort of a string that has been parsed into a dictionary by jason lloyd's command and he actually refused to to take this so uh i was thinking like what kind of solution that we could we could have came up with in this case and i decided just i decided just to try the very post request uh without this fancy uh form request from scrippy so in this video i'll provide a simple one-line solution for you guys to be able actually to uh get this uh get this json data and even parse that parse it into uh and even parse it yeah into a strain basically or first to dictionary then to string just to pretty bring this uh in the console and then obviously i guess my subscriber would would actually be able to reference uh this uh this kind of title description and all the stuff or maybe even i should tell this on my own well let's let's have a look so uh anyway i just want to let me open the terminal in on my desktop and going back here and i just want to invoke my scrapy shell so first we need to save from scrapy import request to be able to fetch the request object not uh not the very url itself and then we can say so uh i would probably like to make a request and respond simultaneously in order to actually fit the one liner format so i just enclosed them with an apparent this is to create a tuple where the first element would be fetch and the second or also need to import uh import json okay so here first let's just try to say response dot text to print this uh i can even print this like so and then within our fetch method here uh we create a new request object and specify the bunch of parameters so the first one would be the url that would be equal to the euro url grab from from here so just copy this and paste then i need to specify the method the method will be equal to post obviously and then instead of the form data like in that fancy form request i'll just use i just want to use the bare uh body uh like a keyword argument here and body would be equal it would be a string and now we need to just grab this form data not in not view parts not here it's just browser parses definitely the wrong way it happens yeah it just happens but just to grab this raw string here just try to copy it and then go back to bro to bro to your scrapy shell and paste uh paste this just right into it here if i did everything correctly then after hitting enter i hope to see uh the 200 response and yeah and also we do have our data being parsed here so now uh the very first thing to consider would be actually to uh parse this to python dictionary from the bare string so you know in order to do this i need just to say json.loads and this sort of stuff here and now in order to pretty fight this as you know like in a more human-readable way i just can say print and then say jason dot dumps and the first argument would be this json load stuff then indentation equals to two spaces close json dumps close print parenthesis and now we should print this in a more human readable way and now you we can already see that the data is being available but we don't really want to use to to deal with all the data so we can just limit ourselves with uh preview so results zero and hits so i just i just now try to specify the appropriate uh keys here so response stack so this is the dictionary so now let's try to take care of results here and i'm not really sure like is this a key or is just uh what is this oh well we'll we'll see now so let's first try to reference this results and i guess this is the list right so probably probably take uh the very first argument of this the very first index from the results will it will it allow me to do so yeah it seems like and from here i just want to take this hits okay so now okay perfect so now we got this uh all the stuff regarding our laptops i guess so highlight well i'm not really oh and here yeah actually we have uh we have uh we have the is this a list or what is this just trying to to figure out the data type so now we are at hits and this is the list oh okay okay so hits is the list oh so if i just if i just try to print the very first element within the hits so it will give me a single okay product face a category title yeah it it it already seems like so highlight results highlight results oh okay i'm just wondering just hold on a second let me just grab this highlight results can i now these are the least the least items uh yeah i can't bring the highlight results within the only element but i uh but i can't do this for all [Music] yeah i can print this for the only element but can't do this for all really so i guess that's still let me just try to make it at least for one here so highlight results for it for the only one okay so yeah in order to make this to print this for everyone i need to use the list comprehension here so just get rid of this stuff and from this as well and try to use my list comprehension uh hold on a sec so now now let me get rid of this print uh okay so we'll go slightly a bit different way so get rid of print get rid of jason dumps okay so still make sure this still compiles and runs okay and now to use the list comprehension here and save for laptop in this list i want laptop and this highlight results should work now okay and if i now just try to rent and now to say json.damps and this laptop highlight and indentation equals two spaces okay so now i should print pretty print okay regarding every product okay so we got this sku title category well okay it's it's now already seems like uh more or less reasonable data i'm not sure which the data in particular he needs to to scrape from this sort of a uh sort of a structure or probably uh probably he he would like to scrape uh the title so every everything everyone has the title and the value well let's try to scrape them something else like i know is there any price well maybe price is taken from other api i'm not sure uh i don't well maybe price is taken from from the higher level actually that is also possible so okay if i just um if i print not the highlight results again but just try to print all the all the stuff available there so product type okay maybe here so do we have any price online timestamp category okay we got the price so it's just okay we got also this primary title okay so probably hold on a sec um [Music] hold on a sec we got this price and primary title let's try to scrape these guys because for sure they would have been needed uh okay product facets category category okay again so just hold on a sec so we are at results zero hits and for every single hit so this is the laptop okay so i'm just wondering uh i saw a pr title there so let's try if laptop title would work for us uh actually let's get rid of this json dams for a while or for forever okay so let's try to print a laptop tile yeah it works okay perfect and let's say we have the laptop price this should work still okay guys perfect we got the we got the price so let's actually try to store this uh to csv at the moment so i just also want to import the csv module here and first let's define the structure so i'm in the dictionary so we will have this like dictionary and we'll replace this with cspd criteric dictionary writer i guess yeah with a csv dictionary writer so [Music] here let's say tile and this would be equal to this okay and also the price would be equal to laptop and price okay let's try to run so now we should see the pairs okay we got the title and the price for for a given laptop okay and now let's try to now let's try to write this to csv basically so instead of printing i want to create csv dot dictionary writer uh and dot right row okay so it would take open let's call this laptops dot csv and we want to append to file stream because we want to be doing this line by line and the second argument to the dictionary writer would be the field names so let's consider the title and the price and if i did everything correctly it should now provide this laptop's dot csv file here okay let's have a look okay let's have a look i hope to see like titles on the left and prices on the rise just the very two columns quite pretty simple so let's make sure that it works actually yeah yeah it is perfectly it works perfectly well so this is it guys well okay so uh i think this is it for this video so i hope uh this uh the the usage of the better post request with a specifying uh body keyword body keyword argument as the better string taken from the browser makes perfect sense uh compared to using this fancy form request that is not actually capable of parsing this sort of a data so even if i do parse this data to uh to the python dictionary using the json loads command still it's not to be it's not about to be approved by the form request form data keyword argument that actually gives an error like uh the the data has that has been provided is kind of wrong there so that's quite pretty simple well okay guys so this is it for this video uh i'm sorry for not doing some whip scraping videos because i'm really really busy with my chest programming channel i'm trying to develop it from scratch and of all and i'm already maintaining uh a chest chest engine that i'm currently working on there or some at the same time i'm making lots of tutorials on how to make or how to write your own chess engine i already did the part that regards to move generate to the mode generator it's what about 20 plus videos there and now i'm making uh the search parts so how one can uh write a chess engine uh using the chess pro the chess engine framework that i've created uh uh in order to do this so now it gives you an ability an ability to actually just going with writing you're just just programming logic straight ahead without being bothered by creating the mood generators so uh i would really appreciate if you have a look at my new chess programming uh a manual chest programming youtube channel same code makikin but a bit different topic i understand that probably most of you don't really care about this but i would really appreciate you can not like and this video and you cannot subscribe to this channel but i would really appreciate if you subscribe to this chess program and follow a couple of tutorials to have a look like the style of these guys so this is kind of it well guys uh thanks for watching i really gotta go now so until the next time and take care
Info
Channel: Code Monkey King
Views: 1,102
Rating: 5 out of 5
Keywords:
Id: F0IfuZ2Ghy4
Channel Id: undefined
Length: 17min 47sec (1067 seconds)
Published: Tue Aug 11 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.