Finding hidden API of HM.com to web scrape all products

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's talk about the page you are going to crawl first many of you are looking to do we're crawling so that you have data on competitors prices and stock I was looking at Asian M as my first example and I quickly found out that you could easily get all their prices and products using their hidden API so let me first show you how you can easily get a lot of products from a website like H&M and you can simplify your project drastically and still get amazing results let's take our example with H&M first if we go on to the new arrivals in the women's section in the United States section and let's go to the women and its sake check out the new arrival section here with the clothes section okay so now the first thing we need to do is to open up Chrome developer tab so on Mac you can press f12 you can also press on the three dots up here in the upper corner and then you can go to more tools and then go to developer tools and I always see this little tap on the side opening up here where you can see the HTML source you can also see a console tab and a network tab make sure that you have preserved lock enabled and disabled cache preserve lock is going to keep having the lock even though you navigate around in the side and disable cache is simply disabling the cache so you don't get cached results but that you hit the API every time we get results so in here you can see let me zoom in a bit you can see there's a lot of requests being made you can see the type over here and you can see the HTTP status code here you can also see here where the request was initiated from and you can see how much time it took and the size of it the type of requests were interested in here to find the API is called a a xhr request it's basically used to get a checks request between the website and a web server it's often used to deliver JSON data so there's no xhr request right now and we can also have a filter up here so filter only the image request so that's only the type of images that D we are getting in the network tab for the website you can see when I hover my mouse over these images they are fetching new images but we actually are looking for the xhr request so that's the API request but you can see there's actually no request being made via HH are right now there's no JSON data being fetch from the website but if we go scroll down we can see there's one xhr request being made right now but this is from another domain called double clicked net this is most likely some kind of tracking for the customers on the website so that H&M have some kind of data of their users I can see there's also a Google Analytics ID inside here the you a with the - five by five anyway if we scroll down and the case for a lot of these types of sites is that when we click on load more products we actually do a xhr request or a HX request a JSON request I would just call it from now on so now when I click on that boom there we go we have a product listing that displayed that JSON which sounds very promising for what we are trying to get which is all the products from the website so I'm going to enlarge the tap here and now we click on the request here we can see some more info about it so we can see the D headers the preview of the data and we can see a response as well so the preview here inside we can see there is a total number its JSON data in here this is JavaScript data so to say it has item shown in a total let's go and see what the products property here is well that is an array with lots of products it looks like yeah so we have a link for a product we have a title and praise high angle jeans and we also have a list of images we have some images for the product and we also have a price down here and let's try and search for this title as well on the side just to see if it's actually showing in here so if I press ctrl F and then I search for it we can see the product is actually right here and so this is there when we clicked unload more products we got a JSON data response and then the JavaScript is updating the site HTML with these new products so right now we basically have everything that we need in order to get data from this side I'm going to show you now also how to do this request in not just in your web browser because this is where it originates from but we also need to be able to do it from another client or inside of nodejs so now in the next section let's take a look at how we can do this request inside of another client in this case the postman REST API client instead of are you just a web browser and how we can get every product that Asian M has
Info
Channel: ReactNativeTutorial
Views: 66,607
Rating: undefined out of 5
Keywords:
Id: 6gtHzj4GMLo
Channel Id: undefined
Length: 6min 13sec (373 seconds)
Published: Sat Mar 28 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.