Puppeteer + Node.js = App That Tracks Prices on Amazon
Video Statistics and Information
Channel: Tom Baranowicz
Views: 26,967
Rating: 4.8959107 out of 5
Keywords: webscraping, nodejs, Puppeteer
Id: 1d1YSYzuRzU
Channel Id: undefined
Length: 21min 14sec (1274 seconds)
Published: Mon Feb 10 2020
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.
Note that if you do this from different IPs, you get different results
... also a good way to get yourself IP banned from Amazon, but good luck with that, i guess.
also, whenever an API is available, use it. scraping information should be your absolute dead last resort to getting it.
This looks like a great starting point to learn web scraping as a concept as long as you don't do it on the likes of Amazon or Google. like others have pointed out - doing so will get you ip banned quickly.
For Amazon, I have used and still use product advertising api heavily for getting product prices as well as other product data.
it's pretty easy to get access to and the rate limits are fairly allocated based on how much sales you drive them. Search for Amazon associates and you will find everything you need on this.
If you are interested, I shared a case study of one of my blog doing about $2.7k a month from Amazon associates here -
https://www.bloggingcage.com/amazon-associates-site/
Even that sites used product advertising api to display prices inside articles.
You can do something similar with product reviews. Here's my project:
https://github.com/ajbogh/amazingreviews
Very cool. Honey (joinhoney.com) can do this, but I am unsure of the alert delay from price trigger.
It's mostly because I haven't had a use case for scraping with Puppeteer (yet), but I must admit I hadn't thought of using Puppeteer just to get the page HTML, then parsing it with Cheerio like you would with classic scraping. Thinking about it, there are some advantages to doing it that way for certain cases. Still, for a simple case like this I was expecting him to just use
page.$()
orpage.waitForSelector()
or similar.Awesome video! would not have thought that this is that easy!