Using a proxy with Puppeteer

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
haha now you can hear me good morning happy new years um here we are on this beautiful day with this probably short stream let's see what we got uh we're gonna be per request uh on youtube we're gonna make a video on how to use a proxy with puppeteer um as you can see here so far i just have i'm planning illuminati um that's the main proxy i used to puppet here uh and then we'll look at proxy call crawl to see if we can do it i don't know how long this will be so if we look at proxy crawl and it's easy then it'll be really fast if it's harder it'll be a little bit longer video so first i'm just going to go uh ripper i'm going to start creating my this is yeah just from the beginning so index.ts and it's like this and we'll say npm init i will say jordan uses a proxy with puppeteer yes i don't have one yet i should get one now hold on new repository and we'll say proxy with puppeteer use the same description here who's excited for new year's it's fine i'm equal fine excited about it not bad not good it's been cold here in idaho that's the keywords puppeteer proxy scraping what else javascript type script there we go okay now it's uh npm i save uh bumped here plus we need dot m so dot m allows us to use uh environmental variables here and then let's get the type form here and at types dot and there's probably more i need but we'll start with this i'm going to be exposing my local ip address here um that'll probably set the radar router or something whatever it's not that big of a deal i'm not too worried about it that my ipad just changes enough that it's not a big problem okay so here we go so we have here we're gonna import um like that and then we're going to import dot m from dot m and then we're gonna say dot m dot config this is how you start it up so this will allow you to um to use environmental variables like this async ah there we go mouse browser equals um puppeteer.launch we're going to go ahead so by default it's headless um true so we're going to pass it ahead this false parameter browser dot new page page dot go to i will go like this and we're going to go to this location right here any background music being picked up i don't show it registering anything not to worry about you know just the sweet music of my voice is probably enough you and then we'll say oh um page dot wait for and this is just for uh so what does that mean i mean it's like deprecated interesting look at this it's struck out what is that a possibly deprecation warning that's pretty awesome and then they go wait browser.close now we go over here to our thing right here and this is just our start right this is just our testing thing uh no no no no there we go so i just created this thing that says hey run typescript transpile the compile thing and then run the index file of js so it's gonna oh when you get a ts config file hold on hold on i have that somewhere else so just copy that from somewhere else paste there we go now we've got ts config in our ts config settings we have it outputting to dist so that's what this will do there we go pretty good um okay my save although i thought i had installed globally anyway i'm kind of curious about oh there goes there we go okay just check aha look look look it's a deprecated look isn't that cool look if it's struck out it's deprecated all right cool i wonder how it's supposed to be that scares me a little bit though how's it gonna be instead uh wait for there's no way for time well hold on i should this is not the new documentation okay now now for the proxy here we go okay so here we have it like this it's really simple we type this we pass in a separate argument we say args and it looks like this i think this url i don't know if this is what everyone uses or if this is specific to me i'm pretty sure it's something that they always um in fact let's check open this up in incognito let's see yeah yep then what we do so here we are like this we uh a new page and then before we go anywhere else we go wait page to authenticate right here we pass a username and it's going to be process dot invite over here i'm not using him let me pass that password my password there we go like that so now it authenticates with those and it'll go there and then let's check what it does so now last time it was like 65 something let's check it again see what it does now there we go in fact it thinks i'm from country whatever see that though there's 80 whatever let's try it again 8240 223 205. and you can rotate those around we can use residential proxy i think you can pass it a certain country if you want to but with illuminati this is this is i'm using the illuminati proxy now um the cool thing about oh i don't know what to push the cool thing about illuminati uh is it's super it's very powerful it's a little more expensive than some of the other ones i am an affiliate and um i will put my affiliate link in there so if you want to use illuminati it wasn't that difficult to get set up it's a little more complicated than some of the other ones to just use use the url but um for things that i've used before like scraper api which are great um and where i see how proxycol works with it they just provide a url so it's normally like it'll just be something like some url right some url and then it's like like scraper api.com proxy slash and then you say you add a parameter that says url equals your url so wherever you want to go in that case this would be like this right so it'd be const url equals this now the problem with this right so this is now this is this url that we passed in the problem with this is that what happens if you need the cookie and you need to proceed to the next page now this works great because the api if it is the page does any redirects it's not like if you click a link on there it's not going to automatically you have to like grab the url and then go with it but the cookie is not gonna be attached to it as it goes through um because you're you're routing through the scraper api and this does it differently how they just authenticate here and the whole it's just like a real proxy honestly it's not like the super easy ones like um scraper apa now let's go through proxy crawl and let's check it and see what we can do there um let's see if i can remember [Music] let's see i want a i have this one right here and try to find my yeah see proxy crawl is another example of that but it should be super simple so instead of doing instead of authenticating we'll get rid of this argument here this is not looping out anymore actually i'm going to go like this we'll say function uh proxy with illuminati actually what let's do this cool thing not just this this is neat and we go like this and we're gonna go what's that sound like this right click and we say refactor and we call it subjective function in module scope and we call the new function scrape proxy with illuminati there see i built the function down here with all that stuff this is really cool okay now we're gonna have some similar things and we're gonna go like this in fact i'm gonna copy this function now we'll change enough of it it'll it doesn't make sense to use it and look like this and we'll say proxy with proxy crawl now a lot simpler here okay using these kinds of things are way simple so you just like this like this could be our desired url i just will just say you are he'll here and we don't authenticate but we do go here and we say that'd be like this like that so this is exact kind of thing right oh but our token will need to be i think it's and url right it goes like this and this will be process dot dot proxy crawler token right here okay so it gets our api key which you'll have to get from it's create proxy crawl or illuminati whatever you're using this is proxy call in this case so up here you'd have to obviously use the the api credentials from illuminati this one you have to get from proxy crawl now again this is super easy right all you do is pass in the url the url you want to go to so you just go proxy crawl and then use the token and then you pass this incredibly simple and probably the easier way to go for most processes now if you're into any kind of login you're gonna need that the cookie associated with the puppeteer session so this will handle um this will not be able to handle that kind of situation at least i don't i can't i couldn't figure out a way to do it yet i've tried it multiple ways but the thing is if you get redirected in fact let's test it we'll first test it with the so we go here in fact they were detected here they were coming through let's double ch let's click it uh any more time okay yeah we could implement things to solve captchas whatever i'm not that's not what we're here about right now so i'm just gonna go in here and i'm going to okay see the different ipa address let's try again one more time sounds like 97 something so we get this time 95 that makes you think it's coming from brazil i don't know who knows one more time so i'm sure you could wrote these right through these around more if you did okay that was 15. whoa whoa uh at the first segment that was not 15 seconds that's for sure was it did i not save it well any case we saw work i'm not worried about that but now i want to see what happens oh there we go different ap address okay i want to see what happens if i go and push a link and so you just have to be i think pushing a link is probably fine no no pushing a link won't work because it's going to go to the link and not going to append this what you'd have to do is is grab the link and then attach it um to another request right so you go through and you say hey go to this place get desired link and then do something like this right and this would be whatever that link you grabbed desire link here this if if navigating around page do this there like that simple as could be now quick 15-minute video now do we hit everything again illuminati that's going to be a more intense one maybe a little bit more expensive and definitely a little bit harder to implement um but they're solid i use illuminati a lot it's very solid for for any kind of proxy um their support's great and like i said i am an affiliate but i do use them proxy crawl is going to be easier one to get into all you have to have is this url i've worked the proxy called quite a bit i've been really impressed with it um and i'm also affiliated with them um i don't use them as much because i just have illuminati they'll be like baked into my stuff but um for any first time user like if you're not looking for intense scraping i would definitely recommend proxy crawl that'd be my first one but if you're if you're looking for like something that's going to scale illuminati is a good way to go anyway here's how you do it uh basic proxy you don't have to do any kind of fancy stuff you just update your url a more intense proxy like illuminati you're going to pass an argument of where the proxy server is and then just authenticate once you open a new page done i don't know if you hear music how do i fix that let's see i wonder is this going to a different audio see desktop audio is that hold on we're gonna uh this is not time to mess with this i'll mess with this later okay that's it thanks team
Info
Channel: Cobalt Intelligence
Views: 817
Rating: 5 out of 5
Keywords: twitch, games, javascript, typescript, web scraping, puppeteer, proxy
Id: nwS6TgXRTQk
Channel Id: undefined
Length: 17min 15sec (1035 seconds)
Published: Thu Dec 31 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.