Martin Splitt - Technical SEO 101 for React Developers | React Next 2019

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what am I here today to talk to you about first things lunch right yes had a good time excellent I you know after lunch normally people are like whoever but you have the energy I like that that's fantastic cool so today I like to talk about something really important and that's kittens right so I think the Internet has not enough cats and I know that that's like a very like debatable opinion but I believe that we need more cats and actually also more dogs I'm not like racist or anything so like dogs are just fine to me as well but I thought I might like build something really Volusia nari to like disrupt the internet and I'll build like a really cool website I call kitten club and it has obviously like it's like super like it's the best kittens I can find on the Internet's and I put them all together in one side and you can upload your own kitten have it rated and if it passes the quality threshold and it might be displayed here along the other amazing kittens that I've got and the cool thing is these days that's pretty easy to do right I mean we have react react has hooks I just like build this page and boom we have it all done thank you very much for coming to my TED talk except there's this question that keeps popping up especially if you are working for like a startup or a larger company that produces a new website or you are helping someone make their way into the web it's like do I need SEO and the answer is clearly of course not I mean you know SEO is easy you're just like big website and then it'll be found oh all right okay that's not good so okay I might want to know a little more about what SEO is to get into the internet and be found by search because actually how do people find websites if you're not one of the larger websites already in a very well-known brand people would probably not find your website especially if you're starting out on the web as a smaller company or you have a new project or you have a new product you wanna have people find it and you want the people who need it to find it and search engines are fantastic channel to make that happen if you're looking for something if you look for I don't know a car rental or a place to eat you probably go to the search engine of your choice so you want to be in the search engine you want to be found through the search engine so how can you facilitate that well first things first I would like to talk about what SEO is because who here has worked with SEO in the past as developers keep your hand up if you enjoyed the experience oh like one person amazing keep that like bring I'll I'll give you some Swiss chocolate that you can bring to your SEO because you want to keep that person know so the thing is SEO is not necessarily well understood and sometimes it's you know you people are joking about developers they're like what's an algorithm it's what a developer would say if he can't be bothered or they can't be bothered to explain what the code does to you right and SEO czar sometimes similar they are not necessarily the best people to explain what they're doing and there's so and that's not because they don't want to or they're terrible people it's just because it's a lot of stuff that goes into SEO and everyone has a slightly different definition and everyone does it slightly differently based on what their clients or companies that we work for need and that's perfectly okay we also develop software differently so like confession I'm not doing that much react I'm actually more of an either like vanilla lit or view guy but I have done some react and I have done some angular and I have done it quite extensively over a long time so I know what it does and it's a tool under my tool belt and SEO can be similar but who here thinks that SEO is basically sacrificing a chicken and like running around the computer three times and then hoping that the moon and the stars align hands up yes that's more people than say they enjoy working with an SEO so what is this SEO well SEO stands for search engine optimization and is actually in my opinion most of the times it's three things you start with the content SEO should figure out together with sales and marketing what is it that we are having and who is it that's using it or needing it and what are these people looking for what's the problem our stuff is solving and how do we describe what we've got to these people that's not really technical so what else is there well there's also the strategy side how do we do this what do we want people to do once they're here what do we want people to surah what we want to support people with what are the questions they are having what are the things that we can suggest them how can we guide them through what is probably a very complicated process if you've ever booked a train ticket in Germany you know complicated processes can take many many steps to successfully complete again that's not really something that we can help but that's what a good SEO these two things are things that good SEO is do for you but where we can help is the technical part because SEO is also technical because it is a technical process and it deals directly with our code and crawlers behave differently than users and crawlers behave differently than regular browsers sometimes so a good technical SEO can help you make the right implementation decisions and they can help you make decisions while you're designing your system rather than fixing the problems once they have happened but they can also test the system for you and they can also make you more at ease with what you're building by saying like yes we're doing find or about it we'll be found everything is good and if something goes wrong then probably also know how to fix it at least they won't fix it themselves but they can tell you where to find the information you need for your specific tech stack to fix that problem so this is where we want to focus today because that's where I feel most at home and I think all of you as well probably right cool good so the first question I keep getting is like also does that mean like react you showed us that your site was not in search does that mean like react kills SEO do can have never a successful website in in Google search if you're using react and I say no that's not true but the thing is it's a technical process on both sides we are building a technical system and there's a technical system that interacts with our content online and we have to make sure that these to play along together quite well and I think the first step to understand what's happening and where things can go wrong is to understand how does your page get into Google so how does this work I mean the other search engines do similar things but I can't really know that much about them because normally other search engines don't really talk about that to us or to the public so I can only deduct from what we're doing what the others are probably doing so what I'm telling you is definitely valid for Google search but it's definitely also some extent of valid for other search engines cool so we'll start with a list of URLs we either found them somewhere or someone submitted them to us or we just you know we know a bunch of web pages so we take these URLs and put them in a list and now we take things from this list and we don't do that manually we don't pay people to do that for us we have Googlebot for that and the very first thing Googlebot does is it crawls which means it makes an HTTP GET request to the URL that it got from the queue and it doesn't like it's not just one computer doing that it's many computers doing that at the same time obviously but they are making get request and they get some HTML or something back depending on where the HTML the get request goes to right so now we have some content and now we need to understand what is this about if it's HTML we probably have some semantic information like this is the headline this is the link this is an image so we can figure out what is this page about from the text from the image and from the semantics of the page and from the videos and all that stuff cool and also we'll find links here so once we found the links we can put them back in the queue and then we can basically do the next thing and we put this information like this page is about cats this page is about dogs this is about ice cream this is about react we put that in our index right because if you go to a library you go to someone who knows where books are right and that's pretty much like the index kind of thing like I need a book about cooking vegan and then you go to the person or to the computer that lets you look up the index and you go like vegan and then it shows you all the page all the books that are about vegan cooking you do that pretty much the same thing here we have an index that we can then query at the problem though is if we look at our website from earlier what is this website about so we made an get request if we got this hmm it's not necessarily clear what this webpage is about right so yeah I mean the problem is that some crawlers actually don't even run JavaScript so that's all they see and they're done again thank you very much for coming to my Tech Talk no not quiet because Googlebot actually runs JavaScript and has been for a couple of years now so we have a bit of experience on how to do that that's fantastic because that means that we can extend our crawling infrastructure right so we crawled we processed we index but we also render the problem is that's the web as far as Googlebot knows it's a lot of pages isn't it it's 130 trillion pages now it turns out I don't know if you ever start with computer science and had a prophecy like mine the theoretical computer science people are like imagine so we're assuming that you have a computer with infinite memory or with infinite amount of cores now this doesn't really exist yeah but we have the cloud Martin yes but it's still someone else's computers so like there's still someone's actual hardware and we don't have infinite amounts of hard way I know shocking but that's the reality and this number just keeps growing this is just the largest approved number I could find so yeah so how do we do that well again the queue so we figure out okay so we need to render this page and then we put it in a queue and the moment we have computing power available we actually render it which means we're using a headless browser in this case chromium to actually run the JavaScript and get the HTML after the JavaScript has executed and then again we are processing the links that we found all the URLs go back into this URL crawling queue that's how this works and then it goes into the index and then something that I'm not going to talk about SEO Slough talking about it I really don't because it's not my cup of tea and it's a completely different kind of worms because we have to open we have to do something after that but doesn't matter so this is the HTML that we are looking at at Googlebot in Google bot this is the rendered HTML so this page clearly is about the kitten Club and it's the best kiddies on the web so this page now has content that we can put in the index once we have it in the index we can render it we can rank it but we're not talking about ranking sorry I'm just this this you confused me I'm surprised I put that in anyway you're gonna hear some SEO Stelling too talking to you about that you know that's all nice and fine Googlebot can run JavaScript but Googlebot runs Chrome 41 y41 yeah that used to be true but we have recently updated Googlebot it's now an evergreen chrome and whenever we're gonna update the stable release of chrome Googlebot will update within a couple of weeks so you don't have to worry about the very old version of chrome anymore if you hear that point them to this blog post it explains that that's no longer true that has been true for the last couple of years we have fixed that and now that we have it through the index we have to rank it which means once someone comes and asked us for the cutest kittens then we have to look at what kind of kitten pages do we have and how good are they and we're looking at lots of factors of what is a good kitten page in this particular case if the query is kitten so for every query we are ranking pages differently you don't just generally rank all my page ranks really well ranks really well for what if you are selling vegan cakes ranking for the query cutest kitten doesn't really help you but again ranking is a completely different can of worms I'd rather not talk about it so all we are talking about is what we can influence its developers we're gonna talk about crawling rendering and indexing ignore ranking ranking is not a problem if you have good quality content if your content is good and you're doing things right you will be fine all right cool so there's a few things that you should do to make sure that Googlebot can process your website properly one thing is you need to make sure that you're actually linking between your different pages because if what we're doing is we are getting the HTML we render the JavaScript and then we get more HTML and then we look for URLs and links if you're not linking your pages we're not going to find the different pages well there's ways around it but normally generally speaking you should just make sure that you have links between your different pages of your website let's say your home page is a bunch of products that you are selling if I click on that product I should probably find links to other similar products and then we can crawl going through your page through the links also use actual links we are not clicking on things oh but it's a button no so good good for you we're not clicking on a button why will we click on a button we're looking for links the web has links use links right oh but I can have an onclick handler no links just links it's fine trust me and while we are on the topic we are looking for URLs please do not use hash or URLs if you're using the hash hack that we use - we have we had to do that in the days but because that was our only way to change the content dynamically with a JavaScript event now we have history API use that react router does it by default you should be fine but I have seen some react applications that are basically best described as legacy do not rely on these if you have one this is fine it's not that we're saying you can't have any of these but this does not describe different content if you're using a hash to load different content then you are hacking the URLs that's not working with crawlers also you should make sure that you understand which pages of your website are valuable if you have really good description pages if you have really good informational pages those are the meat of your website because if I have a question and your website answers this question that's great if you have user generated content you might not know if this content is good if you know that it's not good because a lot of fields that you gave as optional feeds are not filled in do not submit it to the index because you know that just takes up time for us to crawl and we can crawl the more interesting more meaty pages of your site maybe later or slower so that's not really a good thing either and optionally you can create a sitemap and tell us all these pages are really important to us the others not so much and then even if they are not linked very well we'll find them through the sitemap this is what a sitemap is into the XML file with all the URLs that you really truly care about you can use some server-side code you can use puppeteer or something like that to create these they're not a guarantee that we're gonna index it it's just you're telling us that you care about these URLs and if we pick it up that might be one of the signals that we use to figure out which URLs to crawl how often and when also you don't want to be in this territory right so if I'm looking for a recipe for apple pie where do i click here are these good search results not really especially if the description is the same for all of them so you see like we have a little bit of description text below the title if they are all saying the same thing for all the different recipes on your website that's not a good experience instead give us specific titles and descriptions if this says Barbara's apple pie this apple pie is really easy to make and quite fun to bake with kids if that's what I'm trying to do that's a fantastic information that I want while I'm scrolling those search results right it's not a ranking factor let me be very clear this is not a ranking factor but it helps you users understand that this is a good search result for them so you get more clicks that way oh wow ok nevermind the important bit this is easily visible you can use a plugin called react helmet to do that in your react pages you can use all the props and in the render function you basically specify the title in this case I'm using the name of the cat here oh okay kittens dog club and then I have a description that tells me what this this kind of cat and kitten or cat is doing excellent but what do I do when I have a situation like this so I can watch our goats specifically to Abby this way but I want to be friendly to people who mistyped because especially phones sometimes do weird things with capitalization so this should be the same page but because I haven't started from a green green field I actually have a legacy application I also support the IDS and actually like it that used to be a really really old legacy application that had URLs like this I redirect them all to the same content but how do we pick how do we know you probably want to make sure that we are having a consistent URL and we're not using your else that you want to see not on search results well this is called a canonical and again in within the react Hamlet bit you can just tell us link relation canonical and then give us the URL that you would like us to display for the specific thing and to use so you can help us find doubly cats we are actually pretty good at filtering duplicates so if you are in this situation and don't tell us anything we will go at random will be like ok so this is something that it was linked to a lot so we're probably gonna go for this one if this is something that is linked to we probably go for this one but you actually want this one and just tell us we might not pick it if you if we think that your Canonical's are wrong or not helpful we might ignore them but generally speaking give us a canonical that helps us a lot also if you want something not in the index you can add a meta tag you can add this meta tag robots here actually we have three screens let me do this we can use this meta tag robots here and say do not Index this page in this case we will crawl it but then go like okay this doesn't want to be in the index we're gonna remove it it's not gonna show up in search results if however you're okay with being linked to and somehow showing up in two search results sometimes but don't want us to crawl your page because your server is really brittle that you can also add a robots.txt and tell us not to do that so everything under slash private will not be crawled by Googlebot you can also use a user agent asterisk and all bots will not crawl this URL be careful with that though so here you see like my kittens club and you see like there's no images down here none it's all just blank and it says like kittens club but why is there no image loaded here well the problem is if I check what happens when I call the API this is like some random API slash cats then it says Googlebot was blocked by robots.txt so I have thought I safe Googlebot the crawling of my API but my API needs to be crawled to display any content so don't do that be really careful with what you put in your robots.txt we also have something that we call structured data in which results so if you've seen like something like this here's a product that is pretty highly rated and here's a bunch of recipes if you have one of these things if you have recipes articles movies videos books something like that events all sorts of things you can get these search results by adding what we call structured data you give us that and we'll be happy we have a rich results test that you can use to find out if your page is eligible and we have a page that describes all the verticals that we support so everything is listed here with lots of documentation on what you need to add to your website also let's talk about performance so here's my competitor have dogs all right was that a good experience well I don't know like if I look at it the first moment that I'm delighted is here so like the time to first Dargo is really bad but luckily there are things that you can do you should invest in server-side rendering or hydration or pre-rendering depending a little bit on your use case because it makes things faster for react snap for instance react snap uses a headless prom to crawl your website and create HTML pages out of it it looks more or less like this if you want to use hydration as well so you can say like okay we'll load the static HTML but then we add the HTML the JavaScript action on top of it and then you have a pretty much regular running react app in your browser but you also sent the HTML to crawlers that do not run JavaScript so you win you kill two birds with one stone that way and server-side rendering does make a difference if you look at it here like we have the original client-side pen rendered page first a dog on here and first our go here if I server-side render and that's the only difference the rest of the code is the same it's just server-side rendering versus client-side rendering and each of these is like one and a half seconds or something on a really slow mobile connection so that makes a big difference you can also use one of these Gatsby and XJS are pretty great and actually bring you a lot of this stuff for free and you don't have to deal with this once you're guarding guarding yourself into it gatsby even has SEO documentation which i think is fantastic and something that i would like to see more of by other frameworks if you want to do lazy loading you totally can if you use like react lazy and suspense for instance then you can load your components in a lazy manner and it doesn't really speed up the first two action but it speeds up following up interaction so that's pretty cool and then that just works with search as well be careful to test this properly because there are race conditions where things can go wrong this is more or less the code for this so we are lazily loading this specific component the caps list component and then we use the suspense from react core to say the fallback for this is the loader word which was this pulsating kitten so you can use that to lazy load your components there's also a workaround that we call that is that we have that is called dynamic rendering dynamic rendering is workaround because it doesn't give you the user benefits basically what happens oh wow can we do it a little darker again so that the arrows show up that'd be cool if not that's fine basically a browser comes to your server it asks like hi can I have the thing and you just sent to our wonderful there we go no yes yes fantastic and it gets back the initial HTML and JavaScript the client-side rendered app but if a crawler comes and they tell you in the user agent your server will branch out and run a headless browser to actually pre render it into static HTML that way all the crawlers get the content in a static HTML the downside is it's a workaround because it's only for the crawlers you don't get the user benefits of it you don't have a faster website afterwards you can use things like render Tron puppeteer or pre-rendered or io those are tools and services that do this for you and you configure them in your server if you don't want to touch your client-side code that's a way to do that we also have a code lab if you want to try it out and take render time for spin we explain it to you on the code lab there's videos that explain all of this in like a little serious and the series continues so please stay up to date with this because there's more videos coming and the first eight videos are quite useful already so what can go wrong well I make a tiny mistake with my arms here and then this happens so you can fall sometimes that can happen with technology as well so what happens if I go to URL that doesn't exist that looks like a narrow page right except when I check the HTTP code it says this is not a narrow page this is fine and normally Googlebot is good at catching that but it should make you nervous when that happens why should it make you nervous because you might end up with something like this right so like this is this is something you don't want to see an error occurred in the search results because you haven't told us that this is a problem this is not the content that you would like to see this is how you fix that so here we are actually why is this asking for dogs doesn't matter this is asking for a dog from the API a cat in this case really if the cat doesn't exist we can redirect the server has answered 200 ok we are done with the server side but on the client side we can say go to a page that gives us a 404 and we'll be fine we can also check if there's an error and then use the meta know meta robots and set that to no Index and then once we have rendered it the processor will go like oh no this doesn't want to be in the index can we do this so we have like in meta robots that this disables all indexing and then we use JavaScript to return and like know this page actually exists this is fine who thinks this is fine smart people because I did that and it wasn't fine because what happens here is not what you think it does I have to use my own reaction gifts because I don't get copyright Clearance for the others so I use my own now what happens here is that the processing stage sees or look at that processing has finished but we don't need to render we don't need to index because it says no index so the JavaScript never end our pages out of the index not very cool there's more things like for instance if you rely on cookies let's say you have a home page that has a cookie prompt and then on every other page you're checking if that cookie has been set and you go back to that page if it hasn't been set then Googlebot gets stuck here because booboo bot doesn't persist data we don't set cookies you can use cookies local storage special storage index DB but we're not gonna persist that data we're gonna throw that data away once the pin that the page has finished being indexed so don't rely on these also be with be careful with feature detection feature detection is a good thing but sometimes it is not sufficient here we check if the never if this browser has geolocation if it has to your location we load local content for the little bit location that the user is that and if not then we load global content the downside is this is a problem because what happens if I decline so the browser then has geolocation but I declined so this callback never fires because I declined the permission request Googlebot declines permission requests so we don't have any content on this page except if we had a fallback Handler here load fallback for the error case so now we're like does this browser have geolocation yes cool in that case can we actually I gloat the local content if it's successful or can we know the fallback content if it's not successful that works now in every case we have content so this is good if you want to learn more about these specifics then you can easily go to the troubleshooter page if you want to test your site we have two fantastic tools to do so one is the mobile-friendly test it tells you if your page is mobile-friendly who would have known the other thing is it tells you how it looks like so is this blank does this look like the page that I would expect to see here it gives you the rendered HTML so you can find out what the rendered HTML actually looks like and it gives you JavaScript errors including stack traces so you can figure out if something goes wrong why it has gone wrong here the other thing is search console it tells you how much of your pages are in the index how much weren't how much were included and excluded and how much had arrows on them it also tells you why something isn't in the index so this might be a soft error because we had a 404 on this page and we might not have expected a 404 here it also shows you analytics on how often you show up in search results that's pretty nice and you can test them live so like you can enter any URL that is on your domain and then it does some spinny spinny and then it tells you this is not on Google and then you can see like why not well it has been crawled but it hasn't been indexed yet so it's somewhere in the middle be warned these tools haven't updated to the latest chrome yet but we are working on that so stay tuned on our blog and on Twitter we'll let you know when that happens thank you very much I hope you had a great time [Applause]
Info
Channel: ReactNext
Views: 16,055
Rating: 4.9790578 out of 5
Keywords: react.js, react, reactjs, react next, reactnext
Id: 3B7gBVTsEaE
Channel Id: undefined
Length: 28min 20sec (1700 seconds)
Published: Thu Jul 04 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.