Improving SEO with (Dynamic) Sitemaps in Next.js

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey how's it going everyone it's lee halliday and in this video we are talking sitemaps in a next.js application and sitemaps it's that xml file that tells google what all of the different pages are on your website so that it can find them and it can index them so that you have better seo and searchability on your website so the site that we're going to be generating a site map for is this one right here where we have a secret page that just says the word secret so we're going to exclude this from our site map because we don't want to include it and capsules these are spacex capsules and clicking in shows you some details for that specific capsule so it's a pretty simple site but it will cover a number of different concepts both sort of dynamic server side pages that you won't be able to include in your site map without doing some custom work for so code tour we have our home page which links to our secret page that's just rendering a div that says secret we have a capsules index file so this is the list of capsules and we're using swr to load them client-side from this spacex api and then we have for each individual capsule a page so this dynamic page here which takes in an id and then in the get server side props function it uses that id to fetch the data from this spacex api return it as props so that it can be rendered inside of our actual page level component so this is the site that we're going to build a site map for and the reason i chose to use get server side props is because these aren't known at build time so you can't just build a static sitemap file and call it a day this data is constantly changing so we need sort of a static combined with the dynamic site map so to do this we are going to be using a package called next sitemap so it's got great documentation we're not going to cover all of the options for it so i recommend coming in reading about it and what all of the different options are like splitting a large sitemap into multiple files and a few things that we won't touch on today so check it out it's been great i've used it on a number of my websites and the first step that you're going to want to do is add a new script called post build and post build we want to call next sitemap just like this so when you deploy your code uh to say versel or something like that it's going to run yarn build and then when that's done it's going to call if you have a post build script which will generate our site map for it so if we were to come in here and we were to open up a new tab in my terminal and run yarn build it's going to build all of my static pages this takes about 15 seconds i think and then you'll see when it's done it's going to try to call and then fail next sitemap and the reason it fails is because it needs a file next sitemap.js so let's go ahead and build that next sitemap.js and what this is going to include is basically module.exports and it's oops it's going to export an object and this object sort of bare minimum it just needs a site url so we're going to put this into a variable here site url and this isn't a real website so we're just going to say this is capsules.com like that so that's sort of the bare minimum that you need and if we were to run this so i'm not going to do build again i'm just going to run yarn next sitemap from here on out it's done so what happened what it did is inside of the public folder it built a sitemap so sitemap's just an xml file that lists out just the different pages on your website and the last time they were modified so if we came in here now and we went to sitemap.xml we could see this and we have three pages we have our home page our capsules index and our secret page that we want to remove but notice here you don't have all of those individual uh single capsule pages because they're dynamic and there's no way for this code to really know about it to generate and put those into the site map so that's what we're going to try to solve today there's another option that you're probably going to want to add and that is basically if you wanted to generate the robots.txt file so generate txt no it's generate robots txt true so if we just run this again next sitemap it will now create you a robots.txt so google's indexer uses this to basically find out what what pages on your website you're allowing it to index and which aren't but it also uh basically tells google where to find the site map so here's the one that's been created and it's available in the public folder so we're good to go there it tells it what the host is and for any user agent right now it can index everything slash so we're going to control this a little bit later as well one thing i like to do right off the bat is put it into git ignore so i'm going to say slash public slash robots.txt and slash public slash sitemap sitemap.xml and the reason you don't want to commit these two files is because when you deploy after it gets built on the versailles or wherever you're hosting it you want it to then generate these files for the real time on that server i mean you can commit them there's no problem but yeah so generate robots.txt is true so the next thing i want to do is cover how can i exclude that secret page and it sort of involves excluding it from two places so the first thing we're going to do is exclude it from the site map so we can just add in the exclude option and it's going to be an array where we say hey exclude the secret if it was a folder you could say exclude everything in the folder so we're going to do that run let me just hit the up arrow so run it again go check our site map no more secret in there because maybe it's a page you don't want google indexing it's i don't know for whatever reason maybe it's not important to to seo or it's something private you want to hide so that's how you exclude it from there but what if i also want to basically tell robots.txt to disallow this file so if google happens to find it even though it's not in your map maybe somebody linked to it from elsewhere don't index this page what we can do is we can come back and we can add in some additional options for the robots.txt file so that's robots.txt options and one of the options are the policies and this is an array and each array will basically have the user agent so we'll just do all user agents and we are going to disallow slash secret and then we'll add one more to allow slash which is the rest of your website so if we regenerate things again and then look at robots you'll now see that it's disallowing slash secret and it's allowing everything else and we've taken care of secret but now we got to talk about the dynamic pages all of those capsule pages that it it's not known as at build time it's a runtime page that's generated dynamically so what we're going to actually do is we're going to create a new folder and we're going to call it server dash sitemap dot xml weird name for a folder but inside of it we're going to put an index file so index.tsx because we're working with typescript today and the first thing we need to do is sort of just make an xjs happy so we're going to come here we're going to export a default page level functional component called sitemap and it doesn't even return anything because it will never actually use this code it's just um i don't know next gs likes to have a default component that's been exported and then what we're going to do is implement get server side props that will sort of find out what are all the pages we want to include in a separate sitemap file from the api so we're going to import a couple things first we're going to import get server side props from next and we're going to import get server side site map from next sitemap those are the two pieces we need right now and then we need to export the get server side props function which is type get server side props and that is an async function that receives the context so context includes sort of everything about your request any query params in it it also includes a response object for you to basically tell the server how to respond we're actually not going to deal with that at all and instead what we're going to do is return a call to this function which wants to receive the context and then an array of isight map fields so we'll call it fields and we'll create this array here which will just be empty so it doesn't like it right now because this could be an array of anything but it wants an array of isight map field so we're going to import that type here and then we're going to tell it that it's going to be an array of isight map field and now it's happy but what are we going to put into this fields array we need to basically find what are all the capsules that will have pages on our website so for that we are going to call the spacex api so if i go over to capsules index this is the url for that page so coming back here i'm going to put some data here and data will be equal to a weight a fetch call to that api and then we can do it in two steps so okay response is that and then data is the response converted into json and we'll just say that it's in it's an array of anything for now um okay so these are all of the capsules so we could even say capsules like this and instead of any we could we could type it if we want but let's just leave it like that and let's see what we get back here cool and this isn't anything special here like what we could do is we could go and visit this page so it was server dash site map and we haven't given it any fields yet the the different page urls that that should be in the separate sitemap but we should be able to see down here what each of these capsules look like and what's important to us is this id here because that's the thing that allows us to access a page for a specific capsule so what we are going to do is basically put them into this fields array here so we're going to say we don't even need this here we can say capsules.map each capsule and what we want for each of them it's basically an array of objects where each object has a location and that location is the full url to this page so we're going to put in here it's https www.capsules.com capsule slash the id so we need to switch this into a back tick so we can embed the capsule dot id in here and another field you include is the last mod basically when was the last time this page modified if you have a way of knowing that by all means use it but otherwise we'll just say it's a new date dot 2 iso string function call like this so i'm just hitting save prettiers prettying it up and what we've basically done if i console.log the fields and then refresh the page here because we've put all of these pages into our fields so capsules.com capsules and then this id if we could go and make sure that it works so let's go to localhost slash that page um what did i mess up capsules id is it not capsules id all right i'm interested to know why it didn't load i must have messed up something let's try to figure that out oh nothing's working what the heck let me stop this and refresh it okay i don't know i think this page should actually work cool so now you can see that it's generating our server-side sitemap dynamically with the data coming from the api and how do we sort of tie this in together how will google know that there's not just the one sitemap file but there's actually two so what you can do is you can go back over we get rid of this console log what you can do is you can go back over here and you can add another option to the robots.txt and that's additional sitemaps so this is just an array that points to all of the other site maps that you have so you can have the one main one plus the other one for the dynamic server side ones so we'll just embed in here the site url slash server sitemap.xml like this so if i were to come back here and just one more time generate the sitemap go take a look at robots um i guess it this would only tell it about the ones you'd probably want to add in the original one as well so you'd come back and you'd say okay we've got that one plus we've got one that's just sitemap.xml maybe you don't need to tell it because sitemap.xml is the default but i don't think they will penalize you if you're a little bit more explicit so here we go here's our two site maps sitemap.xml which i don't think you need but a tier anyways and then the second one which is dynamic and google will go hit that to get access to all of the other pages that can't be included in the static version of this file so with this tool we've built a static sitemap file which also generates us the robots.txt we've excluded our secret page both from our robots and from our site map itself and then we've come up with a dynamic site map generator which will generate all of the pages that are that come from an external system that aren't available at build time and with this i hope you get fantastic seo and many people can find your website alright take care bye
Info
Channel: Leigh Halliday
Views: 3,667
Rating: 4.9631338 out of 5
Keywords: nextjs, nextjs tutorial, nextjs sitemap, nextjs robots.txt
Id: fOoH9Z5adrg
Channel Id: undefined
Length: 17min 29sec (1049 seconds)
Published: Mon Jul 12 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.