strtok() function | C Programming Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this video i'm going to talk about the string tokenization function c called strtok that allows us to take a string and basically split it up based on some delimiters we give it so to use this function i'm going to have to include string.h because that is where the function is found is inside this library here and this library comes with c so if you've got c you've got this library and then we'll have to make a string so i'll say car s is equal to this is the way and what i'll do is i'm going to have to have some kind of delimiter so i'm going to say here car d for delimiter is equal to space now it doesn't have to be space it could be anything it could be comma i could even have multiple delimiters so i could have like space and comma are both delimiters and what's going to happen is when the function encounters a delimiter that's been defined in this string here as in like one of these characters in the string here it's going to stop and it's going to give us that portion of the string in return so if we call the function we'll say str t ok and i give it s and d here what's going to happen is it's going to look at this string and as soon as it encounters the first token defined in this string here it's going to stop and it's going to return that portion of the string so we're actually going to keep that portion of the string so we can print it out so i'm going to say here car star portion is equal to strtok and i'll call this portion one because i'm going to have multiple portions and then i'm going to say here printf and i'll say percent s and we'll just print out that first portion there portion one and let's just try this now let's compile it and see what happens here and we get back this and that makes sense because that's what we wanted was the first portion of the string up until but not including this delimiter here because that is where we delimit the string that's where we split it up and you know that makes sense then we got this back and we could keep calling the function to get the other portions of the string so the way the function works is that when you call it the first time with a string like this it knows like okay this is the first time i'm being given this string therefore i have to start from the beginning of the string here if you call the function again and you give it null as the first argument basically the function knows based on that to use the prior string it was given so when you call it the first time it basically keeps and maintains a reference to this string here and then when you call it again with null it takes that as a signal that you want to keep looking at the remaining portions of this string so we'll call it again and this time we'll call it with null so i'll call it again i'll say portion 2 now and i'll call it with null so i'll say null here now and because i'm giving it null the function looks at this argument and says well they gave us null they want us to use the string we were given last time so go get the reference we've got to that string and go use that string and now we're going to expect is back because it should be the second portion of the string from where we left off up until the next delimiter which is going to be space again right we could actually change the delimiter here too by the way we could actually give something like comma or something else here and that would be okay but we're going to keep the delimiter as space in this case so we'll run this here and then now expect to get back is right i expect to get back this and then is and i could keep going we could get the other portion of the string so i'll say here car star portion two let's do portion three and portion four so we'll say portion three and we'll say portion four portion four portion four and that should give us back the remaining portions of the string right because we should get back then the and then this should be the last portion because we should reach the end of the string at this point right so let's give this a try here run this here we get back this is the way and this is basically the idea that we want to use when we're when we're using this function is to split up the string like this into these different portions like that and we can then access them as individual strings and use them maybe maybe we're dealing with like comma separated data something something like this and we want to split up the string based on those commas and deal with each portion individually that might be a use case for this kind of function now just so we understand how it's working sort of under the hood let's actually explore that so number one is when do we know to stop calling the function like in this case we knew to stop calling the function because we knew that there was four portions right we knew that there was like this and then the delimiter and then this and then the delimiter and then this and then the delimiter and then this so we knew to stop calling it after four attempts right but what if we didn't know how many spaces there were in our string what if we didn't know how many portions there were to our string in that case we're going to have to rely on the function's behavior of returning null when it's done so string toke is going to return null when it's done when it can't find anymore so if i try it again here if i say here car star again and i say this is going to be equal to string tokenization we're going to say no and i'm going to say here d one more time let's check out what again is equal to it's gonna turn out that again is equal to null so if i say here if again is equal to null we can say printf we're done else i could say printf still more to go and we're going to get we're done and we're going to get we're done because string tokenization at this point is done it's processed the entire string there's no more to find so it's going to return null that's its behavior is when it's done working with the string because it can't find anything else it returns null again is going to store that value we're going to check if it's null and if it is we're going to print out we're done so that's how we know when we're done is when string tokenization when this function returns null and we could actually write some code to look at our string based on this so one thing we could do is maybe we'll kind of get rid of this here for a second we'll comment this out for a second we could do it this way instead we could say here that the car star portion is going to be equal to string tokenization s and d right so we'll get the first portion of it here and what we're going to do is we're going to say while portion doesn't equal null we're going to keep calling the function while portion doesn't equal null and we'll print out the portion so we'll say printf and we'll do a printf of this portion here and then we'll call it again we'll say portion is equal to string tokenization null and d and so taken all together what we're doing here now is we're calling the function at least once and so long as the portion isn't null maybe the string itself is null or something like that but so long as the the portion isn't null we're going to print out that portion and the portion could be the entire string but we're going to print it out and then we're going to try to get the next portion of that seam string by repeatedly calling the function this time with null and we're going to keep on getting the portion and so long as that portion isn't null we're going to keep printing it out and so what we can do is we can programmatically process the string now we can programmatically handle strings of varying numbers of portions because we're basically relying on the fact that this function will return null when it's done when it's processed the entire string so now we can handle like strings of different numbers of portions so we can say instead of like this is the way we could say this is the way again and we can handle like a string of a different number of portions now so we can clear this out and we can compile this and try running it and we're gonna get this is the way again and so we can programmatically handle you know strings of different numbers of portions now so now one other thing we should go over is like what is this portion like it's a pointer to a string but like is it a new string because one thing string tokenization could be doing is it could technically be creating a new string on the heap like creating an all new string and then portion is then a pointer to that new string on the heap it turns out though that's not the way string tokenization works the way it works is it actually just returns a pointer to the existing string and it actually replaces the delimiters in the existing string with the null terminator so it actually kind of like cuts up the original string into a series of strings so i'll show you what i mean by that let's actually comment this out for a second here we'll comment this out for a little bit here and let's just play around with this first again so we're going to say here car star we'll say p1 and we'll say is equal to str tok sd and we know from our past experience here this returns the string this but what is p1 like what is that actual memory address let's print that out and let's also print out the memory address for s too so we'll say here p1 percent p and we'll print out the memory address of p1 we're also going to print out the memory address of s as well and we'll just do a comparison here so let's compile this again here and run it and we get that p1 is this memory address and s is this memory address they're the same so what's going on is that string tokenization this function is not somehow like allocating space for all new strings on the heap and then giving us a pointer to those what it's actually doing is it's actually inserting null terminators into this string into s and if we actually look at s after we do the string tokenization function like after we do a run of it we're going to find that there's actually a bunch of null terminators stuck in there and although all the pointers we're getting back those pointers are just pointers to portions of the string like this portion then that portion then that portion and that portion so let's actually do a printf of all the characters of s after we've done a run of string tokenization like after we've tokenized the whole string let's do a printf and we're going to do a loop so we'll say loop from 0 up until let's say how many characters we got here 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 so about 23 characters there maybe more if we want to include the null terminator and we'll say i plus plus and let's just do an experiment here let's just do a printf so we're going to say here if the character in the string at i is equal to the null terminator we're printing we're going to print out the null terminator we've got to do that kind of manually so we're going to say here zero slash n and we've got to do that manually because we can't really print f the null terminator and actually have it show up as the null terminator so we'll print out the terminator if the null terminator is the character present there otherwise we're gonna put out the character so otherwise let's print out the actual character in the string so we're gonna say printf percent c and we'll say here s at i and we'll print out the character in the string we'll print them out all a new line so we can kind of see what's going on here so we're basically going to look at s after we've done this tokenization process now and we're going to go from the beginning of the string until the end of the string printing out all the characters and as we encounter null terminators though we're going to print those out especially and we're going to print them out like this and we're going to print out basically slash 0. and so let's see what we get now we'll clear this here we run this and look at what we got here we've got this null terminator is null terminator the null terminator whey null terminator again and then null terminator and so what's going on here is that when we call string toke what it's doing is it's giving us a pointer to the next portion of the string as it goes but what it's doing is it's setting the delimiter to be a null terminator to actually terminate that portion of the string so just be aware that when you use this function it's going to actually modify the original string and it's going to depending on your opinion you know pollute it with these null terminators now you might be okay with that you might not if you're not okay with that what you might want to do is actually copy your original string into some other array where you are comfortable with that string being sort of polluted with these null terminators so that's the string tokenization function in c it's a fun one and i hope you like this video checkup portfolio courses.com where we'll help you build a portfolio that will impress employers including courses to help you develop c programming projects
Info
Channel: Portfolio Courses
Views: 161
Rating: 5 out of 5
Keywords:
Id: nrO_pXGZc3Y
Channel Id: undefined
Length: 12min 36sec (756 seconds)
Published: Sun Jul 25 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.