How Strings Work in C++ (and how to use them)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey little guys my name is a cello and walked back to my skipper but series and today I'm going to be talking all about strings in C++ too many years that you should watch if you haven't already are the pointers videos video and pointers and the last week's video rather two days ago video which was on about arrays because as we're about to discover trims very much tied with arrays so first of all what is a string in general a string is essentially a group of characters by characters I mean letters numbers symbols that kind of stuff it's basically tacked so it's very very common for us as humans of course to want to represent text in some in some way shape or form on our computers so it has this problem of well my programming we want to be able to represent text and that kind of group of text it could be a single character it could be an entire paragraph it could be a single word it could be a bunch of words all that stuff that's called string it's a string of text so we need some way to be able to represent that in our program and that what a string in table cloth is it's a way for us to actually be able to represent and manipulate that text so if you understand how strings work which is really we're going to talk about I mean in this type of foot series you know that I like to kind of go in depth and to share with you guys how things actually work because like for me personally when I'm learning something new I really really like to understand how it actually works that helps me understand the way that I can use it because if you just tell me what I should do the type string of code for example that's cool like I know what a string is however in the future when I run into something and I'm not sure if it all work or not if I know how the underlying technology works I can kind of make make an educated guess and determine whether or not what I'm trying to do is is impossible in the first place so you want to standing up strings work in C++ you first need to understand how characters works and what what characters actually are so characters being things like letters in book numbers they're represented in various forms and this is not going to be a tutorial or or a video about different kind of character encoding systems that we have in the world because there are probably too many to count and they're all very complicated and they're all very kind of depending on what what the specification actually is so wherever you get to in depth about that maybe in a future video but we are going to talk about how characters work in general now you may have noticed already that there is a data type in C++ called char which is shot to character and so far we've kind of used it with the reference of this is one byte of memory and so it's useful for being able to do things like cut pointers into a child pointer so you can do pointer arithmetic in terms of bias it's also useful for allocating memory buffers because if you want allocate if you want to allocate one kilobyte of memory you can dissipate 1,024 charts and there you go but it's also very useful for strength and test because the way it is people put treats characters by default release ASCII characters again text encoding not trying to get into that in this video but the way that we kind of deal with characters in C++ is in the form that the character is won by it and that is what ASCII is that extended asking is a lot of other than utf-8 utf-16 you have to - we have things like Y strings and we have we have of course character tested in which characters a way above won by it we are - by characters 3 by 4 by characters with other languages such as Japanese or Chinese or languages that actually have different characters than the ones we see in English we need to be able to use them because they're simply anointeth code if we only have one byte to represent a character that 8 bits which means we have 2 to the power of 8 possibilities of what that number would be which is 256 possibilities there are way more than 256 characters if you take into account English letters numbers symbol you know Japanese letters Korean letters all that kind of stuff so we can't fit all languages into you know a fifth character set that wouldn't work so we have like utf-16 for example which is a 16-bit character encoding which means that we have to have 16 different possibilities which is 65536 to four characters we have all these other things but in C++ just didn't say sign language without any library kind of uses just the primitive data type char is one byte which is why when you use a string as we're about to in C++ not a wide string which is two bytes per character but a normal string with which uses a normal child of each character we're talking about 1 by characters and we're primarily talking about just being able to do English if you're going to need to do some other language such as Russian you're not gonna be able to do that using this you you and have to use some kind of different character occurring which we've talked about other languages and all that in the future but interesting to talk about strings because strings and text and and all that stuff and maybe later into the game engine font rendering is actually a hugely complicated problem that the most people overlook just text in general and language is isn't massive massively complicated problem ok so that's a pretty big introduction but let's talk about how things actually work in a C++ characters I mentioned characters I mentioned the char data type a string is basically an array of characters that's why I want you guys to watch that arrays video in the description everywhere a string is just an array of characters and array being a sort of set of elements so we have a quarter set of characters which make up a string of text you may have noticed in the series that we've often referred to strings of char pointers so let's take a look at how that works we can declare a space start string by writing const char pointer the name of the string like name and then setting it equal to in double quote some kind of text so we'll set it equal to china for example this is essentially a c style way of defining strengths because we do have a library in C++ which means which makes string operations much much easier for us than this but it's still important to kind of know how to work so that you know how to take off both version work you also don't need to declare it as comped char but the reason people usually do that is because you really don't want to be going around and changing the value of these strings are immutable in the sense that you can't just extend a string and make it bigger because this is a fixed allocated block of memory if you want to have a bigger string you need to perform a brand new allocation and delete the old string now just because this is a child pointer does not mean this is actually here allocated you do not delete these things by either calling the lease name or delete name or anything like that we haven't talked about new and delete and all this kind of heap allocation stuff I've added stuff coming very very soon I know that we need to get into this but the rule of thumb is just if you don't use the new keyword don't use the delete keyword so in this case we made this sure no string we didn't write new child something like that we just wrote oh no so no delete required if you do declare cons that of course means that you can't change the content of this again we'll have a comp with you in the future but you cannot for example change the third character here which is the letter e to be something like a you won't be able to do that because you might just comps so if you know that you're not going to modify the strange market its Const otherwise you can just leave it as a child pointer that's totally fine okay so the next question is cool this is a string what does this look like in memory and how long is that going to work so we'll set a breakpoint over here and just get a spider on our program so that we can look at the memory that makes up name if I go to my memory view over here and type in name it's already points as soon as I enter you can see that we have a bunch of memory here and if you look over here we have the word check so what this side of the memory view is is basically an ASCII representation of this these are all individual bytes and this is what that by it would be if you convert it into ASCII you can check out a certain website such as ASCII table comm which has a table of what those ASCII codes actually are you can see in this case we have 43 this is in hexadecimal so if we go to the head column here find 43 you can see that it corresponds to a capital letter C that's what you see over here that's the first letter of China now hearing all these other strings which are basically debug only helpers you can see that we have those six characters which make up the word Chan oh and then we have advice that is set to zero this is called the null termination character that is how we know that that is where the string ends you'll note that we never we don't know what the size of China is we can't it's just a pointer right so how can we find out what the size is that's where that null termination character comes in a string begins as that pointer at that memory address and then it continues on until it hits zero that's how when we decide to print this out to the console for example we can write a the our name if I were to rerun this you consider we get shown and print into the console and yet it's just a pointer so how does it know where it ends it runs into that zero and it realizes okay that's my null termination character that at the end of the string if you were to declare this by yourself that's our example I create a other string called name too I'm going to do is it's fully manually so I'll just make a new char array to 36 characters and then I think I'm a not going to initialize it right here I'll set it equal to the individual characters characters and see what works by the way I define with a single quote not double quotes if it's double quotes then by default it becomes a child pointer okay not a string a child pointer will get into strings in a minute we have c h e RN r now this is an array not a string right just an array of six characters you can see there's Nord null termination character so by trying to print name to to the convoy attack so we can we can expect the memory as well but we'll Princeton will print it to the console here you can see that we get 0 and then a whole bunch of random characters which is again the ask interpretation of whatever the memory was a name to we go back to this memory view and type in name to you can see that we have a bunch we have our Cherno written and then a bunch of weird characters which is you can see the memory set to CC which are actually array guys to let us know that that that memory is outside of our location whenever we allocate arrays and memory in debug mode we lost the debug version of the C standard library the standard library virtually the things like start guides so that we know we know that we know when we're right outside of the bounds of our allocation again for me for another video so because we don't have that zero here at CDC ours does not actually know when to and printing which is why we get this random thing here however if we were to expand it and write 0 either expressed as as a backslash 0 which is the actual ASCII character if we go back to the ASCII table you can see that the null is what it actually is that how you declare a null you write a backslash 0 the errors we've got seven characters now right or you can literally write the numeric constant zero as well because that is the actual value of this if we hit our five you can see now we're printer know properly with nothing else that's how char arrays work that's how string basically works that's what string is it's a collection of characters I think that I think that's probably deep enough if you have any other questions about how that actually works just leave the comment below I'll try and answer as many as I can I don't I think I've mentioned everything ok so a simple plot how does paper Club come into this and how should we be actually making strings in c plus plus the standard library in C++ has a class called string it actually has a class called basic string which is a template class and STD colon colon string is basically a templated version of that basic string class which is templated with charge but it's a it's a template specialization that's what I'm looking for of the basic string class with char as the template parameter which means charge the underlying data type for each character so that is really what you should be using there is something called W string which is the wire strength again we're not going to talk about that we're going to be real simple here STD string is what you should be using to strengthen C++ how does Hanna string actually work basically it is it's just that it's a char array so it's an array of chars and a bunch of functions built-in manipulators later on in this series when we start talking about data structures we're actually going to write our own data structures so all the people floss all the kind of people's data structures that you see in a standard template library we're going to manually write our own version of that and see how and see how that works and how we can optimize and all that stuff so it stick around for that you're interested but for now it's just a child point it's basically just an array of characters and functions built immediately like that so let's talk about how we can use Center strengths okay the best turn our programs will change this current set up to use a standard string so the first thing we need to do is include strength iostream actually does have a definition to string calibers once we always print it to the console as we'll see in a minute actually need to include the string header file will change this child pointer to be an STD string and that's actually it we're done string has a constructor that actually takes in a child pointer or a comp child pointer if you hover your mouse over this you'll see that it is actually a Const our array not a char array that's one thing that I forgot to mention when talk about conch-shell pointers that's why you typically assign it to the concho pointer because insert the child pointer because intrinsically it is when you define strings by literally double quotes in a word or multiple words in people but it's actually concentrate not just the char array but again any put the card to a child pointer if you need to manipulate the contents of the strength totally fine so printing this again we can just call name - you can see that there's a lot clearer than having a conch our pointer if we hit f5 you can see it works the same way now if I hadn't included this string header file and just dealt with iostream you can see we get an error on this output stream operators telling us that we cannot send a string into the center to be our output string because the overload for this operator that allows us to push string in there is inside that is inside this header file okay that's why I included string even though iostream actually does have a destination border and of course because this is a proper class with a bunch of functions we actually have all these methods such as name dot size we can find out what the size is if we just had a constant our pointer or a child pointer we would actually have to use big functions like sterling which is the length of the string and to copy for example to copy strings and all this stuff we have the function defined for up inside shrink us which makes it lovely that is how we use strings now I know this Homer thing that you might find yourself doing its appending strings I want to do Cherno plus hello or something like that now you might get errors here the reason this is happening is because you're actually trying to add to column chart arrays right this double quote is kind of thing is a contra array it's not not an actual string or it's not a strength you can't just add two points together dozen or two arrays together doesn't work that way so if you want to do something like this a nice easy way to do that is either with this up into multiple lines because now you're doing name plus equals colors so what you're doing is you're adding a pointer to an actual name which is a string you're adding it to a string and plus equal is overloaded in the string class to be able to let you to let you do that or one thing that I do quite often as well is just threw out one of them with the strings instructor for you explicitly calling a string constructor so you're making a string out it is and then appending that to it as well and that will be totally fine sure this way you might end up with more copies but for most reason looks like for most purposes it's fine if you want to find text in string you can use name dot fine and then these string of text that you want to look for for example I want to look for the note inside Cherno hello so I'll do named offline node if that does not equal strip SP or the educated E string and pods which is basically an illegal position for that fine then it contains that noticing because what name dot find actually returns is the position of this test so in this case it will return the position of the beginning of this so if you just want to see if it contains something or not you use this because there's no actual dot contained function anyway I could go on and describe the entire string API to you I've linked HTTP reference kind of link to strengthen it in the description below so you can check that out if you want to see everything that that string cloud office but yes that strings super easy to use we'll be using them a lot in the future and another series that I'm maybe going to start soon wink so yes that's pretty much all there is to it one other thing that I want to quickly mention is passing these strings around to other functions if I write if I wrote a function called print string and I wanted to be able to pass a string around I would not simply write a city string string and then have my university our strength the reason I wouldn't do this is because then it's actually a copy we haven't talked about this too much but when you pass in a class like this to a function what you're actually doing is you're creating a brand you're creating a copy of that class of that object and giving it to this function so if I then was to do something like string class equals a short it wouldn't affect the original string that was passed in say over here however this is clearly a read-only function we're not gonna be modifying anything we just want to print it so why would we copy an entire string copying a string means we have to dynamically on the heat allocate the brand-new char array to store that exact same text that we've already got that's not fast string copying is actually really quite close apparently and it's if it's the major short point in some cases because string operations are so common so whenever you pass a string like this and it's going to be read-only make sure you pass it by cons preference okay so we'll add Const over here at the front and reference what that tells us is that we this is a reference meaning that it won't get copied and constant means we're promising not to modify it here again I said promising because technically we could override dates we wanted to but of course we're we're promising not to so we're going to we're going to deny I'll have a video in the future about cause references are more kind of feature video because there's a lot to them and there's a few tricks that we can talk about but that is basically what strings are do you guys enjoy this video make sure you hit that like button if you really enjoy this series and you want to see more videos like this you can support me on patreon account for special channel there's a really nice community we've got going on slack I'm actually thinking maybe switching to this court because I think patreon offers discord rewards and kind of it's all built-in so it might be a bit set up but basically to place a certain amount you get access to trust videos as well as a like a discussion thread in which we can actually talk about what goes into these videos and plan it together that's pretty cool and of course it helps me make more videos which is always good if you have any questions about this episode or about strings or anything you can just leave a comment below I'll try to reply to as many as I can in this video I find it's a bit a bit more easy with the kind of money script anything as much this was more of a casual kind of conversation let me know what you think of these kind of videos they kind of make it a bit rambling looking at the time here and a lot has passed but I kind of like these more laid back videos there be easy for me to make as well I know I don't think I'm stressed because I'd like to Melbourne now and I'm going to work every day and coming home and just having our conversation with the camera probably means that I can probably release more than one video a week which is which is which is good so let me know what you think about that in the comments below and I'll see you guys later good bye [Music] [Music]
Info
Channel: The Cherno
Views: 277,704
Rating: undefined out of 5
Keywords: thecherno, thechernoproject, cherno, c++, programming, gamedev, game development, learn c++, c++ tutorial, strings, strings in c++, how do strings work, c++ text, text, font, characters, char, std::string, c++11
Id: ijIxcB9qjaU
Channel Id: undefined
Length: 19min 26sec (1166 seconds)
Published: Mon Aug 21 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.