ARRAY - Making DATA STRUCTURES in C++

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey what's up guys my name is the churno welcome back to my c plus plus series so far in this sequel series we've been mostly learning about various individual language constructs and just things that we can do in the language and how the language itself works but now it's time to take that and actually put it into practice and the way that we'll be doing that is by learning about how to write our own data structures now just to be clear by data structures i'm talking about things like arrays lists sets maps trees that kind of stuff whilst the c plus plus standard template library comes with most of these data structures already implemented for you i think it's very valuable to try and write your own because apart from learning more deeply about how these data structures actually work and why they might be more useful than other data structures you'll also learn a lot about the language having to write kind of efficient versions of these data structures along with data structures another really important part of programming is of course the patterns or the algorithms that will operate upon the data inside these data structures and that's stuff that we'll be covering as part of this series a little bit later on so today we're going to be starting with talking about arrays and we're going to be creating a very very simple array just a static kind of stack allocated fixed size array something that can hold a certain amount of elements of data contiguously and something that will be easy for us to interface with and use as a programmer using that array this is basically going to be the standard array class that is provided to us by the stl in my experience looking through people's code particularly people who are quite new to language or learning the language even some people who have been using the language for a long time this is not something that a lot of people use a lot of people jump into using vectors straight away because they're dynamic they're resizable they're great aren't they well they are but they're also allocated on the heap and that can tremendously slow down your program for no reason because in a lot of cases you could have gotten away with using a fixed size static array no problem and by doing so in the right places you can greatly speed up the efficiency of your program so i think this is definitely something that you should not underestimate and later when we get into algorithms and stuff like search algorithms we can greatly improve the performance of our program by keeping our memory local on the stack instead of continually going to the heap as you'll see in the future but first before we jump into actually writing this i want to thank skillshare for sponsoring this video i've been working with skillshare for quite a while in this channel as you've probably noticed because i think it's a really great platform it's a platform where you can go to to learn a new skill and it's especially useful when you don't have a lot of time the thing is skillshare has a lot of courses they have they have courses on pretty much anything you can think of and they're all really concise and well put together and they don't contain a lot of flop they don't waste your time they just teach you what you need to know and usually they're extremely high quality i love the fact that skillshare has a lot of videos about things like productivity things like marketing i love their art related courses and especially this one about how you can relax and just take a break with art sometimes the programming gets overwhelming sometimes you don't understand something and being able to find a different creative medium in which to express yourself and also relax i think is really really cool skillshare comes in at less than ten dollars per month for an annual subscription and the first 1000 people who sign up using my link in the description below of this video will get two months of free skillshare premium that's two months to learn something you should be able to learn a lot of things in those two months and i expect you to so thank you skillshare once again for sponsoring this video and hooking you guys up with some free learning if you need to learn something other than c plus or programming definitely check out skillshare in the description below and speaking of learning let's learn how to write a fixed size stack allocated array data structure in c plus so just to be clear here what we're trying to create is a stack allocated array that means an array that exists on the stack not the heap let's start off with the most basic example of this where we don't have a class or any kind of helper functions whatsoever i'll create an integer array here i'll call it array and i'll give it the size of five we now have a very simple stack allocated array this is what forms the basis of this entire data structure a heap allocated array would look something like this i'll call it heavy a the key difference here is that we're actually using the new keyword this means that this memory allocation occurs on the heap and we're explicitly using a pointer to address it that part isn't as much different because we can also use a pointer to address this array but the main difference here is that this is allocated on the heap as such it's actually possible to dynamically allocate this meaning that the size does not have to be specified at compile time this is not the case with a stack allocated array where we need to specify the size at compile time so in other words i could have a size variable here that could be taken in from anywhere and then i can simply use that size variable in this allocation i cannot do this with my stack allocated array you can see that gives us an error this has to be a constant that is known at compile time because the size of this memory allocation on the stack has to be known by the compiler the other big difference of course is that with a heap allocated array like this we need to make sure that we properly delete it this is not the case for a stack allocated array as it will simply get removed when the stack frame is popped which in this case will happen at the end of this function here i'm not really going to talk about the benefits of a stack allocated array like this versus a heap allocated array as well as all of the drawbacks the point is that in a lot of cases if you're just allocating memory for a single scope it's a lot better to use a stack allocated array just simply because that memory will be right there on the stack there's no need for any heap allocations any kind of indirection everything is right there available for you and in a lot of cases this can greatly improve the performance of your program so before we create a data structure that essentially wraps this and makes it a lot more usable for us let's take a look at what the c standard library provides for us which is this standard array which is available since c plus 11. i'll create this standard array here with the name collection and i will also add in the required include here which is array that's where this class is actually contained the way that we use this is primarily using template arguments so using templates we're able to specify the type here as well as the size so it's interesting to note here that the size is something that we specify as a template argument now if we actually open this array file here by just going to the document you can see the entire implementation because this is completely a templated class the entire implementation has to be inside this header file it can't be compiled into some kind of library in fact the entire cpus plus standard template library is actually in header files because it's made out of templates so we'll be referring to this quite a bit just to see how they do things whilst also writing our own that is going to look a lot more simple than what we have here because the c plus libraries are usually written in a not very readable way once we've got that the primary benefit here is that we can in fact easily check what the size is of the array we can easily obtain an iterator to the array so in other words if i wanted to iterate through every single integer inside this collection i have a very easy way to do this i can also use a traditional for loop here like so and then simply use it as if it was any other array by like printing the value or doing something like this doesn't really matter this is basically what the api for this standard array class looks like and that's what we'll be creating today okay so now that we've done a little bit of research let's dive in and take a look at how we can make this ourselves what i want to do is start with the first iteration of this how do we do this just templates aside the simple kind of beginner's version of having a stack allocated array that is wrapped inside a class well i'll make a class called array and then i'm going to decide what kind of type of data this array is going to hold so we'll go with an integer i'll now make an actual stack allocated array here with a particular size now this is where our problem comes in we can't really simply make a constructor here that takes in a size like this because if we did that we would have to somehow allocate this memory it's not going to be possible to do this for the same reason why i can't have an array over here that is initialized by a particular variable like this you just can't do that because this is not a constant expression it's not something that is known at compile time so we can't have an array whose size is specified at runtime so in other words we need to actually insert a number into here and this is specifically where it gets difficult because if we're trying to do something like this how do we control this how can i at the time of instantiating this class and using it how can i tell it how big i want my static array to be that's why the standard array class uses a template now as a side note i just want to mention that yes i did say that technically you cannot have a stack allocated array with a dynamic size that's not strictly true because it is possible to allocate memory on the stack dynamically we can do that by turning this into a pointer and then simply saying that m data is equal to endpointer allocate size allocate is a function that lets us allocate memory on the stack this would successfully allow us to instantiate our array here with an actual runtime value here so this could be of course a variable that was set to something and you can see that it works here fine because this just takes in an integer and we're allocating the right thing now there are certain caveats to using allokay whilst this memory is technically allocated on the stack the compiler doesn't really know how big the size will be that can lead to problems and also lack of compiler optimization so technically speaking you probably want to avoid lk especially in this instance i definitely won't tell you to avoid allocate in life in general just because it is extremely useful and it can definitely benefit the performance of your program but it's totally outside of the scope of this tutorial so we're not going to talk about it here at all so going back to this what are our options if we simply have a declaration like this and the only thing we can really do is by using a template unless we want to have a fixed size array which means that we just have a size of 10 and that is just like this is my array that is 10 integers large or something like that which is ridiculous we need to template this so it's really easy to write this if you're new to templates check out my video in the top right corner about templates but all we really have to write here is template size t and we'll call it s now this can be you can write this as an int usually in c plus if you're referring to sizes you want to use the size t that's certainly what the entire library uses for all of the like c plus plus internal stuff okay so now that we've got that we can simply replace this 10 with the size and now if we work to create this array instead of having to write this size here what we need to do is specify a size like 5 as part of the type so now we have a templated class meaning a new version of this class will physically be created by the compiler with our argument every time we create a new array type okay cool so now that we've got a dynamic ish size going on here it's not really dynamic we still have to specify this at compile time there's no way for us to do something like this at least unless it's a constant expression but we're not really going to talk about that here right now we still have to write the size of our array up front during the compilation time like this which is usually fine because of course if we're making sac allocated arrays anyway we just simply write the size and so we're used to that that's fine that's what a fixed size stack allocated array is all about which is what this data structure is so now how do we make this not specific to integers well we just need to add another template argument so i'll write in type name t at the front of this so we have two arguments here type name t which is going to be the actual type of data that we store and then size t s which is going to be the number of elements that we have here or the count so over here instead of int i'm going to replace this with t so now what happens is when i make this array i need to specify the type first and then the actual size so what this will do is create a version of this class which has t set to an int and size set to five and this is the class that i end up with when i compile this template okay cool so now that i've got that let's take a look at some of the more useful features we can actually add to this class because at the moment it's literally just a wrapper over what we could have done without the class anyway now one of the most useful things that we can do here is actually keep track of this size so i can have a function here that returns the size i'll just call it size i'll label it as const here because it's not going to modify this class and it's just going to return s now this is interesting because keep in mind we're not actually storing the size anyway there is no additional variable here called size which is like set to 5 or set to a value that we take in in the constructor during runtime there is no storage for this whatsoever what this actually is is just this templated argument meaning that when this actually when this template actually gets created this s gets replaced with five so we literally have a function whose sole job is to return a constant value of 5 and that's it there's no additional storage and that's one of the cool things about this class as well and by using templates we're able to actually hard code this particular function these particular machine instructions to just simply return 5 every time which is super nice because we're not using additional memory to store that now this is great because it lets us actually check the size of this array we can now call data.size and immediately this is already better than just having a stack allocated array because we now know the size of this array at any given time i could simply pair this with a for loop like this to iterate through all of the members of this actual array simple as that however there are some improvements we can actually make to this function namely the fact that this is clearly a value that we know at compile time so why not let other compile time tasks be able to take advantage of that value one example is a static assert maybe i don't want these arrays to be above a certain size for some reason so i could add a static assert that simply checks to make sure that the size of this array is less than 10 and i can write it like this i'll add a little message here as well and a static assert is an assert that actually gets evaluated at compile time this theoretically should work because i mean we know the size at compile time we're forced to specify the size of compile time however you can see here that it says that the expression must have a constant value and that's because this size function simply returns an integer what it should return is a constant expression by adding the const x per keyword here we're specifying that this actual function can be evaluated at compile time and since it can you can see the assert here works fine it also lets us create new arrays from the size of that first array because again data.size is a function that can actually be evaluated completely during compile time and that's also the reason why inside the c plus plus array function lots of their functions here as you can see actually returns a constant expression or rather most of these functions are constant expressions okay so now that we've done this how do we iterate through this array and actually pull out the values because at the moment even though i've got all of this stuff going on there's no way for me to actually index this data now i could simply return a pointer to this data but a better way to do this is to implement the index operator and i can write a simple version of this by just writing t then operator open and close square bracket then int index and then i'll simply return m data at that index so this is kind of like the most simple straightforward way it has a number of problems which we'll address in a moment but that's basically the idea so now that we've written this we can simply specify the index of the data that we want to read and you can see we now have the ability to read this data and in fact we can even print it to the console if we want to okay great so what are the problems well first of all t is being returned by value this may or may not be what you want the reason it may not be what you want is because in this case we're just returning integers but what if we had a string would we really want to completely copy the string out every time we use this index operator probably not we simply want a read only reference to it the other problem here is that since this does actually return by value and it makes a new copy i can't do things like this because this isn't a modifiable l value we're simply returning a brand new copy of an integer here there's nothing to assign to there's no storage so because of that we want to return a reference to this actual type even if it is something like an end because even though we we're not really interested in preventing copies for integers we might be for the potential other data types but also by returning a reference it lets us actually assign into that index which is something that we typically want to do with an array such as this this poses another problem though if this array happens to be const for whatever reason maybe we're assigning it to like a const reference or something like that we have our array reference maybe we're passing it into a function and if we have this kind of const reference here for the array we obviously can't do this and we wouldn't expect to be able to do this because this is modifying the array and it's constant so we shouldn't be doing that however you can see that we also can't access it which is kind of a problem because const implies that it's read only we should still be able to read the data but not necessarily write the data this we can easily fix by just adding another version of this function which returns a const t reference and is also marked as const this const over here means that we can actually call this function if the instance of array is constant like it is for array reference so this does not work which is what we would expect but this does and if we of course use the original data which is not const then everything works fine the final real problem here is that we're using int as the index this lets us use negative numbers and also might not be the same size on all platforms so typically the sql standard library likes to use size t for these things if you go into array and you take a look at their index operator you can see that they also use size type here and size type is equal to size t they have a lot of their own definitions for a lot of these things at least msvc's stl which is what we're looking at here has all of these various definitions which just add a little bit of noise in my opinion as to what is actually going on here since you always have to look at what these are actually set to okay so now we have a pretty good version of this if we take a look at the version inside array though you'll see that there is also one other thing i mean largely it's the same as our function i mean it's also marked with no accept and you know there's a few other various flags going on here that we don't really care about at least for our basic example and then this is this is identical however we have this preprocessor statement here the checks to see if the container debug level is greater than zero and if it is it runs this stl verify thing that checks the position versus the size so what this is is essentially an assert that checks to see that this index is within the accepted range so if we were to add something like this it would look like an if statement that basically checks to see whether or not index is less than the actual size of the array which is s and if this assert fails then we need to do something like break the compiler on this line this is an msvc way of doing that this protects us because if we accidentally loop over something like this and we run the code then it's actually going to break the debugger on this line and tell us that well we've done something wrong so that we can debug the state of our program this is not something that typically gets included in release mode because this extra check just slows down our program and is something that we would expect to catch in debug mode anyway so that's that's the extra check that we have in the stl version of the array class and something of course that you might want to use however you can see that this container debug level macro actually makes sure that this does not happen in release mode okay cool so we'll we'll just ignore that part now size should also return a size t and because it returns a size t we should also make sure that our for loop actually set to a size t just to avoid some compiler warnings i'll get rid of this and i'll clean up our code a little bit more one other thing that this class really needs is a way to actually access the data within it so in other words this will just return mdata and it will just return a pointer to it you could also add a const version of this that returns the data if you wanted to and it would look something like that what this enables you to do is quickly for example set the memory to be 0 by just calling data.data writing 0 as your value and then the size of the array which is going to be data.size times size of int because that's the data type that we've selected here this will set all of the integers inside our array to be zero by just simply setting all of their memory to zero this isn't really 100 required because you could always just simply take the first index of the array and then grab the address of that since this stack allocated array of course is guaranteed to be contiguous in memory so finally let's see this in action what i'll do is print every single index of our array like so let's try it without the mem set first to see the effect of that all right cn.yet so that we pause our console you can see we have just a bunch of uninitialized memory here pretty boring let's bring in the mem set now all of the values inside our array are set to zero finally i'll maybe set index 1 to b5 and index 4 to b8 so if we run this again you can see we have zero five zero zero eight so this seems to work and of course if we wanted to we could create an entirely different array that maybe had two strings string one we could set equal to journal like so and then string two we could set equal to c plus plus and then if we basically have this exact same for loop that we had before you can see that this array also works with strings so that is our basic array class you can see it's really tiny there are still a lot of features that we would want to add to this namely the the biggest feature i think is iterators so at the moment we're having to write a for loop like this but it would be really nice if we could simply have a for loop that looks something like this where we could get through our entire array just by using a range based for loop like this and to make that happen we actually need to write an iterator which is a topic for another video there are also some other utility functions such as fill and swab which we would probably want to implement just to make dealing with our array slightly easier but that's going to be it for today's video okay so i hope you guys enjoyed this video if you did please don't forget to hit the like button below obviously there's a lot of other things we could add to this array class and we will probably end up extending it in some future videos like the iterators and stuff like that the core mechanics of a stack allocated array are quite simple as you can probably see and this in and of itself is quite a powerful data structure that you can and should use frequently throughout your code if you have any other suggestions for data structures for me to cover obviously i have a long list of all of the popular ones but if there's something specific that you want me to cover or a different data structure that you want me to cover next please let me know in the comment section below i'll take a list probably of all of your suggestions and we'll get around to doing them as soon as possible thank you guys once again for watching do not forget to check out skillshare's two months of free premium using my link in the description below and i will see you next time goodbye [Music] you
Info
Channel: The Cherno
Views: 69,026
Rating: undefined out of 5
Keywords: cherno, thecherno, c++, cherno c++, arrays, array, data structures, writing data structures, making data structures, templates, how to make an array, how to make an array class, std::array
Id: TzB5ZeKQIHM
Channel Id: undefined
Length: 23min 18sec (1398 seconds)
Published: Fri Jul 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.