Data Structures - Computer Science Course for Beginners

Video Statistics and Information

Video

Captions Word Cloud

Captions

Hello, everyone, and welcome to an introduction to data structures. My name is Steven, and over the next three hours, we'll be tackling the topic of data structures in relation to computer science. More specifically, we'll be talking about what they are going over some different types of data structures, and discussing how we can use each of them effectively with examples. This course will be a general overview of data structures, and so we won't be confining ourselves to one specific programming language. However, this series will require you to have a basic understanding of programming. If you are completely new to computer science, I would recommend our introduction to programming series, which will be linked in the description below. That video will give you all the information you need to traverse these topics comfortably. Now, we're almost ready to hop into things. However, there are a few housekeeping items that I want to go over beforehand for those wishing to enhance their learning experience. If you're not interested in that and want to skip right to the content, go to the time shown on your screen now. For those of you staying though there are just a few things I would like to mention. Firstly, in the description, you will find timestamps for each major topic covered in this video, as well as timestamps taking you to every smaller section contained within those topics. So please feel free to skip around if you're already comfortable with one topic or only interested in a certain data structure. Next, we've decided to include the script and visuals used for the series also in the description below. That way you can follow along if you like or simply read the script if my soothing voice isn't your style. Additionally, for each segment, I'll be including the references and research materials used to write the script for each topic. That way, if you ever feel as if I haven't explained a topic well enough, or we're just simply like more information about a certain topic, you'll have a readily made list of articles and websites, which you can use to supplement this series. If you feel like you have any questions throughout the series about maybe something I said or a certain visual, please ask your questions in the comments below. I'll try to be as responsive as possible for the first few weeks on questions you guys may have about the series and such. And finally, in terms of housekeeping, I just like to say if you're not already subscribed to our channel, no pointer exception, then consider doing so if you enjoy this type of content. as me and my co host Sean regularly post videos in this style. We're trying to hit 1000 subscribers before the fall semester of college begins. So check us out if you end up enjoying the video. With my shameless plug out of the way, we're finally ready to tackle the topic of data structures. In this introduction segment, we'll simply cover a general overview of what data structures are, and then go over what type of information will be covered throughout the duration of the series. So the obvious question to start with is what exactly is a data structure? Well, in computer science, a data structure is a way to store, organize and manage information or data in a way that allows you the programmer to easily access or modify the values within them. Essentially, it's a way for us to store a set of related information that we can use for our programs. data structures, and the algorithms used to interact, modify, and search through them provide the backbone for many of the programs that you'll end up writing, I can almost guarantee that in 99% of your programs data structure will be involved. Each of the data structures that I'll be talking about are designed for the sole purpose of storing information and allowing the end user to access and manipulate that information in an efficient, effective way. But each one differs in the manner that they accomplish this. If you have a basic understanding of programming, you probably know about a few different data structures already, such as the array and the array list, also known as the list in Python. But if you're going to be pursuing a career in computer science, just knowing these two is definitely not going to cut it. Well basic data structures such as the array are used frequently by companies throughout their code. More advanced data structures are being put to use all around you the Undo redo button in Google Docs, Google Maps on your phone, even the autocomplete feature through your text messages all require the use of more advanced data structures, which makes them extremely useful to learn and comprehend. Okay, now, I can't see you per se, but I can just tell that you're jumping up and down in your seat hooked on data structures and how useful they can be. But before we jump too far into things, let's discuss the information that we'll be covering in this series. Now, before we talk about any specific data structures, we're going to have a brief talk about efficiency. We'll discuss the metrics used to judge the speed and efficiency of different data structures, which will then give you a better understanding of why one data structure might be used over another one. From there, we'll start by diving headfirst into what I believe are the basic data structures, those being array. An array lists. While you may already have a good understanding of what these are, I would highly suggest that you still watch these segments, because we'll be going into a little bit more depth as to why they're so useful based on differences in how they're stored in the computer's memory. After that, we'll move on to what I'll call the intermediate data structures. These are a little bit more complicated than the basics and have a few special attributes which make them stand out from the rest. We'll begin by taking a look at stacks which are the backbone for all recursive processes in your computer. Then we'll look at the antithesis of a stack the queue. Moving on from there, we'll be covering linked lists and their evolved form in the doubly linked list. Before moving on to the final of our intermediate data structures, the dictionary which includes a mini lesson on hash tables. Then we'll wrap up the series talking about trees and tree based data structures, less linear and more abstract data structures beginning with the tree itself. We'll then move on to tries a very useful data structure used for a lot of word processing algorithms. And finally end off the series with a discussion on heaps and graphs. So in total, I'll be taking you through four different segments containing 12 of the most common data structures that are practically used, as well as providing examples of where and when to use them. section one on efficiency, section two on basic data structures, section three on intermediate data structures, and wrapping it up with section four on tree based data structures. With this knowledge, you'll be able to take charge of your programming career and hopefully gain a competitive edge by implementing them when needed. Like I said, however, before we jump straight into the different ways to store information, I'd like to have a quick discussion on how we score the efficiency of these data structures using what is known as Big O notation. So let's jump right in. Okay, so before we talk about all these crazy data structures, like maps, and heaps, we want a quantifiable way to measure how efficient a certain data structure is at the different tasks, we might ask a bit. If we're going to be storing extremely large amounts of data, being able to search through, modify, or access the information within a data structure needs to be fast and efficient. As we briefly mentioned before, the industry standard for this kind of implementation is big O notation. So what exactly is big O notation? And how do we measure it for a specific data structure? Well, that's what we'll be covering in this segment. For most of the basic and intermediate data structures in this series, we're going to be spending some time talking about its efficiency using big O notation. So this is definitely a topic you're not going to want to skip. With that being said, let's begin. So because there are so many different ways to store information, like we talked about in the previous segment, programmers have developed this idea of big O notation as a way to basically score a data structure based on a few different criteria. This criteria can change based on who you ask, but for the purposes of this video, we will be using four criteria representing the most common functions you might want from a data structure. The ability to access a specific element within the data structure, search for a particular element within the data structure, insert an element into the data structure and remove an element from the data structure. Now by measuring how efficiently a certain data structure can do these four things, we can basically create a report card of sorts, which measures how efficient a certain data structure is. At a glance, this gives us a pretty good overview of what a certain data structure is good at, and what it is bad at. And it can help us decide which one to use. If we need to store data that is easily accessible to the end user. For example, we might choose a data structure which can access elements the quickest, and vice versa, if accessing elements isn't the most important thing to us, but we need a data structure which can be easily added to and deleted from, we would go for one which is most efficient and that specific functionality. By looking at a data structures report card, if you will, we can get a quick sneak peek at what they're good at and what they're bad at. Now, if we use big O notation to basically create a report card, like I said, there must be some way to actually grade each of these functions. And there is the four criteria mentioned, accessing searching, inserting and deleting are all scored using big O notation time complexity equations. Now what is the big O notation time complexity equation? Well, I'm glad you asked. Besides being in an annoyingly long phrase, a big O notation, time complexity equation, or just time complexity equations, work by inserting the size of the data set as an integer n and returning the number of operations that need to be done. indicated by the computer before the function can finish. The integer n is simply the size or amount of elements contained within the data set. So for example, if we have an array, one of the data structures we'll get into soon, with the size of 10, we would place 10 into the different efficiency equations for accessing searching, inserting and deleting that represent the array, and returned back to us would be the number of operations that need to be conducted by the computer. Before completion of that function. It does get a little bit more complicated than this. But for the sake of keeping this series simple, all you need to know is that these equations help represent efficiency amongst different data structures. Also, an important thing to note here is that we always use the worst case scenario when judging these data structures. This is because we always want to prepare for the worst and know which data structures are going to be able to perform under the worst conditions. You'll see what this means in greater detail a little later on once we start putting this into practice. But for now, just keep in mind that when judging data structures, we always use the worst case scenario. Moving on, the reason that it's called Big O notation is because the syntax for these particular equations includes a big O, and then a set of parentheses. inside these parentheses will include some function, which will correctly return the number of operations needed to be run by the computer. So for example, let's say we have a fake function, it can really be anything, the purpose in this case is irrelevant. Now for this fake function, let's say it's time complexity equation was the one shown on your screen now, we would pronounce this time complexity equation as o of two, meaning it takes two operations from the computer before our make believe function can finish. If the time complexity equation was o with a five in the parentheses, instead, it would be o of five, and so on. Now, these are examples of time complexity equations, which run in constant time, meaning no matter the size of our data set, it will always take the same number of instructions to run. This is usually not going to be the case though, when it comes to the four criteria we want to mention for our data structures. Most of the time are integer n, which again contains the amount of elements within the data set is going to have some adverse effect on how many operations it takes to say search through our data structure, which makes sense, a larger data set means it's going to take more operations from the computer to search through that entire data set. Now we score these four functionalities in number of operations performed, because measuring by how long the function takes to run would be silly. If we measured by a metrics such as time taken to completion, our results would be highly biased by the hardware used to run the function. a supercomputer used by Google is obviously going to be able to search through a data structure much faster than a laptop. time complexity equations essentially level the playing field by instead returning the number of operations to eliminate the bias in processing power that exists. So to sum up everything that we've learned so far, we measure the efficiency or speed of a data structure based on how well it can perform four basic tasks accessing, searching for inserting and deleting elements within itself. Each of these criteria is modeled by an equation which takes in the size of the data structure in number of elements and, and returns back the number of operations needed to be performed by the computer to complete that task. By measuring these four metrics, we can get a pretty good understanding of what the data structure is good at, and what the data structure is bad. Now, it's important to note that this isn't the end all be all for deciding on which data structure to use in your program. As you'll see, as this video progresses, many of the data structures were structured with specific quirks or features, which separate them from the rest and provide additional functionality. Big O notation is incredibly useful, and something that you should definitely take into consideration when determining which data structure to implement into your program, but it should not be the only thing that you use. Cool. Now that we know a little bit about how we measure the efficiency of a data structure using time complexity equations, along with the four criteria actually used to measure them. Let's dive straight into what the actual equations mean in terms of efficiency. That way, when we start grading data structures, you'll have a good idea as to how good or bad each one is at that particular metric. Basically, we're going to cover the six most common time complexity equations from most efficient to least efficient. Now, the absolute best that a data structure can score on each criteria is a time complexity equation of o of one. This essentially means that no matter what the size of your data set is, the task will be completed in a single step. If your data set has one element, that computer will finish the task in one step. If your data set has 100 elements, one step 1 million elements one step 800 quadrillion elements, it does not matter, the computer will finish the task in a single step. This is why when we look at the graph of volume of data versus instructions required for o of one, the line remains constant at one. No matter the volume of data stored within the data structure, that computer can complete that task in a single instruction. Whether it be accessing searching, inserting, or deleting, O of one is the gold standard absolute best top of its class time complexity equation. It is basically the Michael Jordan of big O notation when it comes to time complexity equations. Now the next fastest type of time complexity equation is O log n. While not as fast as instantaneous time, a function having a time complexity of o log n will still provide you with very fast completion. Now if you don't fully understand logarithms entirely, just know that this efficiency is slower than instantaneous time of one and faster than the next level of efficiency known as O of n. Additionally, because the volume of data versus time graph follows a logarithmic curve, the larger the data set that you use, the more bang for your buck that you're going to get. Basically, as the amount of data you try to perform one of our four criteria on increases the rate of change of the amount of operations it takes to complete that certain task decreases. So as you can see, at the beginning of the graph here, when we increase the volume of data, our number of operations skyrockets. But when we do the same for larger and larger sets of data, our number of operations increases much more slowly than before. A good example of a non data structures related function, which has a time complexity of o log n is the binary search. If you know what a binary search is, then you'll understand how it is that when the data set gets larger, the number of instructions needed to find the item you're searching for, doesn't increase at the same rate. If you don't know what a binary searches, but would like to know, in order to better understand Oh login. A link in the description below will take you to that point in our introduction to programming series where you can learn about it. If not, let's move on to the next equation. O of n is the next common time complexity equation that's going to come up frequently during this lecture. The graph of volume of data versus instructions needed for this function is linear, meaning that for every element you add to the data set, the amount of instructions needed to complete that function will increase by the same amount. So to perform a function with the time complexity of O of n on a data set with 10 elements, it will take 10 instructions 50 elements will take 50 instructions, 1000 elements, 1000 instructions, and so on O of n is really the last good time complexity equation that exists. Anything above this is considered inefficient and not very practical when it comes to data structures in computer science. The next type of equation that will come up is o n log n. This equation is the first that is relatively bad in terms of efficiency. The graph of volume versus operations shows a somewhat linear but increasing graph meaning unlike o log n, it won't be better in terms of efficiency as the size of the data set increases. Instead, the slope actually increases as the volume of data does. The last two types of equation are o n squared and o two to the N. These are both incredibly inefficient equations, which should be avoided if at all possible. Because they're both exponential in structure, which can be seen from their graphs of volume versus operations. The larger the data set that you use, the more inefficient it will become. While there are more time complexity equations that exist such as o n factorial, which is even worse than the two I just mentioned, the data structures we'll be talking about, we'll never have time complexity equations that exists outside of the five we've just covered. So we're going to stop there. Now the final thing I want to say is just a reiteration of something I said before, these time complexity equations are not the only metric you should be using to gauge which data structure to use. As we get deeper and deeper into this series, you'll see that we might have some data structures which don't seem that efficient at all based on their time complexity equations for provide some other functionality or feature, which makes them extremely useful for programmers. Okay, now that we have some knowledge on how we actually grade these data structures in terms of efficiency, let's hop into our very first data structure. The first data structure we're going to be covering is the array. Now I understand the array is a pretty common data structure taught in most programming classes, and that many of you might already know about the properties and uses of an array. But in this segment, we'll be going into more detail on the array as a data structure, including its time complexity equations, storage methods, and more. So check the description for a timestamp to skip ahead to that section. If you already know the basics. If you're not as versed in the array, though, or just want a general review, then stick around, because right now we'll be discussing the basics of an array. An array is fundamentally a list of similar values grouped together in a central location, which makes sense because if you remember, we use data structures to store sets of similar information, so that we can easily use that information. Basically, arrays can be used to store anything, usernames for an app, high scores for video game prices for an online shop, pretty much any list of values which are fundamentally similar meaning of the same type. So integers, strings, floats, objects, the list goes on and on. Any primitive or advanced data type you can think of can be stored within an array. Every item contained within the array is referred to as an element of that array. And we call the collective total of elements, the array, this is going to be true for almost all data structures we talked about, by the way, so just keep that in mind. An array will also have three attributes associated with it, the first being a name for the array, the second a type for the array, and the third a size for the array. Let's start with the name. The name, if you couldn't tell is just simply a name for the array, which can be used to reference and interact with it. Say for example, we had an array called names, which contains the list of company employees. And then we also had an array called salaries which contained a list of salaries for those employees. If we wanted to search through the list of salaries, we would do so by referencing the salaries array with its name. This way the computer knows that you want to search through that specific array containing salaries, and not the names arraigns that this works vice versa as well, of course, if we wanted to search through the names array instead. Now another thing to note really quickly, while we have this up is that these two arrays are also an example of parallel arrays. Parallel arrays are two or more arrays containing the same number of elements and have the property wherein each element in one array is related to the element in the other array, or arrays have the same position. So from our example, we can say that the first element in our names array, john smith, corresponds to the first element in the salary array, the integer 10,000. This would mean that john smith has a salary of 10,000. This is really good, because as you'll see, later on, we can't store different types of variables in the same array. So we couldn't have the salary integers be stored in the same place as the name strings, making parallel arrays extremely useful for storing different types of data about the same entity. Alright, tangent over, let's get back to the attributes of arrays. And the second one, we'll be talking about the arrays type. Now an arrays type is simply what type of information is stored or will be stored within that array. One of the biggest stipulations about arrays is that it has to hold all the same type of information. So you cannot have an array containing integers and strings, it would have to be either an array of only integers or only strings. This works for any type of information you would want to store within the array, meaning every element or spot within the array must be of the same type. The third and final attribute that an array has is a size. This size is a set integer that is fixed upon the creation of the array. It represents the total amount of elements that are able to be stored within the array. An array size cannot be changed. I'll repeat that again because it's extremely important and array size cannot be changed once created. Once you write the line of code which instantiates an array You give it a defined size or fill it with elements until it has a defined size. And then at that point, the array size cannot be changed by conventional methods. This is actually pretty rare for a data structure. And it may seem really counterintuitive and useless. But later on when we talk about an arrays efficiency, you'll see that this is for a specific reason, and not just to be annoying to the programmer. So there you have it, the three attributes of an array, a name to reference it a type to fill it with an A size to control it. Let's now shift focus and talk about how we actually define an array ourselves how we reference and change values within an array, and then dive into the idea of two dimensional arrays. There are actually two different ways to instantiate an array in most languages, you can either populate the array with elements that you want contained within it right then and there. Or you can set a specific size for the array, and then slowly populate it later on as the program runs. We'll be discussing how to do both, starting with the first creating the array with the elements already stored. Now defining and filling an array as soon as you create it is mainly used for when you already know which values are going to be held within it. For example, let's say you're creating an array, which is going to hold the salary of 10 different workers at a company. And that information is being read in from a text file. Well, you already know the 10 salaries for those workers from the text file, and so you're able to immediately populate the array when you create it. This way, you'll already have the information available to you as the program runs. The way you do this varies amongst different programming languages. But it's shown here in three different languages, Java, Python, and C sharp. To give you a basic gist of how this is accomplished. All three lines of code on your screen now will create an integer array of size three, containing the numbers one, two, and three. You can see that all of them involve the declaration of an array name, which is then set equal to the values you want to include within the array encapsulated by some sort of brackets or braces. This isn't 100% how it's going to be for all languages, but for the most part, it will follow this skeletal outline. As you can see, Java and C sharp require you to define the array type, whereas Python does not. But for the most part, the syntax is the same. Again, that is name of array set equal to list of comma separated values encapsulated by brackets or braces. This also develops the size of the array for you automatically. So since we populate the array with the numbers, one, two, and three, the array would instinctively have a size of three, as it now contains three integers. If we were to say go back and add in a fourth number, where we define the array, the size would dynamically increase to for the next time we run the code. Now the second way in which we can instantiate an array is by setting an initial size for our array, but not filling it with any elements and then slowly populating it as the program runs. This would be used in the case that you don't actually know what information you're going to store in the array, but will as the program runs through. The most common case in which this happens is in the case where the end user will enter information into the program. And then that information will be stored inside the array. Since the information varies based on which user runs the code and what they input during that particular run, we have no way of populating the array as soon as it's instantiated, forcing us to add the information later. shown on your screen now are two ways in which we can do this in Java and C sharp. Now you'll notice that there is no Python section because of the fact that creating populate later arrays is not conventionally possible in the base version of Python, without the use of more advanced data structures or packages. You can see However, with Java and C sharp that just as with the other example, there are slight differences in the syntax on how we do this, but it generally follows some set pattern, the type of the array followed by a name for the array, which is then set equal to a size for the array. Remember, this size is final and cannot be changed outside of this initial call. If you end up creating an array with the size 10. But then later in the program, find that you need more space for it. You can always go back to where you instantiated it and change the size. But after that line of code runs, there's no conventional way to increase it. You may also be wondering what the brackets mean when instantiating an array and that's true Way to signify to the computer that we are indeed trying to create an array and not just a variable. Now that we know the two different ways to instantiate an array, we need to know how we, as the programmer actually get information that is stored within the array so that we're able to use it. And the simplest way to answer that question is through the means of a numerical index. an index is simply an integer, which corresponds to an element within the array. Now, the most important thing to know about indexes is that for a certain array, they begin at zero instead of one. So if we have an array of 10 elements, the first index would actually be index zero, the second would be index one, and so on all the way up to the ninth index. It's a little bit weird at first, but trust me, you get used to it. Now, to retrieve information from a certain position within the array, you would reference it using both the arrays name, and then the index number of the element you wish to retrieve. Say we have our array called numbers here, which contains the integers one through 10. To print out the element at the fifth index, in this case, the number six, we would reference the array name, in this case numbers. And then in a set of brackets, we would place the index we want to retrieve, in this case, the number five. What this piece of code is basically telling the computer is to print out the fifth index of the numbers array, which again is the integer six. Now because indexes start at zero instead of one, it's very important to note that to get the 10th element of the array, you would actually need to reference it using the number nine since there is no 10th index. If you do end up making this mistake, it will result in an array out of bounds error, since there is no 10th index in the bounds of the numbers array. Referencing and arrays index number is how we actually replace elements within the array as well. Continuing with our numbers array, let's say we wanted to change the last element, the integer 10 to be the integer 11. Instead, what we would do is reference the ninth index of the numbers array, the one which currently contains the integer 10, and set it equal to 11. This tells the computer to take the element at index nine and replace it with the integer 11. Essentially swapping 10 for 11. Now the last thing I want to talk about before we jump into the time complexity equations for an array is the practice of putting arrays inside of arrays. An array with an array at each element is known as a two dimensional array. Think of a two dimensional array as a matrix. It's similar to any other array, except for the fact that instead of a primitive data type like an integer how's that each element, we would instead have a whole other array with its own size and indexes. two dimensional arrays are useful for a variety of purposes, such as if you were programming, a chess board, a bingo board, or even an image where each element in the two dimensional array contains an RGB value, which when combined creates an image. Now referencing an element within a two dimensional array is mostly the same as with one dimensional arrays, except you now need two indexes, the first number would be the index of the column that you want to reference, and the second number would be the index of the row that you want to reference. By combining these two numbers, we can single out an element within the two dimensional array that will then be returned to us. As an example, let's create a two dimensional array with 16 names contained within four rows and four columns. Now to access the name Carl, you'd first single out the column, which contains the name you're looking for. In this case, Carl is in the third column, meaning it's in the second index. Next, you would single out the row which contains the name you're looking for. In this case, the name Carl is in the third row. So we would again use the second index to reference it. Combining these two pieces of information gives us the index location two comma two. And if you look, Carl is indeed at index location two comma two. Just as another example, Adam is in the first column two rows down. And so to reference Adam, you would simply call upon the element at index locations zero comma one. two dimensional arrays are just the beginning. You can also have three dimensional arrays, four dimensional arrays, and so on for containing large amounts of advanced relational data. But two dimension is where we're going to stop for today. All right, that concludes the background information on arrays. Now that we know what they are, let's actually talk about arrays as a data structure. This is where we're finally going to be using our big O notation knowledge from the previous segment. to judge the array as a data structure. We'll be going through the four criteria we talked about previously, accessing elements within an array, searching for an element within an array, inserting an element into an array, and deleting an element from an array, and scoring it based on how effectively they can complete these four tasks using time complexity equations. Now accessing an element within an array can be conducted in an instantaneous of one time. This means that for any element within your array that you want to know the contents of, you can do so immediately by simply calling upon its index location. This has to do with the way that an array is stored within memory. For example, sake, let's look at a fake portion of memory in the computer. You can see we have some information near the beginning, maybe some integers or strings that we've initialized, that's not important. What's important is that when we get to the part in our code, which instantiates an array, we already know exactly how large the array is going to be. Either because we've populated it with the elements needed, or given it a definitive final size. This means we also know exactly how much space we need to reserve in memory for that array. Combining these two pieces of knowledge means we can instantaneously figure out the location of every single element within the array by simply taking the starting location of the array in memory, and then adding to it the index of the element that we're trying to find. So let's say our array contains three elements, each of them integers. And we want to know what the element at the first index is. Well, we know that the array start storing data here. And we also know that every element within the array is stored contiguously together in memory. And so by taking the starting location for the array in memory, and adding one to it, we now know that that's where the first index of our array is going to be, and can return that stored value appropriately. This is the main reason why array sizes cannot be changed. All of the information within an array must be stored in this one place. So that we're able to do this, the contiguous structure of an array prevents you from adding space to it after it's been initialized, because it literally can't without breaking up the data and adding in the space to store new elements down the road away from the initial array. This would make accessing elements instantaneously not possible. Hopefully now you can see why both accessing an element within an array is O of one, as well as why array sizes are final, because not many other data structures allow you to have instantaneous accessing power, meaning arrays have a huge advantage on the others in this metric. Next up searching through an array is O of n. This is because for the most part, you will be working with unsorted arrays, those being arrays which are not in alphabetical numerical or some sort of quantifiable order, meaning that to find an element within the array, you may have to search through each element before you can find it. This is known as a linear search. And if you want to know more about it, click the card in the top right corner, and it will take you to that section in our introduction to programming series. A link will also be provided in the description below. Now while there are faster ways to search through an array, those only work when the array is sorted. And so for a basic array, in worst case scenario, searching through it to find a particular element is going to be O of n. Now Sure, there are going to be cases where the element you're searching for is the first element in the array, or even somewhere in the middle. But remember that when working with big O notation, we always use the worst case scenario. And in this case, the worst case scenario is that the item that you're searching for ends up being the last element within the array, which for an array of size n would take n operations to reach. Finally, inserting and deleting elements from an array both have a time complexity equation of O of n for inserting this is because adding an element into the array requires you to shift every element that's after the index, you want to insert the value at to the right one space. For example, if you have an array of size five, currently filled with four numbers, and you want to insert the number one into the array at index zero, you would first need to shift every element to the right of the zero with index right One, essentially freeing up the space for the new integer, while also retaining all of the information currently stored within the array. Then, and only then can you actually insert it. This requires you to traverse through the whole array of size n, which is why the time complexity equation is O of n. Now, sometimes you might not have to shift the whole list, say in the case, you had the same array of numbers, only this time the fourth index was free, and you wanted to insert the number five in that open spot. Since you don't have to move everything to the right one, in order to make space for this new element, does that make inserting into an array have a time complexity equation of o of one? Well, no, because again, when we talk about a functions time complexity equation, we always refer to it in the worst case scenario, the most amount of operations that we're going to have to conduct before we can insert an element into an array is n, the size of the list, making it's time complexity equation O of n. Deleting an element from an array follows mostly the same idea, you shift every element to the right of the one, you want to delete down one index. And essentially, you have deleted that element. Again, let's say we had an array of numbers one through five, only this time, we want to delete the number one at the zeroeth index, what we would do is set the zeroeth index to whatever is contained at the first index, in this case, the integer two. And then we simply repeat this process until we get to the end of the array, finishing it off by setting the last element in the array to be some no value. This deletes the element from the array by basically replacing it with a value to the right. And then this process is repeated into it frees up one space at the end. Essentially, we're just moving the entire array down one. Now worst case scenario, we're going to have to shift every element in the array of size n, making deleting from an array also have a time complexity equation of O of n. So there you have it the four time complexity equations for an array, accessing is O of one and searching, inserting and deleting are all O of n. Arrays are a really good data structure for storing similar contiguous data. Its ability to access any item in constant time makes it extremely useful for programs in which you would like to be able to have instant access to the information contained within your data structure. It's also a lot more basic than some of the other data structures that we'll end up talking about, making it both easy to learn and reducing the complexity of your code. Now, some disadvantages of the array are the fact that the size of the array cannot be changed once it's initialized. Inserting and deleting from an array can take quite a while if you're performing the function on larger data sets. And if you have an array which is not full of elements, you're essentially wasting memory space until you fill that index location with a piece of data. Overall, the array is a pretty reliable data structure. It has some flaws as well as some advantages. A program that you write could use the array if need be, and it would work just fine. But sometimes you might want some extra functionality. And that's where more advanced data structures come into play. One of those more advanced data structures is what's known as the ArrayList. The ArrayList fundamentally can be thought of as a growing array, we just finished talking about the array and how one of its major flaws was the fact that once initialized an array size could not be changed using conventional methods. Well, in contrast, an ArrayList size expands as the programmer needs. If you take a look at the ArrayList on your screen now full of four elements, and you decide to add one to it, it will simply expand its size to fit five elements. As you can probably tell this is extremely useful. Which begs the question of why not just always use ArrayList. I mean, in comparison, the array list seems to provide all the functionality of an array and then some Well, that's definitely a valid question, and one we will get to later on in this section. But before we can do that, we need to cover the basics of an ArrayList including some of the properties and methods associated with it. Alright, let's hop in. So as we said before, an ArrayList is simply a resizeable array, making them extremely similar in structure. This is furthered by the fact that an ArrayList is actually backed by an array in memory, meaning that behind the scenes of your code, the ArrayList data structure uses an array as its scaffolding system. For this series, we need not go further than that. But for us, it just means that a lot of the functionality will be the same. Now, this doesn't mean everything is going to be the same, which you'll see later on. But it's still important to make note of before we get too deep into things. The next thing I want to do before we talk about functionality is actually go over how we initialize an ArrayList. To do this, again, is going to vary based on which language you're using. So shown on your screen now are two different ways to do so in Java and C sharp, you may notice again, that there is no Python on the screen. And this is because in the base version of Python, arrays and array lists are actually not separate entities. They are frankensteined together into a single data structure called lists. Lists take some functionality from arrays and some from array list. It's a lot more complicated than that. But for this series, that's all you're going to need to know as to why we are not including Python. In this section. We discussed initializing Python lists in the previous section. So if you're interested, you can go back and look at that. Now going back to ArrayList initializations. Another thing you may notice is that it looks a little bit awkward in the way that these statements are structured. And that's mainly due to the fact that the ArrayList is actually its own separate class outside of the base version of these two languages. Now, I'm not going to get into class hierarchy and object oriented programming right now, because that's a whole other topic with big concepts and even bigger words. For right now, this just means that to create a new ArrayList, we have to invoke the ArrayList class when defining it, which as you can see is done at the beginning of both initializations. After that, you can see that we give it a name, which is then set equal to new ArrayList with a set of parentheses. Now in this parentheses, you have a few options, you can either enter in an integer, which will then be used to define a size for the ArrayList. Or you can just leave it blank. Leaving the parentheses blank like that will automatically define a preset size of 10 for the ArrayList. Again, just as a reminder, this can be dynamically increased as time goes on if we add enough elements, but it's just meant as a base size. Now you may be wondering if we can actually populate the array with elements when initializing it as we could with arrays. But array lists actually do not support this type of declaration. Moving on, let's talk about functionality. Now the array list can be thought of as pretty much the evolved form of an array. It's a bit beefier has a little bit more functionality and is overall more powerful than array. That's certainly not to say that it's better in every case. But for the most part, the ArrayList is going to be thought of as the older sibling amongst the two. This is attributed to the fact that it belongs to the prebuilt ArrayList class, which we talked about earlier. The fact that the ArrayList belongs to a class means it's going to come with pre built functions that are already at our disposal from the moment we define and instantiate an ArrayList. More specifically, the ArrayList comes with methods we can use to access change, add to or delete from an easily if you were using an array, you would have to program most if not all of these methods by hand. And so having them pre built into the ArrayList class makes it especially useful. You'll see that this is the case with a lot of the data structures down the road, we're having a data structure belonging to its own class cuts out a lot of programming time, you have to spend making appropriate methods for the functionality that you might want. Now the types of methods that you're going to get will vary based on language. For example, in Java, you'll have a variety of methods to use, including ones to add elements to the ArrayList, remove them from the ArrayList, clear the ArrayList entirely, return it size, etc, etc. as well as tons of other more specific functions. And another language though, such as C sharp, you'll have some of the same methods as the Java ArrayList. But you might also have some methods so that the Java version does not and vice versa. The C sharp version might not have some of the methods that the Java version does. The same is going to apply for any other language you use, which might implement the ArrayList data structure. Because of the variability surrounding the ArrayList amongst languages. In this series, we're simply going to be covering six of the most common methods that are both useful and can be found in 99% of the ArrayList classes. These six methods are the Add method, the Remove method, the get and set methods, the clear method, and the to array method. This may look like a lot but remember, all of these are pre programmed for you And so you don't have to make them all by hand, all you have to do is call them on a pre made array list and you're set to go. Speaking of pre made array lists, before we dive into each of these functions and how they work, let's first create an example ArrayList, we can use to manipulate and show how they work. Let's call it example a list and give it a size of four so that it's not preset to 10. And we're set to go. Now, the Add method actually comes in two different types. One, which takes in only an object to add to the end of the ArrayList. And one which takes in both an object to add to the ArrayList, as well as an index value representing the index to insert the object at. Let's start with the simpler of the two, one, which simply takes in an object. This method is for more basic cases where you don't care about where in the ArrayList, the object you wish to add is going to end up, it will simply append the object you pass in as an argument to the end of the ArrayList. So let's take our example ArrayList and run the Add method on an integer of two. Now, normally, ArrayList only hold objects, not primitive types, like the integer two that we're trying to pass in. However, the computer will automatically convert our primitive integer into an integer object with a value of two, so that we can still add it easily. This is known as auto boxing. And it's going to be used throughout the rest of the series to modify data structures with primitive types. So I just thought I'd mention it now so that you're not confused later on. Okay, so when we run our code, since the ArrayList, is empty, and we ran the Add method, which doesn't care about location, it's going to add the integer to at the first index, index zero. Now if we run another add method, and this time pass in the integer five as an argument, since the zero with index is already taken, it will be slotted in as the first available open index, that being the first index. So as you can see, it's pretty simple. It takes in an object and we'll put it at the first open location available. Moving on to the second type of add method, one which takes in an object to add to the ArrayList, as well as an index to place it. This one works very similarly to the previous only makes sure that the object that you're adding is appended at the index provided. Again, let's say now we want to add the number one to our example ArrayList. But we want to make sure it's placed in numerical order. In this case, at the zero with index, what we will do is call the Add method, providing the integer one as an argument. In addition to the index zero, we want to add to the ArrayList. Once the code has run, the ArrayList will automatically shift our integers two and five to the right one, in order to make space for the integer one. This works for any index contained within the bounds of the array list. So if we wanted to do it again, only this time, insert the number three at the second index, so that the list remains in numerical order. We'd call example, a list dot add, and then in the parentheses passing the integer three and the index location two. After the code has run, you'll see that we have added the integer into our ArrayList. Now it's important to note that because there are two integers being passed in as arguments, you must know which one your computer's treating as the integer and which one the computer is treating as the index location. mixing these up could cause the computer to try and insert the attempted index location at a different location than the one you were attempting to insert at. So just be careful and knowledgeable as to the order your methods arguments are in. Now the next method that comes pre packaged in the ArrayList class is the Remove method. And this one also comes with two different types. The first takes in an integer as an argument, and just as the name suggests, will remove the object if there is one at the index location provided and return it back to the user. The second takes in an object and will simply remove the first instance of that object within the ArrayList If present, and will then return true or false whether an object was removed. So if we wanted to remove the number five from our ArrayList, we would have two different options. We could call example, a list dot remove, and inside the parentheses placed the index of the value we want to remove, in this case three, and the program will remove the object at index three. The other option would be to run another remove method, only this time passing an integer object of five. It has to be an integer object because if we were to just use five, the computer would try to remove the fifth index of the ArrayList, which doesn't even exist. By creating an integer object, we can ensure that when the code runs, the computer knows that we want to remove the first instance of the number five in our ArrayList. And running this will return true and remove the integer from our list. Now, if there is no integer five in the ArrayList, the Remove method will simply return false. Now I quite like the number five, so I'm actually not going to permanently remove it from the ArrayList just yet. Up next is the get method. Now the get method is pretty much the same as referencing an index for an array, it takes in an index location, and will return back to you the value at that location. So example a list dot get with an argument of zero would return one example a list dot get with an argument of two would return three, and so on. The next method is the set method, which is how we replace elements within an ArrayList. Much like the name implies, it takes in an index and an object as arguments. And we'll set the index location of the index you passed in to the object you also passed in. So if we wanted to set the number five in our ArrayList, to be four instead, so that it matches nicely with the other integers, what we would do is call example, a list dot set, and within the parentheses pass in the index location of the element, we want to set, in this case three, and then also the object we want to replace at that index. In this case, the integer for this method call will override the element at position three to be four instead of five. Now, you should be really careful when you're using this method, because you don't want to accidentally override an important element within the ArrayList. Next up is the clear method for when you simply just hate your ArrayList. This is perhaps the simplest of them all. It has not taken any arguments, and will simply clear the ArrayList deleting every element entirely. Calling example a list clear on our ArrayList would delete all the objects within it. But I don't really want to do that right now, especially with one more method to go. So for the sake of this series, let's just keep the ArrayList filled with the values that it currently has. The final method that we'll be covering in this section is a little bit different from the rest. And that's the to array method, which is used to convert an ArrayList to an array. I thought I would include this one because it's really useful for combining the strengths and weaknesses of arrays and array lists. The two array method takes in no arguments, and will simply convert the array list into an array. Now for this to work, of course, you need to set it to be equal to the creation of a new array, like shown on your screen now. But if it is done correctly, you'll end up with a brand new array, which contains all of the contents that used to be in the old array list. You may notice though, that instead of an array of integers, it's actually an array of objects. This mostly has to do with that object oriented programming stuff we talked about in the beginning. But for now, it won't make too much of a difference, we can still treat it as a regular array, printing out indexes to the console, placing elements within it. Typical array functionality. The only thing that changes is that it now contains integer objects instead of primitive integer types. So there they are the six major functions that come with any given version of the ArrayList class. Having these at your disposal will account for much of the functionality, you might use an ArrayList for making them extremely valuable to now Let's now move on to the array list as a data structure, again, we're going to be looking at its four time complexity equations for accessing, searching, inserting and deleting. Now, if you remember back to the beginning of this segment, we mentioned that the ArrayList is backed by an array in memory. And this means just like the array, it too will have over one accessing power. Essentially, this means that when we use our get method, which takes in an index location, it will return to us the value at the index provided in instantaneous time. Now you might be wondering, how is this possible since the data stored within an ArrayList is certainly not contiguous? With is actually due to a really interesting reason. So interesting that before scripting the series, I actually had no idea was the case. So because it is my series, I'm going to take a moment to talk about it. So if we pull up our example ArrayList and memory, you can see that it looks a little bit different. Let's break down what's going on. instead of storing the actual objects which are contained within itself, an ArrayList actually stores references or pointers to the locations of those objects in memory. So the zeroeth index based on the ArrayList is stored at the 87th location in memory, which is currently storing the integer one. Checking back to our example ArrayList, you'll remember that that is indeed what was stored at the zero with index of the example ArrayList. And this goes for every element within the ArrayList. The first is stored at the 91st memory location, the second at the 100th, and so on. So as you can see, while the actual data is not stored contiguously the references to that data are. So when we run our get command, all it has to do is return the value that's stored at whatever the index location points towards. It's a bit more complicated than that, especially the way that these references get stored. But that covers the gist of it. This is the reason our ArrayList can have instantaneous accessing power without having the data stored contiguously. Alright, tangent over, let's get back to big O notation, time complexity equations for the ArrayList. So we know accessing is going to be o of one. And searching is going to be O of n. This is for the same reason that arrays were o n. If we want to find out if an object is contained within our ArrayList. And that object is at the last index of the array list, we're going to have to search through each and every element within it of size n to make sure because remember, we always go based on the worst case scenario. Now inserting into the ArrayList is going to be O of n. Because worst case scenario, if we are inserting an element at the beginning of the ArrayList, we need to shift every element after the index we're inserting to the right one, just like we needed to do for the array. This requires a number of operations equal to the size of the array, making inserting O of n. Deleting is O of n for the exact same reason. If we want to delete the first element within the ArrayList, we would then have to shift every element down one to save space. Additionally, if we want to delete an object contained at the last index, we have to search through the whole list to find it. Either way, it will be O of n. Alright, so there are four time complexity equations, accessing his old one and searching for inserting and deleting are all O of n. If that sounds kind of familiar, that's because these are the same as the arrays time complexity equations. See I told you they were similar. Now this does bring back up the question we posed at the beginning of the episode. Why even use an array in the first place in comparison to the ArrayList the array just does not seem as useful? Well, let's get into that by sizing these two data structures against each other mano a mano. Let's compare. An array is a collection with a fixed size, meaning it cannot be changed, whereas an array list has a dynamic size, which can be updated to fit the needs of the programmer. Arrays can store all types of data, meaning both primitives and advanced types, whereas array lists can only store objects, not primitives. Now this problem is mostly solved through the auto boxing situation I talked about previously, but the fact still stands. Moving on an array is built into most languages, meaning it doesn't have any methods pre made for you to interact or modify it. Whereas an array list is its own class, meaning it comes with useful methods to help you utilize it. Finally, an array is very primitive in nature and doesn't require a lot of memory to store or use. Whereas an array list is again a class, meaning it requires more space and time to use than an array will. Hopefully now you can see that while the array list is more powerful, it still does have some drawbacks which make using an array sometimes more appropriate. In general, you want to use arrays for smaller tasks where you might not be interacting or changing the data that often and array lists for more interactive programs where you'll be modifying the data quite a bit. So that's the ArrayList. As a review, it is a dynamically increasing array which comes with a slew of methods to help work it as the array list is a hefty topic in computer science. If you feel I didn't do a good enough job explaining some of the concepts in the video. The sources used to write the script for this video will be linked in the description below. But if not, let's move on to our next data structure. The next data structure we'll be talking about is the stack. Now at this point in the video, we'll be diverging from what are known as random access data structures, ie arrays and array lists, and diving into sequential access data structures. What's the difference? Well, if you remember back to our discussion on arrays and array lists, we were able to access any element within the data structure by calling upon its index location. This would then result in the computer instantaneously returning the value at that location. Each element was independent of the one before or after it, meaning obtaining a certain element did not rely on any of the others contained within the data structure. That basically describes the gist of a random access data structure, one where each element can be accessed directly and in constant time. A common non computer science example of a random access would be a book. Getting the information contained on a certain page doesn't depend on all the other pages within that book. And getting an element contained within a random access data structure doesn't depend on all the other elements contained within that data structure. In contrast, elements in a sequential access data structure can only be accessed in a particular order. Each element within the data structure is dependent on the others and may only be obtainable through those other elements. Most of the time, this means that accessing a certain element won't be instantaneous. A common non computer science example of sequential access would be a tape measure. To get to the measurement of 72 inches, you would first have to go through inches one through 71. There's no way to just instantly get to 72 inches without first breaking every law of thermodynamics and probably the space time continuum. These are also sometimes called limited access data structures, because unlike random access, the data is limited by having to obtain it through a certain way. So there you have it, random access data structures, which allow instantaneous accessing power and independent elements such as the array and array list. And then you have sequential access data structures, which only allow accessing through a particular order with dependent elements. For the next few segments, we'll be covering a few of the popular sequential access data structures, such as stacks, queues, linked lists, and so on. Beginning of course, with the stack, we've already danced around the subject long enough. So let's finally talk about what a stack actually is. Now, the stack, by definition, is a sequential access data structure in which we add elements and remove elements according to the lifepo principle. The lifepo principle, which stands for last in first out, means that whichever element we added to the stack last will be the first one we retrieve. Hence, last in first out, think of this as a stack of books. Each time you add another book to the top, you do so by stacking it on top of the others. Then, if we want to get a book in the middle of that stack, we would first have to take off all the books on top of it, we can't just grab that book from the stack and expect the entire thing not to fall in on itself like some sort of literature ristic Jenga. The same goes for the stack as a data structure, we add elements to the top of the stack and retrieve them by taking them off the top of the stack. There's only one way in and one way out for the data. We can't simply access or modify an element within the stack all willy nilly like we were able to do with the array and the array list. This might seem like a weakness limiting the flow of data to one single point. But remember what I said during this segment on time complexity, many of these data structures have a built in functionality, which gives them an edge over the others. And the stack with its life. Oh property is a prime example of that. Next up, let's talk about some of the methods commonly associated with the stack. Now the stack class will always come with two methods pre built into its class, those being push and pop. These are the fundamental methods which we use to interact and modify the stack, making them extremely important to comprehend. In addition to those two, I'm also going to cover two more methods peak and contains, which are both useful and can be found in the stack class associated with most programming languages. Now, just like we had an example ArrayList let's also create an example stack to help us show off these methods. And just like that, we're ready to roll. Now push is a method which pushes an object onto the top of the stack. All it takes in as an argument is an object to add to the stack and will return no value. When we do this, that object becomes the form front of the stack, and its size is dynamically increased by one. Let's start running some push commands on a variety of different strings, you'll see that the stack slowly gets built. Now channel this to subscribe, a series of completely random words seamlessly pushed onto the stack. As you can see, we're only adding information to one spot the top, there's no insert method where we can just jam a string into the middle of the stack. Moving on, the pop method is used to remove an element from the top of the stack, it does not take in any argument and will return the element that is popped off the stack back to the user. Once a pop method has run, the element that was at the top of the stack is removed and returned back to the programmer. And the element which was second from the top becomes the new top of the stack. And so on our example stack, if we popped off each element, you'd see that each time one of these completely random strings is taken from the stack and returned back to the user until we are left with a sad, empty stack with no shameless promotion. Let's add the strings back, obviously, just for the sake of the next two methods that we need to cover, and not for continuing the free advertising. So push and pop are how we add and remove the data within our stack. So they're fundamentally the backbone of the data structure. The next two I want to talk about peak and contains are more so used to interact with the data inside the stack without actually changing it. Not as useful for modifying data, but still extremely important. The first of these is the peak method, the peak method allows you to get the value at the top of the list without actually removing. We talked about before that the only way to access elements in a stack was through the top. And this method is simply a way to look at what the top value is without having to pop it off. It takes in no arguments and we'll just return back the contents have the top element. In our case, if we ran it on our example stack, subscribe would be returned back to the user. Now if we popped off the top element and ran it again, two would be returned, it's that you get the idea. Again, let's push subscribe back on at the top for educational purposes. Now the final method I'll be covering is the contains method. This one is used for searching throughout the stack. It takes in an object as an argument and will return a Boolean of whether or not that item is contained within the stack. Essentially, this method allows us to search through the stack without popping off every element until we find the one that we're looking for, as the contains method does not modify the stack in any way. So for example, example stack contains with an argument of subscribe would return true example contains with an argument of this would also return true. But example contains with an argument of Hello would return false. So there they are four common stack functions which are going to be vital if you ever want to construct a stack based program. Moving on to time complexity, for accessing the stack has a time complexity equation of O of n. This is because in order for us to reach a certain element within the stack, we first have to pop off every element that's above. Think of it like this, if we had a stack of stones, and needed to get to the bottom one, we'd first have to take off every stone on top of it. So worst case scenario, if the element we want is at the bottom of the stack, we first have to pop off every element above it. This would require a number of operations equal to the size of the stack, making the time complexity equation Oh event. This is one of the major drawbacks to using stacks. with arrays and array lists, we could easily access any element within the data structure instantaneously. And with a stack, that's just not possible because of the way that it's structured. Searching is going to be O of n for the exact same reason. Worst case scenario, if we're searching for an element that's at the bottom of the stack, we have to go through the whole thing just to find it. Now, inserting and deleting make up for this by having time complexity equations of o of one. This essentially boils down to the fact that using our push and pop methods really only requires one operation. Since the data only flows in and out of a single point, inserting or removing an object from that point can be done immediately. For the push method, we simply add it to the top of the stack. And for the pop method. We just take it off from the top. It's not rocket science. Actually, it's computer science but that's Besides the point, there's no need to rearrange data or move elements around like there was for the array and ArrayList, because we don't have to the data that we're modifying is right there on top. So there you go, the time complexity equations for the stack. Now, you might be wondering if there are even uses for a first in first out data structure, because it seems kind of out there, I mean, limiting your data to a single entry point. But you would actually be mistaken. stacks are used everywhere, both in the actual writing of other code as well as in real world situations. In fact, one of the most fundamental processes of computer science uses stacks as a way of keeping track of active functions or subroutines. Of course, I'm talking about recursion. Now, I won't get into recursion too much here. But basically, it's the process of functions repeatedly calling themselves when the function calls itself, that call is added to a stack of processes. And when that stack finally reaches what's known as a base case, the functions are all then popped off one by one. It goes much, much deeper than that. But we don't have time for a full blown lesson on recursion. Right now. If you want that, you can click the card in the top right corner of your screen, or the link in the description below, which will both take you to that part in our introduction to programming series where we cover it. Either way, stacks are the backbone for recursion. And recursion is the backbone for a plethora of computer science related functionality, such as traversing data structures, keeping track of active subroutines in code, and much, much more. Some examples of stack based functions outside of computer programming, which you use every day include the Undo redo button in word processors and the back paging on web engines. Both use stacks. Similarly, they continually add tasks you've completed to a stack, either web pages that you've visited or words that you've typed. And then when you press Undo, or the back button in a web browser, all we have to do is pop off whatever the last task was off the stack. And bam, you're right back to where you were a second ago. It's like magic, but better in a way. As you can see, the stack has a lot of real world applications, both on the consumer side and the client side. You interact and use stacks every day without even realizing it. And we'll use stacks frequently as you continue along your computer science journey. And so by learning them, you're opening up a world of opportunities. That concludes our discussion on the stack. To review. It is a sequential access data structure in which we use the lifepo principle to add and remove elements from and again, life O stands for last In First Out of next we'll be talking about an equally useful data structure that functions very differently than the stack while also working very similarly. And that data structure is known as the queue. So now that we've talked about the stack, a sequential access data structure, which follows the life principle, we need to cover the opposite, which in computer science, we obviously call a queue. Now by definition, a queue is a sequential access data structure, which follows the FIFO principle, or first in first out, the stack and q are very much a dynamic duo when it comes to the world of comp sigh. So you'll notice a lot of similarities between the two and the way that they're structured and how they work. Today, and in this segment, we'll cover a lot of the same topics as we did with the stack, only for the queue. Now timestamps can be found in the description correlating to these topics. But if you're sticking around, let's dive deeper into what the queue actually is. Well, the queue like the stack is a sequential access data structure, meaning we can only access the information contained within it a certain way. Now, if you remember back to stacks this certain way was through the lifepo methodology, or lastin. First out with the last element pushed onto the stack would always be the first one we popped off, similar to a stack of books that we add to and remove from. Now in contrast, the queue follows what's known as the FIFO methodology, or first in first out, where the first element added to the queue will always be the first one to be removed. We can think of this as a line to your favorite amusement park ride. Mine as a kid was always the Ring of Fire. So let's just use that one as an example. The first one to get there, assuming we don't have any cutters will always be the first one who gets to go on the ride. The later you show up, too long do you have to wait. This is the strategy used for cues when adding and removing objects. The first element to be added will also be the first one to be removed. Another big difference between stacks and queues is the location we add and remove elements from. You might remember that with the stack, we added and removed elements from one spot the top. With the queue. However, this is different. We add elements, the back, also known as the tail, and we remove them from the front, also known as the head. This allows us to make sure that we 100% follow the FIFO methodology. So there's your background information on the queue, sequential access FIFO methodology, add elements the back and remove them from the front. Got it? Good. Now, let's dive headfirst into how we can actually use a queue by jumping into some common queue methods. So just like the stack, we're going to have two methods used to add and remove elements from the queue. For the stack, we added elements with push and removed them with pop. In contrast, with a queue, we add elements using on cue and remove them using dq. In addition to these two methods, we're also going to cover peak and contains, which if you watched the previous segment should look pretty familiar. Alright, let's start. on cue is the first method and the one we use to add elements to the tail of the queue. It takes in an object to put at the end of the queue, and simply adds that object, we'll also increasing the size of the queue by one, let's pull up an example queue, which can see currently has a size of zero. But say we call on queue on a completely random string, let's say the string now, that would be added to the tail of the queue, and the size would increase by one. Now, because there's only one element in the queue, at this point, the string now is acting as both the head and tail for us. We can of course fix that by on queueing a few more completely random strings. If we add the string video, the size goes to two, we can add this and it goes to three, like makes it four, you get the idea. And now we have a fully functional queue, as you can see, like is acting as the tail being that it was the last string added. And now is being treated as the head which makes sense considering it was the first string to be added. Moving on. dq is the method we use to actually remove elements from the head of our queue. It takes in no arguments and will return the element that was removed from the queue back to the user. So if we ran a dq command on our example queue, you'd see that now is both returned back to the user and also removed from the queue. Additionally, the size of our queue has been dynamically decreased by one. If we run it again, video is returned and removed from the queue and the size goes down by one yet again, you get the idea. We can keep doing this for every element in the queue until it's empty. But the next methods we're going to talk about need some information to work with. So for now, let's refill the queue back to its original four elements. The next method that I'll discuss is peak. Now we've actually covered peak just a few moments ago in our segment on stacks. But if you forget or just didn't watch that particular part, peek returns the object that's at the forefront of our queue. It doesn't take in any arguments and simply returns the foremost object of the queue without actually removing it. The key word there being without this method allows you to look at the head of the queue before you actually be queuing. There are a multitude of reasons that you might want to do this, maybe to make sure that the element that you're about to dq is the correct one, or to check to see if an element you need is still there, etc, etc. Whatever the case is, we can use the peak method to fulfill our needs. If we were to run it on our example queue, you'd see that the string now is returned. But if we dq the first two elements and run it again, you'll see that the string This is returned, pretty simple, but extremely effective. Again, let's add video And now back into the queue for our next and final method. That method of course, being the contains method, the name pretty much says it all. The contains method will take an object and will return a Boolean of whether or not the queue contains that object. running it on our example queue with an argument of queue would return false because as you can tell, there is no q string contained within our queue. However, if we ran it on a string, such as video, it would return true because as you can see, the string video is indeed in the queue. And there they are all together now on cue dq peak and contains four methods which will help you utilize a queue to its maximum efficiency. Speaking of efficient See, that takes us perfectly into the time complexity equations for the queue. Now accessing an element within a queue is going to be Oh event, let's say you had a queue full of three elements. If you want the object at the tail, you first have to dq every element off the front into the one you're looking for is the head of the queue, Then, and only then can you actually get the value contained. Since this may require you to go through the entire queue of size n, accessing is going to be O of n. Remember, now queues are sequential access data structures and not random access, meaning we can just grab an object from the middle of the queue. That's just not how things work. Searching is going to be O of n for the exact same reason, trying to find an element contained at the tail of a queue requires you to iterate across the entirety of that queue to check for it. So in that scenario, we have to check every element within that queue of size n, making the time complexity equation O of n, inserting two and deleting from a queue are both going to be an instantaneous o of one. This is because just like with the stack, we're only ever on queuing at a single point, and we're only ever D queuing at a single point. This means that no matter how large the size of the queue is, it will always take the same number of operations for any magnitude to either insert or remove an element. And there they are, in all their glory, the time complexity equations for the queue, you may notice that they're identical to the stack, which if you've been paying attention is the truth for most of the properties about a queue. They're very much a yin and yang, one in the same type deal different slightly in terms of the methodology you use to add and remove objects from them. You'll often see these two data structures talked about together frequently, just because of how similar their functionality is. The final thing I want to cover on the topic of queues are just some common uses for them within programming. What are these things actually used for? And the answer is quite honestly, a lot on your computer. Right now. queues are being used for what's known as job scheduling. The process through which the computer determines which tasks to complete for the user. And when like opening up a web page or a computer program. It's used many times in printers to keep track of when multiple printers Try to print and determining whose documents get printed first. Heck, if you're looking for real world examples, Google even uses cues and their new pixel phones to enable what's known as zero shutter lag, in which they strategically use cues to eliminate the time between when you take a picture and what the phone actually captures. So yeah, in terms of functionality, the queue can be used in a variety of fields. So it's good now that you know what they are and how to use them. This also concludes our discussion on the queue. To review the queue is a sequential access data structure, which follows the FIFO principle, were first in first out to add elements to the back and remove elements from the front. Up next, we'll continue on the sequential access data structures train and talk about one of my personal favorite data structures. Next, we'll be covering the linked list, and it's a good one. So strap into your seats. Let's just jump into things by talking about the elephant in the room. What exactly is a linked list? Well, to answer that a linked list is a sequential access linear data structure in which every element is a separate object called a node. Each node has two parts, the data and the reference, also called the pointer, which points to the next node in the list. Wow, that is a pretty big sentence with a lot of ideas within it. So let's break it down part by part. The sequential access part of that statement means that the information contained within the link list data structure can only be obtained through a certain methodology. We talked about this in our segment on the stack. If you remember, during that segment, we compare them to a tape measure. Because Similarly, you can only get measurements from a tape measure through a specific method. The specific method for a linked list will be covered a little bit later for moving on. The linear part of the definition simply means that the information or data is organized in a linear fashion in which elements are linked one after the other. Now when we state that every element is a separate object called a node, this means that unlike an array or an ArrayList, where every element is just let's say a number, each element in a linked list will actually be an object which can have multiple attributes or variables. And I won't dive too far into object oriented programming right now. If you want a good introduction to that, you can check out the link in the description below, which will take you to our introduction to object oriented programming lecture. For this series, we essentially mean that the objects or note stored within each element of our linked list will hold two separate pieces of information. These two pieces of information come together to form the node. And these nodes are what make up our linked list. More specifically, those two pieces of information are the actual data or information stored within that node, and the reference or pointer to the next node in the linked list. The data is where our strings or integers or Boolean values are kept the actual contents of our data structure. And the other piece of the puzzle is the pointer. This reference or pointer points to where the next node in the linked list is stored in memory. This helps link all of the nodes in a linked list together to form one long chain of information. Kind of like a computer science conga line, you should now be able to understand what a linked list is. Again, it's a sequential access linear data structure in which every element is a separate object called a node, in which each node contains both data and a reference to the next node in the linked list. This cumulatively creates a string of nodes which we call the length list. Feel free to rewatch this explanation if you're still confused, because I know it can get pretty hectic pretty fast. The next thing I want to do in this section is just visualize how one of these linked lists is set up. That way, you can actually see what I mean when I say nodes, error, and pointers and everything else, etc. So every linked list starts with what's known as the head node of the list. This is just an arbitrary label, which represents a node containing some form of data, and also a reference to the next node in the linked list. For now, let's just store the integer one in our head note. Now, since this is the only node so far in our linked list, the head node will simply point towards a no value, which is just to say it's not pointing anywhere. Essentially, it's pointing to nowhere and just being used as a placeholder until we actually give it something to point towards. Let's do that by adding another node to the link list and store the integer to inside of it. Instantly, you can see that now our head node points to this second node instead of a null value. You'll also notice that this new node, which I'll call the two node points to a no reference value, just like the one node used to do, as it is now the last node in the linked list. This is a pattern you'll notice as we continue adding nodes. The last node in the linked list, also known as the tail node will always point towards a null value. This is our way of telling the computer that we reached the end of our linked list and that there are no more nodes to traverse towards. Anyways, let's now create our third node and inside store the integer three, this node now becomes the tail end points towards a North value. And the second node that we created, the one that used to point towards a null value. Now points to this new node which contains the integer three. Essentially, we're just adding a node to the tail of a linked list and then setting the references of the previous node to point towards that new node. And if you can understand that concept, you pretty much comprehend the gist of how these nodes interact. The two main takeaways are that every node has both information and a pointer. And the last node in a linked list points towards a null value. That's pretty much the setup for a linked list. Definitely more abstract and fluid than, say an array or ArrayList, but hopefully not too complicated. Next up, we're going to be discussing how we add and remove these nodes from a linked list. Unfortunately, for us, adding and removing elements from a linked list isn't going to be as simple as it was with the other data structures such as the stack or the queue. This has to do with the simple fact that we are actually able to insert and remove elements easily within a linked list at any location. With a stack or a queue, we weren't actually able to do this because the data could only flow in and out of a few specified points. This made it so we couldn't remove an element from the middle of the stack or jam an element in the center of a queue. The way that they're structured just didn't allow for this. Using link lists. However, we can easily remove elements from the middle or jam elements in the center. And so today, we'll be covering the different methods used to do so. More specifically, we'll be covering three different ways to both insert and remove nodes from a length list. Adding to and removing a node from the head, the middle and the tail of a linked list. These are all going to revolve around those pointers we talked about at the beginning of the episode. Because whenever we change up a node in a linked list, we also have to change its pointers. And that can get pretty complicated pretty quickly. Luckily, that's why I'm here. I've set up a basic linked list on your screen now with three nodes that we can use to play around with. Each node has a value of course representing the data inside the node, and also a pointer which points to the next note, the green and red coloring on the nodes with the integers one and three, simply indicate which node is the head node and which node is the town node. Green means head node, and red means it's the tail node. Now in an actual link list, these pointers would be locations in memory. But for the purpose of this series will be representing the pointers visually perfect. So let's cut the chitchat and just jump right into it. The first method we're going to be covering is adding and removing nodes from the head of a linked list. Lucky for us, this is pretty simple. To add a new node to the head of a linked list, literally, all we have to do is make that new nodes pointer point to whatever the current head node of the linked list is. By doing this, we simply take away the title of head node from the current head and bestowed upon this new node that we're adding, it looks a little bit something like this. Let's take a node with the integer zero and add it to the head of the linked list. All we have to do to do this is set its pointer to point towards the current head node. Now, you'll see that none of the other nodes have changed in the slightest. The only thing that has changed is that this new node with integer zero is the head node, and it now points towards the old head node. extremely simple, and the best part is that it works in reverse as well. Removing a node from the head of a linked list is just as simple. All we have to do is set the head nodes pointer to some null value. Once we do, the head node will get cut off from the flow of information essentially removed from the linked list. If we did this on our example linked list, you'd see that once we set the zero nodes pointer to No, because it no longer points to the one node and no node points towards its location. It has been cut off from the rest of the nodes and exiled from the linked list. The old head node regains its position. And it says if this integer zero node never even exists. Moving on the next methods we're going to cover are inserting and deleting a node from the middle of the linked list. These two methods are definitely the most difficult of the bunch. And this is because we need to insert the node in such a way that the pointers get readjusted accordingly without losing any of the information. If we accidentally set the pointers wrong, or do things out of order, the data could get lost forever. But luckily, I'm here to teach you how to do it the right way. We'll start with adding a node to the middle of a linked list. Adding to the middle of a linked list is a two step process. We first make the pointer of the new node point to the node after the location we want to insert at. Then we set the node before the location we want to insert that to point towards the new note. So if we wanted to insert a node with the double value 1.5 after the node containing the integer one, but before the node containing the integer two, what we would first do is set the new nodes pointer to point to the node containing the integer two, then we would make the node containing the integer one point towards our new node which contains the double 1.5. By adjusting the pointers of the nodes before and after the location, we want to insert that we can strategically jam this node in the correct place without moving or modifying the entire list. As you can see, now we have a linked list of length for where the link has not been broken and is also still contiguous. Removing a node from the middle of a linked list is even simpler. All we have to do is make the pointer of the node previous to the one we're removing to now point to the node after the one removing. Then if we set the pointer of the node we want to remove equal to ignore value, we again cut the node off from the linked list and it is removed. Simple as that. So following these steps, if we now wanted to delete our 1.5 node, we'd make the one node again point towards the two node instead of the 1.5 node. Then if we delete the pointer of the two node, by setting it equal to null, it gets cut off from the flow of the linked list. No changes are externally made to the rest of the list. Just one lonely node removed. The final type of insertion and deletion we'll be covering is inserting to and deleting from the end of a linked list. Doing this simply requires To modify the tail node of the length list, the one which currently points towards some null value. for adding a node to the tail, you simply make the current tails pointer, which is currently set to no to point towards the node you want to add. So if we wanted to add a node with the integer four, we would just make the three node point towards this new node. Then by setting the four nodes tail to point towards No, we've made this new four node the tail of the linked list, and the old tail node with integer three now points to the new tail. Removing the tail of a linked list is just as simple. If we want to remove the tail, we just set the previous tail to point towards a null value instead of the current tail. This leaves the current tail node with no connection to the linked list, isolating it from the pack. Doing this on our list would look like making the three node point towards No. And now because no node now points to our tail node continuing integer four anymore, it gets cut off from the linked list and essentially removed making the old tail node, the one containing the integer three, the current tail node once again. And so after a bit of pointer readjustment, hopefully now you can understand the pseudocode behind inserting and removing elements from a linked list with ease. Up Next are the time complexity equations for a linked list. Now accessing an element within a linked list is going to be O of n. This is because linked lists, again are sequential access data structures. This should be all too familiar if you watch the sections on stacks and queues. But if not, let's just use a simple review. sequential access data structures can only be accessed through a particular way, meaning we can't get any element we want instantaneously. resulting from this is the stipulation that if we want to get a certain element within a link list, we need to start at the beginning of the list and cycle through every node contained within it. Before we can finally access the one that we want. We do this by using the pointers as a little map for the computer. First go to node one node one's pointer gives you the location of node two in memory. And node twos pointer will take you to the memory location, which contains the node with the information that you want. It's like a computer science treasure map and a way for a linked list of size n. This means you could have to traverse the entire linked list before finding your information making it's time complexity equation O of n searching is O of n for the exact same reason, we check a node and if it's not the one that we want, we use that nodes pointer to take us to the next node and check that one. We do this for every node until we either find the node containing the value we want, or get to a node which points towards no indicating that we've reached the end of the linked list that the value that we're searching for just isn't contained within that particular length list. Inserting and deleting from a linked list is a little bit complicated. Since linked lists usually only store the head and sometimes the tail nodes location in memory permanently. If we want to insert or delete an element at the beginning or end of a linked list, we can do so instantaneously using the methodology we talked about previously. However, if we want to insert a node within the linked list, things become a little bit more complicated. In order to insert a node within the link list, we must first traverse to the location we want to insert it. This means following that treasure map until we reach the insertion point. And then and only then can we actually follow the instructions to insert or delete a note. So depending on how and where you want to insert or delete a node at its time complexity equation will be either O of n or o of one a little confusing, yes, but necessary dimension for when it inevitably comes up. So in review, accessing searching, and sometimes inserting and deleting are all going to be O of n. And other times inserting and deleting will be instantaneous. Cool. Now let's finish up by talking about some real world applications of the linked list. Now, something I haven't talked about before, but would be scolded by people in the comments for not mentioning is that linked lists can actually be used in the backing of other data structures. What do I mean by this? Well, basically, we can use linked lists to make stacks, queues and some other data structures we haven't talked about. This is in the same vein as earlier when I mentioned that the ArrayList uses the array as a backing support system in memory. Now this goes a little bit above an introductory series, but it's one of if not the most important uses of a linked list in computers. Science. So I thought I'd mentioned it here. Basically, because of the way that it's structured. Having a stack or queue use the nodal methodology that we talked about during this episode, which comes with the link list to be the back end of it structure makes a lot of sense. If you're curious, I'll have an article linked below explaining this idea in better detail. I just thought it was a little bit above the complexity for an introductory course. Now, that's not to say link lists can't be useful in other ways. A queue on Spotify, for instance, where each song in the queue doesn't contain just an integer or a string, but an entire song with WAV data, a title, a length, etc. Then, when the track is done, it automatically points to the next song in the playlist. Another example could be a photo viewing software where each node is an image, and the pointers simply point to the next photo in the list. Like I said, the main functionality of a linked list might be to back other data structures, but it also has tons of uses in other areas of expertise. In Review, a linked list is a sequential access linear data structure, in which every element is a separate object called a node containing two parts, the data and the reference pointer which points to the next node that comes in the linked list. These pointers allow us to easily shift around each node, adding or removing it without having to move massive amounts of data like we would have to do in the case of an array or some other data structure. One thing I haven't mentioned yet about linked lists is that this pointer system does come with one big drawback. With a normal linked list, we can only ever go forward with our pointers never backwards from the computer's eyes. Once we follow a pointer to a certain nodes, location and memory, there's no way to go back or undo to the previous one. Much like a shark, we can and only will ever go forward. This problem is fixed however, through the use of a data structure known as the doubly linked list, which coincidentally is the subject of the next segment in this video smallworld All jokes aside, Next up, we're going to be covering the doubly linked list so strap into your seats. A doubly linked list is almost exactly the same as a linked list. That is to say it's a sequential access data structure which stores data in the form of nodes. Except with doubly linked lists. There's one small difference. With the doubly linked list, we're able to traverse both forwards to the next node in the list and backwards to the previous node in our list. Again, using pointers. How? Well let me explain. With regular link lists. Each element was a node composed of both a data section and then a pointer which would take you to the memory location of the next node in the list. Using this, we were able to traverse a linked list easily to search for information, access data within a link list, or add and remove nodes from within the list with ease. A doubly linked list simply builds upon this by also having a pointer which points to the previous node location in memory. It's an undo button of sorts, which allows you to fluidly go through the link list in either direction, instead of just limiting yourself to go in one direction. This is great since it allows us to jump around the linked list and have a lot more flexibility when it comes to modifying information. Now because this can get pretty confusing very quickly, I want to do a visualization. Before that though, let's use some lingo to help consolidate the terminology that I'll be using. When I refer to a nodes. Next, I'm referring to that particular nodes pointer which points to the next object in the list, whether that be another node or no value. Similarly, when I refer to a nodes previous abbreviated to prevx, on the vigils, I'm talking about its pointer which points to the previous object in the linked list, again, either another node or no value. Doing this just helps keep things a little bit more simple because as you'll see having both a previous and a next pointer makes things a little bit more complicated. Now on to what these doubly linked lists actually look like. Just like with a regular linked list, every doubly linked list is going to start with a head note. Since it's the first node in the list, both its previous pointer and its next pointer will point towards a null value. This is of course, because it can't point to information which isn't there. We can fix this though by adding another node. Once we do, you can see that a few things have changed. The head nodes next pointer now points towards this new node instead of No, the new nodes previous pointer now points to the head node. And the new nodes next pointer now points towards a null value. As you can see, it's Little bit more complicated than adding a node to a regular link list in which we only had to worry about one set of pointers as opposed to two, but still manageable. Let's add one more node for demonstration purposes. And you'll see the same thing happens again. The second node now points to this new third node. And this third node gets initialized with a pointer which points both to the previous second node and also forward to some null value. You can see that with any two nodes, the next pointer of the first and the previous pointer of the second come together to form sort of a cyclical connection which ties the two nodes together. Then at the head and tail of the list, there's a connection which ties them both towards some no value. Most of this should be common sense, but it's still good to go over. doubly linked lists are a little scary at first, but once you break them down, you realize that they're just an evolved form of linked lists. of next we're going to be talking about adding and removing nodes from a doubly linked list. Again, using the three different methods that we talked about in the previous segment, adding and removing from the head, the middle, and the tail. I've also set up a basic doubly linked list with three nodes containing three strings to help us out with this next segment. Finally, just like the last segment, the green on top of a node signifies that it's the head node, and the red signifies it's the tail node. Alright, with all that being said, let's get into it. Now adding to the head of a doubly linked list is quite simple. The first step is to take our new node that we want to insert and set its previous to no second, we set the new nodes next to point towards the current head node of the linked list. Then all we do is set the current heads previous to point towards this new node instead of a null value, and we're set to go. Doing this rearranges the pointers in such a way that the new node we want to add becomes the head of the doubly linked list. Pretty simple. So if we wanted a new node with the string, aim to be the head of our doubly linked list, we would just follow the steps. First, we set the aev nodes next to point towards the atom node, then we set its previous to point towards the null value. And finally, by setting the atom nodes previous to point towards the AV node, we have successfully added that node to the head of the list making it the head note. Removing a node from the head is even simpler, we first set the head nodes next to point towards a null value. And then by setting the second nodes previous to also point towards a null value, we can easily remove the head node from the list. This is how it would look in our example, we'd first set the aid nodes next to no, then we would set the atom nodes previous to also be no and then we're done. Now because the aid node doesn't have anything to point towards, nor does it have anything pointing towards it, it will be deleted from the list. The Atom node will regain its position as the head node and the program will carry on. Up next is inserting and deleting from the middle of a doubly linked list. This is where things get really tricky, so make sure you're paying attention. To insert a node into the middle of a doubly linked list, we follow a three step process. The first step is to take the node we want to add and set its previous to point towards the node previous to the position we want to insert at. Then we take that same node the one we want to add. Instead it's next pointer to point towards the node after the position we want to insert. Finally, we set the next on the node at the position before the one we're inserting, and the previous on the node after the position we're inserting to both point towards this new node. That is a lot of words, some of which might not even make sense to you. So let's open up the space on our screen real quick and do an example with our list. Let's take a new node with the string Chris and add it between the Adam node and the bed node by simply following our steps. First, we want to set the new nodes previous to point towards the node previous to the position we want to insert at. This means setting the Chris nodes previous to point towards the Adam node. Simple enough. Second, we set the new nodes next to point towards the node after the position we want to insert that this entails setting the Chris nodes next to point towards Ben. So far, so good. Then the last step is to set the node on the next before we're inserting and the previous on the node after we're inserting to both point towards this new node. So in this case, the atom nodes next and the Ben nodes previous both get set to the new Chris node. This completes the addition into list. Obviously, this is a little bit more complicated than inserting or deleting from the head. But as you can see, we've now inserted this new node into its rightful place. It may be hard to see since the pointers are a little bit messy, but the flow of the list has remained constant without any breaks in the pointers, which is the most important part. Removing a node from the middle of a doubly linked list is also a three step process. First, we set the node before the one we want to remove next to point towards the node after the one we want to remove. Then we set the node after the one we want to remove the previous two point towards the node before the one we want to remove. The final step is to set both pointers of the node we want to remove to point towards a null value. Again, a little complicated, so let's take it to our example list and try it out. Let's delete the Chris note just for the sake of keeping things consistent. Following the steps that we've laid out, we would first set the next of the node before the one we want to remove to point towards the node after the one we want to remove. So in this case, we said the atom nodes next to point towards the Ben note, essentially skipping over the crisnet. Then we set the previous of the node after the one we want to remove to point towards the node before the one we want to remove. So we set the bend node to point towards the atom node, again skipping over the Chris node. Finally, we have to set the Chris nodes pointers to point towards null values. Now that we've done this, because no nodes point towards it, and it doesn't point towards any nodes, the Criss node gets deleted from the doubly linked list, the list is back to its original form with no breaks or errors. Adding a node to the tail of a doubly linked list is also a three step process. Step one entails setting the next pointer of the current tail to point towards the new node, we want to become the tail. Step two is setting the previous of the new node that we're adding to point towards the current tail of the list. And step three is making the new nodes next point towards a null value. Again, let's do an example where we add a new node containing the string Ethan to the tail of the doubly linked list. Following our steps, we first set the next pointer of the current tail, the coral node to point towards the Ethan node. Then we make the Ethan nodes previous point towards Carl, and it's next to point towards an O value. And we're all set. Ethan has been successfully added as the new tail of the list in three quick and easy steps. Removing from the tail of the list is even easier and only requires two steps. We first set the tail nodes previous to point towards No. And then we set the second to last nodes next to also point towards No. On our example list, it looks like this. We first set the Ethan nodes previous to point towards No. Then we set the coral nodes next to also point towards No. This isolates any pointers going to or from the Ethan node and deletes it from the list making the coral node the tail once again. adding and removing information from a doubly linked list might sound like a hassle. But remember, we only have to use this pseudocode to program these functions once in whatever language we're implementing them in. And then we're able to reuse them an infinite amount of times. Now for time complexity equations, since the doubly linked list is really just an extension of the linked list, it's time complexity equations are going to be exactly the same as they were for the link list. And for the exact same reasons, O of n for all four cases and sometimes over one for both inserting and deleting. If you didn't watch the linked list segment and are wondering how we arrived at these equations, you can check the description for a timestamp which will take you to the discussion we had on linked lists time complexities. Finally, in terms of doubly linked lists, I just want to talk about some of the uses for a doubly linked list because there are a ton. The back and forth functionality of a doubly linked list lends itself to be implemented in a lot of stack like functionality, ie cases where you want to go back and forth between information that you're storing for the user. Some examples of this could be the browser caches which allow you to go back and forth between webpages, the Undo redo functionality in many programs, applications which allow you to utilize an open recent functionality. The list goes on and on. Basically any case in which you need to store a list of objects with multiple attributes, a doubly linked list is going to be a safe bet to get it done. linked lists and their evolved form and the doubly linked list are a great way to store information because of the adaptability of the nodal structure. Since we're not working with raw information like primitive types, and we're storing All information inside of a shell, it makes it a lot easier to move the information around. This combined with the pointer system allows for non contiguous but still fast operating speed, making these two data structures a staple in computer science. Up next, we'll be covering dictionaries. And with that a little mini lesson on hash tables. This is the last of what I refer to as the intermediate data structures at the beginning of the lecture, and is honestly personally one of my favorites. So let's not wait any longer and just jump into things. Before we get too far, though, we should probably clarify that when we say dictionary, we're not referencing that thick book you probably have lying around your house and haven't used in years. Actually, dictionaries are one of the most abstract data structures which exist in computer science and can be used for a variety of purposes. Another thing I'll clarify really quickly is that dictionaries are also sometimes called both maps and associative arrays by the computer science community. This moreso has to do with language specifics and preferences and not any functional differences, since all of them work in almost identical ways. But for this series, we're going to be referring to them as dictionaries, just to keep things simple and organized. Now a dictionary in computer science, by definition is an abstract data structure, which stores data in the form of key value pairs. This means we have two main parts to each dictionary element, the key and the value. Each value within a dictionary has a special key associated with it. And together they create a pair which is then stored in the dictionary data structure as an element. Think of a key value pair like a social security number. Each social security number is a key, which is then paired with a value, that value being an individual person. These social security number key value pairs then come together to form a dictionary of every human being living in the United States. This is very different from many of the data structures we've talked about previously, because we index dictionaries using these keys instead of a numerical index. For example, with an array, we would index each element within the data structure according to a numerical value, which started at zero and ran the length of the array. With a dictionary, however, we index each element by using its key instead of some arbitrary integer and obtain information through that index instead. So what exactly are these key value pairs going to look like? Well, they can be just about anything. The keys and a key value pair can be any primitive data type that you can think of, we can have a dictionary which has integers as the keys, one with strings as the keys, one with doubles as the keys, there's really a lot of flexibility. As for the values, but we have even more flexibility with those. We can have keys and our dictionary correspond to pretty much anything, strings. Sure. Billions, of course, another dictionary with its own set of key value pairs, you can even do that too. This allows us to have tons of combinations in the way that we store our data, making dictionaries extremely powerful. as powerful as they are, though, there are two extremely important restrictions that we have to cover when it comes to dictionary. And they are as follows. Every key can only appear once within the dictionary, and each key can only have one value. Let's start with the first one. Each key has to be unique and cannot be replicated, duplicated, cloned, copied or anything else that would cause there to be two keys of the same value in a dictionary. Think of it like this, when you create a key value pair, the computer creates a little locked box in memory to store the value, then the computer spends days and weeks creating a customized handcrafted one of a kind key that corresponds directly to that locked box. This key cannot be replicated in the slightest. And that's the same mindset you should use when working with dictionaries, no two keys can be the same. If you were to try and make a dictionary with too similar keys, you would be thrown in error by the computer. The second stipulation is that each key can only have one value. Think back to our custom key example. It wouldn't make sense for this one of a kind key to be able to open multiple boxes. The same is true for our keys and computer science. They can only correspond to one value. Now one rule that we don't have to follow is that there can be duplicates of values within a dictionary. Meaning we can have two separate keys both point towards equivalent values. As long as the keys are different. The computer does not care. Alright, now that we know what dictionaries are and how they work, let's jump into the time complexity. For a dictionary, or at least try to, let me explain. Now for a dictionary, the time complexity equations are a little bit funky. Previously, we talked about linked lists and how they are sometimes used as the backing of other data structures in memory. For example, a stack might implement the linked list structure as its way of storing information. Well, the most common way to initialize a dictionary is to implement it with what's known as a hash table. Hash tables are a little more advanced than the content I wanted to cover in this series. But they are a huge part of dictionaries, especially when it comes to time complexity. So we're going to give a little mini lesson on them today. Doing this will help you better understand the time complexity equations for a dictionary, as well as give you some background on Advanced Data structures. Alright, let's begin with a little thought experiment. Imagine this scenario for a second. say we have a dictionary which contains a few key value pairs like shown on your screen. Now, let's just assume that to store all this information in the computer's memory, we use an array like how it is shown on your screen now, with a position that a cell is in the table also corresponds to the key and the key value pair stored in our dictionary. So the zero with cell would contain a key with the integer zero, which leads to a value. The fourth cell would contain a key with the integer four, which leads to another value, and so on. Any cell which we don't have a key for is empty, or what the computer scientists refer to as nil. Why does this not work? Why can't we just use an array like this to back our dictionary in memory, since it looks like actions such as accessing and searching would run in constant time, since all the keys correspond to their table values in memory, all we'd have to do to get a value is simply reference the key at the index we're looking for? Well, this actually isn't the case, because it's based upon the simple assumption that the size of the array is not too large. Sure, this might work great for this dictionary where we have 10 or so elements. But let's say instead of 10 elements evenly spaced out, we want to have one which has 10 values, which are sporadically placed from one to a billion. We want to keep the keys as being in the same index position as their values, so we can know exactly where they are and reference them easily. But by my count that is 999,999,990 nil values, just taking up space in memory. And that is not good whatsoever. There's got to be a better way to do things, right? Well, this is where our hash tables come in. Hash tables are fundamentally a way to store this information in such a way that we're able to cut down the amount of nil values, while also allowing for the information to be stored in such a way that it is easily accessible. Basically, by using hash table, we'd be able to store these 10 keys in our dictionary, which range from one to a billion in a table with only 10 elements, while also being able to keep constant time. How is this possible? Well, through the use of a trick program is used known as hash functions. A hash function will take all the keys for a given dictionary and strategically map them to certain index locations in an array, so that they can eventually be retrieved easily. Essentially, by giving a hash function both a key and a table, it can determine what index location to store that key add for easy retrieval later. These hash functions can be pretty much anything which takes in a value and returns an index location. The goal of a good hashing function is to take in a key, whatever that may be, and reliably place it somewhere in the table so that it can be accessed later by the computer. So with that being said, let's just jump into an example. Because I know a lot of you might be confused. Let's say we have our dictionary which contains keys in the form of integers from one to a million by a factor of 10. So the keys are 110 101,000 10,000 100,000, and so on. A good hash function for this data set would be to take the key divided by itself, and then multiply the result by the number of digits in the key minus one. So to find out where to store the value corresponding to the 1 million key, we would take a million divided by itself, which yields one and then multiply that by the number of digits in that integer minus one, in this case, seven minus one or six. This means we'd store the 1 million key at index location six. If we do this for every key in the dictionary, we can see that each key in the key value pair is stored at some index from zero to nine. We've consolidated the 10 keys for One to a billion into 10 index lots instead of a billion. A pretty good improvement in my opinion, you might think this is a little over complicated for storing 10 elements. But remember, sometimes we might be working with 1000s of keys at a time, ranging in the billions, or even strings as keys, which is a whole other story. Now the best part about these hash functions, let's say that we now want to get the value which is paired with a 1 million key, all we need to do is put 1 million back into our hash function, and it will tell us where the value tied to that key is stored at within the table, in this case at the sixth index. Doing this allows us to pretty much instantaneously find where any key value pair is stored at within the table. All of this information may be pretty confusing. If you still aren't 100%, I would recommend watching this part over again. However, in layman's terms, and for this series, all I want you to be able to comprehend is that dictionaries are built upon these hash tables. And the keys in our key value pairs are stored in these hash tables and indexes which are determined by hash functions. And if you can understand that you've got a pretty good base for understanding hash tables. Now the final thing I want to talk to you guys about in regards to hash tables, and the reason dictionary time complexity equations are so funky has to do with one small problem. What happens when we run two different dictionary keys into a hash function, and the computer tells us to store them at the same index location. For example, let's say we have two dictionary keys with the strings, Steven and Shawn. And when we run both of them through our hash function, the computer instructs us to store both of them at the ninth index location, this should be impossible. How are we supposed to put both of them at index location nine. This little problem is what's known as a hash collision and can be solved one of two ways, either open addressing or closed addressing. With open addressing, we just put the key in some other index location separate from the one returned to us by the hash function. This is usually done by looking for the next nil value in the table, ie the closest location which contains no key. So with our example, the key Steven would get hash to the index location nine, mostly because it's a better name. And the inferior key Shawn would get stored at index location 10, because it's the closest open location available. This does make it harder to interact with the data in our dictionary later on, and can lead to problems if we eventually have a value which hashes to the index location 10, which is why computer scientists developed closed addressing. Closed addressing uses linked lists to chain together keys that result in the same hash value. So the keys Steven and Shawn would get stored together in a linked list and index location nine. The main drawback to this is that whenever we want to interact with the value stored within a key value pair for either Steven or Shawn, we end up having to look through the linked list for that piece of data that we want. If there are 200 keys hash to one index, that's 200 keys, we might have to search through to find the one that we want. And that's not very good. And with that concludes our mini lesson on hash tables. Now we can get back into dictionaries, and finally talk about their time complexity equations. Now remember back to the segment on time complexity really quickly. Back then I mentioned that for these time complexity equations, we generally measure a data structure based on its worst case scenario. But when it comes to dictionaries, if we are to assume the worst case scenario, things get a little outrageous, we basically end up assuming that our hash function makes it so that every key value pair ends up in the same index, meaning that each of our keys gets stored in a linked list, assuming that closed addressing is being used, then worst case scenario, we have to assume that every operation functions how it would for accessing searching for inserting or deleting from a length list, which, if you remember is O of n for all four. This is of course preposterous and would probably never happen with the bare minimum decent hash function. Which is why in addition to the worst case, scenario, time complexity equations, I'll also go over the average time complexity equations for these four operations. Now Lucky for us, they're all going to be over one. This has to do with these hash functions we talked about before. To access search for insert or delete a key value pair from our dictionary. All we need to do is run that key through our hash function, and it will tell us what index in the hash table to go to in order to perform that operation. There's no time wasted looking through preset indexes because we, the programmer generate the indexes ourselves. And that's the power of the hash table in the flesh. Overall, dictionaries are a very useful data structure when it comes to computer science for a few reasons. They differ from the rest and quite a few big ways. The fact that they don't use a numerical index to retrieve values, rather a preset key, the notion that those keys and the key value pairs can be a range of types from strings to integers, to chars, and so on the implementation of the hash table, which allows for super quick utilization that is sure to come in handy in any program that you write. In conclusion, and overall, the dictionary is just a solid data structure that can be used to fill in the gaps in your data structures knowledge. And with that, we have now completed all of the intermediate data structures if you will. Next, we'll be moving on to trees and tree based data structures. This is the point in this series where we will be dropping our discussion on time complexity equations, because if you thought that the dictionary time complexity equations were complicated with hash tables and hash functions, trees, and there are many variants only further complicate things to the point where I no longer feel comfortable including them in an introductory course, we might mention that certain tree based data structures are good for accessing or searching. But as for discussions on all four time complexity equations, dictionaries are where we leave things. With that being said, let's move on to the final few data structures as we round out this introduction to data structures series. Next up on our list of data structures is the tree. No, not those big green things that inhabit the outside world and that you can climb on and get fruit from, rather the computer science data structure, which is way more fun in my opinion. Now, before getting into trees, we need to talk about data structures in general. Every data structure we've covered up until this point has been stored linearly. arrays, stacks, linked lists, all of these had a definitive start and end to their data, and you could point them out easily. Even the dictionary could be laid out in such a way to be represented linearly. Everything was so nice and neat and easy to visualize. On the other hand, trees store data hierarchically as opposed to linearly. What does that even mean? Well, besides being an extremely hard word to pronounce, it's an alternative way to store data that we're going to be using for the remainder of this series. It's a little hard to visualize. So I'm going to start with an example. The most common real world example of hierarchical data would be a family tree, each person would be an element of the family tree, and connections wouldn't be based on a simple linear fashion. Rather, connections would be more abstract and can lead to multiple paths or branches, instead of just a single data point. Each generation is ordered on a hierarchy, which is where the name comes from. And as you can see, there's no definitive end to the family tree. Sure, there are the ones at the bottom, which you could argue are the ends of the family tree, but which one of them is the definitive end, there is none. Another example could be a file structure system on your computer, you'd have a base folder, such as your desktop. And then inside of that, you might have multiple other folders for different types of information. And then inside of those, you might have more folders representing more types of information. And then finally, you would have your documents. Again, it sets up a network of those files on different levels, just like how we had with the family tree. Trees and the trees. Many variants are all dependent on storing data hierarchically, which of course finally begs the question of what exactly a tree is. Well, a trees in abstract data structure which consists of a series of linked nodes connected together to form a hierarchical representation of data. The actual information or data is stored in these nodes. And the collection of nodes along with the connections between them is what's known as the tree. This is sort of like a linked list only instead of each node only pointing to one location, it has the option of pointing towards multiple, and also branching off on its own or pointing to no other nodes. Each of the nodes in a tree can also be called vertices and the connections between vertices. What linked two nodes together are called edges. Now the free flowing structure of a tree lends itself to a lot of different configurations. And with that comes a plethora of terminology about certain nodes, vertices, and just the structure of a tree in general. So what we're now going to do is go over a lot of the terms associated with specific nodes based on where they are on the tree and how they're connected to other nodes while also visualizing the general structure of a tree at the same Time. Okay, let's start. Let's begin with the things we've already talked about. A vertices is a certain node in a tree, and an edge is a connection between nodes. The first new thing is that every tree starts with what's known as a root node. This is always going to be the topmost node of a tree. Let's add a node with the integer 50 to serve as our root node. Next up, let's connect the two vertices to our root node using two edges. These two nodes we just added are what's known as child nodes, since they are both connected to the node containing the integer 50. child nodes can thus be defined as a certain node which has an edge connecting it to another node one level above itself. Vice versa, the root node, the vertices containing the integer 50, is now what's called a parent node to these two child nodes. Thus, we can define a parent node as any node which has one or more child nodes. Think back to our family tree. If we were using people instead of integers, it makes perfect sense that the nodes directly connected to each other have some sort of familial relationship. Let's continue on by adding two child nodes to the node containing the integer 20. When we do this, the 30 node becomes a parent node, and the two nodes we've just added become child nodes. We have now branched off from the 20 node completely, and that the two child nodes containing the integers 10 and 15, respectively, share no direct association with it. Speaking of the 20 node, since it does not have any children, it is what's known as a leaf node, or a node in a tree which doesn't have any child nodes. In this context, the 10 and 15 nodes would also be leaf nodes. Finally, let's add one more node as a child of the 10 node and there's our tree in all its glory. As a quick review, the 50 node is the root node, the 30 and 20 nodes are children of that root node and the root node is thus the parent of the 30 and 20 node, then the 10 and 15 nodes are children of the 30 node. And the 30 node is thus a parent node to both the 10 and 15 nodes. The five node is a child of the 10 node and the 10 node is a parent to the five node. And finally, the 515 and 20 nodes are all leaf nodes because they do not have any children. As you can see, one node or vertices on our tree can have many different titles depending on where it is in the tree and what other nodes are connected to it. For example, the 30 node is both a parent node to the 10 and 15 nodes, but also a child of the 50 node. The 15 node is both a child of the 30 node, and also a leaf node as it has no children. This terminology really comes in handy when we start talking about trees which contain 1000s of vertices and edges, and the data becomes very complicated to order and maintain. Moving on the next two pieces of terminology I want to go over with you guys are the height and depth of a tree. The height is a property of a particular tree in it of itself. And the depth is a property of each individual nodes are contained within a tree. Let's start with the height. The height of a tree is the number of edges on the longest possible path down towards the leaf. So in our tree example, since the longest path in our tree, which leads to a leaf is from the 50 node to the five nodes, and there are three edges in that path, the height of the tree would be three. The depth of a certain node is the number of edges required to get from that particular node to the root node. For example, let's take our 30 node. Since there's only one edge connecting it on the way up to the root node, its depth would be one. For the 15 node. However, since there are two edges, which separate it from the root node, the 15 node would have a depth to that's pretty much all you need to know about the terminology associated with trees and computer science. As a review, we have the height and depth obviously. Then we have vertices, edges, root nodes, parent nodes, and leaf nodes. Now there's probably something that's been scratching at the edge of your mind for quite a while now. And that's why the heck are these called trees. They look nothing like trees. Regular trees are not upside down like this data structure would lead you to believe. So who named trees and why? Well, there's actually a simple answer to this. The tree is said to have been invented during the 1960s by two Russian inventors. And the last time that a computer scientists got up from the chair and went outside to actually see a tree is rumored to have happened back in 1954. Ever since then, it's been grinding arrayed screens watching hours upon hours of their favorite youth. superchannel so please forgive them for their confusion when it came to naming conventions, they must have just forgotten trees aren't upside down. Now regular trees, like the ones we just created are great for storing hierarchical data. But their power can really be heightened when you start messing around with how the actual data is stored within them. by imposing rules and restrictions on what type of data is stored within a tree, as well as where we can effectively use the tree data structure to its full potential. I could talk about the different types of trees for a long time, so long that many of them are full segments in this lecture. But for now, I just want to cover one popular variant. This will be a good introduction to how trees can vary without simply diving into a huge deviation from what we already know. More specifically, the tree variant I want to talk to you guys about is the binary search tree. A binary search tree is a simple variation on the standard tree, which has three restrictions put on it to help organize the data. The first is that a node can have at most two children. This just helps make it easier to search throughout the tree as we don't have to spend time looking through each of the eight children for a particular node. Keeping it limited to two helps us do this. The second restriction or property is that for any given parent node, the child to the left has a value less than or equal to itself. And the child to the right has a value greater than or equal to itself. This might seem weird, but it comes with certain advantages and disadvantages over using normal trees, which we'll get to in a bit. The final restriction put on binary search trees is that no two nodes can contain the same value. And this is just to prevent weird things from happening when we begin searching through the tree. Now, how do imposing these restrictions on a tree actually help us? Well, the biggest advantage of binary search trees is that we're able to search through them in logarithmic time. Because there is a natural order to the way that the data is stored. It makes it extremely easy to search for a given value. logarithmic time, if you remember back to the segment on time complexity is the equation in which we get more bang for our buck the greater number of elements or nodes have in our data structure. It works like this. All we have to do when searching is tell the computer to go left if the value we're searching for is less than the current node, and right if it's greater than the current node. We can then wash rinse and repeat this strategy until we find our desired note. This makes binary search trees really popular for storing large quantities of data that need to be easily searchable. Of course, this also translates to inserting, deleting and accessing elements within the data structure. But for the most part, searching efficiency is what really sets the binary search tree apart from the rest. stepping away now from binary search trees and into trees in general, let's talk about common uses for them in computer science. The most common uses for trees in computer science includes storing data with a naturally hierarchical structure. These are like the examples we touched upon at the beginning of this segment. Data Sets such as file structure systems, family trees, a company's corporate structure, all of these could be stored and implemented using the tree data structure very easily. These are all general examples though. As I mentioned before, when we put restrictions on trees, like in the case of the binary search tree, we can expand it to uses even further. A trees based structure is incredibly useful, and it can be modified in so many ways, which only add to its functionality. One of these ways is through what's known as a try and is the next data structure on our agenda. So stay tuned for that. Next up, we'll be talking about another one of the special trees with restrictions. We just finished discussing the binary search tree. And with that we mentioned how trees usually become even more useful once you start setting restrictions on how and where the data can be stored within them. Well, a try is another one of these special trees, which have special restrictions put in place to help store the data in an effective manner. This data structure is often overlooked since it's only used in specific situations. But without the use of tries within those specific situations. Some important features of your computing life would be a whole lot harder. So we get it Steven tries a special tree with restrictions. But what are those restrictions and how can they help us? Well, let's start with the basics. A try is a tree like data structure whose nodes store letters of an alphabet in the form of characters. We can carefully construct this tree of characters in such a way Which allows us to quickly retrieve words in the form of strings by traversing down a certain path of the try. These are also sometimes called Digital trees or prefix trees by the computer science community. We'll be calling them tries for today's lecture. Trees are used in the retrieval of data in the form of words, which is actually where the name try comes from, as it's smack dab in the middle of the word retrieval. Essentially, in layman's terms, we use tries to retrieve words extremely fast by traversing down a tree of store characters. This is hard to explain without actually showing you. So let's do an example of what a try might actually look like. Now, just like with a tree, every try is going to start with a root node only in our case, that root node will be empty with either some null value or a blank string. Now also stored within this node will be a set of references, all stored within an array. These references are set to know at the beginning of the Tris existence, but can be slowly filled with references to other notes. For example, let's say we were creating a try to store words that start with the letter D. The root node would contain an array which contains a reference to a node containing the character D, signifying the start of our word. Now imagine for a second that this D node also has an array containing references. Only this time, it contains references that point towards all characters in the English alphabet, which serves as the first two characters of a word in the English dictionary. So da serves as the first two characters and a lot of English words such as dad, or Tao, or dad if you really want to go that far. And so the array contained within the D node would hold a reference to an a node. Now since there are no English words, which start with DB that I know of, we wouldn't have a reference to a b node. Since we know we will never have to retrieve a word from our try, which starts with dB. We continue this process for all letters in the alphabet, only including references to characters which serve as the first two characters in an English word. But that is a lot to visualize. So for now, let's just put up on the screen two nodes that this first D node would point towards. More specifically, the node, we've already added the a node, as well as the E node. Now to continue building our try, we would simply repeat this process for all of the nodes that this D node points towards. So we'd store pointers in the a node, which would point towards characters and the English alphabet that serves as the first three characters of a word in the English dictionary. So you'd have a B, A, D, A y, and many, many more nodes as well. But we're just going to stick with those three for now. For E, you'd have pointers which point towards nodes containing characters like n and w. Obviously, there'd be more nodes than the one shown on your screen right now for all levels of nodes. But what we've created here is a very simple try. As you can see, it's exactly like I said it was in the beginning, a tree like data structure, which stores characters that can be used to make words by traversing down paths. As we traveled down different paths of our tree, we can make different words, dab, Dad day down the one path, and then and do down the other. By following the nodes downwards, we can easily build strings of words, which you'll learn towards the end of the episode can be extremely useful. We can even take this process further by having one path contain multiple strings. For example, if we isolate the den path, we could continue the process to go on to words like dense or Denver or dent, the list goes on and on. We wouldn't have to just stop at the word then since then, is also a prefix for numerous other words, and that's what makes storing data and character format so useful. One path down to try can represent multiple words depending on where you stop along the process. This does provide a bit of a problem for computer scientists though, how do we know when a word has ended? Let's take the Denver path as an example. If we wanted to retrieve the word den from this try, how would we know to stop at the end node and not continue along to Denver? Well, there's actually a pretty simple fix to this and it's known as flagging. We simply mark the end of the word by having it also point towards a flag to let the computer know that the end of a word has occurred. So in our Denver example, not only would the end nodes array contain pointers to whichever characters follow it, it would also contain a pointer to a node with a flag to tell the computer to stop there. In this case, I've just chosen a period as a flag. This way we can control tries in such a way where an each word is marked by an ending point. And the different words that may branch off from the prefix can continue to use that word as a prefix in whichever retrieval program ends up getting used. Okay, so now it's time to talk about the general use cases of a try. If you remember back to the beginning of this segment, I said that the use cases for a try were limited, but extremely effective. And now we're going to talk about what that statement means. Now, have you ever used the autocomplete feature on your iPhone or spellcheck on a Google Doc, because if so you have already experienced the overwhelming power of tries, as they are used for both of these extremely useful features. This mainly has to do with the fact that for big programs like iOS or Google Docs, they're not just storing a try containing a few words, or even all the words starting with a certain letter, like we tried to replicate their storing the entire English dictionary. storing the entire dictionary in a try seems like a tall task. But it can and has been done. Each node would simply have 26 nodes connected to it. And those 26 would have as many nodes connected to them as needed, and so on. And so now I can hear you asking you in the comments. How does this help us with autocomplete and spellcheck? Well, let's say you're typing out a word. For the sake of this video, let's say it's the word subscribe. You start with the s. At this point, that computer has already eliminated 95% of words that you could be typing. We know it's not going to be a word which starts with n, or with a or o or H or l etc, etc. Then you type the letter U and bam, another 95% of possible options get deleted. With each character you type, you eliminate millions of possible words that you could be working towards. Using this fact, as well as popularity data, which could also be stored within the node, the computer can start making educated guesses using that information, as well as context from previous words, to make a suggestion as to what word you're trying to type out. The AI that works. This feature is immensely complicated, but it is backed by the tri data structure and helps autocomplete work easily on your phone. As for spellcheck, well when you spell a word wrong, a spell checking algorithm can use the root of your word and compare it against try data to make an educated guess as to what you were trying to spell. The slightly misspelled words will usually have a similar path from the root of the try. If you accidentally type chalk a lotta instead of chocolate, the computer can take the first few characters of the word you typed in correctly, and see where you may have gone wrong. Basically, by comparing the misspelled word to certain paths of the dictionary try, they can pretty accurately detect which word you were meaning to say and correct you accordingly. So there you have it, the try. Like I said, it is often underrated in the computer science world since it can only be used in certain situations. But as I hope this segment has shown you, those situations are still important. Nonetheless, we are nearing the end of our introduction to data structures series. Now, only two more left before we part ways. So stay tuned, because now we're going to talk about heaps. Now back in our segment on trees, we talked about the binary search tree, a special type of tree which has a few properties. Each node can only have two children, the child to the left must have a value less than the parent node, and the child to the right must have a value greater than the parent node. And also no two nodes could contain the same value. Well, a heap is sort of like this, but a little bit different. More specifically, by definition, a heap is a special tree where all parent nodes compared to their children nodes in some specific way, by being either more extreme or less extreme, ie greater than or less than. This specific way determines where the data is stored and is usually dependent on the root nodes value. There are two different methodologies generally used in computer science, and they are known as min heaps and Max heaps. In a min heap, the value at the root node of the tree must be the minimum amongst all of its children, and this factor must be recursively. The same for any and all parent nodes contained within the heat. So each parent node must have a value lower than all of its children nodes. As you can see from our example, on the screen now 10 is the root node and also the minimum value in the tree. In addition to this fact, if we pick any parent node on the tree, and look at its children and their children and so on, the parent node will have the lowest value of the mall. Take 14 for example, its value is less than 2631 4435. In 33, this must be the case for every single subtree, which is contained within the heap. Max heaps on the other hand are the exact opposite. In a max heap, the value at the root node of the tree must be the maximum amongst all of its children. And this fact must be recursively, the same for any and all parent nodes contained within the heap. If you take a look at the example max heap we have on the screen now, again, you'll see that this is the case 44 is the root node and also the largest value within the heap. If you take a look at say the sub tree, which is parented by the 35 node, you'll see that 35 is a maximum value amongst all nodes in that subtree greater than both 19 and 27. When we store data like this, and heap, whether that be a min heap or a max heap, it makes it extremely easy to insert or remove from. This lends itself to a lot of useful implementations within computer science. Now to show you this, I'd like to build an example max heap, the one which has the greatest integer stored in the root node. For the sake of keeping things simple, let's pull up an array of seven elements with integers ranging from zero to 100, and simply convert it into a heap one element by one. This can be done in a few easy steps. Step one is we add the first integer in as the root node. So 70 would get inserted into our tree as the root node, then we add another node to the tree in the bottom left most positions available. So we would first insert a node as the child of the root node to the left. For our heap. This means adding the integer for as a child of the 70. The final step is to recursively go up the heap and swap nodes if necessary. Now when we say if necessary, we mean that if the node we just added is more extreme, either greater than or less than the node above it, depending on the type of heap that we're trying to create, we swap them to maintain order amongst the heap. Since we're building a max heap, and 70 is greater than four, no swaps are necessary in this case. Now we just simply repeat steps two and three until we've built our heap. So next, we want to add the integer 90. And since the leftmost slot on the tree is already taken by our four node, the right slot ends up being the leftmost location on our tree. So we insert it there. And then since 90 is greater than 70, we actually end up swapping the 90 and 70 nodes. Doing this helps keep the max heap property intact, where every level is greater than the one below it. Let's keep going. Next, we add 45 to the heap. Now, since we've run out of space, on the second level, we move on to the third level and add 45 as a child of the four node, then we compare it to four, which is greater than so we swapped the nodes. Now we have to compare this node to the node above it again. Because remember, we recursively go up the tree until we reach the root or don't need to swap. Now 45 is not greater than 90, so it stays put where it's at. Next up, we add 23 as another child of the 45 node, only this time on the right side, we compare and since it's not greater than 45, we keep it as is. Moving on, we next insert 76 into the tree as a child of the 70 node, then we compare and swap the 76 and 70 nodes as 76 is indeed greater than 70. We then compare 76 and 90. And since 90 is greater than 76. Keep the 76 node in place for now. The next node we add is the 100 node, we compare it to the 76 node and see that it's greater, so it gets swapped. And then we compare it again, this time to the 90 node. And since it's also greater than that integer, it gets swapped yet again to become the root node, signifying it is the greatest node in the heap. As you can see, it's a pretty simple concept, you add a node and then keep trying to swap up until the node you've added is in its rightful place. Deleting from heap is also pretty simple, at least in our case. This is because the type of deletion I want to talk about is just the process of removing the root node from the heap. And you'll see why later on. To delete the root node from a heap, you follow a three step process. Step one is actually removing the root node from our heap. So in our case, we delete the 100 node. What you do with it is up to the programmer, you can return it back to the user stored somewhere else, print it out, etc, etc. Either way, then step two is replacing it with a node furthest to the right in this case It would be the 76th note. Finally, for step three, we do what's known as a heapify. To fix up the heap, we start with the root node and compare it to its children to see if we need to swap any values. So for the 76 node, we compare it to its two child node. And since 76 is less than 90, we end up swapping those two. Then we wash rinse, repeat for every subtree that we have. So on the right side, we swapped 90 with 76. But since 76, remains the greatest integer on that particular subtree, it stays in the same spot. On the left side, we didn't change anything, but 45 is still the greatest integer amongst the three nodes in that subtree. So no swapping is necessary on that branch of the heap. And so we've completed our heapify, and all heap properties are still intact. That is inserting and deleting nodes in a nutshell. Now let's talk about how we can use this to our advantage. Heaps are most commonly used in the implementation of heapsort. heapsort is a sorting algorithm which takes in a list of elements, builds them into a min or max heap, and then removes the root node continuously to make a sorted list. Because heaps always start with the minimum or maximum value contained within them at the root node, we're able to simply just remove the root node over and over again, keep refining the data structure after every pass. And after every element has been removed, we are left with a sorted list. On your screen, you'll see we have an unsorted list on the left. In the middle, we've inserted all of those integers, and in doing so created a max heap. And then finally, on the right, we have continuously removed those elements from the root node into a now sorted list. heapsort is a really cool algorithm, and it will be part of our upcoming series on sorting algorithms, which is kicking off very soon. So if you're interested in that, make sure you subscribe to our channel so that you don't miss it, a link will be in the description below. Another extremely popular use of heaps is through the implementation of priority queues. priority queues are an advanced data structure, which your computer uses to designate tasks and assign computing power based on how urgent a certain matter is. Think of it like a line at the hospital. You wouldn't want your line to follow a regular queue methodology which implements first in first out. Since then you could have patients with extremely urgent matters like heart attacks, waiting behind people coming in for a routine checkup. And the same way you wouldn't want your computer to update an application before it finishes rendering a video. Otherwise your entire progress would be lost. priority queues take care of all the task scheduling done by your computer, and the heap data structure is used as the backing system for it. And with that ends our discussion on heaps. To review, they are a special tree in which each level contains nodes with values more extreme either greater than or less than the nodes on the level above it. Next up is unfortunately the last segment in the series on yet another tree based data structure the graph. The graph is arguably the most dynamic data structure in our introduction to data structure series and an absolute banger to go out on. So let's just hop right into it. Before we get into the nitty gritty details, I first want to do a short little exercise. Visualize for a second a few of your favorite places to eat on a map around your town. For me personally, it'd be places like five guys chick fil a Panera, Wawa and Domino's. Now imagine for a second that you are ravished absolutely starving. And so your plan is obviously to drive around to each of your favorite places to eat in order an ungodly amount of food from each location. Each food location has a few possible paths going to and from it. And so we add those to the map as well. Now you can see what we now have looks like a network of delicious foods, we can start anywhere, and all we have to do is make sure to hit all five. You may not know it, but what we've essentially done here is set up a simple graph. Basically, graphs are composed of pieces of information like the restaurants and the path set run between them. Of course, this is just generally, by definition graphs are a nonlinear data structure consisting of nodes and edges. There are a finite set of these nodes or vertices, which are connected by edges. nodes and edges should be familiar to you if you watch the segment on trees. The big difference between trees and graphs however, is that with a tree we had a specific starting point. Sure, there were multiple paths down the tree that branched off from the initial starting point. But you always had to begin at the root note. In contrast with a graph there is no specified starting point, we can begin from any node and traverse to any node. Just like how on our food example, we are able to start at any restaurant. graphs are a huge concept and escaped the bounds of computer science often being used in many places you wouldn't even think of. But before we get into anything crazy, though, like the difference between a directed or undirected graph or a cyclic versus cyclical graphs, let's get down the basics. Now, every graph is composed of these nodes or vertices and the edges that connect them. Let's pull up a sample graph really quickly and talk about it. We represent graphs visually like this a lot, because it makes it way easier to comprehend. But notationally wise, a graph actually looks like this, which is much harder to comprehend. So let's break it down. First, we have the vertices set, which contains a comma separated list of all vertices within the graph. That's the simple part. Each comma separated value simply represents a node within the graph. Then we have the edge set, which is a little bit more complicated. Each element of the edge set is an ordered pair which describes relationships between nodes. For example, the first one describes a relationship between the six and four nodes. The fifth indicates a relationship between the five and two nodes, and so on. Using these two sets, we're able to visualize a graph pretty easily by laying down both the information and the connections that fall between them. One final thing I want to mention about the basics of graphs is about the relationships that occur between two nodes. If we have an edge, which connects two different vertices, they are known as adjacent to one another. So for example, the five node would be adjacent to the four two and one nodes. Okay, now that we have the basics down, I now want to jump into the different attributes that a particular graph might have. Starting with directed versus undirected. an undirected graph is one in which the direction you traverse the nodes isn't important. This is most prominently indicated by a lack of arrows pointing to specific nodes. Such was the case with our first example graph, or even the food example from the beginning of the episode. We can hop between nodes or even go back and forth between them without problem. A good way to visualize undirected graphs is like a network of friends on Facebook, where each edge indicates a friendship, if you will. Because of the fact that when somebody accepts to be your friend on Facebook, you are both added to each other's friends list. The friendship goes both ways and direction is unimportant. In contrast, a directed graph is one in which the direction you traverse the nodes is important. This is usually indicated by arrows representing which nodes a certain node is able to traverse to. The edges could point both ways but they don't have to. It's very possible the edge only points one way. A good way to visualize directed graphs is by thinking them as a network of friends on Instagram. Sure, I can follow famous celebrity Will Smith, but the odds that he follows me back fairly low. And so in that case, the relationship only goes one way. undirected and directed graphs both have their uses. As we discussed with the social media example, both provide different functionality which will be useful to you and your computer science journey. Just like the next type of property a graph can be classified as either cyclic or a cyclic acyclic graph is one which contains a path from at least one node back to itself. So you can see by the example on your screen now that the four node leads to the three node which leads to the two node which leads to the one node, which finally leads back to the four node, forming a cycle, essentially, we're able to follow at least one path that eventually leads back to our starting point. A small thing to note here is that all undirected graphs end up being cyclical. The bi directional nature of nodes within undirected graphs theoretically forms a cycle between any two nodes. So judging by that logic, all undirected graphs end up being cyclic. In a acyclic graph is one which contains no path from any one node which leads back in on itself. This property can really only apply to undirected graphs like we mentioned previously. Essentially, this means that for any given node, there is no path which will eventually lead back to itself. undirected directed acyclic and a cyclic are all properties we can use to classify types of graphs based on their nodes. But the last property I want to talk about actually applies to the end. Have a graph instead. And it's the process of waiting. Waiting the edges of a graph means associating a numerical value with each edge, often called the cost of that edge. Each weight represents some property of the information you're trying to convey. For example, again, going back to our food location scenario, since the information that we're trying to convey is a good route, which takes us to each location, a good weight for our edges, and that scenario could be the distance between nodes. This comes in handy a lot, especially with navigation, such as the case with our restaurant example. As we of course, always want to find the path of least cost or weight between the different nodes. So there are the major properties of heap that the different nodes and edges can have directed or undirected, cyclic or a cyclic and weighted or unweighted. There are a couple more obscure ones out there, but those six are what we will be covering today. Combining these three properties together leaves us with numerous types of graphs, which all have different strengths and weaknesses. It would take a while to talk about each of these and there are implementations. So for now, I'll just pick out three types of graphs which are used in the most popular cases. Now probably the most famous implementation of the graph data structure is through the undirected cyclical graph with weighted edges. This one gets a lot of use, especially through its implementation in Dykstra shortest path algorithm. This algorithm given a graph and a source vertex within that graph, compiles a list of the shortest possible paths from that source vertex to all other nodes. As you might be able to tell just from its description, this has a multitude of uses across the entire tech world. Google uses this algorithm for Google Maps. It's used in the process of IP routing, and it can even be implemented in telephone networks. Another type of graph, which you probably use quite often is the unweighted cyclical graphs both undirected and directed, as these make up the follower systems of a majority of social media websites. We already talked about these in the cases of Facebook, which would use a cyclical representation, as well as Instagram which would use an a cyclical representation. However, this example encapsulates much more than just those to Snapchat, Twitter, tik tok, even all these platforms can represent your follower following base through a graph and oftentimes do Facebook even has a graph API, which you can use to interact with the models that they use to illustrate each user's web of friends. As you can see graphs and there are many different forms provide a lot of the functionality that you interact with in everyday life, contributing to almost any facet of the internet. And with that concludes our discussion on graphs. As we review, a graph is a data structure which is represented by a set of nodes and edges. These come together to form a visualization of data, whether that data be food locations on a map or friends on social media. The different types of graphs provide a multitude of implementations in computer science. And with that concludes our introduction to data structure series. After around three hours and 12 data structures, you should now have the basic knowledge to take with you as you continue along in your computer science journey. And if you've made it this long, you obviously liked what you saw. So stick around for an extra 30 seconds while I'll pitch to you why you should check out my personal channel. Now I introduce myself at the beginning of the series, but again, my name is Steven. I'm an 18 year old computer science student who runs a YouTube channel called nullpointerexception with one of my good friend Shawn. We've been producing content like this seriously for about four months now. And I've grown a good audience of around 800 subscribers, which we couldn't be happier about. If you'd like to join us, you can check out our channel linked in the description below and just try out a few of our other videos. And if you like what you see, you can subscribe. That ends my pitch and with that I have nothing else to say. I hope you have enjoyed this series as much as I've enjoyed making it. Thank you so much for watching.

Info

Channel: freeCodeCamp.org

Views: 552,355

Rating: 4.9616961 out of 5

Keywords: data structures, python, computer science, tutorial, course, for beginners, big o

Id: zg9ih6SVACc

Channel Id: undefined

Length: 179min 26sec (10766 seconds)

Published: Tue Sep 08 2020