Data Structures - Computer Science Course for Beginners

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello, everyone, and welcome to an introduction  to data structures. My name is Steven, and over   the next three hours, we'll be tackling the  topic of data structures in relation to computer   science. More specifically, we'll be talking  about what they are going over some different   types of data structures, and discussing how we  can use each of them effectively with examples.   This course will be a general overview of  data structures, and so we won't be confining   ourselves to one specific programming language.  However, this series will require you to have a   basic understanding of programming. If you are  completely new to computer science, I would   recommend our introduction to programming series,  which will be linked in the description below.   That video will give you all the information  you need to traverse these topics comfortably.   Now, we're almost ready to hop into things.  However, there are a few housekeeping items   that I want to go over beforehand for those  wishing to enhance their learning experience.   If you're not interested in that and want to skip  right to the content, go to the time shown on your   screen now. For those of you staying though there  are just a few things I would like to mention.   Firstly, in the description, you will find  timestamps for each major topic covered in this   video, as well as timestamps taking you to every  smaller section contained within those topics.   So please feel free to skip around if you're  already comfortable with one topic or only   interested in a certain data structure.  Next, we've decided to include the script   and visuals used for the series also in  the description below. That way you can   follow along if you like or simply read the  script if my soothing voice isn't your style.   Additionally, for each segment, I'll be including  the references and research materials used to   write the script for each topic. That way, if  you ever feel as if I haven't explained a topic   well enough, or we're just simply like more  information about a certain topic, you'll have   a readily made list of articles and websites,  which you can use to supplement this series.   If you feel like you have any questions throughout  the series about maybe something I said or a   certain visual, please ask your questions in the  comments below. I'll try to be as responsive as   possible for the first few weeks on questions  you guys may have about the series and such.   And finally, in terms of housekeeping, I just  like to say if you're not already subscribed to   our channel, no pointer exception, then consider  doing so if you enjoy this type of content. as me   and my co host Sean regularly post videos in this  style. We're trying to hit 1000 subscribers before   the fall semester of college begins. So check  us out if you end up enjoying the video. With   my shameless plug out of the way, we're finally  ready to tackle the topic of data structures.   In this introduction segment, we'll simply cover  a general overview of what data structures are,   and then go over what type of information will  be covered throughout the duration of the series.   So the obvious question to start with  is what exactly is a data structure?   Well, in computer science, a data structure is a  way to store, organize and manage information or   data in a way that allows you the programmer to  easily access or modify the values within them.   Essentially, it's a way for us to store  a set of related information that we can   use for our programs. data structures, and  the algorithms used to interact, modify,   and search through them provide the backbone for  many of the programs that you'll end up writing,   I can almost guarantee that in 99% of your  programs data structure will be involved.   Each of the data structures that I'll be  talking about are designed for the sole   purpose of storing information and allowing the  end user to access and manipulate that information   in an efficient, effective way. But each one  differs in the manner that they accomplish this.   If you have a basic understanding of programming,  you probably know about a few different data   structures already, such as the array and the  array list, also known as the list in Python.   But if you're going to be pursuing a career  in computer science, just knowing these two   is definitely not going to cut it. Well  basic data structures such as the array   are used frequently by companies throughout their  code. More advanced data structures are being put   to use all around you the Undo redo button  in Google Docs, Google Maps on your phone,   even the autocomplete feature through your  text messages all require the use of more   advanced data structures, which makes them  extremely useful to learn and comprehend.   Okay, now, I can't see you per se, but I can just  tell that you're jumping up and down in your seat   hooked on data structures and how useful they  can be. But before we jump too far into things,   let's discuss the information that we'll be  covering in this series. Now, before we talk   about any specific data structures, we're going  to have a brief talk about efficiency. We'll   discuss the metrics used to judge the speed and  efficiency of different data structures, which   will then give you a better understanding of why  one data structure might be used over another one.   From there, we'll start by diving headfirst into  what I believe are the basic data structures,   those being array. An array lists. While you may  already have a good understanding of what these   are, I would highly suggest that you still watch  these segments, because we'll be going into a   little bit more depth as to why they're so useful  based on differences in how they're stored in the   computer's memory. After that, we'll move on to  what I'll call the intermediate data structures.   These are a little bit more complicated than the  basics and have a few special attributes which   make them stand out from the rest. We'll begin  by taking a look at stacks which are the backbone   for all recursive processes in your computer. Then  we'll look at the antithesis of a stack the queue.   Moving on from there, we'll be covering  linked lists and their evolved form in the   doubly linked list. Before moving on to the  final of our intermediate data structures,   the dictionary which includes  a mini lesson on hash tables.   Then we'll wrap up the series talking  about trees and tree based data structures,   less linear and more abstract data structures  beginning with the tree itself. We'll then move   on to tries a very useful data structure used for  a lot of word processing algorithms. And finally   end off the series with a discussion on heaps and  graphs. So in total, I'll be taking you through   four different segments containing 12 of the most  common data structures that are practically used,   as well as providing examples of where and when  to use them. section one on efficiency, section   two on basic data structures, section three on  intermediate data structures, and wrapping it up   with section four on tree based data structures.  With this knowledge, you'll be able to take charge   of your programming career and hopefully gain a  competitive edge by implementing them when needed.   Like I said, however, before we jump straight  into the different ways to store information,   I'd like to have a quick discussion on how we  score the efficiency of these data structures   using what is known as Big O  notation. So let's jump right in.   Okay, so before we talk about all these crazy  data structures, like maps, and heaps, we want   a quantifiable way to measure how efficient a  certain data structure is at the different tasks,   we might ask a bit. If we're going to be  storing extremely large amounts of data,   being able to search through, modify, or access  the information within a data structure needs to   be fast and efficient. As we briefly mentioned  before, the industry standard for this kind of   implementation is big O notation. So what exactly  is big O notation? And how do we measure it for a   specific data structure? Well, that's what we'll  be covering in this segment. For most of the basic   and intermediate data structures in this series,  we're going to be spending some time talking about   its efficiency using big O notation. So this  is definitely a topic you're not going to want   to skip. With that being said, let's begin. So  because there are so many different ways to store   information, like we talked about in the previous  segment, programmers have developed this idea   of big O notation as a way to basically score a  data structure based on a few different criteria.   This criteria can change based on who you  ask, but for the purposes of this video,   we will be using four criteria representing  the most common functions you might want from   a data structure. The ability to access a  specific element within the data structure,   search for a particular element  within the data structure,   insert an element into the data structure and  remove an element from the data structure.   Now by measuring how efficiently a certain  data structure can do these four things,   we can basically create a report card of sorts,  which measures how efficient a certain data   structure is. At a glance, this gives us a  pretty good overview of what a certain data   structure is good at, and what it is bad at.  And it can help us decide which one to use.   If we need to store data that is easily accessible  to the end user. For example, we might choose   a data structure which can access elements the  quickest, and vice versa, if accessing elements   isn't the most important thing to us, but we  need a data structure which can be easily added   to and deleted from, we would go for one which is  most efficient and that specific functionality.   By looking at a data structures  report card, if you will,   we can get a quick sneak peek at what  they're good at and what they're bad at.   Now, if we use big O notation to basically create  a report card, like I said, there must be some way   to actually grade each of these functions. And  there is the four criteria mentioned, accessing   searching, inserting and deleting are all scored  using big O notation time complexity equations.   Now what is the big O notation time complexity  equation? Well, I'm glad you asked. Besides being   in an annoyingly long phrase, a big O notation,  time complexity equation, or just time complexity   equations, work by inserting the size of the data  set as an integer n and returning the number of   operations that need to be done. indicated by  the computer before the function can finish. The   integer n is simply the size or amount of elements  contained within the data set. So for example,   if we have an array, one of the data structures  we'll get into soon, with the size of 10,   we would place 10 into the different efficiency  equations for accessing searching, inserting and   deleting that represent the array, and returned  back to us would be the number of operations   that need to be conducted by the computer. Before  completion of that function. It does get a little   bit more complicated than this. But for the sake  of keeping this series simple, all you need to   know is that these equations help represent  efficiency amongst different data structures.   Also, an important thing to note here is that we  always use the worst case scenario when judging   these data structures. This is because we always  want to prepare for the worst and know which data   structures are going to be able to perform under  the worst conditions. You'll see what this means   in greater detail a little later on once we  start putting this into practice. But for now,   just keep in mind that when judging data  structures, we always use the worst case scenario.   Moving on, the reason that it's called  Big O notation is because the syntax   for these particular equations includes  a big O, and then a set of parentheses.   inside these parentheses will include some  function, which will correctly return the number   of operations needed to be run by the computer.  So for example, let's say we have a fake function,   it can really be anything, the purpose in this  case is irrelevant. Now for this fake function,   let's say it's time complexity equation  was the one shown on your screen now,   we would pronounce this time complexity equation  as o of two, meaning it takes two operations   from the computer before our make believe  function can finish. If the time complexity   equation was o with a five in the parentheses,  instead, it would be o of five, and so on.   Now, these are examples of time complexity  equations, which run in constant time, meaning no   matter the size of our data set, it will always  take the same number of instructions to run.   This is usually not going to be the case though,  when it comes to the four criteria we want to   mention for our data structures. Most of the  time are integer n, which again contains the   amount of elements within the data set is going  to have some adverse effect on how many operations   it takes to say search through our data  structure, which makes sense, a larger data set   means it's going to take more operations from the  computer to search through that entire data set.   Now we score these four functionalities in number  of operations performed, because measuring by how   long the function takes to run would be silly.  If we measured by a metrics such as time taken   to completion, our results would be highly  biased by the hardware used to run the function.   a supercomputer used by Google is obviously going  to be able to search through a data structure much   faster than a laptop. time complexity equations  essentially level the playing field by instead   returning the number of operations to eliminate  the bias in processing power that exists.   So to sum up everything that we've learned so far,   we measure the efficiency or speed of a  data structure based on how well it can   perform four basic tasks accessing, searching for  inserting and deleting elements within itself.   Each of these criteria is modeled by an equation  which takes in the size of the data structure   in number of elements and, and returns back the  number of operations needed to be performed by   the computer to complete that task. By measuring  these four metrics, we can get a pretty good   understanding of what the data structure is  good at, and what the data structure is bad.   Now, it's important to note that this isn't  the end all be all for deciding on which data   structure to use in your program. As you'll  see, as this video progresses, many of the   data structures were structured with specific  quirks or features, which separate them from the   rest and provide additional functionality. Big  O notation is incredibly useful, and something   that you should definitely take into consideration  when determining which data structure to implement   into your program, but it should not be the only  thing that you use. Cool. Now that we know a   little bit about how we measure the efficiency of  a data structure using time complexity equations,   along with the four criteria actually used to  measure them. Let's dive straight into what   the actual equations mean in terms of efficiency.  That way, when we start grading data structures,   you'll have a good idea as to how good or  bad each one is at that particular metric.   Basically, we're going to cover the six  most common time complexity equations   from most efficient to least efficient. Now, the  absolute best that a data structure can score   on each criteria is a time complexity equation  of o of one. This essentially means that no   matter what the size of your data set is,  the task will be completed in a single step.   If your data set has one element, that  computer will finish the task in one step.   If your data set has 100 elements, one step  1 million elements one step 800 quadrillion   elements, it does not matter, the computer  will finish the task in a single step.   This is why when we look at the graph of volume  of data versus instructions required for o of one,   the line remains constant at one. No matter the  volume of data stored within the data structure,   that computer can complete that  task in a single instruction.   Whether it be accessing searching, inserting,  or deleting, O of one is the gold standard   absolute best top of its class time complexity  equation. It is basically the Michael Jordan   of big O notation when it comes to time  complexity equations. Now the next fastest   type of time complexity equation is O log  n. While not as fast as instantaneous time,   a function having a time complexity of o log n  will still provide you with very fast completion.   Now if you don't fully understand logarithms  entirely, just know that this efficiency is   slower than instantaneous time of one and faster  than the next level of efficiency known as O   of n. Additionally, because the volume of data  versus time graph follows a logarithmic curve,   the larger the data set that you use, the more  bang for your buck that you're going to get.   Basically, as the amount of data you try to  perform one of our four criteria on increases   the rate of change of the amount of operations  it takes to complete that certain task decreases.   So as you can see, at the beginning of the  graph here, when we increase the volume of data,   our number of operations skyrockets. But  when we do the same for larger and larger   sets of data, our number of operations  increases much more slowly than before.   A good example of a non data structures related  function, which has a time complexity of o   log n is the binary search. If you know what  a binary search is, then you'll understand   how it is that when the data set gets larger,  the number of instructions needed to find the   item you're searching for, doesn't increase  at the same rate. If you don't know what a   binary searches, but would like to know,  in order to better understand Oh login.   A link in the description below will take  you to that point in our introduction to   programming series where you can learn about  it. If not, let's move on to the next equation.   O of n is the next common time complexity equation  that's going to come up frequently during this   lecture. The graph of volume of data versus  instructions needed for this function is linear,   meaning that for every element you add to the  data set, the amount of instructions needed to   complete that function will increase by the  same amount. So to perform a function with   the time complexity of O of n on a data set  with 10 elements, it will take 10 instructions   50 elements will take 50 instructions, 1000  elements, 1000 instructions, and so on O of n   is really the last good time complexity equation  that exists. Anything above this is considered   inefficient and not very practical when it  comes to data structures in computer science.   The next type of equation that will come up is o n   log n. This equation is the first that  is relatively bad in terms of efficiency.   The graph of volume versus operations shows a  somewhat linear but increasing graph meaning   unlike o log n, it won't be better in terms of  efficiency as the size of the data set increases.   Instead, the slope actually  increases as the volume of data does.   The last two types of equation are o  n squared and o two to the N. These   are both incredibly inefficient equations,  which should be avoided if at all possible.   Because they're both exponential in structure,  which can be seen from their graphs of volume   versus operations. The larger the data set that  you use, the more inefficient it will become.   While there are more time complexity equations  that exist such as o n factorial, which is even   worse than the two I just mentioned, the data  structures we'll be talking about, we'll never   have time complexity equations that exists outside  of the five we've just covered. So we're going to   stop there. Now the final thing I want to say is  just a reiteration of something I said before,   these time complexity equations are not the only  metric you should be using to gauge which data   structure to use. As we get deeper and deeper into  this series, you'll see that we might have some   data structures which don't seem that efficient  at all based on their time complexity equations   for provide some other functionality or feature,  which makes them extremely useful for programmers.   Okay, now that we have some knowledge  on how we actually grade these data   structures in terms of efficiency, let's  hop into our very first data structure.   The first data structure we're  going to be covering is the array.   Now I understand the array is a pretty common  data structure taught in most programming classes,   and that many of you might already know  about the properties and uses of an array.   But in this segment, we'll be going into  more detail on the array as a data structure,   including its time complexity equations,  storage methods, and more. So check the   description for a timestamp to skip ahead to  that section. If you already know the basics.   If you're not as versed in the array,  though, or just want a general review,   then stick around, because right now we'll  be discussing the basics of an array.   An array is fundamentally a list of similar  values grouped together in a central location,   which makes sense because if you  remember, we use data structures   to store sets of similar information, so  that we can easily use that information.   Basically, arrays can be used to store anything,  usernames for an app, high scores for video game   prices for an online shop, pretty much any  list of values which are fundamentally similar   meaning of the same type. So integers, strings,  floats, objects, the list goes on and on.   Any primitive or advanced data type you  can think of can be stored within an array.   Every item contained within the array is  referred to as an element of that array.   And we call the collective total of elements,  the array, this is going to be true for almost   all data structures we talked about, by the  way, so just keep that in mind. An array will   also have three attributes associated with  it, the first being a name for the array,   the second a type for the array, and the third  a size for the array. Let's start with the name.   The name, if you couldn't tell is just simply a  name for the array, which can be used to reference   and interact with it. Say for example, we had  an array called names, which contains the list   of company employees. And then we also had an  array called salaries which contained a list of   salaries for those employees. If we wanted to  search through the list of salaries, we would   do so by referencing the salaries array with its  name. This way the computer knows that you want to   search through that specific array containing  salaries, and not the names arraigns that   this works vice versa as well, of course, if we  wanted to search through the names array instead.   Now another thing to note really quickly, while we  have this up is that these two arrays are also an   example of parallel arrays. Parallel arrays are  two or more arrays containing the same number   of elements and have the property wherein  each element in one array is related to the   element in the other array, or arrays have the  same position. So from our example, we can say   that the first element in our names array, john  smith, corresponds to the first element in the   salary array, the integer 10,000. This would  mean that john smith has a salary of 10,000.   This is really good, because as you'll see, later  on, we can't store different types of variables   in the same array. So we couldn't have the salary  integers be stored in the same place as the name   strings, making parallel arrays extremely useful  for storing different types of data about the   same entity. Alright, tangent over, let's get back  to the attributes of arrays. And the second one,   we'll be talking about the arrays type. Now an  arrays type is simply what type of information is   stored or will be stored within that array. One of  the biggest stipulations about arrays is that it   has to hold all the same type of information. So  you cannot have an array containing integers and   strings, it would have to be either an array of  only integers or only strings. This works for any   type of information you would want to store within  the array, meaning every element or spot within   the array must be of the same type. The third  and final attribute that an array has is a size.   This size is a set integer that is  fixed upon the creation of the array.   It represents the total amount of elements  that are able to be stored within the array.   An array size cannot be changed. I'll repeat  that again because it's extremely important   and array size cannot be changed once created.  Once you write the line of code which instantiates   an array You give it a defined size or fill  it with elements until it has a defined size.   And then at that point, the array size cannot be  changed by conventional methods. This is actually   pretty rare for a data structure. And it may seem  really counterintuitive and useless. But later on   when we talk about an arrays efficiency, you'll  see that this is for a specific reason, and not   just to be annoying to the programmer. So there  you have it, the three attributes of an array,   a name to reference it a type to fill it with  an A size to control it. Let's now shift focus   and talk about how we actually define an array  ourselves how we reference and change values   within an array, and then dive into  the idea of two dimensional arrays.   There are actually two different ways to  instantiate an array in most languages,   you can either populate the array with elements  that you want contained within it right then and   there. Or you can set a specific size for the  array, and then slowly populate it later on   as the program runs. We'll be discussing how  to do both, starting with the first creating   the array with the elements already stored.  Now defining and filling an array as soon as   you create it is mainly used for when you already  know which values are going to be held within it.   For example, let's say you're creating an  array, which is going to hold the salary   of 10 different workers at a company. And that  information is being read in from a text file.   Well, you already know the 10 salaries  for those workers from the text file,   and so you're able to immediately  populate the array when you create it.   This way, you'll already have the information  available to you as the program runs.   The way you do this varies amongst  different programming languages.   But it's shown here in three  different languages, Java, Python,   and C sharp. To give you a basic gist of how this  is accomplished. All three lines of code on your   screen now will create an integer array of size  three, containing the numbers one, two, and three.   You can see that all of them involve  the declaration of an array name,   which is then set equal to the values you want  to include within the array encapsulated by some   sort of brackets or braces. This isn't 100% how  it's going to be for all languages, but for the   most part, it will follow this skeletal outline.  As you can see, Java and C sharp require you to   define the array type, whereas Python does not.  But for the most part, the syntax is the same.   Again, that is name of array set equal to list of  comma separated values encapsulated by brackets or   braces. This also develops the size of the array  for you automatically. So since we populate the   array with the numbers, one, two, and three, the  array would instinctively have a size of three,   as it now contains three integers. If we were to  say go back and add in a fourth number, where we   define the array, the size would dynamically  increase to for the next time we run the code.   Now the second way in which we can instantiate an  array is by setting an initial size for our array,   but not filling it with any elements and then  slowly populating it as the program runs.   This would be used in the case that you  don't actually know what information   you're going to store in the array,  but will as the program runs through.   The most common case in which this happens  is in the case where the end user will enter   information into the program. And then that  information will be stored inside the array.   Since the information varies based on which user  runs the code and what they input during that   particular run, we have no way of populating  the array as soon as it's instantiated,   forcing us to add the information later.  shown on your screen now are two ways in   which we can do this in Java and C sharp. Now  you'll notice that there is no Python section   because of the fact that creating populate  later arrays is not conventionally possible   in the base version of Python, without the use  of more advanced data structures or packages.   You can see However, with Java and C sharp that  just as with the other example, there are slight   differences in the syntax on how we do this, but  it generally follows some set pattern, the type   of the array followed by a name for the array,  which is then set equal to a size for the array.   Remember, this size is final and cannot  be changed outside of this initial call.   If you end up creating an array with the size 10.  But then later in the program, find that you need   more space for it. You can always go back to  where you instantiated it and change the size.   But after that line of code runs, there's  no conventional way to increase it.   You may also be wondering what the brackets  mean when instantiating an array and that's   true Way to signify to the computer that we are  indeed trying to create an array and not just a   variable. Now that we know the two different ways  to instantiate an array, we need to know how we,   as the programmer actually get information that is  stored within the array so that we're able to use   it. And the simplest way to answer that question  is through the means of a numerical index.   an index is simply an integer, which corresponds  to an element within the array. Now, the most   important thing to know about indexes is that  for a certain array, they begin at zero instead   of one. So if we have an array of 10 elements,  the first index would actually be index zero,   the second would be index one, and so  on all the way up to the ninth index.   It's a little bit weird at first,  but trust me, you get used to it.   Now, to retrieve information from a certain  position within the array, you would reference   it using both the arrays name, and then the  index number of the element you wish to retrieve.   Say we have our array called numbers here, which  contains the integers one through 10. To print out   the element at the fifth index, in this case, the  number six, we would reference the array name, in   this case numbers. And then in a set of brackets,  we would place the index we want to retrieve,   in this case, the number five. What this piece of  code is basically telling the computer is to print   out the fifth index of the numbers array, which  again is the integer six. Now because indexes   start at zero instead of one, it's very important  to note that to get the 10th element of the array,   you would actually need to reference it using  the number nine since there is no 10th index.   If you do end up making this mistake, it  will result in an array out of bounds error,   since there is no 10th index in  the bounds of the numbers array.   Referencing and arrays index number is how we  actually replace elements within the array as   well. Continuing with our numbers array, let's say  we wanted to change the last element, the integer   10 to be the integer 11. Instead, what we would do  is reference the ninth index of the numbers array,   the one which currently contains the  integer 10, and set it equal to 11.   This tells the computer to take the element at  index nine and replace it with the integer 11.   Essentially swapping 10 for 11. Now the last  thing I want to talk about before we jump into   the time complexity equations for an array is  the practice of putting arrays inside of arrays.   An array with an array at each element  is known as a two dimensional array.   Think of a two dimensional array as a  matrix. It's similar to any other array,   except for the fact that instead of a primitive  data type like an integer how's that each element,   we would instead have a whole other  array with its own size and indexes.   two dimensional arrays are useful for a variety  of purposes, such as if you were programming,   a chess board, a bingo board, or even an image  where each element in the two dimensional array   contains an RGB value, which when combined  creates an image. Now referencing an element   within a two dimensional array is mostly  the same as with one dimensional arrays,   except you now need two indexes, the first number  would be the index of the column that you want to   reference, and the second number would be the  index of the row that you want to reference.   By combining these two numbers, we can single  out an element within the two dimensional array   that will then be returned to us. As an example,  let's create a two dimensional array with 16 names   contained within four rows and four  columns. Now to access the name Carl,   you'd first single out the column, which  contains the name you're looking for.   In this case, Carl is in the third  column, meaning it's in the second index.   Next, you would single out the row which contains  the name you're looking for. In this case,   the name Carl is in the third row. So we would  again use the second index to reference it.   Combining these two pieces of information gives us  the index location two comma two. And if you look,   Carl is indeed at index location two  comma two. Just as another example,   Adam is in the first column two rows down. And  so to reference Adam, you would simply call upon   the element at index locations zero comma one.  two dimensional arrays are just the beginning.   You can also have three dimensional arrays, four  dimensional arrays, and so on for containing   large amounts of advanced relational data. But two  dimension is where we're going to stop for today.   All right, that concludes the  background information on arrays.   Now that we know what they are, let's actually  talk about arrays as a data structure.   This is where we're finally going to be using  our big O notation knowledge from the previous   segment. to judge the array as a data structure.  We'll be going through the four criteria we talked   about previously, accessing elements within an  array, searching for an element within an array,   inserting an element into an array, and deleting  an element from an array, and scoring it based   on how effectively they can complete these  four tasks using time complexity equations.   Now accessing an element within an array can  be conducted in an instantaneous of one time.   This means that for any element within your  array that you want to know the contents of,   you can do so immediately by simply  calling upon its index location.   This has to do with the way that  an array is stored within memory.   For example, sake, let's look at a  fake portion of memory in the computer.   You can see we have some information near the  beginning, maybe some integers or strings that   we've initialized, that's not important. What's  important is that when we get to the part in our   code, which instantiates an array, we already know  exactly how large the array is going to be. Either   because we've populated it with the elements  needed, or given it a definitive final size.   This means we also know exactly how much space  we need to reserve in memory for that array.   Combining these two pieces of knowledge means  we can instantaneously figure out the location   of every single element within the array  by simply taking the starting location of   the array in memory, and then adding to it the  index of the element that we're trying to find.   So let's say our array contains three elements,  each of them integers. And we want to know what   the element at the first index is. Well, we  know that the array start storing data here.   And we also know that every element within  the array is stored contiguously together   in memory. And so by taking the starting location  for the array in memory, and adding one to it,   we now know that that's where the first index  of our array is going to be, and can return   that stored value appropriately. This is the main  reason why array sizes cannot be changed. All of   the information within an array must be stored in  this one place. So that we're able to do this, the   contiguous structure of an array prevents you from  adding space to it after it's been initialized,   because it literally can't without breaking up the  data and adding in the space to store new elements   down the road away from the initial array. This  would make accessing elements instantaneously   not possible. Hopefully now you can see why both  accessing an element within an array is O of one,   as well as why array sizes are final, because  not many other data structures allow you to have   instantaneous accessing power, meaning arrays have  a huge advantage on the others in this metric.   Next up searching through an array is O of n.   This is because for the most part, you  will be working with unsorted arrays,   those being arrays which are not in alphabetical  numerical or some sort of quantifiable order,   meaning that to find an element within the array,  you may have to search through each element before   you can find it. This is known as a linear  search. And if you want to know more about it,   click the card in the top right corner,  and it will take you to that section   in our introduction to programming series. A link  will also be provided in the description below.   Now while there are faster ways to search  through an array, those only work when the   array is sorted. And so for a basic array, in  worst case scenario, searching through it to   find a particular element is going to be O of n.  Now Sure, there are going to be cases where the   element you're searching for is the first element  in the array, or even somewhere in the middle.   But remember that when working with big O  notation, we always use the worst case scenario.   And in this case, the worst case scenario is that  the item that you're searching for ends up being   the last element within the array, which for an  array of size n would take n operations to reach.   Finally, inserting and deleting elements from an  array both have a time complexity equation of O   of n for inserting this is because adding an  element into the array requires you to shift   every element that's after the index, you want  to insert the value at to the right one space.   For example, if you have an array of size  five, currently filled with four numbers,   and you want to insert the number one into the  array at index zero, you would first need to shift   every element to the right of the zero with index  right One, essentially freeing up the space for   the new integer, while also retaining all of the  information currently stored within the array.   Then, and only then can you actually insert it.  This requires you to traverse through the whole   array of size n, which is why the time complexity  equation is O of n. Now, sometimes you might not   have to shift the whole list, say in the case,  you had the same array of numbers, only this   time the fourth index was free, and you wanted  to insert the number five in that open spot.   Since you don't have to move everything to the  right one, in order to make space for this new   element, does that make inserting into an array  have a time complexity equation of o of one?   Well, no, because again, when we talk about a  functions time complexity equation, we always   refer to it in the worst case scenario, the most  amount of operations that we're going to have to   conduct before we can insert an element into an  array is n, the size of the list, making it's time   complexity equation O of n. Deleting an element  from an array follows mostly the same idea, you   shift every element to the right of the one, you  want to delete down one index. And essentially,   you have deleted that element. Again, let's say  we had an array of numbers one through five,   only this time, we want to delete  the number one at the zeroeth index,   what we would do is set the zeroeth index  to whatever is contained at the first index,   in this case, the integer two. And then we  simply repeat this process until we get to   the end of the array, finishing it off by setting  the last element in the array to be some no value.   This deletes the element from the array by  basically replacing it with a value to the right.   And then this process is repeated into it  frees up one space at the end. Essentially,   we're just moving the entire array down one.  Now worst case scenario, we're going to have to   shift every element in the array of size n, making  deleting from an array also have a time complexity   equation of O of n. So there you have it the  four time complexity equations for an array,   accessing is O of one and searching,  inserting and deleting are all O of n.   Arrays are a really good data structure for  storing similar contiguous data. Its ability   to access any item in constant time makes it  extremely useful for programs in which you would   like to be able to have instant access to the  information contained within your data structure.   It's also a lot more basic than some of the other  data structures that we'll end up talking about,   making it both easy to learn and reducing the  complexity of your code. Now, some disadvantages   of the array are the fact that the size of the  array cannot be changed once it's initialized.   Inserting and deleting from an array can take  quite a while if you're performing the function   on larger data sets. And if you have an array  which is not full of elements, you're essentially   wasting memory space until you fill that index  location with a piece of data. Overall, the array   is a pretty reliable data structure. It has some  flaws as well as some advantages. A program that   you write could use the array if need be, and  it would work just fine. But sometimes you might   want some extra functionality. And that's where  more advanced data structures come into play.   One of those more advanced data structures  is what's known as the ArrayList.   The ArrayList fundamentally can  be thought of as a growing array,   we just finished talking about the array and  how one of its major flaws was the fact that   once initialized an array size could not  be changed using conventional methods.   Well, in contrast, an ArrayList size expands as  the programmer needs. If you take a look at the   ArrayList on your screen now full of four  elements, and you decide to add one to it,   it will simply expand its size to fit five  elements. As you can probably tell this is   extremely useful. Which begs the question of  why not just always use ArrayList. I mean,   in comparison, the array list seems to provide  all the functionality of an array and then some   Well, that's definitely a valid question, and  one we will get to later on in this section.   But before we can do that, we need to cover  the basics of an ArrayList including some of   the properties and methods associated with it.  Alright, let's hop in. So as we said before,   an ArrayList is simply a resizeable array,  making them extremely similar in structure.   This is furthered by the fact that an ArrayList  is actually backed by an array in memory,   meaning that behind the scenes of your code, the  ArrayList data structure uses an array as its   scaffolding system. For this series, we need not  go further than that. But for us, it just means   that a lot of the functionality will be the same.  Now, this doesn't mean everything is going to be   the same, which you'll see later on. But it's  still important to make note of before we get   too deep into things. The next thing I want to do  before we talk about functionality is actually go   over how we initialize an ArrayList. To do this,  again, is going to vary based on which language   you're using. So shown on your screen now are  two different ways to do so in Java and C sharp,   you may notice again, that there  is no Python on the screen.   And this is because in the base version of Python,  arrays and array lists are actually not separate   entities. They are frankensteined together into  a single data structure called lists. Lists   take some functionality from  arrays and some from array list.   It's a lot more complicated than that. But for  this series, that's all you're going to need to   know as to why we are not including Python. In  this section. We discussed initializing Python   lists in the previous section. So if you're  interested, you can go back and look at that.   Now going back to ArrayList initializations.  Another thing you may notice is that it   looks a little bit awkward in the way  that these statements are structured.   And that's mainly due to the fact that the  ArrayList is actually its own separate class   outside of the base version  of these two languages.   Now, I'm not going to get into class hierarchy  and object oriented programming right now,   because that's a whole other topic with  big concepts and even bigger words.   For right now, this just means that to create a  new ArrayList, we have to invoke the ArrayList   class when defining it, which as you can see is  done at the beginning of both initializations.   After that, you can see that we give it a name,  which is then set equal to new ArrayList with a   set of parentheses. Now in this parentheses, you  have a few options, you can either enter in an   integer, which will then be used to define a size  for the ArrayList. Or you can just leave it blank.   Leaving the parentheses blank like that will  automatically define a preset size of 10 for   the ArrayList. Again, just as a reminder, this  can be dynamically increased as time goes on   if we add enough elements, but  it's just meant as a base size.   Now you may be wondering if we can actually  populate the array with elements when initializing   it as we could with arrays. But array lists  actually do not support this type of declaration.   Moving on, let's talk about functionality. Now  the array list can be thought of as pretty much   the evolved form of an array. It's a bit beefier  has a little bit more functionality and is overall   more powerful than array. That's certainly not  to say that it's better in every case. But for   the most part, the ArrayList is going to be  thought of as the older sibling amongst the two.   This is attributed to the fact that it belongs  to the prebuilt ArrayList class, which we talked   about earlier. The fact that the ArrayList belongs  to a class means it's going to come with pre built   functions that are already at our disposal from  the moment we define and instantiate an ArrayList.   More specifically, the ArrayList comes with  methods we can use to access change, add to   or delete from an easily if you were using an  array, you would have to program most if not all   of these methods by hand. And so having them pre  built into the ArrayList class makes it especially   useful. You'll see that this is the case with  a lot of the data structures down the road,   we're having a data structure belonging to its  own class cuts out a lot of programming time,   you have to spend making appropriate methods  for the functionality that you might want.   Now the types of methods that you're  going to get will vary based on language.   For example, in Java, you'll have a variety of  methods to use, including ones to add elements   to the ArrayList, remove them from the  ArrayList, clear the ArrayList entirely,   return it size, etc, etc. as well as  tons of other more specific functions.   And another language though, such as C sharp,  you'll have some of the same methods as the Java   ArrayList. But you might also have some methods so  that the Java version does not and vice versa. The   C sharp version might not have some of the methods  that the Java version does. The same is going to   apply for any other language you use, which  might implement the ArrayList data structure.   Because of the variability surrounding the  ArrayList amongst languages. In this series,   we're simply going to be covering six of the most  common methods that are both useful and can be   found in 99% of the ArrayList classes. These six  methods are the Add method, the Remove method,   the get and set methods, the clear method, and  the to array method. This may look like a lot   but remember, all of these are pre programmed  for you And so you don't have to make them all   by hand, all you have to do is call them on  a pre made array list and you're set to go.   Speaking of pre made array lists, before we dive  into each of these functions and how they work,   let's first create an example ArrayList, we  can use to manipulate and show how they work.   Let's call it example a list and give it a  size of four so that it's not preset to 10.   And we're set to go. Now, the Add method actually  comes in two different types. One, which takes in   only an object to add to the end of the ArrayList.  And one which takes in both an object to add   to the ArrayList, as well as an index value  representing the index to insert the object at.   Let's start with the simpler of the two, one,  which simply takes in an object. This method is   for more basic cases where you don't care about  where in the ArrayList, the object you wish to   add is going to end up, it will simply append the  object you pass in as an argument to the end of   the ArrayList. So let's take our example ArrayList  and run the Add method on an integer of two. Now,   normally, ArrayList only hold objects, not  primitive types, like the integer two that we're   trying to pass in. However, the computer will  automatically convert our primitive integer into   an integer object with a value of two, so that  we can still add it easily. This is known as auto   boxing. And it's going to be used throughout the  rest of the series to modify data structures with   primitive types. So I just thought I'd mention  it now so that you're not confused later on.   Okay, so when we run our code, since the  ArrayList, is empty, and we ran the Add method,   which doesn't care about location, it's going  to add the integer to at the first index,   index zero. Now if we run another add method, and  this time pass in the integer five as an argument,   since the zero with index is already taken,  it will be slotted in as the first available   open index, that being the first index.  So as you can see, it's pretty simple.   It takes in an object and we'll put it  at the first open location available.   Moving on to the second type of add method, one  which takes in an object to add to the ArrayList,   as well as an index to place it. This  one works very similarly to the previous   only makes sure that the object that you're  adding is appended at the index provided.   Again, let's say now we want to add the  number one to our example ArrayList.   But we want to make sure it's placed in numerical  order. In this case, at the zero with index,   what we will do is call the Add method, providing  the integer one as an argument. In addition to the   index zero, we want to add to the ArrayList. Once  the code has run, the ArrayList will automatically   shift our integers two and five to the right  one, in order to make space for the integer one.   This works for any index contained  within the bounds of the array list.   So if we wanted to do it again, only this time,  insert the number three at the second index,   so that the list remains in numerical order.  We'd call example, a list dot add, and then in   the parentheses passing the integer three and  the index location two. After the code has run,   you'll see that we have added  the integer into our ArrayList.   Now it's important to note that because there  are two integers being passed in as arguments,   you must know which one your computer's treating  as the integer and which one the computer is   treating as the index location. mixing these up  could cause the computer to try and insert the   attempted index location at a different location  than the one you were attempting to insert at.   So just be careful and knowledgeable as  to the order your methods arguments are   in. Now the next method that comes pre packaged in  the ArrayList class is the Remove method. And this   one also comes with two different types. The first  takes in an integer as an argument, and just as   the name suggests, will remove the object if there  is one at the index location provided and return   it back to the user. The second takes in an object  and will simply remove the first instance of that   object within the ArrayList If present, and will  then return true or false whether an object was   removed. So if we wanted to remove the number five  from our ArrayList, we would have two different   options. We could call example, a list dot remove,  and inside the parentheses placed the index of the   value we want to remove, in this case three, and  the program will remove the object at index three.   The other option would be to run another  remove method, only this time passing an   integer object of five. It has to be an integer  object because if we were to just use five,   the computer would try to remove the fifth index  of the ArrayList, which doesn't even exist.   By creating an integer object, we  can ensure that when the code runs,   the computer knows that we want to remove  the first instance of the number five in   our ArrayList. And running this will return  true and remove the integer from our list.   Now, if there is no integer five in the ArrayList,  the Remove method will simply return false.   Now I quite like the number five, so I'm  actually not going to permanently remove   it from the ArrayList just yet. Up next is  the get method. Now the get method is pretty   much the same as referencing an index for an  array, it takes in an index location, and will   return back to you the value at that location. So  example a list dot get with an argument of zero   would return one example a list dot get with  an argument of two would return three, and   so on. The next method is the set method, which  is how we replace elements within an ArrayList.   Much like the name implies, it takes  in an index and an object as arguments.   And we'll set the index location of the index  you passed in to the object you also passed   in. So if we wanted to set the number five in our  ArrayList, to be four instead, so that it matches   nicely with the other integers, what we would do  is call example, a list dot set, and within the   parentheses pass in the index location of the  element, we want to set, in this case three,   and then also the object we want to replace  at that index. In this case, the integer for   this method call will override the element  at position three to be four instead of five.   Now, you should be really careful when you're  using this method, because you don't want to   accidentally override an important element within  the ArrayList. Next up is the clear method for   when you simply just hate your ArrayList. This  is perhaps the simplest of them all. It has   not taken any arguments, and will simply clear  the ArrayList deleting every element entirely.   Calling example a list clear on our ArrayList  would delete all the objects within it. But I   don't really want to do that right now, especially  with one more method to go. So for the sake   of this series, let's just keep the ArrayList  filled with the values that it currently has.   The final method that we'll be covering in this  section is a little bit different from the rest.   And that's the to array method, which is  used to convert an ArrayList to an array.   I thought I would include this one because it's   really useful for combining the strengths  and weaknesses of arrays and array lists.   The two array method takes in no arguments, and  will simply convert the array list into an array.   Now for this to work, of course, you need to set  it to be equal to the creation of a new array,   like shown on your screen now. But if it is done  correctly, you'll end up with a brand new array,   which contains all of the contents that used to be  in the old array list. You may notice though, that   instead of an array of integers, it's actually  an array of objects. This mostly has to do with   that object oriented programming stuff we talked  about in the beginning. But for now, it won't make   too much of a difference, we can still treat  it as a regular array, printing out indexes to   the console, placing elements within it. Typical  array functionality. The only thing that changes   is that it now contains integer objects instead  of primitive integer types. So there they are   the six major functions that come with any given  version of the ArrayList class. Having these   at your disposal will account for much of the  functionality, you might use an ArrayList for   making them extremely valuable to now Let's now  move on to the array list as a data structure,   again, we're going to be looking at its four time  complexity equations for accessing, searching,   inserting and deleting. Now, if you remember back  to the beginning of this segment, we mentioned   that the ArrayList is backed by an array in  memory. And this means just like the array, it too   will have over one accessing power. Essentially,  this means that when we use our get method,   which takes in an index location, it will  return to us the value at the index provided   in instantaneous time. Now you might be wondering,  how is this possible since the data stored within   an ArrayList is certainly not contiguous?  With is actually due to a really interesting   reason. So interesting that before scripting the  series, I actually had no idea was the case. So   because it is my series, I'm going  to take a moment to talk about it.   So if we pull up our example ArrayList and memory,  you can see that it looks a little bit different.   Let's break down what's going  on. instead of storing the actual   objects which are contained within itself, an  ArrayList actually stores references or pointers   to the locations of those objects in memory.  So the zeroeth index based on the ArrayList   is stored at the 87th location in memory,  which is currently storing the integer one.   Checking back to our example ArrayList, you'll  remember that that is indeed what was stored at   the zero with index of the example ArrayList. And  this goes for every element within the ArrayList.   The first is stored at the 91st memory  location, the second at the 100th, and so on.   So as you can see, while the actual data is not  stored contiguously the references to that data   are. So when we run our get command, all it  has to do is return the value that's stored   at whatever the index location points towards.  It's a bit more complicated than that, especially   the way that these references get stored. But  that covers the gist of it. This is the reason   our ArrayList can have instantaneous accessing  power without having the data stored contiguously.   Alright, tangent over, let's get back to  big O notation, time complexity equations   for the ArrayList. So we know accessing is going  to be o of one. And searching is going to be O of   n. This is for the same reason that arrays  were o n. If we want to find out if an object   is contained within our ArrayList. And that  object is at the last index of the array list,   we're going to have to search through  each and every element within it   of size n to make sure because remember, we  always go based on the worst case scenario.   Now inserting into the ArrayList is going  to be O of n. Because worst case scenario,   if we are inserting an element at the beginning  of the ArrayList, we need to shift every element   after the index we're inserting to the right  one, just like we needed to do for the array.   This requires a number of operations equal  to the size of the array, making inserting   O of n. Deleting is O of n for the exact same  reason. If we want to delete the first element   within the ArrayList, we would then have to  shift every element down one to save space.   Additionally, if we want to delete an  object contained at the last index,   we have to search through the whole list  to find it. Either way, it will be O of n.   Alright, so there are four time complexity  equations, accessing his old one and searching   for inserting and deleting are all O of n. If that  sounds kind of familiar, that's because these are   the same as the arrays time complexity equations.  See I told you they were similar. Now this does   bring back up the question we posed at the  beginning of the episode. Why even use an array   in the first place in comparison to the ArrayList  the array just does not seem as useful? Well,   let's get into that by sizing these two data  structures against each other mano a mano.   Let's compare. An array is a collection with  a fixed size, meaning it cannot be changed,   whereas an array list has a dynamic size, which  can be updated to fit the needs of the programmer.   Arrays can store all types of data, meaning both  primitives and advanced types, whereas array lists   can only store objects, not primitives. Now this  problem is mostly solved through the auto boxing   situation I talked about previously, but the  fact still stands. Moving on an array is built   into most languages, meaning it doesn't have any  methods pre made for you to interact or modify it.   Whereas an array list is its own class, meaning it  comes with useful methods to help you utilize it.   Finally, an array is very primitive in nature and  doesn't require a lot of memory to store or use.   Whereas an array list is again a class, meaning  it requires more space and time to use than an   array will. Hopefully now you can see that while  the array list is more powerful, it still does   have some drawbacks which make using an array  sometimes more appropriate. In general, you want   to use arrays for smaller tasks where you might  not be interacting or changing the data that often   and array lists for more interactive programs  where you'll be modifying the data quite a bit.   So that's the ArrayList. As a review, it is a  dynamically increasing array which comes with a   slew of methods to help work it as the array  list is a hefty topic in computer science.   If you feel I didn't do a good enough job  explaining some of the concepts in the video.   The sources used to write the script for this  video will be linked in the description below.   But if not, let's move on  to our next data structure.   The next data structure we'll be talking about  is the stack. Now at this point in the video,   we'll be diverging from what are known as random  access data structures, ie arrays and array lists,   and diving into sequential access data structures.  What's the difference? Well, if you remember back   to our discussion on arrays and array lists, we  were able to access any element within the data   structure by calling upon its index location. This  would then result in the computer instantaneously   returning the value at that location. Each element  was independent of the one before or after it,   meaning obtaining a certain element did not  rely on any of the others contained within   the data structure. That basically describes  the gist of a random access data structure,   one where each element can be accessed  directly and in constant time.   A common non computer science example of a  random access would be a book. Getting the   information contained on a certain page doesn't  depend on all the other pages within that book.   And getting an element contained within a  random access data structure doesn't depend on   all the other elements contained within that data  structure. In contrast, elements in a sequential   access data structure can only be accessed in  a particular order. Each element within the   data structure is dependent on the others and may  only be obtainable through those other elements.   Most of the time, this means that accessing  a certain element won't be instantaneous.   A common non computer science example of  sequential access would be a tape measure.   To get to the measurement of 72 inches, you would  first have to go through inches one through 71.   There's no way to just instantly get to 72 inches   without first breaking every law of thermodynamics  and probably the space time continuum.   These are also sometimes called limited access  data structures, because unlike random access,   the data is limited by having to obtain it through  a certain way. So there you have it, random access   data structures, which allow instantaneous  accessing power and independent elements   such as the array and array list. And then  you have sequential access data structures,   which only allow accessing through a  particular order with dependent elements.   For the next few segments, we'll be covering a few  of the popular sequential access data structures,   such as stacks, queues, linked lists, and so  on. Beginning of course, with the stack, we've   already danced around the subject long enough. So  let's finally talk about what a stack actually is.   Now, the stack, by definition, is a sequential  access data structure in which we add elements   and remove elements according to the  lifepo principle. The lifepo principle,   which stands for last in first out, means that  whichever element we added to the stack last   will be the first one we retrieve. Hence, last  in first out, think of this as a stack of books.   Each time you add another book to the top,  you do so by stacking it on top of the others.   Then, if we want to get a book in the middle  of that stack, we would first have to take off   all the books on top of it, we can't just  grab that book from the stack and expect   the entire thing not to fall in on itself  like some sort of literature ristic Jenga.   The same goes for the stack as a data structure,  we add elements to the top of the stack and   retrieve them by taking them off the top of the  stack. There's only one way in and one way out   for the data. We can't simply access or modify an  element within the stack all willy nilly like we   were able to do with the array and the array list.  This might seem like a weakness limiting the flow   of data to one single point. But remember what  I said during this segment on time complexity,   many of these data structures have a built in  functionality, which gives them an edge over   the others. And the stack with its life.  Oh property is a prime example of that.   Next up, let's talk about some of the  methods commonly associated with the stack.   Now the stack class will always come with two  methods pre built into its class, those being   push and pop. These are the fundamental methods  which we use to interact and modify the stack,   making them extremely important to comprehend.  In addition to those two, I'm also going to cover   two more methods peak and contains, which are  both useful and can be found in the stack class   associated with most programming languages. Now,  just like we had an example ArrayList let's also   create an example stack to help us show off these  methods. And just like that, we're ready to roll.   Now push is a method which pushes an object  onto the top of the stack. All it takes in as   an argument is an object to add to the stack  and will return no value. When we do this,   that object becomes the form front of the stack,  and its size is dynamically increased by one.   Let's start running some push commands  on a variety of different strings,   you'll see that the stack slowly gets  built. Now channel this to subscribe,   a series of completely random words  seamlessly pushed onto the stack.   As you can see, we're only adding  information to one spot the top,   there's no insert method where we can just  jam a string into the middle of the stack.   Moving on, the pop method is used to remove  an element from the top of the stack,   it does not take in any argument and will return  the element that is popped off the stack back to   the user. Once a pop method has run, the element  that was at the top of the stack is removed and   returned back to the programmer. And the element  which was second from the top becomes the new top   of the stack. And so on our example stack, if we  popped off each element, you'd see that each time   one of these completely random strings is taken  from the stack and returned back to the user   until we are left with a sad, empty stack with no  shameless promotion. Let's add the strings back,   obviously, just for the sake of the  next two methods that we need to cover,   and not for continuing the free advertising.   So push and pop are how we add and remove the  data within our stack. So they're fundamentally   the backbone of the data structure. The next  two I want to talk about peak and contains   are more so used to interact with the data inside  the stack without actually changing it. Not as   useful for modifying data, but still extremely  important. The first of these is the peak method,   the peak method allows you to get the value at  the top of the list without actually removing.   We talked about before that the only way to  access elements in a stack was through the top.   And this method is simply a way to look at  what the top value is without having to pop   it off. It takes in no arguments and we'll just  return back the contents have the top element.   In our case, if we ran it on our example stack,  subscribe would be returned back to the user.   Now if we popped off the top element and ran  it again, two would be returned, it's that   you get the idea. Again, let's push subscribe  back on at the top for educational purposes.   Now the final method I'll be  covering is the contains method.   This one is used for searching throughout the  stack. It takes in an object as an argument   and will return a Boolean of whether or not that  item is contained within the stack. Essentially,   this method allows us to search through the stack  without popping off every element until we find   the one that we're looking for, as the contains  method does not modify the stack in any way.   So for example, example stack contains with  an argument of subscribe would return true   example contains with an argument  of this would also return true.   But example contains with an argument of Hello  would return false. So there they are four common   stack functions which are going to be vital if  you ever want to construct a stack based program.   Moving on to time complexity, for accessing the  stack has a time complexity equation of O of n.   This is because in order for us to reach a certain  element within the stack, we first have to pop   off every element that's above. Think of it like  this, if we had a stack of stones, and needed to   get to the bottom one, we'd first have to take off  every stone on top of it. So worst case scenario,   if the element we want is at the bottom of  the stack, we first have to pop off every   element above it. This would require a number  of operations equal to the size of the stack,   making the time complexity equation Oh event. This  is one of the major drawbacks to using stacks.   with arrays and array lists, we could easily  access any element within the data structure   instantaneously. And with a stack, that's just not  possible because of the way that it's structured.   Searching is going to be O of n for the exact same  reason. Worst case scenario, if we're searching   for an element that's at the bottom of the stack,  we have to go through the whole thing just to find   it. Now, inserting and deleting make up for this  by having time complexity equations of o of one.   This essentially boils down to the fact  that using our push and pop methods   really only requires one operation. Since the  data only flows in and out of a single point,   inserting or removing an object from that point  can be done immediately. For the push method,   we simply add it to the top of the stack. And  for the pop method. We just take it off from   the top. It's not rocket science. Actually, it's  computer science but that's Besides the point,   there's no need to rearrange data or move elements  around like there was for the array and ArrayList,   because we don't have to the data that  we're modifying is right there on top.   So there you go, the time  complexity equations for the stack.   Now, you might be wondering if there are even  uses for a first in first out data structure,   because it seems kind of out there, I mean,  limiting your data to a single entry point.   But you would actually be mistaken. stacks are  used everywhere, both in the actual writing of   other code as well as in real world situations.  In fact, one of the most fundamental processes   of computer science uses stacks as a way of  keeping track of active functions or subroutines.   Of course, I'm talking about recursion. Now,  I won't get into recursion too much here.   But basically, it's the process of  functions repeatedly calling themselves   when the function calls itself, that call is  added to a stack of processes. And when that   stack finally reaches what's known as a base case,  the functions are all then popped off one by one.   It goes much, much deeper than that. But we don't  have time for a full blown lesson on recursion.   Right now. If you want that, you can click the  card in the top right corner of your screen,   or the link in the description below, which will  both take you to that part in our introduction to   programming series where we cover it. Either  way, stacks are the backbone for recursion.   And recursion is the backbone for a plethora  of computer science related functionality,   such as traversing data structures, keeping  track of active subroutines in code, and much,   much more. Some examples of stack based functions  outside of computer programming, which you use   every day include the Undo redo button in word  processors and the back paging on web engines.   Both use stacks. Similarly, they continually  add tasks you've completed to a stack,   either web pages that you've visited or words that  you've typed. And then when you press Undo, or the   back button in a web browser, all we have to do is  pop off whatever the last task was off the stack.   And bam, you're right back to where you were a  second ago. It's like magic, but better in a way.   As you can see, the stack has a lot of real  world applications, both on the consumer side   and the client side. You interact and use stacks  every day without even realizing it. And we'll   use stacks frequently as you continue along your  computer science journey. And so by learning them,   you're opening up a world of opportunities. That  concludes our discussion on the stack. To review.   It is a sequential access data structure in which  we use the lifepo principle to add and remove   elements from and again, life O stands for  last In First Out of next we'll be talking   about an equally useful data structure that  functions very differently than the stack   while also working very similarly. And  that data structure is known as the queue.   So now that we've talked about the stack,  a sequential access data structure,   which follows the life principle, we need to  cover the opposite, which in computer science,   we obviously call a queue. Now by definition,  a queue is a sequential access data structure,   which follows the FIFO principle, or first in  first out, the stack and q are very much a dynamic   duo when it comes to the world of comp sigh. So  you'll notice a lot of similarities between the   two and the way that they're structured and  how they work. Today, and in this segment,   we'll cover a lot of the same topics as we did  with the stack, only for the queue. Now timestamps   can be found in the description correlating to  these topics. But if you're sticking around,   let's dive deeper into what the queue actually  is. Well, the queue like the stack is a sequential   access data structure, meaning we can only access  the information contained within it a certain way.   Now, if you remember back to stacks this certain  way was through the lifepo methodology, or lastin.   First out with the last element pushed onto the  stack would always be the first one we popped off,   similar to a stack of books  that we add to and remove from.   Now in contrast, the queue follows what's known  as the FIFO methodology, or first in first out,   where the first element added to the queue  will always be the first one to be removed.   We can think of this as a line to your favorite  amusement park ride. Mine as a kid was always   the Ring of Fire. So let's just use that one  as an example. The first one to get there,   assuming we don't have any cutters will always  be the first one who gets to go on the ride.   The later you show up, too long do you have to  wait. This is the strategy used for cues when   adding and removing objects. The first element to  be added will also be the first one to be removed.   Another big difference between stacks and queues  is the location we add and remove elements from.   You might remember that with the stack, we added  and removed elements from one spot the top. With   the queue. However, this is different. We add  elements, the back, also known as the tail,   and we remove them from the front, also known  as the head. This allows us to make sure that we   100% follow the FIFO methodology. So there's your  background information on the queue, sequential   access FIFO methodology, add elements the back  and remove them from the front. Got it? Good. Now,   let's dive headfirst into how we can actually use  a queue by jumping into some common queue methods.   So just like the stack, we're going to  have two methods used to add and remove   elements from the queue. For the stack, we added  elements with push and removed them with pop.   In contrast, with a queue, we add elements  using on cue and remove them using dq. In   addition to these two methods, we're  also going to cover peak and contains,   which if you watched the previous segment should  look pretty familiar. Alright, let's start.   on cue is the first method and the one we use to  add elements to the tail of the queue. It takes   in an object to put at the end of the queue, and  simply adds that object, we'll also increasing the   size of the queue by one, let's pull up an example  queue, which can see currently has a size of zero.   But say we call on queue on a completely random  string, let's say the string now, that would be   added to the tail of the queue, and the size would  increase by one. Now, because there's only one   element in the queue, at this point, the string  now is acting as both the head and tail for us.   We can of course fix that by on queueing a few  more completely random strings. If we add the   string video, the size goes to two, we can add  this and it goes to three, like makes it four,   you get the idea. And now we have a fully  functional queue, as you can see, like is   acting as the tail being that it was the last  string added. And now is being treated as the head   which makes sense considering it was the first  string to be added. Moving on. dq is the method   we use to actually remove elements from the head  of our queue. It takes in no arguments and will   return the element that was removed from the queue  back to the user. So if we ran a dq command on our   example queue, you'd see that now is both returned  back to the user and also removed from the queue.   Additionally, the size of our queue  has been dynamically decreased by one.   If we run it again, video is returned and removed  from the queue and the size goes down by one yet   again, you get the idea. We can keep doing this  for every element in the queue until it's empty.   But the next methods we're going to talk  about need some information to work with.   So for now, let's refill the queue back to its  original four elements. The next method that I'll   discuss is peak. Now we've actually covered peak  just a few moments ago in our segment on stacks.   But if you forget or just didn't watch that  particular part, peek returns the object that's at   the forefront of our queue. It doesn't take in any  arguments and simply returns the foremost object   of the queue without actually removing it. The  key word there being without this method allows   you to look at the head of the queue before you  actually be queuing. There are a multitude of   reasons that you might want to do this, maybe  to make sure that the element that you're about   to dq is the correct one, or to check to  see if an element you need is still there,   etc, etc. Whatever the case is, we can  use the peak method to fulfill our needs.   If we were to run it on our example queue, you'd  see that the string now is returned. But if we   dq the first two elements and run it again,  you'll see that the string This is returned,   pretty simple, but extremely effective. Again,  let's add video And now back into the queue for   our next and final method. That method of course,  being the contains method, the name pretty much   says it all. The contains method will take an  object and will return a Boolean of whether or   not the queue contains that object. running it  on our example queue with an argument of queue   would return false because as you can tell,  there is no q string contained within our queue.   However, if we ran it on a string, such as video,  it would return true because as you can see,   the string video is indeed in the queue. And  there they are all together now on cue dq peak   and contains four methods which will help you  utilize a queue to its maximum efficiency.   Speaking of efficient See, that takes us perfectly  into the time complexity equations for the queue.   Now accessing an element within a queue is going  to be Oh event, let's say you had a queue full   of three elements. If you want the object at the  tail, you first have to dq every element off the   front into the one you're looking for is the head  of the queue, Then, and only then can you actually   get the value contained. Since this may require  you to go through the entire queue of size n,   accessing is going to be O of n. Remember, now  queues are sequential access data structures   and not random access, meaning we can just  grab an object from the middle of the queue.   That's just not how things work. Searching is  going to be O of n for the exact same reason,   trying to find an element contained at the tail  of a queue requires you to iterate across the   entirety of that queue to check for it. So in that  scenario, we have to check every element within   that queue of size n, making the time complexity  equation O of n, inserting two and deleting from   a queue are both going to be an instantaneous o  of one. This is because just like with the stack,   we're only ever on queuing at a single point,  and we're only ever D queuing at a single point.   This means that no matter how large the size  of the queue is, it will always take the same   number of operations for any magnitude  to either insert or remove an element.   And there they are, in all their glory, the time  complexity equations for the queue, you may notice   that they're identical to the stack, which if  you've been paying attention is the truth for   most of the properties about a queue. They're very  much a yin and yang, one in the same type deal   different slightly in terms of the methodology  you use to add and remove objects from them.   You'll often see these two data structures talked  about together frequently, just because of how   similar their functionality is. The final thing  I want to cover on the topic of queues are just   some common uses for them within programming.  What are these things actually used for?   And the answer is quite honestly, a lot on your  computer. Right now. queues are being used for   what's known as job scheduling. The process  through which the computer determines which   tasks to complete for the user. And when like  opening up a web page or a computer program.   It's used many times in printers to keep track  of when multiple printers Try to print and   determining whose documents get printed first.  Heck, if you're looking for real world examples,   Google even uses cues and their new pixel phones  to enable what's known as zero shutter lag,   in which they strategically use cues to eliminate  the time between when you take a picture   and what the phone actually captures. So yeah,  in terms of functionality, the queue can be used   in a variety of fields. So it's good now that you  know what they are and how to use them. This also   concludes our discussion on the queue. To review  the queue is a sequential access data structure,   which follows the FIFO principle, were first in  first out to add elements to the back and remove   elements from the front. Up next, we'll  continue on the sequential access data   structures train and talk about one of  my personal favorite data structures.   Next, we'll be covering the linked list, and  it's a good one. So strap into your seats.   Let's just jump into things by talking about  the elephant in the room. What exactly is a   linked list? Well, to answer that a linked list  is a sequential access linear data structure   in which every element is a separate object  called a node. Each node has two parts, the   data and the reference, also called the pointer,  which points to the next node in the list. Wow,   that is a pretty big sentence with a lot of ideas  within it. So let's break it down part by part.   The sequential access part of that statement means  that the information contained within the link   list data structure can only be obtained through  a certain methodology. We talked about this in   our segment on the stack. If you remember, during  that segment, we compare them to a tape measure.   Because Similarly, you can only get measurements  from a tape measure through a specific method.   The specific method for a linked list  will be covered a little bit later   for moving on. The linear part of the definition  simply means that the information or data is   organized in a linear fashion in which  elements are linked one after the other.   Now when we state that every element is a separate  object called a node, this means that unlike   an array or an ArrayList, where every element  is just let's say a number, each element in a   linked list will actually be an object which  can have multiple attributes or variables.   And I won't dive too far into object  oriented programming right now.   If you want a good introduction to that, you  can check out the link in the description below,   which will take you to our introduction  to object oriented programming lecture.   For this series, we essentially mean that the  objects or note stored within each element of   our linked list will hold two separate pieces  of information. These two pieces of information   come together to form the node. And these  nodes are what make up our linked list.   More specifically, those two pieces of information  are the actual data or information stored within   that node, and the reference or pointer to the  next node in the linked list. The data is where   our strings or integers or Boolean values are  kept the actual contents of our data structure.   And the other piece of the puzzle is the pointer.  This reference or pointer points to where the   next node in the linked list is stored in memory.  This helps link all of the nodes in a linked list   together to form one long chain of information.  Kind of like a computer science conga line,   you should now be able to understand what  a linked list is. Again, it's a sequential   access linear data structure in which every  element is a separate object called a node,   in which each node contains both data and a  reference to the next node in the linked list.   This cumulatively creates a string of nodes which  we call the length list. Feel free to rewatch this   explanation if you're still confused, because I  know it can get pretty hectic pretty fast. The   next thing I want to do in this section is just  visualize how one of these linked lists is set up.   That way, you can actually see what I mean when  I say nodes, error, and pointers and everything   else, etc. So every linked list starts with  what's known as the head node of the list.   This is just an arbitrary label, which represents  a node containing some form of data, and also a   reference to the next node in the linked list. For  now, let's just store the integer one in our head   note. Now, since this is the only node so far in  our linked list, the head node will simply point   towards a no value, which is just to say it's not  pointing anywhere. Essentially, it's pointing to   nowhere and just being used as a placeholder until  we actually give it something to point towards.   Let's do that by adding another node to  the link list and store the integer to   inside of it. Instantly, you can see that now our  head node points to this second node instead of   a null value. You'll also notice that this new  node, which I'll call the two node points to a   no reference value, just like the one node used to  do, as it is now the last node in the linked list.   This is a pattern you'll notice as we continue  adding nodes. The last node in the linked list,   also known as the tail node will always point  towards a null value. This is our way of telling   the computer that we reached the end of our linked  list and that there are no more nodes to traverse   towards. Anyways, let's now create our third node  and inside store the integer three, this node now   becomes the tail end points towards a North value.  And the second node that we created, the one that   used to point towards a null value. Now points to  this new node which contains the integer three.   Essentially, we're just adding a node to the tail  of a linked list and then setting the references   of the previous node to point towards that new  node. And if you can understand that concept, you   pretty much comprehend the gist of how these nodes  interact. The two main takeaways are that every   node has both information and a pointer. And the  last node in a linked list points towards a null   value. That's pretty much the setup for a linked  list. Definitely more abstract and fluid than,   say an array or ArrayList, but hopefully not too  complicated. Next up, we're going to be discussing   how we add and remove these nodes from a linked  list. Unfortunately, for us, adding and removing   elements from a linked list isn't going to be as  simple as it was with the other data structures   such as the stack or the queue. This has to do  with the simple fact that we are actually able to   insert and remove elements easily within a linked  list at any location. With a stack or a queue,   we weren't actually able to do this because the  data could only flow in and out of a few specified   points. This made it so we couldn't remove an  element from the middle of the stack or jam an   element in the center of a queue. The way that  they're structured just didn't allow for this.   Using link lists. However, we can easily  remove elements from the middle or   jam elements in the center. And so today, we'll  be covering the different methods used to do so.   More specifically, we'll be covering three  different ways to both insert and remove   nodes from a length list. Adding to and  removing a node from the head, the middle   and the tail of a linked list. These are all  going to revolve around those pointers we talked   about at the beginning of the episode. Because  whenever we change up a node in a linked list,   we also have to change its pointers. And that can  get pretty complicated pretty quickly. Luckily,   that's why I'm here. I've set up a basic linked  list on your screen now with three nodes that we   can use to play around with. Each node has a value  of course representing the data inside the node,   and also a pointer which points to the next note,  the green and red coloring on the nodes with the   integers one and three, simply indicate which node  is the head node and which node is the town node.   Green means head node, and red means it's the  tail node. Now in an actual link list, these   pointers would be locations in memory. But for  the purpose of this series will be representing   the pointers visually perfect. So let's cut  the chitchat and just jump right into it.   The first method we're going to be covering is  adding and removing nodes from the head of a   linked list. Lucky for us, this is pretty simple.  To add a new node to the head of a linked list,   literally, all we have to do is make that new  nodes pointer point to whatever the current   head node of the linked list is. By doing this,  we simply take away the title of head node from   the current head and bestowed upon this new  node that we're adding, it looks a little bit   something like this. Let's take a node with the  integer zero and add it to the head of the linked   list. All we have to do to do this is set its  pointer to point towards the current head node.   Now, you'll see that none of the other nodes have  changed in the slightest. The only thing that has   changed is that this new node with integer zero  is the head node, and it now points towards   the old head node. extremely simple, and the  best part is that it works in reverse as well.   Removing a node from the head of  a linked list is just as simple.   All we have to do is set the head nodes pointer  to some null value. Once we do, the head node   will get cut off from the flow of information  essentially removed from the linked list. If we   did this on our example linked list, you'd see  that once we set the zero nodes pointer to No,   because it no longer points to the one node and no  node points towards its location. It has been cut   off from the rest of the nodes and exiled from  the linked list. The old head node regains its   position. And it says if this integer zero node  never even exists. Moving on the next methods   we're going to cover are inserting and deleting  a node from the middle of the linked list.   These two methods are definitely the most  difficult of the bunch. And this is because   we need to insert the node in such a way that the  pointers get readjusted accordingly without losing   any of the information. If we accidentally set  the pointers wrong, or do things out of order,   the data could get lost forever. But luckily,  I'm here to teach you how to do it the right way.   We'll start with adding a node  to the middle of a linked list.   Adding to the middle of a linked  list is a two step process.   We first make the pointer of the new node point to  the node after the location we want to insert at.   Then we set the node before the location we want  to insert that to point towards the new note. So   if we wanted to insert a node with the double  value 1.5 after the node containing the integer   one, but before the node containing the integer  two, what we would first do is set the new nodes   pointer to point to the node containing  the integer two, then we would make the   node containing the integer one point towards  our new node which contains the double 1.5.   By adjusting the pointers of the nodes before  and after the location, we want to insert that   we can strategically jam this node in the correct  place without moving or modifying the entire list.   As you can see, now we have a linked list of  length for where the link has not been broken   and is also still contiguous. Removing a node  from the middle of a linked list is even simpler.   All we have to do is make the pointer of the node  previous to the one we're removing to now point to   the node after the one removing. Then if we set  the pointer of the node we want to remove equal   to ignore value, we again cut the node off from  the linked list and it is removed. Simple as that.   So following these steps, if we now wanted to  delete our 1.5 node, we'd make the one node   again point towards the two node instead of  the 1.5 node. Then if we delete the pointer   of the two node, by setting it equal to null, it  gets cut off from the flow of the linked list.   No changes are externally made to the rest  of the list. Just one lonely node removed.   The final type of insertion and deletion we'll be  covering is inserting to and deleting from the end   of a linked list. Doing this simply requires To  modify the tail node of the length list, the one   which currently points towards some null value.  for adding a node to the tail, you simply make   the current tails pointer, which is currently set  to no to point towards the node you want to add.   So if we wanted to add a node with the integer  four, we would just make the three node point   towards this new node. Then by setting the four  nodes tail to point towards No, we've made this   new four node the tail of the linked list, and the  old tail node with integer three now points to the   new tail. Removing the tail of a linked list is  just as simple. If we want to remove the tail,   we just set the previous tail to point towards  a null value instead of the current tail.   This leaves the current tail node with no  connection to the linked list, isolating   it from the pack. Doing this on our list would  look like making the three node point towards No.   And now because no node now points to our tail  node continuing integer four anymore, it gets cut   off from the linked list and essentially removed  making the old tail node, the one containing the   integer three, the current tail node once again.  And so after a bit of pointer readjustment,   hopefully now you can understand the pseudocode  behind inserting and removing elements from a   linked list with ease. Up Next are the time  complexity equations for a linked list.   Now accessing an element within a linked list is  going to be O of n. This is because linked lists,   again are sequential access data structures.  This should be all too familiar if you watch   the sections on stacks and queues. But  if not, let's just use a simple review.   sequential access data structures can only be  accessed through a particular way, meaning we   can't get any element we want instantaneously.  resulting from this is the stipulation that if we   want to get a certain element within a link list,  we need to start at the beginning of the list   and cycle through every node contained within it.  Before we can finally access the one that we want.   We do this by using the pointers  as a little map for the computer.   First go to node one node one's pointer  gives you the location of node two in memory.   And node twos pointer will take you to  the memory location, which contains the   node with the information that you want. It's  like a computer science treasure map and a way   for a linked list of size n. This means  you could have to traverse the entire   linked list before finding your information  making it's time complexity equation O of n   searching is O of n for the exact same reason, we  check a node and if it's not the one that we want,   we use that nodes pointer to take us  to the next node and check that one.   We do this for every node until we either find  the node containing the value we want, or get   to a node which points towards no indicating  that we've reached the end of the linked list   that the value that we're searching for just isn't  contained within that particular length list.   Inserting and deleting from a linked list  is a little bit complicated. Since linked   lists usually only store the head and sometimes  the tail nodes location in memory permanently.   If we want to insert or delete an element  at the beginning or end of a linked list,   we can do so instantaneously using the methodology  we talked about previously. However, if we want   to insert a node within the linked list, things  become a little bit more complicated. In order to   insert a node within the link list, we must first  traverse to the location we want to insert it.   This means following that treasure map until  we reach the insertion point. And then and   only then can we actually follow the instructions  to insert or delete a note. So depending on how   and where you want to insert or delete a node at  its time complexity equation will be either O of n   or o of one a little confusing, yes, but necessary  dimension for when it inevitably comes up.   So in review, accessing searching, and sometimes  inserting and deleting are all going to be O of n.   And other times inserting and  deleting will be instantaneous. Cool.   Now let's finish up by talking about some  real world applications of the linked list.   Now, something I haven't talked about before, but  would be scolded by people in the comments for   not mentioning is that linked lists can actually  be used in the backing of other data structures.   What do I mean by this? Well, basically,  we can use linked lists to make stacks,   queues and some other data structures we haven't  talked about. This is in the same vein as earlier   when I mentioned that the ArrayList uses the  array as a backing support system in memory.   Now this goes a little bit above an introductory  series, but it's one of if not the most important   uses of a linked list in computers. Science.  So I thought I'd mentioned it here. Basically,   because of the way that it's structured. Having a  stack or queue use the nodal methodology that we   talked about during this episode, which comes with  the link list to be the back end of it structure   makes a lot of sense. If you're curious, I'll have  an article linked below explaining this idea in   better detail. I just thought it was a little bit  above the complexity for an introductory course.   Now, that's not to say link lists can't be useful  in other ways. A queue on Spotify, for instance,   where each song in the queue doesn't contain  just an integer or a string, but an entire   song with WAV data, a title, a length, etc. Then,  when the track is done, it automatically points   to the next song in the playlist. Another  example could be a photo viewing software   where each node is an image, and the pointers  simply point to the next photo in the list.   Like I said, the main functionality of a linked  list might be to back other data structures,   but it also has tons of uses  in other areas of expertise.   In Review, a linked list is a sequential access  linear data structure, in which every element   is a separate object called a node containing  two parts, the data and the reference pointer   which points to the next node that comes in the  linked list. These pointers allow us to easily   shift around each node, adding or removing it  without having to move massive amounts of data   like we would have to do in the case of  an array or some other data structure.   One thing I haven't mentioned yet about linked  lists is that this pointer system does come with   one big drawback. With a normal linked list, we  can only ever go forward with our pointers never   backwards from the computer's eyes. Once we follow  a pointer to a certain nodes, location and memory,   there's no way to go back or undo to the previous  one. Much like a shark, we can and only will ever   go forward. This problem is fixed however,  through the use of a data structure known as   the doubly linked list, which coincidentally is  the subject of the next segment in this video   smallworld All jokes aside, Next up, we're  going to be covering the doubly linked list   so strap into your seats. A doubly linked list  is almost exactly the same as a linked list.   That is to say it's a sequential access data  structure which stores data in the form of nodes.   Except with doubly linked lists. There's one  small difference. With the doubly linked list,   we're able to traverse both forwards to  the next node in the list and backwards   to the previous node in our list. Again,  using pointers. How? Well let me explain.   With regular link lists. Each element was  a node composed of both a data section and   then a pointer which would take you to the  memory location of the next node in the list.   Using this, we were able to traverse a  linked list easily to search for information,   access data within a link list, or add and  remove nodes from within the list with ease.   A doubly linked list simply builds upon this  by also having a pointer which points to the   previous node location in memory. It's an undo  button of sorts, which allows you to fluidly go   through the link list in either direction, instead  of just limiting yourself to go in one direction.   This is great since it allows us  to jump around the linked list and   have a lot more flexibility when  it comes to modifying information.   Now because this can get pretty confusing  very quickly, I want to do a visualization.   Before that though, let's use some lingo to help  consolidate the terminology that I'll be using.   When I refer to a nodes. Next, I'm  referring to that particular nodes   pointer which points to the next object in the  list, whether that be another node or no value.   Similarly, when I refer to a nodes previous  abbreviated to prevx, on the vigils,   I'm talking about its pointer which points  to the previous object in the linked list,   again, either another node or no value. Doing  this just helps keep things a little bit more   simple because as you'll see having both  a previous and a next pointer makes things   a little bit more complicated. Now on to what  these doubly linked lists actually look like.   Just like with a regular linked list, every doubly  linked list is going to start with a head note.   Since it's the first node in the list, both its  previous pointer and its next pointer will point   towards a null value. This is of course, because  it can't point to information which isn't there.   We can fix this though by adding another node.  Once we do, you can see that a few things have   changed. The head nodes next pointer now  points towards this new node instead of No,   the new nodes previous pointer now points to the  head node. And the new nodes next pointer now   points towards a null value. As you can see, it's  Little bit more complicated than adding a node to   a regular link list in which we only had to worry  about one set of pointers as opposed to two,   but still manageable. Let's add one  more node for demonstration purposes.   And you'll see the same thing happens again. The  second node now points to this new third node.   And this third node gets initialized with  a pointer which points both to the previous   second node and also forward to some null  value. You can see that with any two nodes,   the next pointer of the first and the previous  pointer of the second come together to form sort   of a cyclical connection which ties the two nodes  together. Then at the head and tail of the list,   there's a connection which ties them both  towards some no value. Most of this should   be common sense, but it's still good to go over.  doubly linked lists are a little scary at first,   but once you break them down, you realize that  they're just an evolved form of linked lists.   of next we're going to be talking about adding  and removing nodes from a doubly linked list.   Again, using the three different methods  that we talked about in the previous segment,   adding and removing from the head, the middle,  and the tail. I've also set up a basic doubly   linked list with three nodes containing three  strings to help us out with this next segment.   Finally, just like the last segment, the green on  top of a node signifies that it's the head node,   and the red signifies it's the tail node. Alright,  with all that being said, let's get into it.   Now adding to the head of a doubly  linked list is quite simple.   The first step is to take our new node that we  want to insert and set its previous to no second,   we set the new nodes next to point towards the  current head node of the linked list. Then all   we do is set the current heads previous to point  towards this new node instead of a null value,   and we're set to go. Doing this rearranges the  pointers in such a way that the new node we want   to add becomes the head of the doubly linked list.  Pretty simple. So if we wanted a new node with   the string, aim to be the head of our doubly  linked list, we would just follow the steps.   First, we set the aev nodes next to point towards  the atom node, then we set its previous to point   towards the null value. And finally, by setting  the atom nodes previous to point towards the AV   node, we have successfully added that node to  the head of the list making it the head note.   Removing a node from the head is even simpler, we  first set the head nodes next to point towards a   null value. And then by setting the second nodes  previous to also point towards a null value,   we can easily remove the head node from the  list. This is how it would look in our example,   we'd first set the aid nodes next to no, then we  would set the atom nodes previous to also be no   and then we're done. Now because the aid  node doesn't have anything to point towards,   nor does it have anything pointing towards  it, it will be deleted from the list.   The Atom node will regain its position as  the head node and the program will carry on.   Up next is inserting and deleting from  the middle of a doubly linked list.   This is where things get really tricky,  so make sure you're paying attention.   To insert a node into the middle of a doubly  linked list, we follow a three step process.   The first step is to take the node we want to add  and set its previous to point towards the node   previous to the position we want to insert at.  Then we take that same node the one we want to   add. Instead it's next pointer to point towards  the node after the position we want to insert.   Finally, we set the next on the node at the  position before the one we're inserting, and   the previous on the node after the position we're  inserting to both point towards this new node.   That is a lot of words, some of which might not  even make sense to you. So let's open up the space   on our screen real quick and do an example with  our list. Let's take a new node with the string   Chris and add it between the Adam node and  the bed node by simply following our steps.   First, we want to set the new nodes previous to  point towards the node previous to the position we   want to insert at. This means setting the Chris  nodes previous to point towards the Adam node.   Simple enough. Second, we set the new nodes next  to point towards the node after the position we   want to insert that this entails setting  the Chris nodes next to point towards Ben.   So far, so good. Then the last step is to set the  node on the next before we're inserting and the   previous on the node after we're inserting to both  point towards this new node. So in this case, the   atom nodes next and the Ben nodes previous both  get set to the new Chris node. This completes the   addition into list. Obviously, this is a little  bit more complicated than inserting or deleting   from the head. But as you can see, we've now  inserted this new node into its rightful place.   It may be hard to see since the pointers are a  little bit messy, but the flow of the list has   remained constant without any breaks in the  pointers, which is the most important part.   Removing a node from the middle of a doubly  linked list is also a three step process.   First, we set the node before  the one we want to remove   next to point towards the node after the one  we want to remove. Then we set the node after   the one we want to remove the previous two point  towards the node before the one we want to remove.   The final step is to set both pointers of the  node we want to remove to point towards a null   value. Again, a little complicated, so let's  take it to our example list and try it out.   Let's delete the Chris note just for  the sake of keeping things consistent.   Following the steps that we've laid out, we  would first set the next of the node before   the one we want to remove to point towards  the node after the one we want to remove.   So in this case, we said the atom nodes  next to point towards the Ben note,   essentially skipping over the crisnet. Then  we set the previous of the node after the one   we want to remove to point towards the  node before the one we want to remove.   So we set the bend node to point towards the  atom node, again skipping over the Chris node.   Finally, we have to set the Chris nodes  pointers to point towards null values.   Now that we've done this, because no  nodes point towards it, and it doesn't   point towards any nodes, the Criss node  gets deleted from the doubly linked list,   the list is back to its original form with no  breaks or errors. Adding a node to the tail of   a doubly linked list is also a three step process.  Step one entails setting the next pointer of the   current tail to point towards the new node, we  want to become the tail. Step two is setting   the previous of the new node that we're adding  to point towards the current tail of the list.   And step three is making the new nodes next point  towards a null value. Again, let's do an example   where we add a new node containing the string  Ethan to the tail of the doubly linked list.   Following our steps, we first set the next  pointer of the current tail, the coral node   to point towards the Ethan node. Then we make  the Ethan nodes previous point towards Carl,   and it's next to point towards an O value. And  we're all set. Ethan has been successfully added   as the new tail of the list in three quick  and easy steps. Removing from the tail of   the list is even easier and only requires two  steps. We first set the tail nodes previous to   point towards No. And then we set the second  to last nodes next to also point towards No.   On our example list, it looks like this. We first  set the Ethan nodes previous to point towards No.   Then we set the coral nodes next to also point  towards No. This isolates any pointers going to   or from the Ethan node and deletes it from the  list making the coral node the tail once again.   adding and removing information from a doubly  linked list might sound like a hassle. But   remember, we only have to use this pseudocode to  program these functions once in whatever language   we're implementing them in. And then we're  able to reuse them an infinite amount of times.   Now for time complexity equations, since the  doubly linked list is really just an extension   of the linked list, it's time complexity  equations are going to be exactly the same   as they were for the link list. And for the  exact same reasons, O of n for all four cases   and sometimes over one for both inserting and  deleting. If you didn't watch the linked list   segment and are wondering how we arrived at these  equations, you can check the description for a   timestamp which will take you to the discussion  we had on linked lists time complexities.   Finally, in terms of doubly linked lists, I just  want to talk about some of the uses for a doubly   linked list because there are a ton. The back  and forth functionality of a doubly linked list   lends itself to be implemented in a lot of stack  like functionality, ie cases where you want to   go back and forth between information that you're  storing for the user. Some examples of this could   be the browser caches which allow you to go  back and forth between webpages, the Undo redo   functionality in many programs, applications which  allow you to utilize an open recent functionality.   The list goes on and on. Basically any case in  which you need to store a list of objects with   multiple attributes, a doubly linked list  is going to be a safe bet to get it done.   linked lists and their evolved  form and the doubly linked list   are a great way to store information because  of the adaptability of the nodal structure.   Since we're not working with raw information  like primitive types, and we're storing All   information inside of a shell, it makes it  a lot easier to move the information around.   This combined with the pointer system allows for  non contiguous but still fast operating speed,   making these two data structures a  staple in computer science. Up next,   we'll be covering dictionaries. And with that a  little mini lesson on hash tables. This is the   last of what I refer to as the intermediate data  structures at the beginning of the lecture, and is   honestly personally one of my favorites. So let's  not wait any longer and just jump into things.   Before we get too far, though, we should probably  clarify that when we say dictionary, we're not   referencing that thick book you probably have  lying around your house and haven't used in years.   Actually, dictionaries are one of the most  abstract data structures which exist in computer   science and can be used for a variety of purposes.  Another thing I'll clarify really quickly   is that dictionaries are also sometimes called  both maps and associative arrays by the computer   science community. This moreso has to do with  language specifics and preferences and not any   functional differences, since all of them work in  almost identical ways. But for this series, we're   going to be referring to them as dictionaries,  just to keep things simple and organized.   Now a dictionary in computer science, by  definition is an abstract data structure, which   stores data in the form of key value pairs. This  means we have two main parts to each dictionary   element, the key and the value. Each value within  a dictionary has a special key associated with it.   And together they create a pair which is  then stored in the dictionary data structure   as an element. Think of a key value  pair like a social security number.   Each social security number is a key,  which is then paired with a value,   that value being an individual person. These  social security number key value pairs then   come together to form a dictionary of every  human being living in the United States.   This is very different from many of the data  structures we've talked about previously,   because we index dictionaries using  these keys instead of a numerical index.   For example, with an array, we would index each  element within the data structure according to a   numerical value, which started at zero and ran the  length of the array. With a dictionary, however,   we index each element by using its key instead  of some arbitrary integer and obtain information   through that index instead. So what exactly  are these key value pairs going to look like?   Well, they can be just about anything. The keys  and a key value pair can be any primitive data   type that you can think of, we can have a  dictionary which has integers as the keys,   one with strings as the keys, one with doubles  as the keys, there's really a lot of flexibility.   As for the values, but we have even more  flexibility with those. We can have keys   and our dictionary correspond to pretty much  anything, strings. Sure. Billions, of course,   another dictionary with its own set of key value  pairs, you can even do that too. This allows us to   have tons of combinations in the way that we store  our data, making dictionaries extremely powerful.   as powerful as they are, though, there are two  extremely important restrictions that we have   to cover when it comes to dictionary. And they  are as follows. Every key can only appear once   within the dictionary, and each key can only  have one value. Let's start with the first one.   Each key has to be unique and cannot  be replicated, duplicated, cloned,   copied or anything else that would cause there  to be two keys of the same value in a dictionary.   Think of it like this, when you create a key value  pair, the computer creates a little locked box in   memory to store the value, then the computer  spends days and weeks creating a customized   handcrafted one of a kind key that  corresponds directly to that locked box.   This key cannot be replicated in the slightest.  And that's the same mindset you should use when   working with dictionaries, no two keys can  be the same. If you were to try and make   a dictionary with too similar keys, you  would be thrown in error by the computer.   The second stipulation is that each key can  only have one value. Think back to our custom   key example. It wouldn't make sense for this one  of a kind key to be able to open multiple boxes.   The same is true for our keys and computer  science. They can only correspond to one value.   Now one rule that we don't have to follow is  that there can be duplicates of values within   a dictionary. Meaning we can have two separate  keys both point towards equivalent values. As long   as the keys are different. The computer does not  care. Alright, now that we know what dictionaries   are and how they work, let's jump into the time  complexity. For a dictionary, or at least try to,   let me explain. Now for a dictionary, the time  complexity equations are a little bit funky.   Previously, we talked about linked lists and how  they are sometimes used as the backing of other   data structures in memory. For example, a stack  might implement the linked list structure as its   way of storing information. Well, the most common  way to initialize a dictionary is to implement it   with what's known as a hash table. Hash tables are  a little more advanced than the content I wanted   to cover in this series. But they are a huge part  of dictionaries, especially when it comes to time   complexity. So we're going to give a little mini  lesson on them today. Doing this will help you   better understand the time complexity equations  for a dictionary, as well as give you some   background on Advanced Data structures. Alright,  let's begin with a little thought experiment.   Imagine this scenario for a second. say we  have a dictionary which contains a few key   value pairs like shown on your screen. Now,  let's just assume that to store all this   information in the computer's memory, we use an  array like how it is shown on your screen now,   with a position that a cell is in the table  also corresponds to the key and the key value   pair stored in our dictionary. So the zero with  cell would contain a key with the integer zero,   which leads to a value. The fourth cell  would contain a key with the integer four,   which leads to another value, and so on. Any cell  which we don't have a key for is empty, or what   the computer scientists refer to as nil. Why does  this not work? Why can't we just use an array like   this to back our dictionary in memory, since it  looks like actions such as accessing and searching   would run in constant time, since all the keys  correspond to their table values in memory,   all we'd have to do to get a value is simply  reference the key at the index we're looking for?   Well, this actually isn't the case, because it's  based upon the simple assumption that the size of   the array is not too large. Sure, this might work  great for this dictionary where we have 10 or so   elements. But let's say instead of 10 elements  evenly spaced out, we want to have one which has   10 values, which are sporadically placed from one  to a billion. We want to keep the keys as being   in the same index position as their values, so we  can know exactly where they are and reference them   easily. But by my count that is 999,999,990  nil values, just taking up space in memory.   And that is not good whatsoever. There's  got to be a better way to do things,   right? Well, this is where our hash tables come  in. Hash tables are fundamentally a way to store   this information in such a way that we're able  to cut down the amount of nil values, while also   allowing for the information to be stored in such  a way that it is easily accessible. Basically,   by using hash table, we'd be able to store these  10 keys in our dictionary, which range from one   to a billion in a table with only 10 elements,  while also being able to keep constant time.   How is this possible? Well, through the use of  a trick program is used known as hash functions.   A hash function will take all the keys  for a given dictionary and strategically   map them to certain index locations in an array,  so that they can eventually be retrieved easily.   Essentially, by giving a hash  function both a key and a table,   it can determine what index location to store  that key add for easy retrieval later. These   hash functions can be pretty much anything which  takes in a value and returns an index location.   The goal of a good hashing function is to take in  a key, whatever that may be, and reliably place it   somewhere in the table so that it can be accessed  later by the computer. So with that being said,   let's just jump into an example. Because I know  a lot of you might be confused. Let's say we have   our dictionary which contains keys in the form of  integers from one to a million by a factor of 10.   So the keys are 110 101,000 10,000 100,000, and so  on. A good hash function for this data set would   be to take the key divided by itself, and then  multiply the result by the number of digits in   the key minus one. So to find out where to store  the value corresponding to the 1 million key,   we would take a million divided by itself, which  yields one and then multiply that by the number   of digits in that integer minus one, in this  case, seven minus one or six. This means we'd   store the 1 million key at index location six.  If we do this for every key in the dictionary,   we can see that each key in the key value pair  is stored at some index from zero to nine.   We've consolidated the 10 keys for One to  a billion into 10 index lots instead of a   billion. A pretty good improvement in my opinion,  you might think this is a little over complicated   for storing 10 elements. But remember, sometimes  we might be working with 1000s of keys at a time,   ranging in the billions, or even strings  as keys, which is a whole other story.   Now the best part about these hash functions,  let's say that we now want to get the value which   is paired with a 1 million key, all we need to  do is put 1 million back into our hash function,   and it will tell us where the value tied  to that key is stored at within the table,   in this case at the sixth index. Doing this  allows us to pretty much instantaneously find   where any key value pair is stored at within  the table. All of this information may be   pretty confusing. If you still aren't 100%, I  would recommend watching this part over again.   However, in layman's terms, and for  this series, all I want you to be able   to comprehend is that dictionaries are built  upon these hash tables. And the keys in our key   value pairs are stored in these hash tables and  indexes which are determined by hash functions.   And if you can understand that you've got a  pretty good base for understanding hash tables.   Now the final thing I want to talk to  you guys about in regards to hash tables,   and the reason dictionary time complexity  equations are so funky has to do with one   small problem. What happens when we run two  different dictionary keys into a hash function,   and the computer tells us to store them at the  same index location. For example, let's say we   have two dictionary keys with the strings, Steven  and Shawn. And when we run both of them through   our hash function, the computer instructs us to  store both of them at the ninth index location,   this should be impossible. How are we supposed  to put both of them at index location nine.   This little problem is what's known as a hash  collision and can be solved one of two ways,   either open addressing or closed addressing. With  open addressing, we just put the key in some other   index location separate from the one returned to  us by the hash function. This is usually done by   looking for the next nil value in the table,  ie the closest location which contains no key.   So with our example, the key Steven would  get hash to the index location nine, mostly   because it's a better name. And the inferior  key Shawn would get stored at index location 10,   because it's the closest open location available.  This does make it harder to interact with the data   in our dictionary later on, and can lead to  problems if we eventually have a value which   hashes to the index location 10, which is why  computer scientists developed closed addressing.   Closed addressing uses linked lists to chain  together keys that result in the same hash value.   So the keys Steven and Shawn would get stored  together in a linked list and index location nine.   The main drawback to this is that whenever we  want to interact with the value stored within a   key value pair for either Steven or Shawn, we end  up having to look through the linked list for that   piece of data that we want. If there are 200 keys  hash to one index, that's 200 keys, we might have   to search through to find the one that we want.  And that's not very good. And with that concludes   our mini lesson on hash tables. Now we can get  back into dictionaries, and finally talk about   their time complexity equations. Now remember back  to the segment on time complexity really quickly.   Back then I mentioned that for  these time complexity equations,   we generally measure a data structure based  on its worst case scenario. But when it comes   to dictionaries, if we are to assume the worst  case scenario, things get a little outrageous,   we basically end up assuming that our hash  function makes it so that every key value pair   ends up in the same index, meaning that each  of our keys gets stored in a linked list,   assuming that closed addressing is  being used, then worst case scenario,   we have to assume that every operation functions  how it would for accessing searching for inserting   or deleting from a length list, which,  if you remember is O of n for all four.   This is of course preposterous and would probably  never happen with the bare minimum decent hash   function. Which is why in addition to the worst  case, scenario, time complexity equations, I'll   also go over the average time complexity equations  for these four operations. Now Lucky for us,   they're all going to be over one. This has to do  with these hash functions we talked about before.   To access search for insert or delete a key value  pair from our dictionary. All we need to do is run   that key through our hash function, and it will  tell us what index in the hash table to go to in   order to perform that operation. There's no time  wasted looking through preset indexes because we,   the programmer generate the indexes ourselves. And  that's the power of the hash table in the flesh.   Overall, dictionaries are a very useful data  structure when it comes to computer science   for a few reasons. They differ from the rest  and quite a few big ways. The fact that they   don't use a numerical index to retrieve values,  rather a preset key, the notion that those keys   and the key value pairs can be a range of types  from strings to integers, to chars, and so on the   implementation of the hash table, which allows  for super quick utilization that is sure to   come in handy in any program that you write. In  conclusion, and overall, the dictionary is just   a solid data structure that can be used to fill  in the gaps in your data structures knowledge.   And with that, we have now completed all of  the intermediate data structures if you will.   Next, we'll be moving on to trees and tree based  data structures. This is the point in this series   where we will be dropping our discussion on time  complexity equations, because if you thought that   the dictionary time complexity equations were  complicated with hash tables and hash functions,   trees, and there are many variants only further  complicate things to the point where I no longer   feel comfortable including them in an introductory  course, we might mention that certain tree based   data structures are good for accessing or  searching. But as for discussions on all four   time complexity equations, dictionaries are where  we leave things. With that being said, let's move   on to the final few data structures as we round  out this introduction to data structures series.   Next up on our list of data structures  is the tree. No, not those big green   things that inhabit the outside world and  that you can climb on and get fruit from,   rather the computer science data structure,  which is way more fun in my opinion.   Now, before getting into trees, we need  to talk about data structures in general.   Every data structure we've covered up until this  point has been stored linearly. arrays, stacks,   linked lists, all of these had a definitive start  and end to their data, and you could point them   out easily. Even the dictionary could be laid  out in such a way to be represented linearly.   Everything was so nice and neat and easy to  visualize. On the other hand, trees store data   hierarchically as opposed to linearly. What does  that even mean? Well, besides being an extremely   hard word to pronounce, it's an alternative way  to store data that we're going to be using for the   remainder of this series. It's a little hard to  visualize. So I'm going to start with an example.   The most common real world example of  hierarchical data would be a family tree,   each person would be an element of the family  tree, and connections wouldn't be based on   a simple linear fashion. Rather, connections  would be more abstract and can lead to multiple   paths or branches, instead of just a single data  point. Each generation is ordered on a hierarchy,   which is where the name comes from. And  as you can see, there's no definitive   end to the family tree. Sure, there are the ones  at the bottom, which you could argue are the   ends of the family tree, but which  one of them is the definitive end,   there is none. Another example could be  a file structure system on your computer,   you'd have a base folder, such as your desktop.  And then inside of that, you might have multiple   other folders for different types of information.  And then inside of those, you might have more   folders representing more types of information.  And then finally, you would have your documents.   Again, it sets up a network of those files on  different levels, just like how we had with the   family tree. Trees and the trees. Many variants  are all dependent on storing data hierarchically,   which of course finally begs the question of what  exactly a tree is. Well, a trees in abstract data   structure which consists of a series of linked  nodes connected together to form a hierarchical   representation of data. The actual information or  data is stored in these nodes. And the collection   of nodes along with the connections between them  is what's known as the tree. This is sort of like   a linked list only instead of each node only  pointing to one location, it has the option of   pointing towards multiple, and also branching  off on its own or pointing to no other nodes.   Each of the nodes in a tree can also be called  vertices and the connections between vertices.   What linked two nodes together are called edges.  Now the free flowing structure of a tree lends   itself to a lot of different configurations. And  with that comes a plethora of terminology about   certain nodes, vertices, and just the structure  of a tree in general. So what we're now going   to do is go over a lot of the terms associated  with specific nodes based on where they are on   the tree and how they're connected to other nodes  while also visualizing the general structure of   a tree at the same Time. Okay, let's start. Let's  begin with the things we've already talked about.   A vertices is a certain node in a tree,  and an edge is a connection between nodes.   The first new thing is that every tree starts with  what's known as a root node. This is always going   to be the topmost node of a tree. Let's add a node  with the integer 50 to serve as our root node.   Next up, let's connect the two vertices  to our root node using two edges.   These two nodes we just added are what's known as  child nodes, since they are both connected to the   node containing the integer 50. child nodes can  thus be defined as a certain node which has an   edge connecting it to another node one level  above itself. Vice versa, the root node, the   vertices containing the integer 50, is now what's  called a parent node to these two child nodes.   Thus, we can define a parent node as any node  which has one or more child nodes. Think back   to our family tree. If we were using people  instead of integers, it makes perfect sense   that the nodes directly connected to each  other have some sort of familial relationship.   Let's continue on by adding two child nodes to the  node containing the integer 20. When we do this,   the 30 node becomes a parent node, and the  two nodes we've just added become child nodes.   We have now branched off from the 20 node  completely, and that the two child nodes   containing the integers 10 and 15, respectively,  share no direct association with it.   Speaking of the 20 node, since  it does not have any children,   it is what's known as a leaf node, or a node  in a tree which doesn't have any child nodes.   In this context, the 10 and 15  nodes would also be leaf nodes.   Finally, let's add one more node as a child of  the 10 node and there's our tree in all its glory.   As a quick review, the 50 node is the root node,  the 30 and 20 nodes are children of that root   node and the root node is thus the parent of  the 30 and 20 node, then the 10 and 15 nodes   are children of the 30 node. And the 30 node is  thus a parent node to both the 10 and 15 nodes.   The five node is a child of the 10 node and  the 10 node is a parent to the five node.   And finally, the 515 and 20 nodes are all leaf  nodes because they do not have any children.   As you can see, one node or vertices on  our tree can have many different titles   depending on where it is in the tree and what  other nodes are connected to it. For example,   the 30 node is both a parent node to the 10  and 15 nodes, but also a child of the 50 node.   The 15 node is both a child of the 30 node,  and also a leaf node as it has no children.   This terminology really comes in handy when  we start talking about trees which contain   1000s of vertices and edges, and the data  becomes very complicated to order and maintain.   Moving on the next two pieces of  terminology I want to go over with you guys   are the height and depth of a tree. The  height is a property of a particular tree in   it of itself. And the depth is a property of each  individual nodes are contained within a tree.   Let's start with the height. The height of a tree  is the number of edges on the longest possible   path down towards the leaf. So in our tree  example, since the longest path in our tree, which   leads to a leaf is from the 50 node to the five  nodes, and there are three edges in that path,   the height of the tree would be three. The depth  of a certain node is the number of edges required   to get from that particular node to the root  node. For example, let's take our 30 node.   Since there's only one edge connecting it on the  way up to the root node, its depth would be one.   For the 15 node. However, since there are two  edges, which separate it from the root node,   the 15 node would have a depth to that's pretty  much all you need to know about the terminology   associated with trees and computer science. As a  review, we have the height and depth obviously.   Then we have vertices, edges, root  nodes, parent nodes, and leaf nodes.   Now there's probably something that's  been scratching at the edge of your mind   for quite a while now. And that's why the heck  are these called trees. They look nothing like   trees. Regular trees are not upside down like  this data structure would lead you to believe.   So who named trees and why? Well,  there's actually a simple answer to this.   The tree is said to have been invented during  the 1960s by two Russian inventors. And the   last time that a computer scientists got up from  the chair and went outside to actually see a tree   is rumored to have happened back in 1954. Ever  since then, it's been grinding arrayed screens   watching hours upon hours of their favorite youth.  superchannel so please forgive them for their   confusion when it came to naming conventions, they  must have just forgotten trees aren't upside down.   Now regular trees, like the ones we just created  are great for storing hierarchical data. But their   power can really be heightened when you start  messing around with how the actual data is stored   within them. by imposing rules and restrictions  on what type of data is stored within a tree,   as well as where we can effectively use the  tree data structure to its full potential.   I could talk about the different types of  trees for a long time, so long that many of   them are full segments in this lecture. But for  now, I just want to cover one popular variant.   This will be a good introduction to how trees can  vary without simply diving into a huge deviation   from what we already know. More specifically,  the tree variant I want to talk to you guys about   is the binary search tree. A binary search tree  is a simple variation on the standard tree,   which has three restrictions put on it to help  organize the data. The first is that a node can   have at most two children. This just helps make  it easier to search throughout the tree as we   don't have to spend time looking through each  of the eight children for a particular node.   Keeping it limited to two helps us do this. The  second restriction or property is that for any   given parent node, the child to the left has a  value less than or equal to itself. And the child   to the right has a value greater than or equal  to itself. This might seem weird, but it comes   with certain advantages and disadvantages over  using normal trees, which we'll get to in a bit.   The final restriction put on binary search trees  is that no two nodes can contain the same value.   And this is just to prevent weird things from  happening when we begin searching through the   tree. Now, how do imposing these restrictions  on a tree actually help us? Well, the biggest   advantage of binary search trees is that we're  able to search through them in logarithmic time.   Because there is a natural order to the way that  the data is stored. It makes it extremely easy to   search for a given value. logarithmic time, if you  remember back to the segment on time complexity   is the equation in which we get more bang  for our buck the greater number of elements   or nodes have in our data structure. It works  like this. All we have to do when searching   is tell the computer to go left if the value  we're searching for is less than the current node,   and right if it's greater than the current node.  We can then wash rinse and repeat this strategy   until we find our desired note. This makes  binary search trees really popular for storing   large quantities of data that need to be easily  searchable. Of course, this also translates   to inserting, deleting and accessing elements  within the data structure. But for the most part,   searching efficiency is what really sets  the binary search tree apart from the rest.   stepping away now from binary search  trees and into trees in general,   let's talk about common uses for them in  computer science. The most common uses   for trees in computer science includes storing  data with a naturally hierarchical structure.   These are like the examples we touched  upon at the beginning of this segment.   Data Sets such as file structure systems,  family trees, a company's corporate structure,   all of these could be stored and implemented  using the tree data structure very easily.   These are all general examples though. As I  mentioned before, when we put restrictions on   trees, like in the case of the binary search  tree, we can expand it to uses even further.   A trees based structure is incredibly useful,  and it can be modified in so many ways,   which only add to its functionality. One of  these ways is through what's known as a try and   is the next data structure on our  agenda. So stay tuned for that.   Next up, we'll be talking about another one of the  special trees with restrictions. We just finished   discussing the binary search tree. And with that  we mentioned how trees usually become even more   useful once you start setting restrictions on  how and where the data can be stored within them.   Well, a try is another one of these special trees,  which have special restrictions put in place to   help store the data in an effective manner. This  data structure is often overlooked since it's only   used in specific situations. But without the use  of tries within those specific situations. Some   important features of your computing life would  be a whole lot harder. So we get it Steven tries a   special tree with restrictions. But what are  those restrictions and how can they help us?   Well, let's start with the basics. A try is  a tree like data structure whose nodes store   letters of an alphabet in the form of characters.  We can carefully construct this tree of characters   in such a way Which allows us to quickly retrieve  words in the form of strings by traversing down a   certain path of the try. These are also sometimes  called Digital trees or prefix trees by the   computer science community. We'll be calling them  tries for today's lecture. Trees are used in the   retrieval of data in the form of words, which is  actually where the name try comes from, as it's   smack dab in the middle of the word retrieval.  Essentially, in layman's terms, we use tries   to retrieve words extremely fast by traversing  down a tree of store characters. This is hard to   explain without actually showing you. So let's do  an example of what a try might actually look like.   Now, just like with a tree, every try is going  to start with a root node only in our case,   that root node will be empty with either some null  value or a blank string. Now also stored within   this node will be a set of references, all stored  within an array. These references are set to know   at the beginning of the Tris existence, but can  be slowly filled with references to other notes.   For example, let's say we were creating a try  to store words that start with the letter D.   The root node would contain an array which  contains a reference to a node containing   the character D, signifying the start  of our word. Now imagine for a second   that this D node also has an array  containing references. Only this time,   it contains references that point towards all  characters in the English alphabet, which serves   as the first two characters of a word in the  English dictionary. So da serves as the first two   characters and a lot of English words such as dad,  or Tao, or dad if you really want to go that far.   And so the array contained within the D node would  hold a reference to an a node. Now since there   are no English words, which start with DB that I  know of, we wouldn't have a reference to a b node.   Since we know we will never have to retrieve  a word from our try, which starts with dB.   We continue this process for all letters in the  alphabet, only including references to characters   which serve as the first two characters in an  English word. But that is a lot to visualize.   So for now, let's just put up on the screen two  nodes that this first D node would point towards.   More specifically, the node, we've already added  the a node, as well as the E node. Now to continue   building our try, we would simply repeat this  process for all of the nodes that this D node   points towards. So we'd store pointers in the a  node, which would point towards characters and the   English alphabet that serves as the first three  characters of a word in the English dictionary.   So you'd have a B, A, D, A y, and many, many more  nodes as well. But we're just going to stick with   those three for now. For E, you'd have pointers  which point towards nodes containing characters   like n and w. Obviously, there'd be more nodes  than the one shown on your screen right now   for all levels of nodes. But what we've  created here is a very simple try.   As you can see, it's exactly like I said it was  in the beginning, a tree like data structure,   which stores characters that can be used  to make words by traversing down paths.   As we traveled down different paths of  our tree, we can make different words,   dab, Dad day down the one path,  and then and do down the other.   By following the nodes downwards, we  can easily build strings of words,   which you'll learn towards the end of  the episode can be extremely useful.   We can even take this process further by having  one path contain multiple strings. For example,   if we isolate the den path, we could continue the  process to go on to words like dense or Denver or   dent, the list goes on and on. We wouldn't have  to just stop at the word then since then, is also   a prefix for numerous other words, and that's what  makes storing data and character format so useful.   One path down to try can represent multiple words  depending on where you stop along the process.   This does provide a bit of a problem  for computer scientists though,   how do we know when a word has ended? Let's take  the Denver path as an example. If we wanted to   retrieve the word den from this try, how would  we know to stop at the end node and not continue   along to Denver? Well, there's actually a pretty  simple fix to this and it's known as flagging.   We simply mark the end of the word by having it  also point towards a flag to let the computer know   that the end of a word has occurred. So in our  Denver example, not only would the end nodes array   contain pointers to whichever characters follow  it, it would also contain a pointer to a node   with a flag to tell the computer to stop there.  In this case, I've just chosen a period as a flag.   This way we can control tries in such a  way where an each word is marked by an   ending point. And the different words that  may branch off from the prefix can continue   to use that word as a prefix in whichever  retrieval program ends up getting used.   Okay, so now it's time to talk about  the general use cases of a try.   If you remember back to the beginning of this  segment, I said that the use cases for a try were   limited, but extremely effective. And now we're  going to talk about what that statement means.   Now, have you ever used the autocomplete  feature on your iPhone or spellcheck on   a Google Doc, because if so you have already  experienced the overwhelming power of tries,   as they are used for both of  these extremely useful features.   This mainly has to do with the fact that  for big programs like iOS or Google Docs,   they're not just storing a try containing a  few words, or even all the words starting with   a certain letter, like we tried to replicate  their storing the entire English dictionary.   storing the entire dictionary in a try seems  like a tall task. But it can and has been done.   Each node would simply have 26 nodes connected  to it. And those 26 would have as many nodes   connected to them as needed, and so on. And so  now I can hear you asking you in the comments.   How does this help us with autocomplete and  spellcheck? Well, let's say you're typing   out a word. For the sake of this video, let's  say it's the word subscribe. You start with   the s. At this point, that computer has already  eliminated 95% of words that you could be typing.   We know it's not going to be a word which starts  with n, or with a or o or H or l etc, etc.   Then you type the letter U and bam, another  95% of possible options get deleted. With   each character you type, you eliminate millions of  possible words that you could be working towards.   Using this fact, as well as popularity data,  which could also be stored within the node,   the computer can start making educated guesses  using that information, as well as context from   previous words, to make a suggestion as to what  word you're trying to type out. The AI that works.   This feature is immensely complicated, but it  is backed by the tri data structure and helps   autocomplete work easily on your phone. As for  spellcheck, well when you spell a word wrong,   a spell checking algorithm can use the root of  your word and compare it against try data to make   an educated guess as to what you were trying to  spell. The slightly misspelled words will usually   have a similar path from the root of the try. If  you accidentally type chalk a lotta instead of   chocolate, the computer can take the first few  characters of the word you typed in correctly,   and see where you may have gone wrong. Basically,  by comparing the misspelled word to certain paths   of the dictionary try, they can pretty accurately  detect which word you were meaning to say   and correct you accordingly. So there you  have it, the try. Like I said, it is often   underrated in the computer science world since  it can only be used in certain situations.   But as I hope this segment has shown you, those  situations are still important. Nonetheless,   we are nearing the end of our introduction to data  structures series. Now, only two more left before   we part ways. So stay tuned, because now we're  going to talk about heaps. Now back in our segment   on trees, we talked about the binary search tree,  a special type of tree which has a few properties.   Each node can only have two children, the child  to the left must have a value less than the parent   node, and the child to the right must have a value  greater than the parent node. And also no two   nodes could contain the same value. Well, a heap  is sort of like this, but a little bit different.   More specifically, by definition, a heap is a  special tree where all parent nodes compared   to their children nodes in some specific way,  by being either more extreme or less extreme,   ie greater than or less than. This specific way  determines where the data is stored and is usually   dependent on the root nodes value. There are two  different methodologies generally used in computer   science, and they are known as min heaps and Max  heaps. In a min heap, the value at the root node   of the tree must be the minimum amongst all of  its children, and this factor must be recursively.   The same for any and all parent nodes contained  within the heat. So each parent node must have   a value lower than all of its children nodes. As  you can see from our example, on the screen now   10 is the root node and also the minimum value in  the tree. In addition to this fact, if we pick any   parent node on the tree, and look at its children  and their children and so on, the parent node will   have the lowest value of the mall. Take 14 for  example, its value is less than 2631 4435. In 33,   this must be the case for every single  subtree, which is contained within the heap.   Max heaps on the other hand are  the exact opposite. In a max heap,   the value at the root node of the tree must be  the maximum amongst all of its children. And   this fact must be recursively, the same for any  and all parent nodes contained within the heap.   If you take a look at the example max  heap we have on the screen now, again,   you'll see that this is the case 44 is the root  node and also the largest value within the heap.   If you take a look at say the sub  tree, which is parented by the 35 node,   you'll see that 35 is a maximum value amongst all  nodes in that subtree greater than both 19 and 27.   When we store data like this, and heap, whether  that be a min heap or a max heap, it makes it   extremely easy to insert or remove from. This  lends itself to a lot of useful implementations   within computer science. Now to show you  this, I'd like to build an example max heap,   the one which has the greatest integer stored  in the root node. For the sake of keeping   things simple, let's pull up an array of seven  elements with integers ranging from zero to 100,   and simply convert it into a heap one element  by one. This can be done in a few easy steps.   Step one is we add the first integer in as the  root node. So 70 would get inserted into our   tree as the root node, then we add another node  to the tree in the bottom left most positions   available. So we would first insert a node as the  child of the root node to the left. For our heap.   This means adding the integer for as a child of  the 70. The final step is to recursively go up   the heap and swap nodes if necessary. Now when  we say if necessary, we mean that if the node   we just added is more extreme, either greater  than or less than the node above it, depending   on the type of heap that we're trying to create,  we swap them to maintain order amongst the heap.   Since we're building a max heap, and 70 is greater  than four, no swaps are necessary in this case.   Now we just simply repeat steps two  and three until we've built our heap.   So next, we want to add the integer 90. And since  the leftmost slot on the tree is already taken by   our four node, the right slot ends up being the  leftmost location on our tree. So we insert it   there. And then since 90 is greater than 70, we  actually end up swapping the 90 and 70 nodes.   Doing this helps keep the max heap property  intact, where every level is greater than the   one below it. Let's keep going. Next, we add 45  to the heap. Now, since we've run out of space,   on the second level, we move on to the third  level and add 45 as a child of the four node,   then we compare it to four, which is greater  than so we swapped the nodes. Now we have   to compare this node to the node above it  again. Because remember, we recursively go   up the tree until we reach the root or don't  need to swap. Now 45 is not greater than 90,   so it stays put where it's at. Next up,  we add 23 as another child of the 45 node,   only this time on the right side, we compare and  since it's not greater than 45, we keep it as is.   Moving on, we next insert 76 into  the tree as a child of the 70 node,   then we compare and swap the 76 and 70  nodes as 76 is indeed greater than 70.   We then compare 76 and 90. And since 90 is greater  than 76. Keep the 76 node in place for now.   The next node we add is the 100 node, we compare  it to the 76 node and see that it's greater,   so it gets swapped. And then we compare it  again, this time to the 90 node. And since   it's also greater than that integer, it gets  swapped yet again to become the root node,   signifying it is the greatest node in the heap.  As you can see, it's a pretty simple concept, you   add a node and then keep trying to swap up until  the node you've added is in its rightful place.   Deleting from heap is also pretty  simple, at least in our case.   This is because the type of deletion I want  to talk about is just the process of removing   the root node from the heap. And you'll see why  later on. To delete the root node from a heap,   you follow a three step process. Step one is  actually removing the root node from our heap.   So in our case, we delete the 100 node. What  you do with it is up to the programmer, you   can return it back to the user stored somewhere  else, print it out, etc, etc. Either way, then   step two is replacing it with a node furthest to  the right in this case It would be the 76th note.   Finally, for step three, we do what's known as  a heapify. To fix up the heap, we start with the   root node and compare it to its children to see  if we need to swap any values. So for the 76 node,   we compare it to its two child node. And since  76 is less than 90, we end up swapping those two.   Then we wash rinse, repeat for every subtree  that we have. So on the right side, we swapped   90 with 76. But since 76, remains the greatest  integer on that particular subtree, it stays in   the same spot. On the left side, we didn't change  anything, but 45 is still the greatest integer   amongst the three nodes in that subtree. So no  swapping is necessary on that branch of the heap.   And so we've completed our heapify, and  all heap properties are still intact.   That is inserting and deleting nodes in  a nutshell. Now let's talk about how we   can use this to our advantage. Heaps are most  commonly used in the implementation of heapsort.   heapsort is a sorting algorithm which takes  in a list of elements, builds them into a   min or max heap, and then removes the root  node continuously to make a sorted list.   Because heaps always start with the minimum  or maximum value contained within them at   the root node, we're able to simply just  remove the root node over and over again,   keep refining the data structure after every pass.  And after every element has been removed, we are   left with a sorted list. On your screen, you'll  see we have an unsorted list on the left. In the   middle, we've inserted all of those integers, and  in doing so created a max heap. And then finally,   on the right, we have continuously removed those  elements from the root node into a now sorted   list. heapsort is a really cool algorithm,  and it will be part of our upcoming series   on sorting algorithms, which is kicking off very  soon. So if you're interested in that, make sure   you subscribe to our channel so that you don't  miss it, a link will be in the description below.   Another extremely popular use of heaps is  through the implementation of priority queues.   priority queues are an advanced data structure,  which your computer uses to designate tasks   and assign computing power based on how urgent a  certain matter is. Think of it like a line at the   hospital. You wouldn't want your line to follow a  regular queue methodology which implements first   in first out. Since then you could have patients  with extremely urgent matters like heart attacks,   waiting behind people coming in for a routine  checkup. And the same way you wouldn't want   your computer to update an application before it  finishes rendering a video. Otherwise your entire   progress would be lost. priority queues take care  of all the task scheduling done by your computer,   and the heap data structure is used as the  backing system for it. And with that ends   our discussion on heaps. To review, they are a  special tree in which each level contains nodes   with values more extreme either greater than  or less than the nodes on the level above it.   Next up is unfortunately the last segment in the  series on yet another tree based data structure   the graph. The graph is arguably the most  dynamic data structure in our introduction   to data structure series and an absolute banger  to go out on. So let's just hop right into it.   Before we get into the nitty gritty details,  I first want to do a short little exercise.   Visualize for a second a few of your favorite  places to eat on a map around your town. For   me personally, it'd be places like five guys chick  fil a Panera, Wawa and Domino's. Now imagine for a   second that you are ravished absolutely starving.  And so your plan is obviously to drive around to   each of your favorite places to eat in order  an ungodly amount of food from each location.   Each food location has a few possible paths  going to and from it. And so we add those to   the map as well. Now you can see what we now  have looks like a network of delicious foods,   we can start anywhere, and all we have  to do is make sure to hit all five.   You may not know it, but what we've essentially  done here is set up a simple graph. Basically,   graphs are composed of pieces of information like  the restaurants and the path set run between them.   Of course, this is just generally, by definition  graphs are a nonlinear data structure consisting   of nodes and edges. There are a finite set of  these nodes or vertices, which are connected by   edges. nodes and edges should be familiar  to you if you watch the segment on trees.   The big difference between trees and graphs  however, is that with a tree we had a specific   starting point. Sure, there were multiple paths  down the tree that branched off from the initial   starting point. But you always had to begin at  the root note. In contrast with a graph there   is no specified starting point, we can begin from  any node and traverse to any node. Just like how   on our food example, we are able to start at  any restaurant. graphs are a huge concept and   escaped the bounds of computer science often being  used in many places you wouldn't even think of.   But before we get into anything crazy,  though, like the difference between a   directed or undirected graph or a cyclic versus  cyclical graphs, let's get down the basics.   Now, every graph is composed of these nodes or  vertices and the edges that connect them. Let's   pull up a sample graph really quickly and  talk about it. We represent graphs visually   like this a lot, because it makes it way easier  to comprehend. But notationally wise, a graph   actually looks like this, which is much harder  to comprehend. So let's break it down. First,   we have the vertices set, which contains a comma  separated list of all vertices within the graph.   That's the simple part. Each comma separated  value simply represents a node within the graph.   Then we have the edge set, which is a little bit  more complicated. Each element of the edge set is   an ordered pair which describes relationships  between nodes. For example, the first one   describes a relationship between the six and four  nodes. The fifth indicates a relationship between   the five and two nodes, and so on. Using these  two sets, we're able to visualize a graph pretty   easily by laying down both the information  and the connections that fall between them.   One final thing I want to mention about the  basics of graphs is about the relationships that   occur between two nodes. If we have an edge, which  connects two different vertices, they are known as   adjacent to one another. So for example,  the five node would be adjacent to the four   two and one nodes. Okay, now that we have  the basics down, I now want to jump into the   different attributes that a particular graph might  have. Starting with directed versus undirected.   an undirected graph is one in which the direction  you traverse the nodes isn't important. This is   most prominently indicated by a lack of arrows  pointing to specific nodes. Such was the case   with our first example graph, or even the food  example from the beginning of the episode.   We can hop between nodes or even go back  and forth between them without problem.   A good way to visualize undirected graphs  is like a network of friends on Facebook,   where each edge indicates  a friendship, if you will.   Because of the fact that when somebody accepts  to be your friend on Facebook, you are both added   to each other's friends list. The friendship  goes both ways and direction is unimportant.   In contrast, a directed graph is one in which the  direction you traverse the nodes is important.   This is usually indicated by arrows representing  which nodes a certain node is able to traverse to.   The edges could point both ways but they don't  have to. It's very possible the edge only points   one way. A good way to visualize directed graphs  is by thinking them as a network of friends on   Instagram. Sure, I can follow famous celebrity  Will Smith, but the odds that he follows me   back fairly low. And so in that case, the  relationship only goes one way. undirected   and directed graphs both have their uses. As  we discussed with the social media example,   both provide different functionality which will be  useful to you and your computer science journey.   Just like the next type of property a  graph can be classified as either cyclic or   a cyclic acyclic graph is one which contains  a path from at least one node back to itself.   So you can see by the example on your screen now  that the four node leads to the three node which   leads to the two node which leads to the one node,  which finally leads back to the four node, forming   a cycle, essentially, we're able to follow at  least one path that eventually leads back to our   starting point. A small thing to note here is that  all undirected graphs end up being cyclical. The   bi directional nature of nodes within undirected  graphs theoretically forms a cycle between any two   nodes. So judging by that logic, all undirected  graphs end up being cyclic. In a acyclic graph is   one which contains no path from any one node which  leads back in on itself. This property can really   only apply to undirected graphs like we mentioned  previously. Essentially, this means that for any   given node, there is no path which will eventually  lead back to itself. undirected directed   acyclic and a cyclic are all properties we can use  to classify types of graphs based on their nodes.   But the last property I want to talk about  actually applies to the end. Have a graph instead.   And it's the process of waiting. Waiting the edges  of a graph means associating a numerical value   with each edge, often called the cost of that  edge. Each weight represents some property of the   information you're trying to convey. For example,  again, going back to our food location scenario,   since the information that we're trying to convey  is a good route, which takes us to each location,   a good weight for our edges, and that scenario  could be the distance between nodes. This comes   in handy a lot, especially with navigation,  such as the case with our restaurant example.   As we of course, always want to find the path of  least cost or weight between the different nodes.   So there are the major properties of heap  that the different nodes and edges can have   directed or undirected, cyclic or a cyclic and  weighted or unweighted. There are a couple more   obscure ones out there, but those six are what  we will be covering today. Combining these three   properties together leaves us with numerous types  of graphs, which all have different strengths and   weaknesses. It would take a while to talk about  each of these and there are implementations.   So for now, I'll just pick out three types of  graphs which are used in the most popular cases.   Now probably the most famous  implementation of the graph data   structure is through the undirected  cyclical graph with weighted edges.   This one gets a lot of use, especially through its  implementation in Dykstra shortest path algorithm.   This algorithm given a graph and a source  vertex within that graph, compiles a list   of the shortest possible paths from that source  vertex to all other nodes. As you might be able   to tell just from its description, this has a  multitude of uses across the entire tech world.   Google uses this algorithm for Google Maps.  It's used in the process of IP routing,   and it can even be implemented in telephone  networks. Another type of graph, which you   probably use quite often is the unweighted  cyclical graphs both undirected and directed,   as these make up the follower systems of a  majority of social media websites. We already   talked about these in the cases of Facebook,  which would use a cyclical representation,   as well as Instagram which would use an a cyclical  representation. However, this example encapsulates   much more than just those to Snapchat, Twitter,  tik tok, even all these platforms can represent   your follower following base through a graph  and oftentimes do Facebook even has a graph API,   which you can use to interact with the models that  they use to illustrate each user's web of friends.   As you can see graphs and there are many different  forms provide a lot of the functionality that you   interact with in everyday life, contributing  to almost any facet of the internet.   And with that concludes our discussion on graphs.  As we review, a graph is a data structure which is   represented by a set of nodes and edges. These  come together to form a visualization of data,   whether that data be food locations on a map or  friends on social media. The different types of   graphs provide a multitude of implementations  in computer science. And with that concludes our   introduction to data structure series. After  around three hours and 12 data structures,   you should now have the basic knowledge to take  with you as you continue along in your computer   science journey. And if you've made it this long,  you obviously liked what you saw. So stick around   for an extra 30 seconds while I'll pitch to you  why you should check out my personal channel.   Now I introduce myself at the beginning of the  series, but again, my name is Steven. I'm an   18 year old computer science student who runs a  YouTube channel called nullpointerexception with   one of my good friend Shawn. We've been producing  content like this seriously for about four months   now. And I've grown a good audience of around 800  subscribers, which we couldn't be happier about.   If you'd like to join us, you can check out  our channel linked in the description below   and just try out a few of our other  videos. And if you like what you see,   you can subscribe. That ends my pitch and with  that I have nothing else to say. I hope you have   enjoyed this series as much as I've enjoyed  making it. Thank you so much for watching.
Info
Channel: freeCodeCamp.org
Views: 552,355
Rating: 4.9616961 out of 5
Keywords: data structures, python, computer science, tutorial, course, for beginners, big o
Id: zg9ih6SVACc
Channel Id: undefined
Length: 179min 26sec (10766 seconds)
Published: Tue Sep 08 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.