Discrete Structures: Introduction to graph algorithms; minimum spanning tree

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] so [Music] [Music] mixer [Music] what you did [Music] [Music] [Music] [Laughter] [Music] my [Music] recording in progress [Music] good morning good morning got a good group of people here today [Music] oh well this weekend i caught the cold it's not as bad as like my kids had it but i definitely have a little bit of congestion a little sore throat it's first time i've been sick in 18 months so it kind of stinks [Music] i guess that was the good thing about staying at home [Music] i just didn't get sick at all now my kids got sick i think my youngest he's on like cold number four out of the past 19 months maybe even cold number five because you know when you're at school you just bring home bring home the germs [Music] so hopefully i'll be able to last all day today i mean like you know for the next hour and a half how the rest are you doing [Music] getting ready for the week doing good good [Music] well so we started last week talking about grass and what did we leave off with i don't even remember i mean i know we talked about things like different types of graphs we didn't really get into the graph algorithm so i think i would i'll just start you off with a little bit of an introduction i guess before we do that any questions last week we ended with trees and traversals oh yeah that's right we were talking about how to like evaluate expressions stages of a compiler you know that the stage yes indeed with the going starting with source code and going through the uh what was the first step like the lexer and then the parser that's called a state machine so your program kind of moves from state to state it goes into the parsing uh lexing state then the parsing state and the co-generation state optimization state and then finally comes out the other end we'll get into state machines here in about a week or so [Music] um yeah so let me give you this oh hi thank you my lovely wife just bought me some tea [Music] oh that's good that should help all right um so there's this company out there called stitch fix and if you haven't heard of them and i have i have no like um ties to them at all like i just ran across this because uh you know i hear about it like on you know they're one of those companies that advertises on things like blogs and podcasts but what they do is they try to hook you up with clothing that they think you'll like so you start out you sign up for an account and then you go through a brief survey where it asks you like your general what kind of person are you what are your general likes it might show you a few styles and you rent you rank them and then every month or so it takes your survey results and tries to find a set of clothing that it thinks you'll like so it some algorithms take over and these algorithms try to choose some clothing and then those clothing choices go to a human stylist who's actually going to look at the the choices the computer came up with and then give it either you know yay or an a and then those clothes get shipped out to you and then you can then uh decide whether you want to buy those clothes or you can send them back and if regardless of whether you buy them or not your your choice as to which ones you like and which ones you didn't like goes back into the recommendation engine so that next month it might make a better choice for you and so recommendation engines of course are used by all sorts of companies netflix amazon pretty much any you know large shopping type service out there is going to use some kind of recommendation engine in order to figure out what kinds of things you want to watch or want to buy but the nice thing about stitch fix is they have a whole section of their website devoted to engineering and algorithm blog postings by their own programmers and their own engineers so you can find out a little bit about how their company works and also about whether you i mean the reason they post this is maybe to attract some talent so people might read these and they go oh i want to work for a company like stitch fix and then you can go and apply there so if we go here to their algorithms tour up at the top gives a you know a brief overview of some of the algorithms that they use at stitch fix so we scroll down here and then i you know if you watch this later on you can pause the video and then read some of the text that's scrolling by but i'll kind of give you my summary of what i've read going through this here but basically their business model is you've got clothing over here you've got clients over here and then you've got these algorithms and stylus in the middle and their their job is to link you up you know with clothing and clients and so in this circle here is all the algorithms that take place in order to do a job of trying to kind of link you up with the clothing that you like so one of the one of the problems they have is they have clothing scattered around to warehouses around the country and um this is also you know a problem they have what i mean by problem i don't mean like you know this is a this is a a huge like business uh issue this is one of the the problems that their algorithm designers have to solve which is where exactly do you put this clothing if you have let's say one two three four five six warehouses scattered around the clothing which you know clothings do you put in which warehouses you know people over here on the west coast particularly people in the south don't need so much and this is just like off the top of my head here don't need so much winter clothes i don't need to clothes that are really warm whereas uh people up here in the northern part of the united states do need more of that people in the southern united states also have a different style than people on the east coast or the west coast right so you want to move clothes around so that the clothes that people are likely to buy say on the west coast are in the west coast warehouses and the clothes that people are likely to buy in the north or the east part of the united states are over in those warehouses this is a problem of minimizing things like shipping costs and minimizing storage costs and of course there's going to be exceptions you know someone over here in the west coast probably does want to buy a warm jacket at some point and so you're then going to have to move that article clothing from where one warehouse to another warehouse and that's also a problem to solve is when do you want to do that can you predict when that's going to happen so that you can ship those things before the customers order them so they have problems like you know these represent warehouse locations and then all these small dots here represent their customers you know how do you how do you group warehouses with customers so that you kind of minimize shipping costs and this this part of the blog posting or tour talks about their recommendation engine where you've got people over here you've got styles over here and then these numbers here might represent how good of a match that is between this person and and this piece of clothing i'm just going through this quickly you know if you want to go through this website on your own and read all the text and go through all the little animations they have it's quite quite interesting so again you can see pictures of graphs here so they got an article clothing um you know clothing she's gone through and she's pinned images of clothing that she likes and it goes through this recommendation engine and then it tries to match them up with actual items of clothing that they have in their inventory and tries to figure out which one is the best match for that you've got natural language processing so someone might type in one of the things you do with stitch fix is rather than just answering a whole bunch of kind of serious survey questions you can just type some words about the kinds of things you want to wear and one of the things the algorithms have to do is analyze the words that you typed in and try to figure out what did you mean by that and you know what do you mean by casual outdoor wedding clothes what's next here so this is talking about matching up clothes with stylists stylists ultimately making the the final choice and what gets recommended to you what gets shipped to you we've got logistics which is moving clothes around in their warehouses oh you've got this another problem here this is called the traveling salesman problem if you've got a person who has been tasked with picking up all the clothes for a particular customer that person has to make a little route through the warehouse to stop at various clothing racks and pick up the clothing what's the best route for that person to take what's the best route to take that's the shortest so they can get the most amount of work done in a day is it also possible that they could pick up along the way clothing for a different order so they can actually be mult putting together clothing for multiple clients at the same time and again what's the best route for them to take through that warehouse so that they can walk the the fewest number of steps and therefore process the most number of orders in a given shift okay so the customer receives their order they then try on the close they give feedback to stitch fix about what they liked and what they didn't like and then that influences decisions that are made next time so it goes through this cycle of um what are we doing here so request and feedback goes to recommendation engine and then that information goes into the warehouses warehouses get delivered to the customer customer returns some of the items but maybe not all of them that goes back into the warehouses that becomes more feedback and the cycle just continues over and over and over again state machines are a huge part of their process actually this diagram we had over here is a state machine because the client or the customer experience go through a series of stages and each one of these stages here is a state and then each one of these lines is a transition between the different states there's this kind of state machine which is showing all the idle articles of clothing going through the different stages like their stages of initial warehousing and then there are stages of being delivered to customers and stages going coming back to the warehouse and has to go back onto the racks and just making sure that none of the individ none of the states gets depleted of resources entirely so making sure that resource allocation goes um smoothly so like state state one here make sure that not enough resources are coming into it that it never gets completely depleted maybe state to make sure it doesn't get too full right so all sorts of graph problems here about resource allocation and optimization this animation is trying to show here's all the clothing arriving from manufacturers and then it has to be allocated to various warehouses around the country some of them get delivered to customers some of them get returned back and then clothing has to move around the different warehouses and so forth and that's the end of that algorithms tour so another problem they have is well let me just go find the the blog postings and one of the problems they have is trying to identify what color clothings are the articles of clothing i need to scroll down to find this actually i should just search for color huh so this is their problem of you've got this article clothing and when an article clothing comes in from one of their manufacturers the first thing they do is they take a picture of it so they can show it on their website but also so that their algorithms engines can start working on trying to figure out you know what is this art of clothing and what color is it what size is it but this particular blog post is about color and so if you ask a human what color is this of course they come up with different answers some ones might say blue someone might say teal turquoise dark teal understated tailed medium saturation and brightness right so even humans can't come up with a definitive answer about what color something is so i think what they did was talking about taking photos of the of the articles of clothing and then they turned to randall monroe of xkcd fame and he did a survey of the 949 most common colors along with names so they grouped colors into primary colors secondary colors tertiary colors along with some neutrals and then it came up with a hierarchy of colors so you've got your purple colors common colors design colors and xkcd colors so you got warm purple medium purple ugly purple and purpley these are all examples of purple colors we'll just keep going on that and so i think out here on the outer ring are the actual colors these are the color families here and these are the 954 colors in their survey okay so the first thing they have to do is take a color of a garment and basically reduce it down to one of these 954 949 possible colors but a more interesting problem they have is having a computer try to figure out what color a particular garment is so after they take a picture of a garment they then have an algorithm go actually a couple of algorithms go and try to figure out what color it is and of course you got a problem when a garment is multiple colors and so they got two different algorithms that looked at this picture one of them arrived at these three colors you know what i guess it's close-ish right this is kind of close to this color this is kind of close to this color i'm not sure where this one came from oh so this is a big problem for computers is clothing doesn't lie flat it's got you know wrinkles in it and it has shape and it has form and so there are shadows that the computer has to contend with now they could light the clothing so the shadows are completely eliminated eliminated but i think a human would not like to look at a piece of clothing that had no shadows on it at all because then you couldn't get an idea about like how full the garment is or what the shape of the garment is so at the computer also contend with these shadows and i think that's probably where it came up with this color right this is like one of these shadowy colors and it's really hard for the computer to tell is this a shadow or is this actually part of the design of the clothing another algorithm came up with these five colors you know this is kind of like here but this one here is also kind of the shadows here it found this cream color pretty well and then it found these two other colors in here so this is a really hard problem for them to solve a human would say well there's only three colors in that garment but you know the computer sees more or less so they had this idea to take a photograph and break it up into these smaller regions which they called super pixels where each one of these regions basically has the same color in it now each one of these regions is this is kind of like the map coloring problem we did a week ago a week or two ago where each one of these regions can be represented as a node or a vertex in a graph and then you've got its neighbors so you can turn this picture into a graph basically and so they take a picture that might have 60 000 pixels and turn it just into a few hundred super pixel regions and that makes it a lot easier to analyze the color of a garment so here's their graph of the garment but then you've got like other problems where you know even though this garment here is just one color at least a human would say it's one color um it's kind of hard for the computer to figure out exactly what color it is right because you got some regions that are well lit some ways that are uh brighter some regions that are darker you know so what color is that really it's hard to tell shoes are a problem you know you got to teach the computer not to look inside the shoe because people don't usually consider the inside of the shoe to be the color of the shoe it's just the outside of the shoe and then you know this is the problem here this is striped garment uh how do you contend with that and then this plaid garment here you know what color is that well it's like a whole bunch of colors problem is you can't break up this plaid shirt into these super pixels very well so they've tried some other things representing the colors in a kind of a three-dimensional space and then looking for clusters anyway so you know lots of really interesting problems that their programmers have to work on and a lot of them can be just reduced down to graph problems so that's uh that's stitch fix and like i said i have no ties to them other than it's interesting to see that uh both their algorithms group and their engineering group put these uh blogs up there and let's see you can get an idea uh what am i looking for here i'm looking for oh careers i think is that what i'm looking for no i am looking for the picture they have pictures of their actual like algorithms group here it is i think there yeah so what i do is i click on algorithms up here at the top and then you can get a tour of their algorithms group and it's just this incredibly diverse group of people that do the programming behind stitch fix and then on their engineering team engineering are the people that like run the physical infrastructure so they might be working on the the servers and they might be working on the actual programming of the website versus the algorithms people they're also scattered all over the place austin iowa sunnyvale austin dallas san francisco atlanta san francisco los angeles professor uh how can they how can they afford so many engineers i mean that's a lot of salary to pay um that's a good question presumably they're doing a lot of business i mean yeah they're a pretty big company they do a lot i mean the the the forward-facing side of it is they're a clothing recommendation company right but really what they are is a data science company clothing just happens to be what they sell so their job really is to do all that recommendation stuff and the other clothing that they sell really isn't that cheap um if you i think if you actually buy the clothing that they recommend you could probably spend 150 200 bucks on that box for one outfit yeah they're roughly about a four billion dollar company that's their revenue i think that's their stock or something i just looked up their net worth real quick and if they're a public company we should be able to find out their actual net worth not their stock stock valuation yeah it says they did about 1.7 billion in 2020. that's not bad professors saw all the data that they gather and the algorithms that can be used somewhere else right and they could sell it as well i imagine so there might be interesting things in their privacy policy i'm looking for here we go use of your information well from this privacy policy it doesn't look like they really like sell out your information to other places they do say that they share it with third-party providers but they're mostly like to do technical support marketing only to provide services but it doesn't look like they're really in the business of selling your information so i suppose that could be good i'm looking at their faq frequently asked questions i don't see anything there about selling information but for a company like this to do over a billion dollars of sales every year more than i thought anyway let's um let's move on to an actual graph a graph algorithm so this this problem here is called minimal spanning tree and it has all sorts of applications in things like resource management and resource optimization here's just one kind of contrived example so you've got these houses in this little tiny town and you know this town is so small that up until now there have just been paving stones in between the houses so people have made these paths that go between the houses and some of the paths are longer some of the paths are shorter but they decided to actually pave the roads uh or turn these uh pathways into actual roads and so they're trying to decide where to put the pavement between these houses so they use as little as possible they're not a very wealthy town so this path here would cost five units of concrete and this one here would only be three units of concrete and the idea is how can we connect up all the houses of this town so that every house is reachable from every house there's some way to get so all the houses are basically connected in to our road system using a little little pavement as possible so what do you think might be some strategies that we would use to figure out where to put the pavement well one strategy might be why don't we just pave all the shortest roads first so we'll do this one and then this one and then this one those are all the twos right there and then once we exhaust all the twos maybe we'll just do all the threes so we got this one and this one and this one and this one and this one and this one and here's also a three right there right the bridge doesn't count have we have we finished at that point c to b c to b oh another three right there okay so at this point are all the houses hooked in in other words is there is there a path at least one path going to each house and so a is covered that's good b is covered d is covered h is covered f is c is e i g and j so it looks like they're all covered and then if we add up all this these amounts here we've got two and three three actually i'll just do it by hand we got 2 plus 3 is 5 this is 8 11 13 15 18 22 25 27 uh 30 33 i think it was 33. so the question is uh is 33 the the the fewest amount uh the least amount of payment we could do well one thing i see here is that like house number house i over here it seems like if i wanted to get from i to j i could either go this way or i could go this way so there's kind of like two paths to get the j and that means that one of those paths is redundant i don't need one of them so which of these links should i get rid of should i get rid of this one or this one or this one and then over here like from d to f i could either go this way or i could go this way so one of those links is redundant should i get rid of this one or this one or this one or this one same thing here to get from c to b i could either go this way or i could go this way so which of these should i get rid of remember we wanted to hook up this town so that there is a way so every so every house is hooked into the graph but it doesn't necessarily mean that every house has a shortest path from it to every other house just just that it's hooked in somehow so there's a road that goes to the house but we don't want any redundant roads we don't any roads that for which there's like two paths to get to a given house so the problem we have here is that we're sort of resorting to trial and error first we went through it and we just like paved all the shortest roads and then we tried to figure out which ones did we do it not need and that's kind of like not a great way to solve a problem trial and error i mean it's it is the way to solve some problems there are some problems for which trial and error is the only way that we know to solve it like trying to crack passwords as trial and error but minimal spanning tree does have a way to do this such that you can do it fairly quickly so that's kind of like the holy grail of computing algorithms is is there a way to solve this problem without just doing trial and error so here's how you do it let me first erase all these so i'll just kind of like walk you through it informally and then i'll show you the formal description of this so first step is just since you are going to be eventually including all the houses and just pick one to start at doesn't matter which one you start out just arbitrarily pick one so if you're a computer you might want to pick out the one that has the the first number that's in the list you might think it started a but it really doesn't matter where you start let's let's do a first so what you do is you look at all of the paths that are leading out of that house so there's this one this one and this one and you just pick the shortest one there's one with five one with three and one were four pick one of three because it's shortest and then connect them up so now we have two houses that are in our road network we're going to repeat the process over again but with one little change we are now going to look at all of the paths leading out of all of the houses that we have paid so far so we're not just going to look out the very last one we did we're going to look at all of the houses that we have paved so far which means we're going to look at the path leading out of a and c so here are all the paths that we should consider these are all the paths coming out of a and c pick the lowest one and if there's a tie you can arbitrarily pick one so it looks like there is a path leading here to b and a path leading to f that are both of weight three you can arbitrarily pick one or the other i mean i'll pick this one that goes to b now repeat that over again look at all of the paths coming out of all of the houses that you've paid so far you could ignore any path that leads to a house that you already paid so this one here that has five leading from a to b we've already paid both a and b so we don't have to look at this path that means we'll look at this one this one this one this one this one this one and this one pick the lowest one and again if there's a tie just randomly pick one which one do we pick the one with two from b to f b to f good but then we we don't need from c to b right could be or if you make it just c to f or oh yeah never mind sorry yeah we could we could have picked this one or we could have picked this one so we we randomly picked c to be okay repeat again look at all the paths coming out from all the houses that have been paved so far we don't have to look at a path that leads between houses that have already been paved so we don't look at c to f don't look at a to b pick the lowest one and for the tie just arbitrarily pick one so it looks like we either do b to d or we could do f to h or we could do f to j let's do b to d okay same thing look at all the paths coming out of all the houses that have been paved pick the lowest one that's going to be d to h okay all the paths um up here so it looks like f to j is the lowest one and then repeat again so the lowest one will be this one right here uh let's see we got a three here and we've got a three here so it doesn't matter which one we pick and then there's not much left to do we have to pick this one that has two and there now all the houses have been hooked in and let's see what we got so if we started here at a that then went to c which went to b and then from b we branched out to f and then j and that went to either g or i and then i went to e and then over here in b we branched out and went to d and then h so we made a tree there are no cycles anywhere in this graph and let's see what the total weight is so a to c was three c to b also three b to d is three d to h was two b to f two f to j three j to g was two j to i was three and i to e was two so we end up with uh three plus three plus three is nine here's eleven thirteen 16 18 21 23 and that should be the lowest that you can go now there were a couple of places where we arbitrarily picked one path versus another because they're the same weight we would have ended up with a different tree but with the same minimal weight so this is called this was an event this particular algorithm was invented by a person named prim so this is a prim's minimal spanning tree algorithm and it works pretty fast there's no trial and error this algorithm is called a greedy algorithm a greedy algorithm is one in which the algorithm just looks at its available choices picks the best one and then sticks with it so a greedy algorithm doesn't do what's called backtracking it doesn't change its mind based upon new information once it makes a choice it sticks with that choice and then moves forward uh professor yeah so it could be like really uh nice algorithm for python in python i guess yeah this is fairly easy to code up in python i think next time on wednesday um we'll look at an algorithm that does a little backtracking so shortest path there's a little bit of backtracking because it might make one decision about the shortest way to get to the destination and then realize there was actually a shorter way to do it so it backs up a little bit and then tries a different path okay so this was prim's minimal spanning tree algorithm by looking at an actual um graph let's try to look at it divorce ourselves from the actual picture of the graph and try to look at just the numbers so if we represent the graph as a grid so this is the same graph it's got all the same weights in it but like for example this one here this 5 here represents the edge from going from b to a and you can see there's also a corresponding 5 for going from a to b since this is an undirected graph so each of the numbers is in here twice they're actually reflected around the diagonal and so i'm going to say that the letters here on the left hand side represent the vertices of the graph that i've already visited the ones that i've included in the graph and the ones here on the top are the ones i haven't yet visited so the ones on the left the ones i've paved already and the ones on the top are the ones that remain unpaved so here's the algorithm we got a set of visited vertices and we got a set of unvisited vertices and then you arbitrarily pick any one vertex to be the um the initial visited vertex now in our diagram over here we arbitrarily picked one to start at we picked sorry we picked it a we started at a so that's what we mean we arbitrarily pick one of them to be our first visited or paved vertex so let's uh let's go ahead and let's pick a so you look at all the unvisited neighbors of the visited so here are all the neighbors actually here are all the neighbors b d and e and you can see that they're the neighbors because they have edges connecting them so what i'm going to do is i'm going to highlight this row to focus our attention on the edges that are coming out of node a or vertex a i'm also going to cross off a going vertically here because we're never going to go back to a so we're never going to go from a to a right so we're going to cross this out because we'll never go back to it that's why in the first step here we initialize the set of unvisited vertices to be all but the starting vertex so we immediately eliminate that one okay so we consider each of the unvisited neighbors identify the neighbor with the lowest edge weight so we look at these three numbers and we pick the lowest one so that's this one right here so we have selected the edge that goes from a to d as being the lowest one that means we're now going to highlight the row d because this is now included in our visited set and we're also going to cross out column d because we'll never go back to it okay so every time we make a choice we're going to highlight one of the rows and we're going to cross out one of the columns now look at all the numbers that are highlighted this corresponds to all the edges coming out of all the paved houses so far so i've circled what six of them pick the lowest one and if there's a tie just pick one at random so we've got two threes we'll just pick one of them and we'll pick this one here okay so we're going to highlight b and cross it out and then look at all the highlighted numbers looks like number two wins so i'm going to highlight i and cross it out so we're going to add it to our visited set but remove it from our unvisited set so that's why we're highlighting it to add it to our visited set and then we're crossing it out vertically to remove it from our unvisited set and then look at all the highlighted numbers pick the lowest one and if there's a tie just arbitrarily break them so maybe we'll do that one there so we're gonna whoops i did not mean to erase i meant to highlight all right keep repeating looks like we got a 2 here so we'll select that one so we'll highlight f cross it out you know if you do this in python it's relatively easy to do with sets and so as you add something to a set you just do a union and in order to remove something from a set you just do the subtraction okay we got a few threes here's one here's one and here's one and we'll just pick one maybe i'll pick this one okay so i just highlighted c and crossed it out and look at the numbers that are highlighted this two is the smallest one so pick that i did it again across the uh erased it there we go okay keep repeating looks like there's a three here so in addition to using some sets to represent the visited and unvisited vertexes you also have to incorporate into your data structure the weights and you also incorporate like where each one of these uh vertices goes so you got to incorporate not only the edges in there somehow but also the weights of them and then come up with a way to be able to search through the graph fairly quickly so when you ultimately when you take a course in algorithm design you're it's going to go hand in hand with algorithms and data structures it's usually why those two things are included in the same course because if you have a data structure you're going to want algorithms to be able to work with that data and oftentimes you can't make an efficient algorithm unless you figure out how that data is going to be stored so the way you store the data can have a huge impact on how fast and how efficient the algorithm can work you know we can store everything in just arrays but then searching through an array is laborious because you have to start at the beginning and go to the end if you have to do that every single time your algorithm could take a long time particularly with a large amount of data but if you come up with a good data structure that's efficient to search efficient to to to work with the kind of algorithm you're designing then the algorithm can go much faster so let's finish this up let's see i think i just did i think i just a g so highlight g cross it out we've only got a few things left um oh here this one this one this one and this one so it looks like there's a two so i'm highlighting e and then crossing it out so in our algorithm it says once the unvisited set is empty and we've crossed everything out which means we've now completely emptied it out we are done let's see what we ended up with so we ended up with a going to d with a weight of 3. and then d went to b also with a weight of 3 b went to c and i then went to c with a weight of 3 and i with a weight of 2. c went to j with a weight of 2. j didn't go anywhere i don't see anything on its row here um so let's go back to i i goes to h with a weight of three h goes to f with a weight of two f goes to g with a weight of three and then g goes to e with a weight of two and the e doesn't go anywhere so so this is what our tree looks like notice we ended up with a different tree than on the previous page that's because we made some different decisions about how to break the ties but let's see if we ended up with the same total weight three plus three is six this is 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 so same weight as before but just a different tree any questions about that uh yes yeah uh professor k uh can this be applicable in pen testing like hopping from port to port while scanning or like scanning for um like any weakness weaknesses like exploits i'm i i i can't say no um i haven't heard about it being used for that but it is definitely used in computer networks absolutely and it's used most it so spanning tree gets used for for two things typically two big applications one is network optimization and this road thing is a is a network it's a network of roads and we're trying to optimize the amount of pavement used it's also used for segmenting computer networks into to make sure that there are no redundant links in fact let me i'm going to show you an example of that here in a moment oh the spanning tree protocol yeah that's why it's called that's right that's why it's called that um so so here's like a picture of you know this is like just like a picture of my house here right i have lots of computers and lots of stuff on the internet so i've got internet service provider with their link coming into my house it goes into a router the router's also a wi-fi access point so i've got you know laptops and tablets and phones on the router and then i've got switches around the house with tablets and phones and tvs and computers and printers hooked up and you know in a typical residential network this forms a tree right you know it comes in here and it's a tree structure and there's a good reason for that because if i were to let's say link up these two switches with an extra cable have you ever tried that ever tried taking your tree structured network and putting a redundant link in it have you tried that no what would happen yeah question is what will happen well what happens kind of depends upon how much you spent on the equipment if you've got your typical residential personal computer equipment these switches and the routers will do one of two things they'll either freak out and just start passing data around and around and around and round and round right because the switch here goes like okay i need to send data from this computer to let's say well maybe from this computer to this printer here right i see two paths to do it i can go this way or i can go this way you know and then once the switch receives the data it goes i don't know what i should do with it maybe i'll send it out both right and then this one sends it out this one here so so the data ends up just like going around in circles the other thing that might happen is the switch might detect that there's a redundant link and just like shut down just go i don't know what to do with this i'm just going to take myself out of service or it may see that one of these links here is generating a lot of traffic and it may shut they may shut down both of them right so these switches are either just going to completely go crazy or they're going to shut themselves off but if you spent more on these switches if you got enterprise grade switches ones for businesses where you might actually have redundant links on purpose then when you connect up these switches together they start talking to each other and they try to figure out okay you know who who are your neighbors and and who are your neighbors and who are your neighbors and how fast is this link is this a 10 megabit you know 100 megabit thousand make bit is it a gigabit network and they run the spanning tree algorithm and they try to figure out how can we connect all these devices on the network and minimize uh well in this in the case they want to maximize the bandwidth they want to maximize the speed so they do things like they measure you know times to send data back and forth they measure the speeds of the links and they come up with the best way to connect all these devices and they may decide to you know basically disconnect this cable right here not not physically disconnected they just disable the ports right the cable is connected to so now there's no longer any cycles in the tree and then if at some point like a rat or something or a backhoe comes and it cuts this link right here then they run the minimal spanning tree again and they bring this link back into operation so every time a device gets connected into the network the the routers and the switches run the minimal spanning tree algorithm and try to figure out what's the best way to connect all the devices up if they detect any cycles the minimal spanning tree algorithm will automatically get rid of those cycles then when a device is connected or reconnected to the network they run a minimal spanning tree all over again and again try to figure out the best way to connect up the devices so every time there's a change to the topology of the network the minimal spanning tree algorithm is run between all the devices and it figures out the best way to connect them up so this is an example of resource optimization but in this case they want to maximize the speed or maximize the bandwidth rather than minimizing something but it's the same type of problem i have a question um would there ever be a circumstance where it would be efficient to use both paths right if you had say two gigabit pads between things and you're sending sending data could you like switch it between two pads to get the data there quicker again some enterprise level switches will detect that you've got a redundant one and we'll do what's called trunking so they'll turn the that redundant path into basically two parallel paths that have twice the bandwidth so so like the the application that i that i'm curious for something like that would be like if like could you create a network inside say an office building using people's cell phones where where you've got you know 100 cell phones in the building and it's creating all these redundant links between the phones and using that to transfer data would that be um yeah that can be called a mesh network and they're not typically done with people's cell phones but there is kind of like a you can go out to the store and you can buy mesh networking devices where you've got your central router and then you've got these other devices you scatter around your house and they can be used to extend the range of your wi-fi network that's really the purpose of them so having one instead of having one big giant powerful wi-fi station at the center of your house you've got less powerful auxiliary stations scattered around and these these mesh networks are self-organizing [Music] so each one of the devices kind of like listens to see who who what other stations can i listen to what other stations can i make contact with and they internally form a map of the connections between themselves and they figure out what's the best way to pass data amongst the different access points so same idea not done with people's cell phones but done with little access points scattered around the house is that kind of like what you're getting at yeah absolutely and in fact i'm looking at a mesh network repeater right now so the the idea there is and this is a huge problem like especially in my neighborhood is if i just well shoot well if i if i pull down the um the list of like all the wi-fi networks that are nearby near me there's like 20 of them and so those are like all my neighbors all around my neighborhood with their routers and access points and everyone is trying to out shout everyone else imagine trying to go into a room where there's 20 people and they're all trying to be heard so there's there's two ways to do this one is everyone can talk really really loudly which is what basically everyone around me is trying to do with their wi-fi networks they're trying to you know everyone's trying to use their phones and their tablets and their tvs around their house and the signal can't reach because all the other neighbors are also shouting as loud as they can of course you we humans can't hear it but they're in the radio waves and it's just not really an efficient way to do it because the other way to manage a room full of 20 people is to tell everyone okay everyone speak really softly you only have to talk to the person next to you right you're only talking amongst your your your immediate neighbors you're not trying to talk to the person across the room so just speak quietly and you'll be fine and so that's the idea of this mesh network is instead of having one powerful access point in your house trying to reach every corner of the house you scatter all these little different access points each of them operating at a lower power and then they can more effectively share the bandwidth there's a limited number of channels they can broadcast on in the the g networks there's what 11 channels and in the end networks in the 5 gigahertz range there's a greater number of channels but still a limit on how many channels there are and so if everyone is being packed in and trying to share all the same channels there's just a lot of interference and so what happens is you go out to the store and you go i'm going to buy the most powerful router or most powerful access point i can but really a better strategy is to buy a less powerful one and scatter more of them around your house and that's the idea behind these mesh networks what i've had to do in my house is since i have my neighbors who are just blasting out their signals um if i uh the the the the access points i have are like these pros i'm looking over here because i'm going to see if i have an example of one but i can't see one they're like these prosumer devices so they're not just your cheap belkin or d-link ones where you just plug and play them they're they have to be configured but i've got three of them in the house i've got you know central one and i've got two of them scattered around in different parts of the house and they talk to each other so they as you walk around the house with your phone they pass the signal off to each other basically they're forming kind of like a mesh network but i lowered the power on them deliberately and i narrowed the number of channels they can occupy simultaneously my neighbors over here what they do and i'm pointing generally in that direction i don't know what direction they're in but they've set their the default setting for the router is maximum signal occupying as many channels as possible i mean ostensibly that increases the amount of bandwidth they can it can send and receive at a given time but what it means is that your neighbors can't transmit and receive on those same channels at the same time so when i first turned on my routers i couldn't get i mean even though i had like a strong signal right you just measure the signal strength i couldn't get any data through them because my neighbors were just like obliterating all the the channels so i ended up having to configure my my routers or my access points with less power and fewer channels so i can kind of like squeeze into the little gaps that my neighbors are left behind it means that my overall throughput is lower but at least i can send and receive data whereas all the rest of my neighbors are like contending with each other shouting each other as loud as they possibly can my routers are like whispering to each other but at least we can communicate so no you know i don't have like gigabit wireless networking in my house you know it's in the range of tens of megabits but it's enough for the computers to like be able to stream video and do anything where i need a high amount of bandwidth like the computers that i'm using for teaching the classes those are all wired in directly right i've got i've got a 16 port ethernet switch under my desk and i've filled up like 12 of the ports right because all the stuff i have directly wired in and sort of so that none of those signals are contending with the wireless stuff there's only like two things in my room here that are actually not wired in it's my uh the ipad here and like my phone that's it everything else is wired in sorry that was all a long description of what you were asking so it's absolutely true that for these enterprise grade switches if you connect up some switches some some links redundantly and you have to tell them that you've done that then it will trunk them together and treat them like a link that has twice the capacity it's also a good way to make redundant links right so that if if one of these gets cut you know like a mouse or something like that cuts one of them then the other one can take over so that's called trunking when you combine them together so enterprise networks use minimal spanning tree to eliminate redundant links and bring them back into service or i should say activate them when other links have been disabled for whatever reason they get cut accidentally or someone unplugs it or network administrator administratively disables a link just by shutting off one of the ports on the on the switch another application for mineral spanning tree and this is again kind of contrived one is let's say you've got a new housing development and you want to connect in a sewer system to those houses so it doesn't matter how the houses are connected they just have to be connected into the sewer system somehow of course you don't want any redundant links right you don't want sewers flowing back into someone's house you just want to use as little pipe as possible so this uh this picture here you know could instead of being paving paths it could be these are all the possible ways that we could route the pipes which ones should we actually do in order to use as little pipe as possible so really quickly we'll finish off with this how can we represent graphs we can't actually physically draw graphs inside the computer but how could we represent them inside the computer and we already saw one of them we saw the matrix style where you've got the two dimensional array basically and if you've got an undirected graph then all you have to do is just indicate in this interior of the matrix like how the various vertices are linked up so we might say that let's see a goes to b a goes to c b goes back to a and it also goes to d c goes to a and it also goes to e d goes to b and c and e goes to c i feel like i missed something there the reason i feel like that is because it should be like symmetric so i'm missing the one here c goes to oh yeah c goes to d right so there's a one there there we go and then if you have a directed graph then the numbers you put inside the the matrix are just the weights and you can also incorporate directional information by not having it be symmetric around the diagonal so v goes to x with a weight of three that takes care of v w goes to x also with a weight of three x goes to z with a weight of 6. and that takes care of x and then y goes to w with a weight of 5 and it goes to v with a weight of two and it goes to z with a weight of two and then finally z goes to y with a weight of one now you get the idea i i might have missed one there but that's the idea another way to store information about a graph is called an adjacency list and so basically it's an array of arrays or a list an array of lists or something like that or yeah that's it's it's a two-dimensional structure so we've got a b c d and e and maybe that's in an array and then each one of these is a pointer to another array or another list of some sort so it's just showing you here which which vertices are adjacent to each one and i i see that i've i've withdrawn this one here a goes to b and c there we go so it's called an adjacency list because a is adjacent to b and a is adjacent to c b is adjacent to a and adjacent to d so it's a list of the vertices that are adjacent to each other now for a directed graph and weighted graph you're going to have to incorporate the weights somehow into this let's see v goes to x with a weight of three so this could be like a tuple a pair w also goes to x with a weight of three x goes to z with a weight of six y goes to v with a weight of two and it goes to w with a weight of five and it goes to z with a weight of two and then finally z goes to y with a weight of one so you just have to incorporate the weight information into it and then finally we've got a set of edges and vertices start out by listening all the vertices and then listing the edges so we've got an edge that goes from a to b and i'll do these as tuples from a to c [Music] b to d c to d c to e uh almost done here i think that's it one two three four five i might have missed one but that's the idea and then for the directed weighted graph again you have to incorporate the weight information in with the edges so we've got v x three maybe and w x three x z 6 y v 2 right get the idea so you list out the set of all the vertices and you list out all the set of edges and there are probably other ways to represent a graph but these are kind of like the three main ones that you might see and the one that you picked just depends upon what kind of algorithm you're running the nice thing about the matrix or the two-dimensional array is it's very easy to like start at a particular vertex and see which ones are adjacent right you just scan through this row and you look for the places where there's numbers all right again you scan through there look this places where the numbers so two-dimensional arrays are very easy because most programming languages support two-dimensional arrays natively the problem with this is there's a lot of wasted space there's all these gaps in the data so you end up wasting some space so the adjacency list is a little bit nicer because it's very compact but it requires you to have like lists of lists and the idea of pointers and then sets also require you to have some kind of data structure that incorporates the notion of a set also being able to do things like sets of tuples so easy to do in a language like python probably a little bit harder to do in a language like c or java it still could be done just a little more wordy so again the data structure you use depends a lot on what algorithm you're running and what kind of runtime you're looking for are you looking for fast lookups you're looking for a compact representation are you looking for something that works natively in your language you know that's going to influence your choice well i didn't get a much chance to say hi to the people on youtube we've got oh just a couple people looking who were watching but we've got uh happy bull and someone named dydx and alexander that's great so i think that's it for ahead for today so we talked about a brief introduction to the kinds of algorithms and graphs that get used out in the real world at a website like or servers like stitch fix and also we looked at minimal spanning tree and we looked at ways to represent graphs in a computer next time on wednesday we'll look at the shortest path algorithm how do you what's what's the best way to get from point a to point b and you use this algorithm probably every day when you ask your phone how do i get from where i am now to the place i'm trying to go and it's just amazing that a website or a service like google maps or apple maps which has a giant graph of all the roads and all the streets basically in the whole world you can find the answer just a fraction of a second so again they're using a really fast algorithm they're not doing trial and error they're not doing brute force you're doing something a little more a little smarter a lot smarter than that but algorithm has been around for decades yes uh does it have to do something with the sorting algorithms uh well i mean it it's a search in the sense that you're searching for the destination vertex you're trying to go to okay so so it's not so it's nothing like binary search or bubble store i think uh the picture i have in my mind is like dropping a pebble into a puddle or a pond and watching those ripples go out so your pebble goes in where you want to start and then the algorithm ripples outward until it finally reaches the destination so in a large graph you can see it start to like branch out to all the neighbors and all the neighbors of that all neighbors of that but where the the smarts comes in is you obviously don't want to do paths that take you away from the destination you want to go towards the destination so you wouldn't consider ones that take you away and do a little bit of backtracking you might end up at a dead end and you have to back out that dead end and try a different path all right we'll see how that works on wednesday and you can probably find lots of animations for that on youtube plus some websites that do interactive demonstrations of it it's pretty neat all right that's it for today like i said on wednesday we'll visit shortest path and then we'll basically be done with graphs from a general point of view and we're ready to move on to more specialized types of graphs trees state machines million more machines and that'll occupy the rest of our our semester okay thank you it was great good i don't see any don't see any questions so have a good day thank you professor take care thank you yep take care and uh still got a couple youtube people so thanks for watching hope you got something out of it like i said next time we'll do shortest path algorithms and then we're on to finite state machines and then there'll be some there'll be some space left over at the end of the semester towards in december sometime and so we might get a chance to talk about some really cool stuff like uh there's quantum computing there's uh going more in depth into some of the stuff we talked about in in the the rest of the semester like hashing algorithms cryptography gosh there's all sorts of stuff we could do we'll get a few weeks of that in december so um i think i'll end it here have a good day and i'll see you later bye
Info
Channel: Barry Brown
Views: 266
Rating: undefined out of 5
Keywords:
Id: JuTA3pfOoP4
Channel Id: undefined
Length: 92min 35sec (5555 seconds)
Published: Mon Nov 08 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.