Big O Notation

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi, I'm Gayle Laakmann McDowell, author of Cracking the Coding Interview. Today we're going to cover one of my favorite topics, which is big O, or algorithmic efficiency. I'm going to start off with a story. Back in 2009 there was this company in South Africa and they had really slow internet they were really really frustrated by, and they actually had two offices about 50 miles away, so they set up this little test to see if it would be faster to transfer big chunks of data over the internet with their really slow internet service provider or via carrier pigeon. Literally they took a bunch of USB sticks or probably actually one giant one strapped it to a pigeon and flew it from one office to the next, and they got a ton of press over this test and media kind of love this stuff right. And of course the pigeon won right. It wouldn't be a funny story otherwise. And so they got a whole bunch of press over, and people are like oh my god I can't believe that a pigeon beat the internet. But the problem was it was really actually kind of a ridiculous test because here's the thing. Transferring data over the Internet takes longer and longer and longer with how much data there is. With certain simplifications and assumptions, that's not really the case with pigeons. Pigeons might be really slow or fast but it always basically takes the same amount of time for a pigeon to transfer one, transfer some chunk of data from one office to the next. It just has to fly 50 miles. It's not going to take longer because that USB stick contains more data, so of course for a sufficiently large file the pigeon's going to win. So in big O, the way we talk about this is we describe the pigeon transfer speed as O of 1, its constant time with respect to the size of the input. It doesn't take longer with more input. Again with certain simplifications and assumptions about the pigeons ability to carry a USB stick. But the internet time is internet transfer time is O of n. It scales linearly with respect to the amount of input. Twice amount of data is going to take roughly twice as much time. So that's what Big O is. Big O is essentially an equation that describes how the runtime scales with respect to some input variables. So I'm going to talk about four specific roles, but first let me show you what this might look like in code. So let's consider this simple function that just walks through an array and checks if it contains a particular value. So you would describe this is O of n, where n is the size of the array. What about this method that walks through an array and prints all pairs of values in the array? Note that this will print if it has the elements 5 and 2 in the array it'll print both 2 comma 5 and 5 comma 2. So you describe this probably as of O of n squared and where n is the size of the array. You can see that because it has n elements in the array, so it has n squared pairs, so the amount of time it's going to take you to run that function is going to increase with respect to n squared. Ok so that's kind of basics of Big O. Let's take another real-world example. How would you describe the run time to mow a square plot of land? So a square plot of grass. So you have this plot of grass and you need to mow all of it. What's the runtime of this? Well it's kind of an interesting question, but you could describe it one of two ways. One of which, one way you can describe it is O of a where a is the amount of area in that plot of land. Remember that this is just an equation that expresses how the time increases so you don't have to use n in your equation, you can use other variables and often it make sense to do that. So you just described this as O of a where a is that the amount of area there. You could also describe this as O of s squared where s is the amount of, is this length of one side, because it is a square plot of land so the amount of area is s squared. So it's important that you realize that there's O of a, and O of s squared are obviously saying essentially the same thing, just like when you describe the area of a square. Well you could describe it as 25 or you describe it as 5 squared, just because it has a square in it doesn't make it bigger. So both are valid ways of describing the runtime. It just depends on what might be clearer for that problem. So with that said, let me go on to 4 important rules to know with the big O. The first one is if you have two different steps in your algorithm, you add up those steps, so if you have a first step that takes O of a time and a second step that takes O of b you would add up those run times and you get O of a plus b. So for example you have this runtime that, you have this algorithm, that first walks through one array, and then it walks the other array. It's going to be the amount of time it takes to walk through each of them. So you'll add up their runtimes. The second way, the second rule is that you drop constants. So let's look at these two different ways that you can print out the min and max element in an array. One finds the min element and then finds the max element, the other one finds the min and the max simultaneously. Now these are fundamentally doing the same thing, they're doing essentially the exact same operations, just doing them in slightly different orders. Both of these get described as O of n, where n is the size of the array. Now it's really tempting for people to sometimes see two different loops and describe it as O of 2n and so it's important that you remember that you drop constants in big O. You do not describe things as O of 2n or O of 3n because you're just looking for how things scale roughly. Is it a linear relationship, is it a quadratic relationship, so you always drop constants. The third rule is that if you have different inputs you're usually going to use different variables to represent them. So let's take this example where we have two arrays and we're walking through them to figure out the number of elements in common between the two arrays. Some people mistakenly call this O of n squared but that's not correct. When you just talk about runtime if you describe things as O of n or O of n squared or O of n log n, n must have a meaning, it's not like things are inherently one or other. So when you described this as O of n squared it really doesn't make sense because it doesn't, what does n mean? n is not the size of the array, if there's two different arrays. What you want to describe instead is O of a times b. Again fundamentally, big O is an equation that expresses how the runtime changes, how its scales, so you describe this as O of a times b. The fourth rule is that you drop non-dominant terms. So let's suppose you have this code here that walks through an array and prints out the biggest element and then for some reason it goes and prints all pairs of values in the array. So that first step takes O of n time, the second step takes O of n squared time, so you could first get as a less simplified form, you could describe this as O of n + n squared, but note that, compare this to O of n and the runtime, or compare this to the runtime of O of n squared, and the runtime of O of n squared plus n squared. Both of those two runtimes reduce down to n squared and what's interesting here is that O of n plus n squared, should logically be between them. So this n squared thing kind of dominates this O of n thing, and so you drop that non dominant term. It's the n squared that's really going to drive how the runtime changes. And if you want you can look up a more academic explanation for when exactly you drop things and when exactly you can't, but this is kinda layman's explanation for it. We've now covered the very basic pieces of big O. This is an incredibly important concept to master for your interviews so don't be lazy here, make sure you really really understand this and try a whole bunch of exercises to solidify your understanding. Good luck.

Info

Channel: HackerRank

Views: 1,322,098

Rating: 4.8433719 out of 5

Keywords:

Id: v4cd1O4zkGw

Channel Id: undefined

Length: 8min 36sec (516 seconds)

Published: Tue Sep 27 2016