__new__ vs __init__ in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello and welcome. I'm James Murphy. In this video, we're going to be talking about one of Python's most misunderstood features the double underscore or dunder __new___ method of a class and how it differs from the __init__ method. Before we get into this programming tutorial, if you're someone interested in technical programming stuff, I know you've thought about making your own website before, so please consider this video's sponsor Hostinger for all your web hosting needs. Try Hostinger's Premium Shared Hosting Plan. It comes with a free domain name. You can set up a WordPress blog in no time with their built-in WordPress support. You get one hundred free email addresses at your new domain name, and all of this has a 30-day money back guarantee. Click the link in the description and use coupon code MCODING in all caps at checkout to get up to 91% off all yearly plans. Both __new__ and __init__ are part of the process of constructing an object. Both of them get called when you try to create an instance of a class. In this case, we see that the new method gets called with Class A, and then the arguments and keyword arguments, whereas the __init__ method gets passed an actual instance of the [class] and then the same arguments and keyword arguments. Here is approximately what happens when you actually execute a line of code like this. First, the __new__ method is called with the given arguments and keyword arguments. Then, if the returned object has the correct type, then it will go ahead and call the __init__ method. So this really points out what the difference is between __new__ and __init__. __new__ is responsible for creating and returning the actual object, whereas __init__ is responsible for initializing it, setting default values and things like that. We can further confirm this by looking at the return types. _new__ is actually supposed to return something, an object, whereas __init__ doesn't return anything. It just initializes values. This explains why you don't return anything from __init__. Its __new__ s job to actually make the new object. It's a class method, so it takes in the class that it's supposed to return an instance of, and then it needs to return an instance of that class. Now notice here because of this isinstance check, the __init__ method will only be called if it does actually return an object of the given type. If I return the wrong kind of object from __new__, then I'll just be left with an uninitialized object and __init__ will never be called. We can see this happen by commenting out the return line so it returns None. Now we see that __new__ was called, but __init__ was never called, and the value of x is None. That can be really annoying, so always remember to return something from __new__. That's its job. I'd say that most Python programmers learn about __init__ fairly early. Pretty much as soon as they learn about classes, you learn about __init__ and how to make your own classes that have their own attributes. But learning about __new__ is something that usually doesn't come up until much, much later, sometimes years even. So, when would you ever need to actually change how the object is created rather than just how it's initialized? __new__ was added into Python, primarily to allow programmers to subclass built-in immutable types. Suppose that I want to make an uppercase tuple type. It's going to be the exact same thing as a tuple, except whenever you create one, it always uppercases its arguments. You might be tempted to try to accomplish this using the __init__ function. You take in an iterable as its argument. You're going to loop through those things, change the ith thing and the tuple to the uppercase version of that argument. But when we run the code, we see that that doesn't work. By the time that __init__ is called, it's too late. The object already exists and it cannot be changed. The only possible way to get around this is to intercept and modify the arguments before the object is created. This can be accomplished using __new__. Here we take in the same iterable that we would have taken into __init__ and then we modify the arguments. We create a generator here that's going to be a new iterable that uppercases all the strings that it sees, and then we go ahead and pass that new iterable to the tuple class. Then we end up with an actual uppercase tuple. Again, this is only possible because we're modifying the arguments before this immutable thing is actually created. Now, when we run the example, it works. We have lowercase "hi" and "there", and when it's printed out, we have uppercase "HI" and "THERE." You might be wondering in a case like this, why wouldn't I just create a class that contains a tuple instead of inheriting from tuple? And that's a very valid point. The reason, at least for a case like this, would be performance. You could certainly make a proxy object that uppercases the arguments and then forwards all of its method calls to the underlying tuple. However, doing that is going to significantly impact the performance. Tuple is a built-in python object, primarily written in C. Because it's written in C that makes it much faster than any Python code that you could write. So if you wrote a wrapper around it, that was proxying all the calls, that wrapper would be written in Python, and it would take a significant performance hit. You would also be in a similar situation for any type written as a C-extension. Here's another interesting way that __new__ can be used. Creational design patterns. Here is an implementation of what's called the singleton design pattern. The purpose of a singleton is that there's only supposed to be one of them. You might think about it like a global configuration object that everything is supposed to share, no matter how many times you try to create one. You're always supposed to get back the same instance. Since you can only ever have one instance of a singleton, that prevents everyone from getting out of sync. They just all have a reference to the same object. So if you have something like a global config, you could change an option in one place and then it would immediately be available to all of the other places that had instances of that object. Personally, I don't think you should ever use a singleton. It has a lot of the same problems that just keeping everything in a global variable would have. And for that reason, I can't recommend it. But, it does showcase an interesting use case. When you try to create the singleton, if there's no instance that's already set, then it actually creates one. However, if the instance has already been set, then we just return the existing instance. Notice an important detail here. __new__ is supposed to return an instance of the class. However, I never said it had to return a *new* instance. It's perfectly fine to return something that already exists. We can see this example code to check that there is actually only one instance of the singleton. So here, we create the first one. Then we create another one, and then we check that they are literally the same object, and indeed we see that X is Y equals true. So it was literally the same object that was returned from the two calls to the constructor. Here's another example that's like the singleton example on steroids. Imagine you have an object that is incredibly expensive to initialize. Maybe initialization requires something like going out to a database to read the results or reading it from a file. If that's the case, you definitely don't want to create a new instance if you already have one in memory. The loaded variable here keeps track of which clients have been loaded from the database. Whenever we're asked to create a new instance. First, we check if there's already an existing one that has the same Client ID. If there's already an existing one, we just return that. If not, then we go ahead and create the new client, mark them as loaded and then do whatever. go out to the database to initialize them. This _init_from_file function, think about as actually going out to the database and doing all the hard work. I didn't actually implement any of that stuff, but this is similar to how some big frameworks work. When we actually use the client class, the first time that we construct a client with ID 0, it goes out and reads the client from the file or database or whatever. But the second time it just returns the existing one. But if we try to construct a client with a different client ID, then it does go out to the database again and read that one. Doing it this way minimizes the number of times that we have to make those really expensive calls, like going out to the database. One could argue that a lot of these examples are probably better suited for a factory pattern rather than overwriting __new__, but this video is about learning how you can use __new__. Ok, Here's another really fun one. Imagine that we want an encrypted file class that's able to read encrypted files. Basically, you just pass it the key to the file and then tell it what kind of encryption is used and where the file is located. Of course, we don't want to have one mega class that just has every different possible kind of encryption and methods for it inside. Instead, we'll have different classes for different kinds of encryption. For a plain text file, we just open the file and return its contents. But for a ROT 13 file, we would read the file and then decode it as ROT 13 and return that. If we had a file that was encrypted with a one time pad that we're supposed to XOR with, then we read the file as bytes, xor the bytes with the key to the file and then decode and return that. Now all of the logic for how to read each kind of encrypted file is delegated to the class that's in charge of that. But, I don't want to do something like hard code plaintext file here because this thing might be an argument of the function that's not known ahead of time. I want to just pass the string that has all the information to encrypted file and get back the right kind of class. Here's how we accomplish that. First off, we'll use this registry dictionary to map from prefixes like "rot13" to classes that actually implement the functionality for those. We populate the registry by using this __init_subclass__ hook. This function is called whenever a class subclasses EncryptedFile. Rot13Text subclasses EncryptedFile and has a prefix of "rot13". When this line defining the class is executed, it calls the __init_subclass__ function of the EncryptedFile class. It'll pass in the subclass, the Rot13Text subclass, along with the prefix that it had. Then we just store the prefix and class in the registry. Now, in the __new__ method of EncryptedFile, we just parse out the prefix from the path and then look that prefix up in the registry that'll tell us which subclass that we should use. We then create a new instance of that subclass and return it. So if the path starts with "file" colon, we return a plain text object. But if it starts with "rot13", then we return a Rot13Text object. Here we have three text files that when you print them out, you see this: the first one "hello world", the second one jumbled up mess, the third one another jumbled up mess. But, once I add in the correct prefixes "rot13" and one time pad ("otp") with the correct key, then I see something else. After decrypting, we see that the first two said "hello world" and the last one said "subscribe to mcoding". The last very common use case for overriding __new__ is metaclasses. Metaclasses very often want to or need to override the __new__ method. I don't want to go into the details about metaclasses too much. I'm going to have a whole new video about them next week. Be sure to stay tuned.
Info
Channel: mCoding
Views: 46,927
Rating: 4.9720864 out of 5
Keywords: python
Id: -zsV0_QrfTw
Channel Id: undefined
Length: 10min 50sec (650 seconds)
Published: Sat Sep 25 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.