Do we still need dataclasses? // PYDANTIC tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Am I a slow typer or is this guy a fast typer because he makes me feel really slow lol

๐Ÿ‘๏ธŽ︎ 3 ๐Ÿ‘ค๏ธŽ︎ u/TheCannings ๐Ÿ“…๏ธŽ︎ Jun 19 2021 ๐Ÿ—ซ︎ replies

While Pydantic is a useful library, it has a heavy handed casting approach that can sometimes yield surprising results. This behavior is documented and I would suggest exploring the casting/conversion prior to adoption of the library for your app/project.

Here's an example of datetime conversion/casting that is perhaps not particularly obvious.

import sys
from datetime import datetime
from pydantic import BaseModel

class Record(BaseModel):
    dt: datetime


def demo() -> int:
    xs = [float('infinity'), '-inf', '0', 0, 0.1, 100, -7, '-777', sys.maxsize]
    for x in xs:
        record = Record(dt=x)
        print(record)

Yields

dt=datetime.datetime(9999, 12, 31, 23, 59, 59, 999999)
dt=datetime.datetime(1, 1, 1, 0, 0)
dt=datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)
dt=datetime.datetime(1970, 1, 1, 0, 0, tzinfo=datetime.timezone.utc)
dt=datetime.datetime(1970, 1, 1, 0, 0, 0, 100000, tzinfo=datetime.timezone.utc)
dt=datetime.datetime(1970, 1, 1, 0, 1, 40, tzinfo=datetime.timezone.utc)
dt=datetime.datetime(1969, 12, 31, 23, 59, 53, tzinfo=datetime.timezone.utc)
dt=datetime.datetime(1969, 12, 31, 23, 47, 3, tzinfo=datetime.timezone.utc)
dt=datetime.datetime(2262, 4, 11, 23, 47, 16, 854774, tzinfo=datetime.timezone.utc)

Another friction point I've run into with the library is how it defines "required optional" fields.

https://pydantic-docs.helpmanual.io/usage/models/#required-optional-fields

๐Ÿ‘๏ธŽ︎ 3 ๐Ÿ‘ค๏ธŽ︎ u/ElevenPhonons ๐Ÿ“…๏ธŽ︎ Jun 18 2021 ๐Ÿ—ซ︎ replies
Captions
pedentic is python package that's an alternative to data classes and it adds a couple of really cool features in this video i'm going to show you how to use it and when you should choose pydentic over the built-in data classes let's dive in if you're new here you want to become a better software developer gain a deeper understanding of programming in general start now by subscribing and hitting the bell so you don't miss anything so why should you use pydentic it offers a lot of the functionality that data classes also offers like defining a class with data attributes dander methods like equality wrapper and string and more also if you want to use pydentic you're going to have to install it whereas databases is built in package but there's a couple of really cool features that pydantic adds and that's extensive support for data validation conversion and sanitizing and that's pretty good reason to choose pedentic over data classes let's look at an example here's json file it contains a number of books each book has a title author publisher isbn number price and some other information and we want to create a python program where we can basically read and process this data in some way so here's my simple python script that has now only main function that opens the json file loads it and then we can start printing out anything that's inside the data set for example i could print the first book in the list so if i print that out then this is what we're going to get so we have this object that's the the first book if i want to print the title i can basically access this like a dictionary that's basically how the json package works so then we get this in principle it's fine to work with data in this way but you're kind of limiting yourself you don't know anything about the structure of the data you can validate the data easily or sanitize it or do things like that and that's where package like pedentic comes into play so i'd like to show you how to use pydentic to work with data a bit more easily and add things like validation and sanitation so in order to do that let's first import binantic and then what we can do similar to how you would do it with data classes that we're going to create a class that inherits from a pedentic model and that becomes the basic structure of your data so let's create a class book and we're going to use the base model class for that so we have our book class and they're very similar to how you would do it with data classes we're going to define the fields that are part of the book so we have a title it's a string we have an author a publisher each book has a price and that's a float we have isbn numbers and these are actually optional so not every book has both an isbn 10 and an isbn 13 so we're going to use the optional type and finally we have the subtitle which is also optional there we go so now we've defined this base model in pedantic of book and what we can do then is create a list of books from the data that we loaded from the json file so we create a books a variable that's a list of books and we also need to import the list type there we go so this is a list of books and we're going to use a list comprehension to convert the data that we read from the file into this list of books so i'm going through the data getting the items from the data and then construct a book from that item what i'm doing here is unpacking the item into keywords arguments so that it sets the right value to the right attribute inside the book class so now we have our books and similar to data classes identic adds a couple of methods to easily print data and things like that so i could print the first book i'll just remove this and then this is what you get so it prints out the book you also see it nicely formatted with the names of the attributes another thing you can do because it's now class you can actually access data inside the book using the attributes so if you want to print the title of the book i can write here the title and you see i'm getting typing information now so this is really helpful when you're looking through the data and you want to do something with it in your python code as opposed to when you were using raw json you didn't have any of that information so printing out the title is now also really straightforward where pedentic is really helpful is when you want to add a validation to your data so for example you want to make sure that the data that you're getting from this json file adheres to what you want the data to be like one thing we could do is add validation for the isbn 10 value there's a particular rule about isbn numbers is that the sum the weighted sum of these numbers should be divisible by 11. so we can add a validator that checks that these isbn numbers are actually valid numbers and the way you can do that in bidentik is using a decorator let me show you how that works so this is the validator decorator that we're going to use and we're going to use that on the isbn 10 field which is the field that we're going to check and this is a class method so i'm also adding a class method decorator here it's not strictly needed but you're going to run into problems with style if you remove it and that function is going to get the value that we're going to need to validate and default we're just going to return that value and then we can do some checks before we return it so one thing we need to do is check that the length of the isbn number is indeed 10 digits and sometimes some isbn numbers contain dashes or maybe some empty space etc so i want to clean it up a little bit so let's create a list of characters that doesn't contain any of that superfluous information and we're going to use a list comprehension for that and sometimes these digits can also be an x a lowercase or uppercase to indicate that it's 10. so now we have our list of characters and we're going to verify that the length of the character list is 10. and if it's not 10 we need to raise an error i'm going to add a custom error type for this because that's the neat way to do it let's call that an isbn 10 format error and that's an exception subclass and this is going to have an initializer we want to give it the current value so we can check what the wrong value is as well as a message there we go so now let's erase that isomer format arrow here in order to check that the weighted sum of the isbn number is indeed divisible by 11 we're going to need a convenient function to help us a little bit with that to convert these characters to integers i'm just going to include that inside this isbn 10 valid function because this is the only place where we're going to use it so this gets a string and gives us back an integer if it's an x we're going to return 10. otherwise we simply convert the character to an integer and now we can compute this weighted sum and we're going to enumerate over the characters and then what we're going to need is the weight which is 10 minus the index times the converted value and that's going to give us our weighted sum and if that weighted sum is not divisible by 11 then we're gonna raise an error as well and we can use the isbn 10 format error again for this so i'm going to copy this over for completeness let's also add a doc string here there we go so now let's run this example again and you'll see that it's going to run this validator i don't see anything changing because all the isbn values in the data are actually valid but let's change one value to something else let's say i change this to 1 and now it's no longer divisible by 11. so if i run the example we're going to get a validation error that the digit sum should be divisible by 11. so this is really useful aspect of pydentic that when you're loading data you can add these kind of validation functions to make sure that the data is clean and that the data is defined in the way that you want to have it another thing you can do is add validation on the whole of the model for example in this case we have a book that has either an isbn 10 or an ibm 13. these are optional types so they can be missing but we'd like to make sure that every book has at least one of the two so a validator like this won't work because this validates individual fields but you can validate a whole model as well so let's add a validator that checks that the book has either an isbn 10 or an isbn 30 or both both is also acceptable so then what we're going to do is create a root validator in the root validator you can specify whether it should validate it before it converts the values into a model or after i'm just going to use before so we have access to the raw data and this is also class methods and this is going to check that isbn 10 or isbn 13 is there so what we need to do is that one of these two is in the values dictionary let's call it values that makes a lot more sense so both of these things are not inside the values it means we're missing something and then we need to raise an error let's also create a custom error for that particular validation issue let's add an initializer we want to have the title of the book where the problem occurs and the message there we go and then in the validator we're going to erase this error and the result of the root validator should be the list of values so now we added an extra validator let's see how that works so if i do this oh i still get this isbn 10 formatting error so i'm going to change this back to zero so then it should resolve that yeah so it's working again and now let's go to that same set of data and let's remove the isbn numbers from let's say the design patterns book so here i'm removing these there and now if i run the example i should get a root validation error there you see documents should have either isbn or isbn 13. let me put back the isbn numbers there we go there's few other things you can do as well with pedantic for example identic has a config class that you can add to a base model to change some settings for example what you could do is create a immutable object so the way to do that is to have a class called config and we're going to set the allow mutation value to false and now books are immutable objects so for example if i try to change the book it's going to give me an error there you see book is immutable there are other options you can set as well as part of this config object for example you can also automatically convert all string values to lowercase like so and then if i print let's say the first book then this is what we're going to get so you see all the thai titles author etc is all now in lower case and that can be useful sometimes if you need to do some data processing another thing you can do is convert these models back to python dictionaries simply using the dict method so if i print this then i'm going to get a dictionary containing the values of this particular book and you can even do things like excluding particular values but let's say you want to exclude the price then this is how you do it and now there is no price information there's also an include option so if you add include then you're going to exclude everything else so now we're only getting the price other things you can do easily in pedentic is create a copy of an object so let's say i have this book and then i can just create a copy using the copy function there and now i've created a copy of that particular book if your model has more complicated things like lists inside of the model and things like that you can also create a deep copy using the d flag yeah so there is a deep and if you set that to true it's going to create a deep copy and then make copy of everything that's inside the base model as well there's a couple of things that pydensic can do that i didn't really talk about in this video for example it can generate a json schema automatically from the base models you defined it also has a base settings class that allows you to easily read configuration data from for example environment variables this is particularly useful if you have a database and you store the credentials in an environment variable and you want to easily access that in your application so overall i like pedentic i think it's a good solution for importing and validating data i'm not saying you should never use data classes anymore i think data class are still a really good alternative especially if you don't need validation of your data it's always nice to work with built-in packages if you can because that way if somebody else needs to run your code they don't have to install other third-party things i did a video about data classes a couple of weeks ago you can watch that here so i hope you enjoyed this example as usual the code i worked on is available in the git repository the description is in the link below the link is in the description below that makes a lot more sense thanks for watching take care see you next time [Music]
Info
Channel: ArjanCodes
Views: 38,113
Rating: 4.979239 out of 5
Keywords: pydantic tutorial, pydantic basemodel, pydantic validator, pydantic vs dataclass, dataclass, python, python tutorial, python programming, python for beginners, python coding, pydantic, dataclasses, pydantic vs dataclasses, python data cleaning, python data validation, data validation in python, data sanitization tool, python dataclasses, dataclasses vs pydantic, python programming tutorial, python for beginners vscode, python interview questions, python programming course
Id: Vj-iU-8_xLs
Channel Id: undefined
Length: 16min 32sec (992 seconds)
Published: Fri Jun 18 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.