Pickling Data With Python!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's up everyone welcome back to another video and in this one we're gonna talk about pickling and unpicking data so what pickling is is taking your Python object like your dictionary your list things like that and saving it to a serialized file format so then when you're ready at a later time you can unpick alit and quickly get your python object back so it's very useful when you're working with data that you get that's not quite in a ideal format so typically what you'll do is you'll get your data you'll unzip it you'll clean it up you'll parse it and all that stuff to get it to your final form like a Python dictionary and so on and then you'll save it to a pickle file so then you can later read it get your python object back and do your analysis so we're going to talk about that in this video so let's get started so before we begin I'd like to point out this morning that's on the pickle web page and that's that pickle module is not secured against erroneous or malicious constructed data so what this means is someone could pickle something like a virus and if you were to unpick oh that file you're vulnerable to an attack so the guide is to never unpick all data received from an untrusted or an authenticated source so basically only one pickle data that you've generated or you're sure that that data was generated safely never unpick 'old data that you do not know the source of it so with that out of the way let's jump over to the new notebook and see how this module works so the first thing we're gonna do is import pickle and because we need some random data I'm just going to import numpy as NP and then let's create a dictionary so we'll call it data data dig and we'll give it some random data so the first thing we'll call it volts and we'll just use NP dot random dot random and let's give it just ten points and then we'll call the other column current and again we'll use a random data and again ten points oops and I spelt pickle wrong and then just to see what this looks like let's run it and you can see that's just a dictionary with two columns with numpy arrays with random data cool so what we're gonna do now is we're going to pickle our data dict so the way we do that is we're gonna use the with statement so typically when you open a file you would say my file is equal to open specify the file name and how you're gonna write to it and then you'll say like my file dot write some data to it and then my file dot close so that's the traditional way of working with files but the more pythonic way is to use the with statement the advantage of using with statement is it handles that open and closing automatically so it's just a cleaner and more pythonic way of working with files and writing to files so so the way we do that was we say with and then we say open and then we need to give it a file name so I'm just gonna call it data pick and then the file extension will use is PKL then we specify how we're gonna write to it so we'll say WB and the W means write and the B means binary so and then the next thing we do is we specify the object so we'll say as and then we'll say we'll say pickle file cool now what we're gonna do is we're going to write the data to our new file so the way we do that is we say pickle and then while we specify first is the data itself so we'll save data dict and then we specify the file object so pickle file cool so now when we run this it creates that data pic PKL file with our data dick information in there so if I come to my directory you can see this is the file it created data underscore pic pkl so we've saved data to it now let's read data from it so it's pretty similar setup we're gonna use the with statement again so we'll say with open and then we just specify the file name so data underscore pic pko and then we say how we're gonna use this file so now we're gonna say our for read and B for binary and then we just give it an object name so we can just call it the same thing pickle file and now what we'll do is we're gonna in order to unpick alit we specify the object name that we want to give it so i'm just going to call it new data and then it's going to be equal to pickle dot load so now we're just gonna load from our file so we specify this thing here the file object name and that's it that's all we need to do and we've now we have an object with our data in it so if I were to run this you can see we get our dictionary back just a two-column dictionary with a bunch of random data and a numpy array so that's it it's pretty simple why you would use this again in order to like for example if you are reading some data from like some source like online and you need to clean up the data unpack the data all that effort that it takes can be CPU intensive and time intensive so you don't want to do it twice so once you've cleaned up your data it's a good idea to pickle it so that you can use it like down the road later on so that's pickling and the next one we'll talk more about file structures and then we'll probably move into pandas so yeah if you guys liked the video give it a like if you really liked the video hit the subscribe button and stay tuned for the next one see you guys [Music]
Info
Channel: Mark Jay
Views: 35,305
Rating: 4.933589 out of 5
Keywords: pickle, pickling, python, binary, serialize, data storage, pandas, python3, jupyter, cPickle, parsing, read data, DIY, tutorial, programming, unpickle, coding, learn to code
Id: Pl4Hp8qwwes
Channel Id: undefined
Length: 6min 59sec (419 seconds)
Published: Sun Nov 12 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.