Migrate CSV data to MS Access Using Python with Pandas and SqlAlchemy

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
i'm your host sean mckenzie thank you for joining me once again on my channel on data engineering in this episode we go back to our python playlist and we're going to look at how to migrate data from csv to microsoft access which is a migration that's not totally obvious in how to do it and so i thought what i would do is go through a demonstration of that today which uses pi odbc and sql alchemy and some other goodies so without further ado let's get to our migration from csv to access okay so i've got this csv file in excel here and just open it in excel and now you can see it's got a company id you know work code some hours date worked and stuff like that and this is pretty typical of what we might see if i open this in my notepad plus plus you can see what the what the file looks like underneath i can open the file and scroll through it you can see let's see how many rows so it's got 445 for 444 four rows plus one row as a header at the top there so it's pretty typical of what you might see when you have a csv file and i'll go ahead and create an access table here in our usual database that we've been using and what i'll do first i'm going to throw an index number on there because the the pandas to csv usually puts an index column on it this will just make it easier um this is if you're you know directly you know dumping data into access from a csv uh there are other ways if you need to do just certain columns and things like that you can do that as well but today we're just taking that full table plus the index and we're just going to dump it into access and so i'll create these fields here i've got company employee date worked billable hours i'm making sure that my you know my column names are matching the the header names exactly on my csv file and and that's really going to help to make sure everything goes through so work code with a space between work and code even though that's not good database design that's the name of the field here and so what you could do is you could run all of your data into a temporary table in access or in another database if you're using another database and then you know transform that data and move it into data you know if you have properly designed tables without spaces in them and things like that so um i'll save this table and i'll call it uh billable hours and uh oh i'll skip that and i'll just add our own index here so i'll just make the index field the index or pardon me the primary key and that's going to make sure that each row is unique and we can move on and there we go there's our table opened and we can move on so i'm going to go into python i did all of this in python 32-bit and since the drivers for my access and everything we're all 32-bit and so i'm going to start off here and i'm going to save it first of all and then i'm going to import pandas as pd and then i'm going to use url lube and that's that's a library that's going to help us put together sort of a specially crafted url that is just for our data access and we'll also get um create engine from sql alchemy and that's going to give us sort of what we need to connect to the access database and you know read and write with a nice sort of translation layer in between there and that's something that is is very very handy so i'll give some feedback to the user you know print opening csv so we can see what's happening and uh and then i'll create a csv path and i'll put the r in there so that our backslashes uh work properly and uh and then i'll just get our data frame so our data frame is going to be pandas.readcsv and we'll just read our csv path and that should sort of basically pull everything from that csv into our data frame and it's going to take that first column or the first row pardon me it's going to put those into into the column names and i'll print the head and tail of our data frame here and uh and we'll see what we get here just to make sure everything's working properly so i'll hit f5 and uh sort of takes a minute to spin up and uh we'll see how this goes here okay so there's our uh our head and tail of our of our data frame it looks like it got the the column names all correctly and you can see on the very left column there's the index field which we're going to account for in our destination table so i'll print opening access to get some more feedback and now we're going to move over to our access connection which is kind of different from what you're normally going to see from from from me with my access stuff because normally i just you know do the odbc thing and i just you know run a bunch of update statements or whatever but in this case we want to use sql alchemy and we want to do it all in one statement which is one which is a nice way to do it so that you don't have to you know fiddle around with parameters and things like that so i'll paste in this connection string that i've had you know for most of our lessons that you've seen and and then we're going to go ahead and we're going to create our url and uh that's going to um going to help us with our our our engine so we'll go ahead and we're going to do this one it's a access pilot dbc and this is going to help us to create our connection to to access it's a little bit different from how you normally see me connecting where i'm just you know doing a bunch of statements but um this is going to have a nice translation layer in it and so it will use the url lib and we're going to stick our connection string with the quote plus in there and that's gonna be very very handy for our connection so once we've got our url we can go ahead and create our engine oh that should be access plus pi odbc not access underscore pi odbc and so we're going to use access plus pi odbc and uh and then we can go ahead and create our engine and uh so we'll say you know access engine is equal to create engine and then we'll put our url in there for our connection and that's going to create the engine that's going to sort of sort of take care of a lot of stuff for us so we can just sort of pass a command in to basically send all the data over then i'll give some feedback i'll say writing to to access and uh and then we can sort of move on from there um the next thing that we'll do is we'll go data frame to sql and uh and then we'll specify the table that we're going to put it in that was that billable underscore hours table that i created at the beginning of the video and then we'll pass in the access engine as an argument and we'll say if exists equals append and that's going to append to a table if it's there if it's not there it'll create a new table uh and and so we've created a table with all the fields in it so in this case it's going to append it to the table so we could run this file plus another file plus another file if they all have the same structure then they'll all be appended together into one big table so from there i can go ahead and hit f5 and see what happens here it does our head and tail right of the data frame and then at the bottom here you can see it's opened access you know it's writing to the access table and then it says write complete which is exactly what we want to see and and then we can go ahead and take a look at the data i'll pull up access here i'm just going to open it up and open up that database and i'll pull that one up up here onto screen so you can see it so there we go there's the billable hours table that we created there's all the data inside and so if i can you know i can go and look there's our index which is our primary key in this case and all the other fields that came from the csv and you know i can scroll through this table um if i want to you know scroll down i can see all the different you know companies and and work hours uh and i can go all the way to the end to make sure there is indeed uh 444 records in there and it looks like we're in good shape there so that is how you can do csv to access using python hope you enjoyed today's discussion on how to do csv to access in python if you like what you saw today please give the video a thumbs up make sure to subscribe to the channel if you haven't subscribed yet click the bell when you see the bell and put any questions or comments you might have in the comment section below have a great day have a safe day and i'll catch you next time
Info
Channel: Sean MacKenzie Data Engineering
Views: 119
Rating: 5 out of 5
Keywords: Migrate CSV data to MS Access Using Python with Pandas and SqlAlchemy, csv to access, csv to accdb, migrate csv to access, python csv to access, csv to access in python, sqlalchemy accdb, pyodbc sqlalchemy accdb, sqlalchemy urllib, comma separated values, python csv, pipeline csv to access, sean mackenzie data analytics and data engineering
Id: _fvCKUs4k30
Channel Id: undefined
Length: 10min 49sec (649 seconds)
Published: Fri Jul 30 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.