Use Python to Query a SQL Server Database & Export Results to a CSV File- A Real World Project

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
we're going to look at how we can use Python to integrate that and query SQL Server databases and then export the results into CSV format so we have a sort of use case here and that would be where we need to potentially just export results daily to alert us as to people whose passwords are expiring so we've got a script here that we're going to recreate but essentially we're going to go through a few stages importing in dependencies or packages and then we will former credentials or connection details then we will query the relevant data which could be whatever you want Place us into a pandas data frame so a structure here much like a table and that's going to then allow us to convert that using pandas to a CSV we'll name it with the current date and time and then we'll we'll collect that CSV file so if I was to run this now we can see that the the script will run and then when we look into our folder structure here within our Explorer we get the result here which is a CSV with the result the output of that query if I double click there we'll see it loaded in here within um vs code so that's what we're going to go through today we're going to create this I'll show you what what we require why we require it um and all of this sort of details in a project like this usually the most complex part when working with sort of drivers or apis is actually getting the credentials right after that it's it's some it's some pretty simple code here so so we'll go through this now primarily what we're going to use here is the pi odbc module and it can easily be installed within your command prompt using pip install Pi odbc of course if you've not used command prompt before and this is really your first time venturing into python potentially or a python project you can just use here you can find it in your search bar or your start bar just type CMD and what you will get is the command prompt and this is where we can again go ahead and install so in this case it would be pip install and then the name of your package module so it'll be Pi or DBC and then you could click enter and go through it that way however um yeah we can we can see that clearly here pip install Pi odbc a lot of you will likely have used command prompt so so we don't necessarily need to dive in there and but as we can see it's a DB database API module and like I said traditionally these sorts of projects the actual code in the script isn't too complex but usually playing around with apis and finding out what works is quite tricky which is why I want to go over this and sort of full format so as we can see here brief overview it's an open source python module and it allows us to access odbc databases so this could obviously be extend beyond SQL server and I'm choosing to use SQL Server because that's what I primarily use but yeah it implements DB API specification but it's got lots of pythonic convenience so obviously native to the way that we like to like to use python and if you're using that you should go ahead and use Homebrew but like I've shown usually we would use the the command prompt or terminal within um within Mac and the standard sort of pip installation so that's all that we really need to know about piodbc that's going to be the interface that we actually use through python to connect into our SQL database so that will be integral to this project that we're we're sort of building here will be pandas of course as we see here in the short blurb well first of all you can install it with Pip install pandas again in the command prompt for terminal and Powerful data structures for data analysis time series and statistics essentially what you need to know about pandas if you've not encountered it is it's sort of the gold standard in Python for data analysis and Transformations and so on so a bit of a blurred python package fast flexible expressive data structures making working with relational um data easy and intuitive high-level building block real world data analysis A broader goal of becoming the most powerful and flexible open source data analysis or manipulation tool so a few things that pandas does well if you've not used it before handling missing data inserting deleting appending data grouping data together common functionality with SQL getting to that level of aggregation merging joining reshaping and so on so we will just uh if you've not used this forward you can use pip install pandas to get that um and we'll we'll get moving with the last of the detail before we start building out this project of the the dependencies of things that we require to build out this project would be if we're using SQL Server certainly would be our server name and then the database name that we intend to query data from so what we're going to do here is we here you can actually Source your server name I've just got a local edition here where I'm using my my local version through through SQL Server right now so this is your server name within this box however what I will show you is if you connect up and get running there is another way that you can go ahead and get your server name in SQL Server so we start up a new query you can just say select and we can say double at and then server name here just release it say as server name execute the query map will display our server as well you can right click copy this and take it into our um our project in Python at the correct time it's also worth noting I'll get rid of that the database that I will be using certainly and you may have this seem able to to join along is Adventure Works 2019 you can easily download this if you like from from Google it's a slightly more complex if you go down the the route of restoring a backup file do have another video I'll link that you can use to also go ahead and import this data if you like so within my databases as you can see I'm going to use adventureworks 2019 I'll expand out the structure and the table specifically that we're going to use is person.password so if I just query this quickly I'll say select all columns star from person dot password there escape the query and we get some some pretty basic data here we get the password hash the business entity ID password salt Road guid a modified data of course you could use this in a more complex environment uh traditionally passwords you know if you're working within let's say the security or I.T side of a business may have to be reset every 90 days or so on this could help um be filled filtered into a system where you give people warnings or you can at least scope out when certain users passwords are are expiring so it's quite a quite a useful use case there are obviously more things if you had more data such as an expiry date you could filter this you could use your where and so on so it's very what I would say real world use case it would be commonly used and this is a way that we could it's certainly query databases and all the better to do it within python in a slightly more automated fashion actually going to go ahead now piece everything together look at we've got all the dependencies sort of aligned and we can actually go ahead and just build the project and look at see this sort of um magic here in a relatively simple script we can start to automate tasks one of the beauty of python and you may go into a lot of organizations or even if you're working personally and there's a large requirement for python automation I've seen it a lot and I use Python to automate tasks a lot it's a fantastic open source language and tool to do this so the first thing that we'll do we'll just write a comment here to say that we're just um we'll see a dependencies to install this would be our pick packages and modules and I've already done this and I've went through it in the steps before this but you may need to so if you if you do need to follow along that would be pip install pandas you need to type into your command prompt CMD in the start bar or terminal on Mac and we can see hit install and that will be p y o d b c of course referencing an odbc connection to our database so that's sorted what we can do now is because we've installed it we can import and as and I'll give an alias just for good practice if we were going to use this and it's handy to have a shortened version commonly you'll see pandas imported as PD I'll import py odbc as and I'll just call this odbc we can be flexible on how we use the Alias and we'll use date time later on so from date time we'll also import date time sort of sub module here and that's going to be relevant later on because we're actually going to use uh date time to Output a file that represents the current date and time so they're the packages you don't need to pip install date time it's it's contained within python already it would just be pandas and pyodbc if you don't have them already installed if you do want to see the packages you have installed you can again go into um your command prompt and type pip freeze and that will that will let you know our pip list you can see the packages that you have installed or you could just type pip install the package name and then it would let you know if you already have that installed so now we're going to list the credentials um so we'll say list credentials and this is going to be we'll be using odbc.connect so we need to have some you know a certain amount of uh credentials logons whatever that may be drivers server names to list so that we can actually Connect into SQL Server so I'll just say list credentials and um we're going to pretty just this note we looked at it later but if you do want to go back and uh have a t SQL code to query a server name uh that will just be using select server name there so we can we can use that if you're if you're actually within a query window in SQL server and you want to use T SQL to find out your server name and we are going to use something in this in the connection properties below uh the The Trusted connection property so just let you know what that means as well trust the connection ignores the user ID and password keys so uid that's how we would sort of State them within the credentials and password keys and the reason I don't need to use this because I'm just using a local version so I don't need to log on with a specified SQL username or password a lot of the times you would so you would need to actually specify in these credentials below but if you're using just a sort of Windows integrated you don't need to but there are other sort of logins such as um you know multi-factor authentication you may have specific SQL logons within your your server so it's important to note that so here we go I'm just going to call this connection string the the credential properties and I'll say these are equal to and we'll say odbc because that's what we imported piodbc as so we're using that you'll see the white line disappears with an atom uh to show that we're actually we're started to use it you may use other text editors that's fine such as atom pie chart maybe even Jupiter notebooks I find that a stronger IDE is good for processes when we're connecting to apis so that's why I've chosen vs code so you'd say odbc dot connect and what we need to do now open up some parenthesis I am simply just spacing them out just for readability purely so the first thing that we're going to need is we will essentially have have four lines here so we will have the um the driver first so uh driver just standard here and we need to say driver is equal to and in curly braces we will put the name of our driver there and I will also just finish this off with a semicolon so my driver is going to be a SQL Server native client let me just see Server native client 11.0 and and also it's worth noting that sort of stating out the the native client driver I think Native clients sort of being phased out after I think it's SQL Server uh before 2016 even so that that driver hasn't changed because I believe I'm running I upgraded to SQL Server uh 2022 and this is still functioning fine so that's the first line driver and it's SQL Server native client 11.0 there's a good chance that your driver is the same if not you can Google how to actually find out the the driver by going through your your file explorer through that way so I would just copy and paste this line a few times because this is the basic syntax that we'll need actually I won't do that maybe a bit more confusing to do it that way so the next line that we will require will be the server and I showed you how to how to get your server that was querying the SQL script above select at server name or it will be there when you load up SQL Server management studio also so say my server is equal to and for me it's desktop because I'm using my my local version desktop and it will be e six e obviously this is going to be different um however you're using it but just as a guide this is for me so it should be desktop and then Dash d6e7jof and I also need to remember to have my semicolon here my next line will be the database that I choose so that will be equal to adventureworks 2019 as discussed before easy for you to get a sample I have linked that um prior to this within the video so database is equal to adventureworks 2019 and then I will see trusted connection is equal to yes so I'll just set it out as it's um as we have to specify these credential namings obviously usually I wouldn't use an underline I would use Pascal case capitalizing the start of each each other word that's how I like to list my name and conventions but I don't have much Choice here so there we go and I will just move this closing princess up one step there so we've got connection string is equal to odbc dot connect the driver we've got the server there server name we've got the database Adventure where it's 2019 with a semicolon trusted colon is equal to yes there so that looks that looks quite good um yeah I think we can we can move on from there now so the next thing that we require is we're actually going to write our query and we're actually going to write the query um with the pandas read function so it will look something like this I will say query that's what I'm going to name this variable and it will be PD Dot read underscore SQL this one here the beauty of intellisense here read SQL query again I need to open up parenthesis I'm going to again nope these these lines and what I will do is I will state three quotation marks just so that I can better uh space out my query here and I will say select and it will be The Columns I'm going to take just in this example will be business entity ID and it will be password hash and roll the uid of course meaning globally unique identifier for that row so every item within our table every sort of row will have its unique identifier and I will say from and I'll get rid of this blurb here I will say from person.password that's the table that we're choosing to query so select business entity ID password hash row guid we've got separated by the triple quotation marks to give us more space um in presenting this over multiple lines and lastly what I will do is just reference where we're taking this from the credentials obviously the server is a database to driver so I need to specify that variable there which will be connection string okay and lastly well almost lastly penultimately we will need to specify the data frame so I'll just give it the the variable DF in capitals is equal to PD Dot each frame we can see here oh that's a wrong one PD Dot I need to frame this first option within here the data frame that we we want to create is from the query results there so above so the great thing about this if you're if you're sort of quite new to python or you're intermediate we don't have to focus too much on you know we don't have to create functions or classes it's quite a short script and a lot of it's credentials but it's very sequential very easy to follow we've got a connection properties which get fed in uh to create this query from from the connection string we specify the query we create a data frame from that query which you can think of a data frame almost like a like a table in the real world and then we've got pd.date frame query and then we just want to set a file path and you could specify what we would call a fixed file path where you give the exact path in your computer sort of with CE users desktop files python files whatever that may be but we're just going to set a relative file path so it'll output it I'm in the same working directory that's why of good practice I've called this main.py because if we had sensitive credential information we could actually set up another file like a DOT EnV and import with the dot EMV package or we could set up a file called credentials.py and import it in but we're just doing everything off this one page just for Simplicity so we'll set a file path and it can just be a relative file path this is where if you've not worked a lot with sort of string formatting in Python it may may look a bit um a bit complex here but I'll explain it as we go along so it say did frame Dot two underscore CSV and within here we will specify so we're going to take the date time from our date time that we imported but now get rid of this blurb again because it gets a bit annoying great time dot now and we will see we'll have this in string format it's a greattime.now.strf time and we can specify the way that we want to Output that date time so we will start off with two quotation marks and we will use our percentage it's a year of The Bash centage and month Dash percentage and B and then underscore and we will go ahead and format our time now so if you bear with me we'll get this typed out um so we will go ahead yes and don't worry this looks a little bit intimidating because I'll explain it shortly so that should be the correct formatting here capital Y go into lowercase MD uppercase after the underscore i m s and p and then what we will do is we need to append on the rest of the text so that's going to give us a current date and time as the file name and we just need to append on this text here so I'll give a dash and then I'll say SQL user password beta dot CSV because it's going to be a CSV file that we we download it all into and then we will see index because we don't want to have an index and our pandas data frame we'll just put false there and so now what we have here it looks all right we've got the date time dot now string formatting we specify how we want to format um this date time value which we've done and we're appending on the text sqluserpassword data.csv index equals false so to review if you want to sort of check everything here we've got pip install pandas Pi or odbc we're importing in our dependencies we've used them all they're no longer underlined Twice first of all we list their credentials we can get our server name from the SQL Server management Studio sort of log on screen or within T SQL by using select at server name and we ignore we use trusted connection equals yes to ignore any dependency we would have any user ID and password because we're just doing this locally likely you would have to add to this credentials a uid and PWD which would be your SQL Server username your logon and your password so there we call this connection string really we're just giving the connect properties for pi odbc that we've aliased as odbc driver server database and trusted connection so fine classic sample database the name and the driver which will likely be the same for yourself but you may need to check that in your files the query we State and we're using pd.read SQL query and then we State a query the connection string and then the data frame and then we set a file path daytime now string here we go and then we append on the rest of the file name and an index equals false so what happens if we run this we'll run it within our terminal and we've got our query goes through that open up our Explorer you see at this time now at 8 13 pm and we've got the SQL user passwords that's all perfect when we double click here the same working directory so wherever you have this stored within your file explorer as well you'll be able to find this output and we get business entity ID password and then we get all of our data in CSV format and CSV obviously important because that's very commonly used um in the in the sort of working world when we're sort of doing more sort of one-off pieces like this so really this is a definitely a real world project so important to uh to learn these sorts of things yes there are ways that you could make this a bit more sophisticated with usernames with passwords you could output this with SQL but often tasks like this are done using python it's great to understand python Automation and working through apis and drivers um and yeah this is this is something that uh that can be very useful so so hopefully you enjoyed it and if you did please feel free to like comment subscribe share whatever you like it it helps the channel and and helps me understand what viewers tend to enjoy thank you
Info
Channel: Matador Software
Views: 8,065
Rating: undefined out of 5
Keywords: python, sql, python project, automation, python automation, automate your work, task automation, python task automation, sql server, sql server management studio, data, data analysis, software development, python programming, python development
Id: -68x3KtBGKI
Channel Id: undefined
Length: 25min 41sec (1541 seconds)
Published: Sun Feb 12 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.