How to Set up VS Code for Data Science

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome everyone today i have another exciting new video for you guys uh first of all my name is dave i'm a data scientist and in this video i will show you how to set up fies code for data science this is something that has completely changed the game for me coming from working with jupiter notebooks and jupiter lab which are awesome tools by the way but vs code has totally changed the game for me in terms of my workflow and my productivity so in this video i'll show you how i've set up my fuse code and i use it for my data science projects right so here's what we'll cover in this video i'll first give a brief introduction about field code then i'll go over some themes and settings that i personally use then some must have extensions for if you want to write biting code then i'll show you how to run python code using field code and lastly i will show you how you can also run jupyter notebooks within fields code all right so what is fies code so for those of you that don't know it's a free integrated development environment or ide for short made by microsoft and it's available for windows linux macos so doesn't matter what system you're using you can install fies code for free and get started now why do i like vs code first of all because it has so many features it supports many languages it has a really good features for debugging syntax highlighting intelligent code completion this is a big one snippets code refactoring and uh embedded git within it so that's really convenient for version control what also makes it really awesome is the extensibility and the customizability so there are many many settings and there's a huge marketplace with extensions to add new languages teams debuggers etc and you really have to know a few settings and extensions to make it work for writing python code and doing data science projects and then productivity since i've made the switch to fierce code coming from jupiter lab jupiter notebooks it has just made me so much more efficient as a data scientist and then this is probably the biggest advantage of using vs code versus just uh jupiter notebooks or jupiter lab it's that you can manage the entire data science project life cycle and this also has to do with with just productivity as i just mentioned when you first write your code in a jupiter notebook and then once it's ready for production you have to transform that notebook into a python file and that just takes a lot of time basically have to write your code twice and vs code makes this so much easier and i will show you how to do that so first of all if you don't have fierce code you can just google it go to the website and you can download it install it for free alright let's now hop into vs code and i'll show you how i've set up mine to work on data science projects so i've opened up a blank field code file we'll start off by going to top right corner file and opening a folder so i've made a demo folder for this video which i will open it has a typical data science project layout so it has some data we have some code and some python code and we have some notebooks let me close this out for now and the next thing that i will do once i've imported the folder is i will save it as a workspace so i go to file save workspace as and i'll just save it here as demo which is fine what i've just done is i've saved this file with the imported folder as a vs code workspace file and within this workspace you can save settings that are particular to this project so that could be really convenient and another nice way of working like this is that now once i want to work start working on a project again i can just open up the workspace file and it will open up vs code within this workspace with all my folders attached so now we've set up the project so now how do you get started as i mentioned when you just downloaded vs code it's kind of blank it's kind of empty so we need to add some extensions and we to have to tweak some settings in order to make it work for us so let's hop into the extensions and you can do this by clicking on this icon in the left bar over here and this will first show you a list of all the extensions that you currently have installed as you can see these these are all my extensions that i'm i'm running and there also is a search bar where you can search for extensions so first and foremost what we'll start with is the python extension back this is basically a must-have pack that installs uh six different packages that i think are just very convenient some of them are necessary to run python code within vs code so on the overview over here you basically have a description of what's in the in the extension and i already have this extension pack installed so for me it says disable uninstall but for you when you don't have it installed there will be an install button over here so you can just do that install it it's free it will take a couple of seconds and it will stall install the following extensions to fuse code so i'll quickly go over what these are so first of all python is essential to run python code so this is a extension that is produced by microsoft so this basically enables you to run python code and here you can see you can manage your environments which version of python you are using so this is a must-have so what's uh what else is in it then we have intellicode which is very awesome this is and i just explained how this is not so good within jupiter lab and jupiter notebooks but this is just basically makes your life as a data scientist easier it's an ai assistant python development tool that autocompletes suggest etc so you can see an example here of what it does whenever you start typing intellicode will prompt some suggestions and using the arrow keys you can toggle through them and then just hit enter and it will auto complete the code so this will do two things it will make you write code faster and it will also make sure that you make less errors because you don't have to type all the methods and the attributes by hand and you can just auto complete them so that in turn will also enable you to just work faster all right so what else is in it there's an environment manager django comes uh with it if if you work with django that's really uh convenient uh this is the extension to fix python indents uh automatically also very convenient and then you have outer dock string which is also a cool tool this uh basically what this does if you type three quotes it will insert a dog string template which is very important to just comment your functions and your codes for yourself but also when you're working with other people alright so that's the python extension pack let me take a look at my notes yes so the next one that we're looking into is by lance so let me just look it up pylance same thing here i've already have it installed you can install it over here basically what pylance is it's a feature rich server language in python for vs code and it basically it works together with intellisense so as it says here pylance has the ability to supercharge your python intellisense basically this makes the autocompletion suggestions etc even better so another really nice extension to have and then lastly jupiter and this is a wrong tab this is a really cool extension that basically gives you the best of both worlds so i will show you later how this works but basically this allows us to use all the uh the benefits from jupiter and jupiter notebooks within fies code so jupiter over here install it as well those are like the the essentials and there's a lot more code snap is also a pretty funny tool that you can use to make pretty code export so you can highlight a section of code and then export it to to a jpeg for example so i sometimes use this for presentations pretty cool path intelligence you can look into basically makes working with bots easier so when you type double quotes you can look through your parts and easily look up data data files for example but yeah that's just about it about the extensions oh no wait one more thing you can also look for teams and icon teams within the marketplace so if you want to tweak some settings and adjust the look and feel of this code you can do that here so for example i know that atom has a has a team for example so yeah you can look that up check if it fits fits your vibe how you do that once you've installed the theme you can go to the settings in the lower left corner here and by clicking on the gear icon and then you can hit color team and this will show you all the themes so i can basically just scroll through those and you can see how it how it will change and when you install a theme from the marketplace it will show up here so let me just i use the dark plus default dark this is the one i like so this is how you switch up your teams and another cool thing is you can change your icon theme if i click here file icon team i think the you can disable them and i think this is like the default or maybe this one these are from vs code itself so you have minimal and set eye and you can see on the left on the left over here what what it does so um i use the material icon team and why i do that is uh what i really like about this team is that it creates different folders for different depending on the name of the folder so here you can see data has a little database icon docs as doc items different color and the models has a different different color and a different icon so and the source is in green with the code icon next to it so basically this allows me to really easily look up folders that i want to check so not just by name but also by color and it's i think it just adds a little nice touch and now there's one uh more thing i want to show you about the settings um so if i go to settings over here now first of all there are a ton of settings that you can tweak from font sizes to whatever you can look through all of these i have almost everything just on default i've also never really looked in too much of it because i like the way it comes out of the box but there's one important thing to note here and that is that you have a user level and a workspace level and basically how this works is on the user level everything that you tweak will be saved within ps code and then once you open up a new file and you open up a new project by default it will look at the user level uh you can basically compare this to just your your settings in any uh program you just change them and when you open up a new instance of the program your settings are still there but then a really convenient thing is that you also have workspace settings and workspace settings override your usb user settings and they are tied to the workspace so as i showed in the beginning of this video i imported this folder and then i saved this this project to this workspace now this workspace contains settings and for example if i want to increase the font size over here so let me just check i can if i click here and here you won't see any difference but here i will make adjustments on a user level and here i will make adjustment to just the workspace level that is a distinction that you have to be aware of all right so and there's one setting that i want to show you and this is really cool and we'll check that out in a bit and that is the jupiter send selection to interactive window and i will do that on the user level and i will look for jupiter sent to so jupyter send selection to interactive window we want to check this button over here and what this does as it describes here when pressing shift enter send selected code in a python file to the jupyter interactive window as opposed to python terminal and i will show you later in this video what this will do but this is awesome all right now that we've set up fierce goat with the settings and the extensions i'll now show you why it's so awesome and give you a few examples so let me first start by opening up a notebook and by the way this is just a notebook some notebooks that i downloaded from this github repository over here it's just some basic bandless exercises and what i can do is let me just open it up and and here you can see that we can just open up a jupyter notebook and also just run it just like we're used to so this is really awesome and this works because we have installed the jupiter extension and now we can basically just work like any other notebook in the top right corner over here we can select our environment so i'm using anaconda and my base environment and this is really awesome so we can just um work on jupiter notebooks here and let me just open this up in the finder it's just a notebook file and what we can do if i just open this up and open like a regular jupiter session you can see this this is basically just the same as you would normally work but now we're doing it in fields code so that is how you work with jupyter notebooks within vs code and now i'll show you how to work with python files and something that has completely changed the game for me in terms of my productivity so let me close this and i'll open up a quick python file here very basic file we import pandas we load a data frame and we print the data so the main difference between working with jupiter notebooks and working with python files is that in jupyter notebooks you run code sell by cell which is very convenient you can use breakpoints and you can basically for for data science projects you can very easily manage a project step by step so you first load the data you check it you do some explorations you tweak some things and this is really convenient because in a data science project you constantly go back and forth back and forth so you load the data you do some tweaks visualize something create a function back and forth and iterate until you have your desired output so that's what makes a jupyter notebook very convenient for that and if you compare this to running just python file for example that we have over here we can do run python file is it will run everything in one go so it will import the library it will load the data and then it will print the head of the data in this case but the thing is say for example you're working with quite a big data set you do some transformations that maybe take a couple of seconds each and then you've noticed that all the way at the end your visualization doesn't really work you mess it up now if you were working in a python file you would have to adjust your code and then rerun that entire python file and do all those transformations again so load the data etc and this can take a couple of seconds sometimes even minutes every time that you run the file and a jupyter notebook counters this by running it in codeblocks but now by using vs code we can get the best of both worlds and i'll show you how to do this so remember the setting that we tweaked within the settings is jupiter sent so it says here when pressing shift enter send the selected code in python file to a jupyter interactive window as opposed to the python terminal what this basically means is that we can now within the python file as you can see this is not a notebook but this is a python file we can select a line of code and then hit shift and enter and vs code will fire up a interactive jupiter session so this is basically the same thing that's running in the background whenever you're running a jupyter notebook and it will store the lines of code that you've run within memory so you can see for example here the variables no there's nothing over here but then once we load the data so i can either select the line or i can just have my cursor on one of the lines and hit shift enter what you can see it will run this line of code and then it will store it into the variables so what this basically means is that we can write our python code but run it in an interactive way just like we would do in jupyter in a jupyter notebook all right and i'll show you why this is so awesome and how you can use this in like a regular data science project so i have a little script here that basically reads some data then there is a function to transform some of the data and then to store it so this is like the the typical first step within a data science project i will set my cursor on the first line and hit shift enter to fire up an interactive window that's completed so then i go to the second line and import the data and what i can then do i can just select the data and then run it so this is so convenient for working on a python script is i can just use my arrows and then select certain variables or select whole lines to run the code so if i have my cursor just on this line or i select the whole line i can run it and then what i can do i can just move my cursor forward then select just data and then run it and it will show me the output and this just makes it so fast to to work through through a code and you also don't have to um for example in a notebook insert a line below then type data there and then check the output you can just run this very interactively and it's yeah it's just very a very natural way of writing code also you have all the tools in the toolbar here that you also have in the jupyter notebook so for example i can clear everything up let me show you another cool thing so i have a data transformation function here that basically takes the data and let me just show what data is and it takes the item price and it basically turns it into a float so as you can see it has a it has a dollar sign here and that makes it a an object within panda so we can't do any uh calculations with it so this would be like a typical data transformation and i've defined this this function and then what i can do here is i can call this function put data variable we didn't and then we have our data transformed but what you would typically do is you write this function line by line and in between you test a few things so what i can now do is instead of just running this whole function i can just go to this particular section highlight it and then check whether the output is correct so i can start off with the item price and as you can see this is just the item price from the data data frame and this is an object and then i can use the string that replace method so i'll select this over here and i can check all right this works the dollar sign is now gone but it's still an object and then i can say okay but now we want this as a float so we run we add s-type float to the end run it and boom now you can see dollar sign is removed and pandas recognized it is recognizes it as a float alright so now i've basically validated my function i'll select it i'll run it the function is now defined so i can see this is a function what i can then do is i run the line over here and then now my data transformed is a new data frame which has the item price as a float so yeah this has completely changed the game for me in terms of my my productivity and once you get used to this way of working it just becomes so fast you can just hop through your code using the arrow keys and then make selections by using command or control and shift and highlighting certain parts or by using alt or option for example then basically you have the best of both worlds so you have the interactivity of going back and forth of jupiter notebooks but you also have the added functionality of vs code in the extensions to write in a python file uh with all the added features that we discussed like the auto completion uh and the suggestions to basically speed up your coding make less errors etc and you combine those and it will just really improve your workflow it has at least done uh for me as i already mentioned the best thing is that the code that you're writing is also ready for production because it's in in a python file and not in a jupyter notebook so you don't have to do that transformation so that's what i wanted to show you in today's video this is how i use and set up vs code for my data science project as i said it has completely changed the game for me coming from jupiter lab jupiter notebook switching to the workflow that i just explained over here i'm just way more productive and can write code way faster it's also way more fun i think so yeah um i hope this video helped you out if it did i would really appreciate it if you like this video subscribe to the channel i'll be sharing more videos related to data science basically whenever i encounter something in my work that i think can help other people i will try to create a video about it so yeah if that's something you're interested in definitely subscribe thanks for watching see you next time
Info
Channel: Dave Ebbelaar
Views: 23,305
Rating: undefined out of 5
Keywords: vs code data science, data science, python, visual studio code, machine learning, productivity, workflow, vs code, visual studio, vscode python, data science with python, vscode python setup, python data science, vs code shortcuts, vscode tutorial, python visual studio code setup, python mac vs code, vscode for data science, data analysis with python, vs code data science setup, vs code data science extensions, python data science vs code, vscode, vscode data science
Id: zulGMYg0v6U
Channel Id: undefined
Length: 22min 53sec (1373 seconds)
Published: Thu Jun 30 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.