Understanding Virtual Environments for Data Science / Data Analysis - P.4

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what up data nerds i'm luke and welcome to my channel where i make data visualization easy in this video today we're going to be talking about a really important topic when it comes to python and that is virtual environments and this is a topic that whenever you're new to python it may be very confusing and it's really important that you understand some of the basics such as packages and package managers and how they interact with the virtual environment so i'm hoping with this video to actually show how this all relates together and to better understand virtual environments as a quick overview of this video the first part is going to focus on packages and package management what it is we're going to give some quick examples behind how they are utilized and then from there we're going to go into the computer and actually use the anaconda distribution to manage packages for the second part we're going to move into virtual environments explaining what virtual environments are what the problem they're trying to solve and show an example of how you could run into this problem and then finally we're going to jump into the computer again and we're going to be using this virtual environments with the anaconda distribution so with that let's jump right in to understand packages and package managers let's go over a simple example first so let's say this is your computer operating system and we open a terminal or command prompt from here we can install python on our machine and if you're interested in installing python via the anaconda distribution i have a link above for how to do that from here once we install python it's installed locally on our computer so python is great it has a lot of built-in functions and properties within it so we can do simple print statements we can do simple math but whenever we want to go further than that we need to install packages so let's say we have a python script that we want to create and for this focus on the fact that we want to take the we want to raise e to the power 1 and we also want to arrange 0 to 5 in a list format if we wanted to do this with python it would take up a lot of code or we can import these packages into our script and run it with even less code so from here we would see that the to take the exponent we need the math it's actually a module not a package but we would need the math module which is actually internal to the python library itself however numpy numpy is its own library and it's external to python so we'd actually have to install numpy this is where package managers come into play there's two um frequently used package managers it's either gonna be pip or conda i prefer conda as it's uh easier as we'll see so we're gonna be using conda for this so we install numpy using conda and from there it will install it in a folder on our computer and will now be accessible so whenever we go to run this script that requires math and numpy now we will get the results that we expect to see here we are within vs code and let's show this through actually putting the code in and running it so to start with we have our file one py and that's up at the top and then we also have the terminal open up at the bottom and if you're interested in how to install uh vs code i have a link up at the top as well so for this i just want to show let's just do some simple functions first so we can do things like uh one plus two and this is just to show that we this is things that are built within um within python itself so if we save this file i press command s and then i go and run this file so using python file 1 py it will run so it tells us 3 and then it also tells us hello world but like i said previously we want to do some more complicated features so let's start with the math function first so we're going to import math and then from here we're going to execute it so we can go and do print we'll do math dot exponential and we'll do it of one okay so it is now we just saved it by pressing command s come down to the terminal and i'll run this file again okay and so whenever i run it we can see that we get the value that we would expect to see so this is just a simple example of importing it in and math like i said is a module within python so let's move into the next part so let's say we're going to use numpy to arrange our values so we can go in and say import uh numpy okay and then from there we want to uh print numpy dot arrange and we want to arrange it 0 to 5. we're going to run this example without installing numpy via the package manager because i want to show what would happen if you tried to do this so if i go down here and we clear things up so if i come down here and we type python file and we want to run this again whenever i go to run it it's going to give me this error it's going to say hey this module is not found there's no module named numpy so we actually this shows right hey math is actually installed but numpy we need to install so i'm going to clear again now i'm going to go in and install numpy and i want this numpy i don't want this specific version so i press enter and it's going to go through and install numpy it's going to ask us it tells us that the following new package will be installed and i'm fine with all of these being installed i'll type yes and from here i'll clear the screen and i will execute the code again to see and as we can see we got the results that we expected numpy installed properly our math function worked properly and numpy worked properly as well so there you have it that's a quick recap of packages and package managers and just to clarify right packet managers you can have two different types usually you'll see either pip or conda using the anaconda distribution on this channel we focus on data science and data linux so we'll be using conda but if you want to use something that's a little bit more flexible you may consider using pip as your package manager to understand virtual environments it's best to understand what the problem is virtual environments are trying to install so take this operating system and we're operating within uh let's say a single or just no virtual environment whatsoever and take our previous example let's say we still have it on our computer let's say we did this back in 2007 or 2018 a couple years ago and we were using python 3.6.12 and at the time of making this script it worked fine it worked perfectly even as shown on my the computer in the previous example okay now it's a few years later and we pop open our terminal and we hear that python is updated and we also we need to write a new script that we want to do new stuff with so popping open our terminal we decide we're going to update python to the newest version while also updating numpy as well and this is going to update our versions within our local folders then we go in and we create this new file that we want to create and we run it and it runs perfectly it's running perfectly fine the problem we may run into is now if we go in and run the first file it may not run properly because the versions for python and numpy have now changed just to be clear this isn't trying to say you should never update your versions you should just maintain the same versions because if i update my versions it's going to break my code no this is it it runs into certain cases where sometimes whenever you update your versions or install new packages you're going to run into problems with your code and it may cause conflicts and cause your code to break so that's why we're using these virtual environments to solve this problem jumping into vs code let's look at how you can run into this problem if you're not using virtual environments so our script this file one py i updated it slightly to include a little bit more information but right now if we run python and file one py um it's working properly and we have new issues so now let's say we're starting a new project and we recently found out that python got upgraded and so we want to update it we can update all of the packages and it's asked if we want to update them yes we do and also we want to install python and upgrade it to the newish uh version of python 3.8 and we say yes okay so now python is upgraded to that new version and so we're gonna go ahead and clear this so let's go ahead and create a new file come over here and click the documents icon click on our desktop that's where we're working right now and we'll create this new file so file to dot py okay and press enter okay file that two py is open let's go ahead and insert some code into here we know we're going to be using numpy so we'll go ahead and import numpy also we're going to be we want to just mess with an array so numpy is good for creating an array and then we can also print it so i'll go ahead and save it and then let's go ahead and close this let's run our second file so python file 2 dot py and we can see that it runs perfectly fine it outputs the 0 and 1 as expected now whenever we go in and we execute the first python file though so this file right here um we go and execute it we get this error this value error and raises this exception and that's because we of how we built it it causes that error with that upgraded version of python and numpy now that we understand the problem that virtual environments are trying to solve let's go into how we can implement a solution using conda to manage virtual environments to be clear that's not conda is not the only solution for virtual environments another popular option is v-e-n-v and similar similarly you'll see people use pip and vnv or they'll use conda i like conda because you can use it as your package management manager and also use it as your virtual environment manager and so it's a sort of a bundled solution great for data analytics or data scientists like myself so let's jump in let's walk over the basics that we're going to go through to use virtual environments for executing our python scripts so let's say we're starting fresh again and we're starting with our you're on your computer operating system and you have your terminal or your command prompt and from here the first thing that you would want to do is actually create your virtual environment and so here i'm creating my virtual environment uh using conda it's naming it that va en v1 and it's using python version 3.6 and from there i want to actually activate that virtual environment that i call venv1 and so from now it is activated and you can see it's sort of replicated here and it has uh the python inside of the virtual environment itself of the 3.6.12. okay so now we want to execute this script itself similar to our first fileone.py and it requires math which is inherent to python but then also numpy so we use the package manager inside of conda to install numpy and one thing key to note is to know that you're in that virtual environment is with your command prompt you're going to see in parentheses v env1 and that shows that you're in that virtual environment and so we know we're installing numpy into that location right there now whenever we go to run this file it's going to run with no errors because it has everything that we need within that virtual environment and then just running through quickly for the second virtual environment we would create a virtual environment it would install python from there we'd have the python file we need we'd install whatever version of numpy we may need and then but they still both would work because they're both in their own separate virtual environments and as you can see they have different versions of python and different versions of numpy so jumping into vs code let's show how we would create a virtual environment and run our files within this virtual environment so the first thing we want to do is actually create this virtual environment so we would type conda create then the name we want to create uh name the virtual environment so that would be vnv1 and then also we want to use pi python 3.6 for this from here i would press enter and it's saying hey do you want to install the new packages within this virtual environment and yes we do so we're going to click yes and enter and from here we're going to have to actually activate this environment okay so we can type this conda activate it to activate it here but then our python interpreter itself in vs code is still going to be using this environment right here so i'm going to show you a quick way to actually solve this instead of doing flipping through all those different things the simplest way is to just close out of the s code and then relaunch it with the new interpreter so i'm going to close out of it so i'm going to close out quit out of vs code for good measure and then from here i'm going to open vs code and it opens the files back up and i'm going to exit out of these warning messages and then from here i'm going to select the virtual environment that i want so i select it down at the bottom this uh which python down here and i can come in and it will tell the different virtual environments i restarted vs code because this wasn't going to appear unless i restarted it so now i click here and our virtual environment venv1 is right here and then if i open the terminal by saying new terminal it will do the conda activate vnv1 for me so i didn't have to type it and i now see via the parentheses that that virtual environment is activated so i'm going to go ahead and clear and from here let's execute the file 1 because that's the one we needed python 3.6 for so we'll do python file 1 py and it works properly it output the ease um the the array and then also it says that the code is working and then just for uh just for an example we're going to go through and create the virtual environment for file 2 as well so i've gone ahead and created the virtual environment and activated the second virtual environment and from here i can say we can execute that second file and it works perfectly fine but if we went to execute the previous file within this in virtual environment we can see that we're going to get the value error and it's going to be broken when we're done working in this virtual environment we can go ahead and deactivate it by typing conda deactivate and this will just return us to our base anaconda environment our base anaconda environment isn't something we necessarily want to just operate in all the time it's like operating in without a virtual environment i would very much encourage you to use virtual environments for any type of project that you're working on with your python so that was a lot to cover for packages package managers and virtual environments so as a quick recap we talk about we talked about packages what are their importance why are they used within python then we moved into package managers and remember we have two different major types used we use conda but there also is pip available to install packages we talked about how to actually use conda to install packages from there we moved into the second part of the video where we talked about virtual environments and the two main virtual environments that we talked about were venv and then also conda from there we talked about the problem that virtual environments are trying to solve and actually from there moved into using conda to manage virtual environments within our computer as a quick shout out this video is part of a series where we go through to understand the basics to get up and running with python on your computer on the next video we're going to be focusing on how to run jupyter notebooks within vs code some of the different features available and then we're going to be applying a lot of the tactics from this video and also previous videos into using jupyter notebooks in vs code so if this video series seems interesting consider subscribing also be awesome if you smash that like button if you found that this virtual environment video was useful and comment down below on things maybe i missed out on or that you'd like to learn more about with virtual environments hope to see you again
Info
Channel: Luke Barousse
Views: 43,225
Rating: undefined out of 5
Keywords: anaconda tutorial, anaconda python, how to use anaconda, anaconda environment setup, anaconda virtual environment, what is anaconda python, conda enviornment windows, conda environment mac, python environment setup, virtual environment python, virtual environment python visual studio code, python data science, python data analysis, python for data science, python for data anlaysis, data analysis with python, data science with python, data science python, data analysis python
Id: qI0uJsLweoM
Channel Id: undefined
Length: 20min 8sec (1208 seconds)
Published: Tue Oct 20 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.