Jupyter Notebooks in VS Code with Python Extension - Tutorial Introducing Kernels, Markdown, & Cells

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
when you're working on a real world problem involving data you're often constructing a narrative around the analysis that you're doing you'll work on one side of the problem to bring yourself closer to the analysis that you want to do later and perhaps you're exploring many different directions of or lines of thought as you're doing your analysis and so you'd like to be able to combine prose and documentation with code well that's what jupiter notebooks allow you to do in a really fantastic way a jupiter notebook is a different kind of file than a python file it's its own special kind of file that allows us to combine text documentation and code and to work in what feels like a mix between a stored program and the terminals read evaluate print loop that we've seen when we're programming interactively this jupyter notebook will live kind of in the middle and is awesome for data analysis data scientists use jupiter notebooks all the time in the real world we're going to look at how jupyter notebooks work specifically in vs code and you might be wondering why is it spelled funny and you'll notice this py is an homage to the fact that it was made for the python ecosystem in recent years other programming languages have been made accessible in the jupyter notebook ecosystem but python is certainly the de facto so we're going to be focused on that in vs code there are extensions that you've probably installed and if you haven't installed the python extension you should go ahead and definitely do that it's a great extension when you're working with python programs additionally if you read the documentation for this extension it tells you that this gives you a feature-rich experience for working with jupiter notebooks as well so by default a jupiter notebook file vs code can't work with in a meaningful way but you add the python extension and it in turn will automatically add for you the jupyter extension and then you'll have support for this all right so i'm going to go ahead and set up a new file that's a jupyter notebook file so these files end in a special extension ipi nb so maybe i'll name this demo notebook dot i pi nb and you'll notice the first time a jupiter notebook starts up it's going to have a different editor you may be prompted in the lower you know corner of your screen for installing additional things that you that in python that jupiter notebooks need now i'd encourage you to go ahead and accept uh those packages that might need to be installed but hopefully it started up so let's just talk a little bit about what's going on on this screen first you'll notice there are some controls up here and we'll talk about what these mean in due time next you'll notice that there's and this isn't quite like the editor that you're used to what this is is this is a cell and we'll see that in jupiter notebooks they're made up of many cells we can have two kinds of cells one for text and one for code we'll come back to this something you should see is uh over here your your exact uh the exact text that's on your machine and your vs code may be a little bit different but we do want to see this green connection this is indicating that vs code was able to successfully start a jupiter kernel and that's a funny term so let me go ahead and get this out of the way and tell you what we're meaning here so the kernel of a jupiter notebook is what we think of as the running state of the program so just like when we have a terminal open and we are running python in the terminal as we are declaring variables and adding functions and things like that we're changing the state of the running program and you can think of that just much very much like the kernel so behind the scenes when we see this green connection what that means is there's a program that's who that's running in the background and when we evaluate these cells which we'll see how to do in just a moment we're modifying the state of that program or the memory of that program all right if you see a version of python that's not 3.91 one of the things that you can do is actually click on that and you'll be given some options of the other versions of python on your machine to be able to use you are encouraged to use the latest version of python these buttons will make more sense once we've talked a little bit about these cells and what it means to evaluate a cell so let's go ahead and set up our first cell and do something that we haven't been able to do in a python program before we're going to add some documentation that is displayed as rich text when we're done editing it so i'm going to use a format called markdown and the hashtag in markdown means a heading so this is going to be a heading and the heading is my first notebook this cell will render as rich formatted text when it is evaluated in order to do that in order to make a cell markdown click the m down arrow button on the cell header all right so you'll notice this little button and if you hover over it you should see a tool tip that says change to markdown so if you click that not much has happened but you'll you'll notice that the formatting went away so it looks like this is all just text and this will render as rich text when your cursor leaves the cell so we i click out of that cell and boom we've got formatted text notice this header became bold and much larger this rich formatted text became italicized and the reason for that is because we surrounded it in underscores we can have subheadings and other formats so this is what we call markdown and markdown is a plain text formatting syntax for writing documentation that's very popular not just in jupyter notebooks but all over tools that you might use you can use markdown styles in in slack and discord on github where you store your code if you're leaving comments in github those are in markdown it's really all over the place and so just to give you a quick overview of some of the other things that you can do you can have bulleted lists you can have bolded text with double asterisks surrounding it you can have links so links to web pages by surrounding the text in square brackets and then putting the link in the uh in the parentheses so https if we wanted a link to say google.com we could do that there right and when you move away from the cell you'll notice that okay sure enough that bold of text is bolded look we've got you know circle dots instead as as bullets instead of the asterisk and if you click a web page link it'll ask if you want to open that webpage and we can open google alright so if you search for markdown cheat sheet in google or your popular a search engine the first link is probably good enough there's going to be a ton of these and i don't think anyone is better than in another but just to give you a sense of some of the other things that you can do if you look at a markdown cheat sheet you can see the other kinds of formatted text you can specify using just plain text syntax right great so we've seen a text file or a text cell i should say sorry a text cell now let's take a look at a code cell so you'll notice that there's this plus button that says insert cell below and we can you know if you were to hover over this text we could actually insert another cell below that one or another cell so you can insert cells in any order you'd like the order will ultimately be important as we'll see later on but we tend to think of these notebooks as wanting to establish sort of a linear train of thought or a linear narrative through working sum analysis if you accidentally add some cells that you want to get rid of there is a trashcan icon that will allow you to remove cells all right so it's no problem to remove cells and to add them back so let's try adding some code here so maybe i set up a variable such as a name that is chris and i print and use an f string hello and name pressed a certain keyboard just out of a habit a shortcut out of a habit that evaluated this cell but before we talk about the keyboard shortcuts how do you evaluate a cell there's this green play button and if you click that play button you'll notice that the number in this in the in the margin here will increase by one and we see that the output of that cell is shown below it right so just like when you are working in a repel and you enter some code and run it what we've done is we've set up a variable whose value is chris named name in our program that would be in the global's space all right one of the cool things in this tool bar that you have is this button right here that says if you leave the tool tip there it tells you that you can show when the show what the active variables are in our jupiter kernel if you click that button you'll notice that sure enough there's our name variable and the value is the string chris so we can see what variables are currently declared and in existence in our jupyter kernel right the next thing i want to mention are these keyboard shortcuts and then we'll talk about some of the mechanics of working with a jupyter notebook so let's add another markdown cell so i'm going to change this cell to markdown i should mention that to change a markdown cell back to code you just click the curly braces and this is how you can jump back and forth between the two different kinds of cells so this is a markdown cell and let's add a subheading for keyboard shortcuts to know you can uh you can always click the play button to run a cell but once you become more comfortable working in a jupiter notebook using keyboard shortcuts you can move much faster and it's a much more enjoyable experience so the first keyboard shortcut to know is and i'm going to use back ticks here so backtick is the character that's just above your tab button on a united states keyboard and we're going to say the control button plus enter evaluates the current cell so if i press ctrl enter you'll notice that this cell was rendered as markdown you'll also notice though that the cell that we're working in which is you know highlighted in this blue margin bar stayed where it is if i press enter again i could go back and edit that cell so not only does this evaluate the current cell it keeps the focus in the current cell right control enter boom right so notice when you go back to a previous cell and you press play that something else happened notice that this blue bar went from being in the cell that we were working on and then it moved down so there's another way of evaluating a cell which is to say evaluate the current cell and then move the focus to the next cell so let's go ahead and add that keyboard shortcut here as well so shift plus enter evaluates the current cell and moves focus to the next cell right so if i were to press shift enter here and i would encourage you to do the same boom look at that now i've got a new cell just below the one i was working on when you use these depends sort of on the context a lot of times when you're working on code cells you'll often want to tweak something try running it and convince yourself that it did what you expected it to do and you'll need to do this multiple times over so when you're working on some code i often find that the control enter shortcut is handier than the shift enter because i want to keep focus on what i'm doing until i know that it's actually what i expect it to be doing and then i'll use shift enter to go ahead and move to the next cell so i want to give another example here so let's say print f wow name uh you are doing great all right if i press ctrl enter notice that i evaluate this cell and you can see that the variable name because it had been assigned in the global state of our kernel right there's that program running behind the scenes where our variables live we print that name out it's part of the string that's evaluated and boom there it is so we're all good all right so as you can start to see we can combine text in code to build a narrative and as we're doing data analysis this is a very common thing scientists use this to sort of talk through their line of thinking before pulling in some data and then analyzing it and then talking about well we found this and that let's go explore it a little bit further in a different direction and so on there are a few things i want to mention about sort of the logistics of what it means to run a jupyter notebook in vs code as i mentioned it's very different from working with just a plain python file right so first you'll notice that this file is i've changed it and i haven't saved it so i've been working in this program i've been editing it and if as soon as i save it i can come back to it later it's on my it's on my file system right when you close a jupyter notebook the kernel is going to close with it right so let's just demonstrate that so i close this notebook and i go back and i open it back up when you go and view the variables so when i open a kernel back up this isn't running the program yet i have none of this has been evaluated this feels kind of confusing because wow doesn't it look like the evaluation of this is there and the evaluation of this is there and you know this looks this was the eighth time we evaluated a cell i should mention that the reason why the numbers count up each time you evaluate a cell that's code is because there are some advanced ways of using jupyter notebooks where you can refer back to results in previous cells that i honestly haven't found very useful or easy to follow along with so i'm ignoring that here but it is kind of useful for knowing that what you just evaluated actually did run you can see that go up by one each time but the point is we open this back up and if this is one of the things that's kind of confusing and takes a moment to get used to with with working in notebooks none of this code has written in this new kernel that we just started so when you close a jupyter notebook file and you reopen it you're starting up a new kernel and when you view the variables it says no variables are defined right and you're thinking well here's this cell but we haven't evaluated the cell in the current kernel right where this gets confusing is you say okay you come back to a problem after you've been working on it and you want to jump back to where you left off so if i were to try evaluating this cell notice that it says name is not defined and you're thinking well what gives like i actually i defined it earlier so this is what i think took me the longest thing to to appreciate in sort of the mental jump between working in a regular python file and a jupyter notebook is you have control over the order in which all of the cells in your program evaluate right so when you reopen a file typically what you'll want to do is put your cursor in the first cell and you'll notice another button in this heading that says run the current cell and everything below it all right so if i press that button boom you know now this is working and now if i were to look at the variables that are in my associated with my kernel there's name everything's fine all right uh so when you open a file and you want to get back to work one of the first things i would encourage you to do because it will avoid some confusion and some errors that are going to be unexpected is to go ahead and just rerun the whole file and then scroll through and make sure that you don't see any errors unexpectedly all right if you're in a cell and uh actually let me talk about one other button here you can stop your jupiter kernel without closing your file or restart it and sometimes when you like change significant parts about your program like you've decided you're going to use a bunch of different variable names like for example if i were to change this variable name to say user and say hello user and evaluate the cell by pressing control enter hello chris okay and let's ch let's say i change this user name to khaki hello khaki so this user variable has been associated in my kernel with the name khaki but i didn't update this down here well now if i go and i rerun this notice it still says chris and this is where things can also be kind of confusing right notice now we have a variable that's named name and one that's named user this is because earlier in this kernel we evaluated this cell where this name variable was set up and that kernel is still running right if i were to restart this kernel you know with this green back arrow and look at my variables you'll see that okay well we just started again we're clean slate we haven't evaluated any of our cells yet and so okay uh let me do what i said it was a good practice press this you know play button with the down arrow and hello khaki runs but oh no now we have an error because the name variable was never actually established so you do need to be careful about you know renaming your variables and if you're doing it in one place doing it everywhere else and when you do that you're encouraged to restart your kernel and re-run the entire thing all right we'll see that there are some other features of jupiter notebooks when we start using graphing and charting libraries we'll be able to have those charts show up as results the last note i want to make about a that's common in a jupiter notebook and it feels very similar to the repl let's add one more markdown cell here is the last expression of a cell is printed by default what do i mean by this well uh as you know in a stored program typically if you want a value to show up as output in your terminal you have to print it so for example print hello right so if i evaluate the cell boom there's hello well in a jupyter notebook cell there's this convention that says hey if you have an expression so here i just have a string literal expression and you evaluate that cell that whatever the string representation of the last cell that you evaluated is that will be printed out for you automatically right this doesn't work with multiple expressions so i could have a second expression here and notice that it's only the last one of a cell that will automatically be printed out on your behalf so why might you use this well you might do something like say set up a variable such as you know some results is a float that is 3.14 times 10 right and if you just set up a variable like this and you evaluate it well this is a statement right an assignment statement so this isn't an expression there is an expression that's a part of this this arithmetic expression this is a statement and so if we wanted to just use a variable name on its own we could see that it will evaluate just the same as if we had printed it right so this is also a common convention you'll see when you're working with jupyter notebooks this idea that you can just have an expression as a part of or as the very last line of a cell and when you evaluate that cell that expression will be what is displayed below the cell often times though you want multiple things to be printed so having multiple lines is totally fine if you have multiple print sentiments so multiple print statements to multiple lines and there's nothing that forces you to use this convention if you would prefer to just print whatever expression you wanted explicitly that's actually probably a good practice when you're just getting started but know that once you become comfortable with notebooks people who are professionals would probably just allow the last expression to be evaluated and that's a common convention all right so this is a quick overview to the fundamentals of working with a jupyter notebook in vs code we're going to get some more exposure to how we would actually use this in real world applications with some charting in the next videos ahead great work
Info
Channel: Kris Jordan
Views: 32,488
Rating: undefined out of 5
Keywords:
Id: IqLY2MW6VqE
Channel Id: undefined
Length: 23min 25sec (1405 seconds)
Published: Thu Mar 18 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.