Jupyter Notebooks in Visual Studio Code

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
>> On today's Visual Studio toolbox, Jeffrey will show us why Visual Studio Code is the place to be for data science and Python. Hi. Welcome to Visual Studio toolbox. I'm your host Robert Green and joining me today is Jeffrey Mu. Hey, Jeffrey. >> Hey Robert. >> Thanks for coming on the show. >> No problem. >> Jeffrey is a Program Manager in the Python tools group at Microsoft. So this is our Part 2 of our Python series. We had Tyrique on in that previous episode to give us an overview of Python and today, you're going to have focus on data science. >> Yeah, for sure. >> So what does that mean? >> So why data science? Well, data science is one of the biggest workloads in Python right now. We're estimating around like 30-35 percent of all Python developers do data science of some sort. So everyone trying to do it. >> Is it optimized for data science Python as a language or is it more of a general purpose thing? >> Well, the reason why Python is so popular for data science is because it's such easy to pick up language to learn. So a lot of these data scientists don't have this engineering background, so it's really easy for them to plug and play. Minimal learning curve. It's almost a defective what people use in the industry. Because data science is so popular, we want to have it like a first-class data science experience inside VS Code and then most data scientists use Jupyter Notebooks which is a tool that people use to explore George melt code and because of this, we want to have Visual Studio Code. We want to offer a first-class experience inside it for data science and also for developing with Jupyter Notebooks as well. So our team has been cooking up a lot of really cool new features on the past month. This just released this month as well. So I'm super excited to be here to show you what we have to offer, all the cool things we can do with data science inside Visual Studio Code as well. >> Cool. >> Yeah. Before we get started, if you're interested in just general Python as well or something like web application or web development and VS code, you can check out my colleague, Tyrique's video like you mentioned earlier. He also has a video on Channel 9 as well. >> So aka.ms/vst/pythonVSCode, which is some getting started and there was a whole series on Python development. >> Yeah. So definitely go check that out. >> Forgot to mention on air in the last episode, so we'll make up for that now. All right, data science. >> So it's easy. If you've never heard of the VS Code before, it's completely open source. It's free. It's actually really easy to transition your Jupyter Notebooks from other editors ideas into VS code because we now offer a fully functional Jupyter UI inside VS Code and also fully functional Jupyter hotkeys as well. So it's really easy, just bring your Jupyter Notebook in or create a new one, there's no learning curve, you just plug and play and it's what you expect. So how you can get started is first, if you don't have Visual Studio code, you want to install that from code.visual Studio.com. So once you actually have Visual Studio code installed, Visual Studio code is actually like bare bones ID so it doesn't actually come with the data science features, you'll need to install the Python extension on top which will actually get you those data science features. So to do this, you can go into Visual Studio code. On the left-hand side, you'll see the extensions tab. You can click into it and you can search for the keyword "Python". It'll be the first one that shows up and you click "Install" as you can see, I already have it installed on my machine just for the sake of this demo. So once you have the Python extension and VS Code installed, the last thing you'll need actually is distribution of Python in your machine. So if you don't have Python in your machine, you'll need to install that and there's two main ways to do it. You can download from the official Python website or you can get Anaconda distribution of Python, which I personally recommend. Anaconda I guess like a TLDR or just a bit it's basically a package manager and it's really good for data science because it is really good at managing your environments for data science and Anaconda is something that most of the data science communities, so I would personally recommend that. I also have Anaconda install on this machine as well. So we'll go through that as well. So once you have those two installed, the first thing you'll want to do is to get started is to create a new Jupyter Notebook. If you don't already have one you can also, if you already have a Jupyter Notebook you can open it up normally and opening up as well. But for the sake of this video, we're just going to go through the getting started experience. So the first thing we want to do is, we want to access the command palette. So let me just close this. So you can access that through control Shift P or Command Shift P if you're on a Mac and it'll bring this up. Why the command palette so good is because it has all the actions of VSCode listed in it. So if you ever don't remember how to get to some menu or some action, you just bring this up and search what you need. So for in our case, we're going to create a new Jupyter Notebooks. So I'll search, "Create new" you'll see one of the first things that pops up is create new blank Jupyter Notebook. So if we click on this, it'll open up what we call our notebook editor which is the new feature we have for editing Jupyter Notebooks. You might be asking what's the difference between a Jupyter Notebook and a regular python file. So the main thing is that a Jupyter Notebook file, you can think about it as your code is segmented into different sections or what we call cells, whereas in a python file everything is all together. So what's really good with Jupyter Notebooks you can see is these cells can be run individually. So you can run individual pieces of your code in that file or multiple times or you don't have to run the entire file to see a new output. So that's where Jupyter Notebook is really useful because you can just experiment and test and change like one or two things, just to see if that makes your data better. >> Okay. >> Now that we're in our notebook editor, we can start going through the UI of the notebook. So if we go into the notebook ourselves will see when you make a new notebook, there's one individual cell already. We can now start going through. You can see there's buttons in the cell and there's also a buttons up top. So the buttons in the cell is what we will call cell actions, that are things that will affect the cell itself. The ones that are up top on the toolbar what we will call notebook or like global application. So they affect the entire notebook. So we'll go through the in the cell actions first. So we can look on the left. You have your move cell up and move cell down buttons. So I only have one cell here right now, but if I had another cell below I can just click on this button and I'll move the cell below there. To the right of that, we have our execution counters. So what that does is they'll tell you the relative when the cell was running in comparison to the other cells. So I have a number of like lets say one, two or three. Right now I don't have the cell run yet, so there's no number beside it. Below that we have our Run button. So let's write some code, Prints "Hi". You can just click that Run button and I'll execute that cell. So we'll see the word 'Hi'. >> Okay. >> Then below that, we have our cells or buttons for run cells above and run cells below. So what runs cells above does is that it will run all your cells that are above the current cell, but not including the current cell. So where this is really useful is, let's say you have a lot of cells, maybe 20 or 30 cells that you want to get through before you have to run the current cell. Instead of having to click the Run button all those 30 cells, you just click this one button, it will run all those cells for you. Similarly with run cell and below, it will run the current cell and all the cells below so it minimizes that amount of clicks that you've to click. The next one is switched changed from markdown. So with this is Jupyter Notebooks also support not just Python as a language, but also supports the Markdown syntax as a language. Markdown is basically waste like pretty printed text so you can have like bold or like list and stuff. This is just for comments in your notebook or making a notebook more readable. >> Okay. >> Then the last one is delete cells. So you can click on that. It'll just delete the cell. So now let's go over overall global actions in a notebook. So at the bottom of your notebook you'll always have this insert cell below button. So it's really easy to always just add a new cell below whenever you want and then as well there's also a insert cell above button or instead it's a little bit adds up at the very top. So only show on hover so if you click on that, you'll see it as a Cell Above. >> All right. >> So now let's go through the top-level toolbar. So the first two are the kernel actions. So first is restarting your pipeline kernel. If anything happens, your kernel or if it gets into some weird state or where you ever need to restart it, you can just click on that and we'll restart your kernel. Next one is your interrupt kernel. So this is more like, let's say you're running some really long code cell and it gets stuck in an infinite loop or it's just taking a really long time and you just want to stop it, if you'd just click on this button it will stop the execution. The next one is another insert cell button. So this is still just insert a new cell below whatever cell you're focus. So for example, so here I can click this and insert a new cell below. Your next one is to run all cells button. So like the run all above and run all below and you just want to run all the cells in your notebook to see the output, you just click this button instead of having to run all your cells individually. The next one is the clear all output. So let's say you see in this print statement I have it open of "Hi". I can just click on this and it'll remove all output in my Notebook. The next one is what we call a Variable Explorer. So it'll show all your active variables in your notebook. Currently, I don't have any variables in this notebook, so nothing shows up. But later on in this video, I'll show them what's in that notebook and then go into this feature more in-depth. >> Okay. >> Next is your save notebook. Then finally, is your Convert and Save as a Python script. So this is what I was mentioning earlier where we have a feature where you can actually convert your Python Notebook into a Python file. Again, I'll get into more in-depth of this feature later on this video. >> Okay. >> So let's move on. So this is when I created a notebook from scratch. But let's say you already have a notebook open, or a notebook that you want to bring into VS code. So for example, I have this data science is cool notebook, which it is. But let's open that up, and you can see that it also opens up in a notebook editor as well. So this is a notebook I already have. You can see that the outputs are already saved. So if you already have the notebook, the output saved externally from another notebook application, I will also save that in this one. Some things that we can go through are; there's full support for IntelliSense and IntelliCode. So as you can see here, let's say I remove this statement, if I don't want to write plot.show, you can see IntelliCode shows up with a suggestion for show and there's also an IntelliSense for all the API that support it. So if I want to do this, it also shows the function signature of what it showed us. >> So plt comes from matplotlib.pyplot. What is that? >> So matplotlib is one of the most popular uploading libraries for data science. So it's really good for this graphing and seeing the data. So I guess the first feature I want to go through more in-depth is called a plot viewer. So you can see here that this cell generated a plot as an output. You can see that there's a lot of data, it's hard to zoom in, or it's like the plots small, so it's hard to see the data really clearly. So we can actually do is, if you click on this top left button here, it brings up the plot viewer in a new window. Then this plot viewer will actually bring up a bigger image of that plot, and then you can do really cool things such as you can zoom in the plot, so let's say you want to look closer at where it crosses the X-axis, you can see here. It also lets you do other cool things such as, you can save the plot in different formats. So if you want to save as like a PDF or a PNG as an image, and you want to share it with other people, you can do that as well. So now let's go back to our notebook. Other cool feature that I mentioned before was the Variable Explorer. So what the Variable Explorer, lets say I open that, I just run this code cell. You can see that P shows up because P is a variable I just made. So with the Variable Explorer, it'll show up the current state of all your variables. Where this is really useful is, let's say you have many cells that you run, and maybe sometimes you run them out of order because you want to fix some code in it, and when you run it again and many times you won't know what it's like, I don't remember what my variable was, or I don't remember the state. You have to print that out. But the Variable Explorer is really useful because you can just open it up, and you'll see at a glance exactly what your variables are. >> Okay. >> So the Variable Explorer, it'll have the name of the variable. So in this case, we have P. It'll have the type of the variables, so we have ndarray type. I'll have a count, which is basically like the length of the variables, so it says 100 here, and it shows up 100. But the most really interesting part is the value. So this is really useful for datasets, and this also ties into a new feature called the Data Viewer. So if you look on the right, there's a button over here for array or ndarray types, the DataFrame types. So they show variables in Data Viewer. If you click on this, it'll open up a new window, and it gives an Excel-like interface with all your data. This is really useful because lets say you basically send it, or check, or look at your dataset without having to write code to do it. It's just to support you. Even more useful is, let's say you want to do a sanity check, and you want to make sure there's nothing like negative, or no numbers are like less than one, or see how many values of that. All you need to do is click this "Filter Rows" button. You'll see this text box popup. But you can do this in text boxes. Let's say I want to make sure that none of my values are negative. All I have to do is type less than zero, and I can see there's nothing matching that. >> Cool. >> Or if I want to say everything less than one, it'll give me all these values less than one. So that was the Data Viewer. The next thing I want to show you is Jupyter Hotkeys. So like I mentioned, Jupyter Hotkeys basically can make your workload more productive. So instead of having to click for all these actions, find them in the menus, you can just do a bunch of hotkeys and we have full support from any Jupyter Hotkeys. So for example, I went through, like there's control, side of shift enter. So if I run it, it'll run this cell. There's also, if I go into command mode, I can push escape. You'll see this turn blue, and then I can also navigate between my different cells, and then if I want as well, I can push DD to delete. There we go. Then for fullest of Jupyter Hotkeys, you can check out at the end of this video, or in the description I'll have a link to a documentation and there's a lot more hotkeys supported as well. The next thing I want to go through is remote Jupyter Server. So right now I'm running on my laptop right here. It's pretty fast, but obviously not as fast as like some server in the Cloud. With a lot of Machine Learning task or data science task, it's really compute-intensive. So I don't want to sit here for maybe a few days or even a week waiting for it to a run on my machine. >> Right. >> So we have the ability to actually connect to a remote Jupyter Server and leverage the compute power of that. So to do this, we just go back to the command palettes, so Control Shift T, or Command Shift T if you are on a Mac. We'll search for the command Specify Jupyter Server URI. So we'll see that show up, and all you need to do is just click it, and you'll see, by default, it's running the local Jupyter Server. So right now, it's just running on my local machine. But if you have a remote machine in the Cloud and you want to leverage that compute power like a GPU or a really powerful CPU, you just click on this button, and here you can enter the URL for whatever server it is. So it's really useful to you. Also, this entire interface also supports a remote SSH with Visual Studio Code as well. So you can also connect to a remote server through that way. So the last thing I want to show you is, like I mentioned previously, is the Convert and Save to a Python script. So in this scenario, let's say I'm pretty happy with what I've done so far in my notebook, these are generating the right plots, the code seems right, and now I want to convert it into a production service. So it's like API others can use it, or if I want to convert it like a Python scripts so that people can just run it from the command line and don't have to have a Jupyter Notebook to do it. Well, all we have to do is just click this one button, and they'll automatically convert all my notebook code into a Python format. So before, I would have had to manually copy and paste all the code in the cells into a new file on that. Let's say you have like a huge notebook of 100 of cells, that could take an hour or something. But this makes it almost instant. >> You can go the other direction? If you have this Python can you go back to the Jupyter Notebook? >> So there's also a feature that we have that let's you go the other way. >> Okay. >> So it's a really binary way where you can just go from one to the other. You can work in whatever you're comfortable in. We want to encourage that flexibility as well. So once you actually convert and save as a Python file, it'll open up in what we call our Python interactive window. You might be asking, what's the difference between notebook editor, Python interactive window, or even let's say our traditional just regular Python file? The main difference is that the Jupyter Notebook editor is mostly for Jupyter Notebook files. So with this extension ipynb. It gives that traditional Jupyter Notebook interface where you have cells, your input and output are in line, and just like the general Jupyter UI. While in our Python interactive window here, we'll see that it's a hybrid of your traditional Python file and also the Jupyter Notebook editor. So you'll see it's like a Python file, but you also have an overlay for a Jupyter Notebook cells, because it knows it came from a Jupyter Notebook. So you'll see these runs cells, run below, debug cells. So you have the best of both worlds in this case. Let's just save this file real quick first. It already have one in that test, so its name is tests1.py. So with this Python interactive window, you have the benefits of both. It's like a hybrid of both regular Python file editor and also our notebook editor. So you can see this is in our traditional Python file with the test.py. But we also have our overlays for our Jupyter Notebook. So we can see that there's Run Cell, Run Above, and Debug cell, and where this is really cool is you can still run individual code cells like they were a Jupyter Notebook, but you also can run the entire file like it was a regular Python files. So you have the best of both worlds in that scenario. What's even cooler is because it's a Python file now, we have also the ability to debug cells. So instead of having to debug the entire file or running debugging the code for the entire file, you can just run for the individual cell. So if I want to debug the cell, for example, I can just click on "Debug Cell", I can see it starts stepping through the cell line by line. If I want to go through each line of code, I can just click "Step Over" and you'll see it keeps on going through each line of code, and you can see the Variables, and the Call Stack up here as well. >> Cool. >> Yeah. So the last feature I want to go through in the Python interactive window is the input. So we have a fully functional IPython ripple window at the bottom right. Here you can actually type in code and run it in line with your existing Python file as well. This window also has full IntelliSense and IntelliCode capability. So it has context of what you run previously, and you can run that update with your current state. So for example, here I've run the cell, so it has contexts like numpy, pandas and matplotlib. Here I can, for example, create a new variable, let's name it X, and I'll make an array of zeros. Just as examples, I can say zeros, and then let's just make it size 10, and you can see that runs as well. >> Cool. >> Yeah. As you can see my entire data science workflow from getting started to create a new notebook, to even just bring my own notebook into VS Code, and then doing the experimentation, doing the debugging, all that was done inside of this one tool of Visual Studio Code. So that's where I think our tool will Excel in, where everything you don't have to switch between different tools, everything can just be done with this one really amazing tool. >> It's fantastic. Cool stuff. >> Yeah. This recently just came out this month. So this is all brand new. I encourage everyone to go try it out for themselves. All you have to do is download Visual Studio Code, the Python extension as well, and then just bring it in your own notebook, or create a new notebook and explore for yourself. >> Awesome. >> Yeah. >> All right. Thanks for showing us that. >> Thank you so much for having me here. >> So anybody doing Python, data science, Jupyter Notebook, this is absolutely the idea of the environment of choice at this point. >> For sure. >> Cool. >> Then we're actually putting a lot of our focus on this tool. So we're going to have a lot of new features coming up in coming months. So hopefully, I'll be back soon to demo even more. >> Absolutely. >> Yeah. >> All right. Cool. I hope you enjoyed that, and we will see you next time on Visual Studio Toolbox.
Info
Channel: Microsoft Visual Studio
Views: 88,427
Rating: undefined out of 5
Keywords:
Id: FSdIoJdSnig
Channel Id: undefined
Length: 20min 49sec (1249 seconds)
Published: Thu Dec 05 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.