>> On today's Visual Studio toolbox, Jeffrey will show us why
Visual Studio Code is the place to be for data
science and Python. Hi. Welcome to Visual Studio toolbox. I'm your host Robert
Green and joining me today is Jeffrey Mu. Hey, Jeffrey. >> Hey Robert. >> Thanks for coming on the show. >> No problem. >> Jeffrey is a Program Manager in the Python tools group at Microsoft. So this is our Part 2
of our Python series. We had Tyrique on in
that previous episode to give us an overview
of Python and today, you're going to have
focus on data science. >> Yeah, for sure. >> So what does that mean? >> So why data science? Well, data science is
one of the biggest workloads in Python right now. We're estimating around
like 30-35 percent of all Python developers do
data science of some sort. So everyone trying to do it. >> Is it optimized for
data science Python as a language or is it more
of a general purpose thing? >> Well, the reason why Python is
so popular for data science is because it's such easy to
pick up language to learn. So a lot of these data scientists don't have this
engineering background, so it's really easy for them to plug and play. Minimal learning curve. It's almost a defective what
people use in the industry. Because data science is so popular, we want to have it like a
first-class data science experience inside VS Code and then most data scientists use Jupyter
Notebooks which is a tool that people use to explore George
melt code and because of this, we want to have Visual Studio Code. We want to offer a
first-class experience inside it for data science and also for developing with
Jupyter Notebooks as well. So our team has been
cooking up a lot of really cool new features
on the past month. This just released
this month as well. So I'm super excited to be here to
show you what we have to offer, all the cool things we can do with data science inside Visual
Studio Code as well. >> Cool. >> Yeah. Before we get started, if you're interested in
just general Python as well or something like web application
or web development and VS code, you can check out my colleague, Tyrique's video like
you mentioned earlier. He also has a video
on Channel 9 as well. >> So aka.ms/vst/pythonVSCode, which is some getting started and there was a whole series
on Python development. >> Yeah. So definitely
go check that out. >> Forgot to mention on
air in the last episode, so we'll make up for that now. All right, data science. >> So it's easy. If you've never
heard of the VS Code before, it's completely open
source. It's free. It's actually really easy to
transition your Jupyter Notebooks from other editors ideas into VS code because we now offer a fully
functional Jupyter UI inside VS Code and also fully functional
Jupyter hotkeys as well. So it's really easy, just bring your Jupyter Notebook
in or create a new one, there's no learning
curve, you just plug and play and it's what you expect. So how you can get started is first, if you don't have Visual Studio code, you want to install that
from code.visual Studio.com. So once you actually have
Visual Studio code installed, Visual Studio code is actually
like bare bones ID so it doesn't actually come with
the data science features, you'll need to install
the Python extension on top which will actually get you
those data science features. So to do this, you can go
into Visual Studio code. On the left-hand side, you'll
see the extensions tab. You can click into it and you can
search for the keyword "Python". It'll be the first one that shows up and you click
"Install" as you can see, I already have it
installed on my machine just for the sake of this demo. So once you have the Python
extension and VS Code installed, the last thing you'll need actually is distribution of
Python in your machine. So if you don't have
Python in your machine, you'll need to install that and
there's two main ways to do it. You can download from the official Python website or you can get Anaconda
distribution of Python, which I personally recommend. Anaconda I guess like a TLDR or just a bit it's basically a
package manager and it's really good for data science because
it is really good at managing your environments for data
science and Anaconda is something that most of the
data science communities, so I would personally recommend that. I also have Anaconda install
on this machine as well. So we'll go through that as well. So once you have those two installed, the first thing you'll want to do is to get started is to create
a new Jupyter Notebook. If you don't already
have one you can also, if you already have a Jupyter
Notebook you can open it up normally and
opening up as well. But for the sake of this video, we're just going to go through
the getting started experience. So the first thing we want to do is, we want to access
the command palette. So let me just close this. So you can access that
through control Shift P or Command Shift P if you're on a
Mac and it'll bring this up. Why the command palette
so good is because it has all the actions of
VSCode listed in it. So if you ever don't remember how to get to some
menu or some action, you just bring this up
and search what you need. So for in our case, we're going to
create a new Jupyter Notebooks. So I'll search, "Create
new" you'll see one of the first things that pops up is create new blank Jupyter Notebook. So if we click on this, it'll
open up what we call our notebook editor which
is the new feature we have for editing Jupyter Notebooks. You might be asking what's
the difference between a Jupyter Notebook and
a regular python file. So the main thing is that
a Jupyter Notebook file, you can think about
it as your code is segmented into different
sections or what we call cells, whereas in a python file
everything is all together. So what's really good with
Jupyter Notebooks you can see is these cells can
be run individually. So you can run individual
pieces of your code in that file or multiple times or you don't have to run the
entire file to see a new output. So that's where Jupyter Notebook
is really useful because you can just experiment and test and
change like one or two things, just to see if that
makes your data better. >> Okay. >> Now that we're in
our notebook editor, we can start going through
the UI of the notebook. So if we go into the
notebook ourselves will see when you make a new notebook, there's one individual cell already. We can now start going through. You can see there's buttons in the cell and there's
also a buttons up top. So the buttons in the cell is
what we will call cell actions, that are things that will
affect the cell itself. The ones that are up
top on the toolbar what we will call notebook or
like global application. So they affect the entire notebook. So we'll go through the in
the cell actions first. So we can look on the left. You have your move cell up
and move cell down buttons. So I only have one
cell here right now, but if I had another cell below I can just click on this button and
I'll move the cell below there. To the right of that, we
have our execution counters. So what that does is
they'll tell you the relative when the cell was running in comparison
to the other cells. So I have a number of like
lets say one, two or three. Right now I don't have
the cell run yet, so there's no number beside it. Below that we have our Run button. So let's write some
code, Prints "Hi". You can just click that
Run button and I'll execute that cell. So
we'll see the word 'Hi'. >> Okay. >> Then below that, we have our cells or buttons for run
cells above and run cells below. So what runs cells above does is that it will run all your cells
that are above the current cell, but not including the current cell. So where this is really useful is, let's say you have a lot of cells, maybe 20 or 30 cells that you want to get through before you have
to run the current cell. Instead of having to click the
Run button all those 30 cells, you just click this one button, it will run all those cells for you. Similarly with run cell and below, it will run the current cell
and all the cells below so it minimizes that amount of
clicks that you've to click. The next one is switched
changed from markdown. So with this is Jupyter Notebooks also support
not just Python as a language, but also supports the Markdown
syntax as a language. Markdown is basically waste like pretty printed text
so you can have like bold or like list and stuff. This is just for comments in your notebook or making a
notebook more readable. >> Okay. >> Then the last one is delete cells. So you can click on that.
It'll just delete the cell. So now let's go over overall
global actions in a notebook. So at the bottom of your
notebook you'll always have this insert cell below button. So it's really easy to
always just add a new cell below whenever you
want and then as well there's also a insert cell above button or instead it's a little
bit adds up at the very top. So only show on hover so
if you click on that, you'll see it as a Cell Above. >> All right. >> So now let's go through
the top-level toolbar. So the first two are
the kernel actions. So first is restarting
your pipeline kernel. If anything happens, your
kernel or if it gets into some weird state or where
you ever need to restart it, you can just click on that and
we'll restart your kernel. Next one is your interrupt kernel. So this is more like, let's say you're running some
really long code cell and it gets stuck in an infinite
loop or it's just taking a really long time and
you just want to stop it, if you'd just click on this button
it will stop the execution. The next one is another
insert cell button. So this is still just insert a new cell below whatever
cell you're focus. So for example, so here I can click this and insert
a new cell below. Your next one is to
run all cells button. So like the run all above and run all below and you just want to run all the cells in your
notebook to see the output, you just click this button
instead of having to run all your cells individually. The next one is the clear all output. So let's say you see in this print statement I
have it open of "Hi". I can just click on this and it'll remove all output in my Notebook. The next one is what we
call a Variable Explorer. So it'll show all your active
variables in your notebook. Currently, I don't have any variables in this notebook, so
nothing shows up. But later on in this video, I'll show them what's
in that notebook and then go into this
feature more in-depth. >> Okay. >> Next is your save notebook. Then finally, is your Convert
and Save as a Python script. So this is what I was
mentioning earlier where we have a feature
where you can actually convert your Python Notebook
into a Python file. Again, I'll get into more in-depth of this feature later on this video. >> Okay. >> So let's move on. So this is when I created
a notebook from scratch. But let's say you already
have a notebook open, or a notebook that you want
to bring into VS code. So for example, I have this data science is cool
notebook, which it is. But let's open that up, and you can see that it also opens up in a notebook editor as well. So this is a notebook I already have. You can see that the
outputs are already saved. So if you already have the notebook, the output saved externally from
another notebook application, I will also save that in this one. Some things that we
can go through are; there's full support for
IntelliSense and IntelliCode. So as you can see here, let's say I remove this statement, if I don't want to write plot.show, you can see IntelliCode shows
up with a suggestion for show and there's also an IntelliSense
for all the API that support it. So if I want to do this, it also shows the function
signature of what it showed us. >> So plt comes from
matplotlib.pyplot. What is that? >> So matplotlib is one of the most popular uploading
libraries for data science. So it's really good for this
graphing and seeing the data. So I guess the first feature
I want to go through more in-depth is called a plot viewer. So you can see here that this cell
generated a plot as an output. You can see that
there's a lot of data, it's hard to zoom in, or it's like the plots small, so it's hard to see the
data really clearly. So we can actually do is, if you click on this
top left button here, it brings up the plot
viewer in a new window. Then this plot viewer
will actually bring up a bigger image of that plot, and then you can do
really cool things such as you can zoom in the plot, so let's say you want
to look closer at where it crosses the X-axis,
you can see here. It also lets you do other
cool things such as, you can save the plot
in different formats. So if you want to save as like
a PDF or a PNG as an image, and you want to share
it with other people, you can do that as well. So now let's go back to our notebook. Other cool feature that
I mentioned before was the Variable Explorer. So what the Variable Explorer, lets say I open that, I
just run this code cell. You can see that P shows up because
P is a variable I just made. So with the Variable Explorer, it'll show up the current
state of all your variables. Where this is really useful is, let's say you have many
cells that you run, and maybe sometimes you
run them out of order because you want to
fix some code in it, and when you run it again and many times you won't
know what it's like, I don't remember what
my variable was, or I don't remember the state. You have to print that out.
But the Variable Explorer is really useful because
you can just open it up, and you'll see at a glance
exactly what your variables are. >> Okay. >> So the Variable Explorer, it'll
have the name of the variable. So in this case, we have P. It'll
have the type of the variables, so we have ndarray type. I'll have a count, which is basically like the length of the variables, so it says 100 here,
and it shows up 100. But the most really
interesting part is the value. So this is really
useful for datasets, and this also ties into a new
feature called the Data Viewer. So if you look on the right,
there's a button over here for array or ndarray types,
the DataFrame types. So they show variables
in Data Viewer. If you click on this, it'll
open up a new window, and it gives an Excel-like
interface with all your data. This is really useful
because lets say you basically send it, or check, or look at your
dataset without having to write code to do it.
It's just to support you. Even more useful is, let's say you want to
do a sanity check, and you want to make sure
there's nothing like negative, or no numbers are like less than one, or see how many values of that. All you need to do is click
this "Filter Rows" button. You'll see this text box popup. But you can do this in text boxes. Let's say I want to make sure that
none of my values are negative. All I have to do is
type less than zero, and I can see there's
nothing matching that. >> Cool. >> Or if I want to say
everything less than one, it'll give me all these
values less than one. So that was the Data Viewer. The next thing I want to
show you is Jupyter Hotkeys. So like I mentioned,
Jupyter Hotkeys basically can make your workload
more productive. So instead of having to
click for all these actions, find them in the menus, you can just do a bunch of hotkeys and we have full support
from any Jupyter Hotkeys. So for example, I went through, like there's control,
side of shift enter. So if I run it, it'll run this cell. There's also, if I go into
command mode, I can push escape. You'll see this turn blue, and then I can also navigate
between my different cells, and then if I want as well, I can push DD to delete. There we go. Then for
fullest of Jupyter Hotkeys, you can check out at
the end of this video, or in the description
I'll have a link to a documentation and there's a lot
more hotkeys supported as well. The next thing I want to go
through is remote Jupyter Server. So right now I'm running
on my laptop right here. It's pretty fast, but
obviously not as fast as like some server in the Cloud. With a lot of Machine Learning
task or data science task, it's really compute-intensive. So I don't want to sit here for maybe a few days or even a week waiting for it to a
run on my machine. >> Right. >> So we have the ability
to actually connect to a remote Jupyter Server and
leverage the compute power of that. So to do this, we just go back to
the command palettes, so Control Shift T, or Command Shift T
if you are on a Mac. We'll search for the command
Specify Jupyter Server URI. So we'll see that show up, and all you need to
do is just click it, and you'll see, by default, it's running the
local Jupyter Server. So right now, it's just
running on my local machine. But if you have a remote machine
in the Cloud and you want to leverage that compute power like
a GPU or a really powerful CPU, you just click on this button, and here you can enter the URL
for whatever server it is. So it's really useful to you. Also, this entire
interface also supports a remote SSH with Visual
Studio Code as well. So you can also connect to a
remote server through that way. So the last thing I
want to show you is, like I mentioned previously, is the Convert and Save
to a Python script. So in this scenario,
let's say I'm pretty happy with what I've done
so far in my notebook, these are generating the right
plots, the code seems right, and now I want to convert it
into a production service. So it's like API others can use it, or if I want to convert
it like a Python scripts so that people can just run it from the
command line and don't have to have a Jupyter
Notebook to do it. Well, all we have to do is
just click this one button, and they'll automatically convert all my notebook code
into a Python format. So before, I would have had to manually copy and paste all the code in the cells
into a new file on that. Let's say you have like a huge
notebook of 100 of cells, that could take an hour or something. But this makes it almost instant. >> You can go the other direction? If you have this Python can you
go back to the Jupyter Notebook? >> So there's also a feature that we have that let's
you go the other way. >> Okay. >> So it's a really binary way where you can just go
from one to the other. You can work in whatever
you're comfortable in. We want to encourage that
flexibility as well. So once you actually convert
and save as a Python file, it'll open up in what we call
our Python interactive window. You might be asking,
what's the difference between notebook editor, Python interactive window, or even let's say our traditional
just regular Python file? The main difference is that the Jupyter Notebook editor is
mostly for Jupyter Notebook files. So with this extension ipynb. It gives that traditional
Jupyter Notebook interface where you have cells, your input and output are in line, and just like the general Jupyter UI. While in our Python
interactive window here, we'll see that it's a hybrid of your traditional Python file and
also the Jupyter Notebook editor. So you'll see it's
like a Python file, but you also have an overlay
for a Jupyter Notebook cells, because it knows it came
from a Jupyter Notebook. So you'll see these runs cells, run below, debug cells. So you have the best of
both worlds in this case. Let's just save this
file real quick first. It already have one in that test, so its name is tests1.py. So with this Python
interactive window, you have the benefits of both. It's like a hybrid of both
regular Python file editor and also our notebook editor. So you can see this is in
our traditional Python file with the test.py. But we also have our overlays
for our Jupyter Notebook. So we can see that there's Run Cell, Run Above, and Debug cell, and where this is really
cool is you can still run individual code cells like
they were a Jupyter Notebook, but you also can run the entire file like it was a regular Python files. So you have the best of both
worlds in that scenario. What's even cooler is because
it's a Python file now, we have also the
ability to debug cells. So instead of having to
debug the entire file or running debugging the
code for the entire file, you can just run for
the individual cell. So if I want to debug
the cell, for example, I can just click on "Debug Cell", I can see it starts stepping
through the cell line by line. If I want to go through
each line of code, I can just click "Step Over" and you'll see it keeps on
going through each line of code, and you can see the Variables, and the Call Stack up here as well. >> Cool. >> Yeah. So the last feature
I want to go through in the Python interactive
window is the input. So we have a fully functional IPython ripple window at the bottom right. Here you can actually type
in code and run it in line with your existing
Python file as well. This window also has full IntelliSense
and IntelliCode capability. So it has context of
what you run previously, and you can run that update
with your current state. So for example, here
I've run the cell, so it has contexts like
numpy, pandas and matplotlib. Here I can, for example,
create a new variable, let's name it X, and I'll
make an array of zeros. Just as examples, I can say zeros, and then let's just make it size 10, and you can see that runs as well. >> Cool. >> Yeah. As you can see my entire data science workflow from getting started to
create a new notebook, to even just bring my own
notebook into VS Code, and then doing the experimentation, doing the debugging, all
that was done inside of this one tool of Visual Studio Code. So that's where I think
our tool will Excel in, where everything you don't have to
switch between different tools, everything can just be done with
this one really amazing tool. >> It's fantastic. Cool stuff. >> Yeah. This recently
just came out this month. So this is all brand new. I encourage everyone to go
try it out for themselves. All you have to do is
download Visual Studio Code, the Python extension as well, and then just bring it
in your own notebook, or create a new notebook
and explore for yourself. >> Awesome. >> Yeah. >> All right. Thanks
for showing us that. >> Thank you so much
for having me here. >> So anybody doing Python, data science, Jupyter Notebook, this is absolutely the idea of the environment of
choice at this point. >> For sure. >> Cool. >> Then we're actually putting a
lot of our focus on this tool. So we're going to have a lot of new features coming up in coming months. So hopefully, I'll be back
soon to demo even more. >> Absolutely. >> Yeah. >> All right. Cool. I
hope you enjoyed that, and we will see you next time
on Visual Studio Toolbox.