If you already work with data in Excel, and 
want to add more power to your data analysis   and evaluation using Python, then this is the 
course for you. Frank is a data scientist,   and he will teach you how to use Python to work 
with data. Hi, everyone, my name is Frank and   draw that. And this is my Python course for Excel 
users. I created this course to help Excel users   move from Excel to Python. That why Python? 
Well, in Python, we can do most of the things   we will do in Excel, such as working with data, 
making charts, and pivot tables. But that's not   all. We can use all the power of Python 
to automate tasks, work with large data,   and do lots of things. Thanks to the 1000s 
of Free Libraries Python has on top of that,   Python can help you become a better data analyst 
or get into new fields like data science,   I divided this Python course for Excel users in 
three modules. In module one, I'll teach you all   the Python core concepts you need to know for data 
analysis. Then in module two, we'll learn pandas,   pandas is a Python data analysis library that will 
help us do most of the things, we can do an excel   in module three, we'll put into practice what we 
learned in this course, by creating a pivot table   and visualizations such as line plots, bar plots, 
and pie charts. Remember that in the description,   you will find the files code, as well as a free 
PDF Python cheat sheet I created for this course.   There, you will find the concepts, methods and 
functions we will see in this course. By the way,   I'm Frank, and I will be your instructor 
in this course. So let's get started.   To download Anaconda, we go to anaconda.com and 
click on get started. Then we choose their last   option download Anaconda is Tollers. And then we 
have here that different Anaconda is taught. So   there are Windows, Mac, and Linux. So in my case, 
I'm going to choose Mac, and I'm gonna choose the   64 bit graphical installer. So now I'm downloaded 
Anaconda. And once it's downloaded, I'm gonna   click on it, and a message will pop up. Do you 
just have to click on Allow us I'm going to do   right now. So just click on Allow and then 
click on continue until the installation starts.   So I just click Continue and then agree and then 
continue. And it's going to start installing   Anaconda. In case you're on Windows and you're 
installing Python or Anaconda for the first time,   make sure to check the first box you see now 
on screen. So I'm going to speed up the video   now. Okay, the installation is almost done. And 
now it's telling me that Anaconda works with pi   term. And now I'm just going to click on Continue 
to finish that installation. So I click Continue.   And then we'll see just a summary of what was 
installed. And now I'm going to close this window   and I'm going to open Anaconda. So I'm going to 
locate that icon, it's green icon, this one that   you see here. And I'm going to open Anaconda. I'm 
going to wait a couple of seconds. And let's see   what was installed. So here we have that you put 
your lab and Jupyter notebook, which are widely   used in data science. So I'm going to launch 
Jupyter Notebook. So here it's opening Jupyter   Notebook. Let's give it a second. And now we open 
a new notebook with python three. So python three   was installed to and that's it. In the following 
videos, we'll learn how to use Jupyter Notebook.   In this video, I will introduce you to the 
Jupyter notebook interface. Jupyter Notebook   is an open source web application that allows 
us to create and share documents that contain   live code equations, visualizations, 
and text. This is a perfect text editor   for doing data cleaning and transformation. 
That visualization and data analysis this is why   Jupyter Notebook is widely used in data science 
and also machine learning. As you might remember,   we installed Jupyter notebook in Python with the 
Anaconda navigator in this means that we already   have installed some popular libraries used in 
Python for data analysis. By the way, one of the   terms of Jupyter Notebook is Jupiter lab. Both are 
similar, but we're going to use Jupyter notebook   in this course, because of its simplicity. So 
let's open Jupyter Notebook. And to do that,   we have to click here on the launch button. So I 
click here. Now we wait a couple of seconds Now we   have here the interface of Jupyter Notebook. So 
I'm gonna maximize this. And by default Jupyter   Notebook opens the root directory of your 
computer, it's a good idea to create a folder   where all your Python scripts will be located. In 
my case, this folder is called Anaconda scripts.   So I click here. And now I can navigate through 
the folders. And the folder I'm going to use for   this example is this one that says my course here, 
we're going to create our first python script.   To do that, we click here on the New button. So 
click here, and we have to click on the first   option that says, python three, there are other 
options like text file folder, or the terminal,   but we're not going to use these options 
in this course. So click on python three.   And now we have a Python script powered 
by Jupyter Notebook. So here on the right,   you can see that it says python three, and also 
there is the Python logo. And on the left, you   can see here the Jupyter Notebook logo, and also 
the name of this Jupyter Notebook file, we can   change the name of the file by clicking here on 
on title. So I click here, and I can change it to,   let's say, example. So I write example, in I click 
on Rename, and now we rename these up or not file.   Alright, now let's navigate through this menu bar 
that we have here in this Jupyter Notebook file.   So the first option is the file. In here, we can 
create a new notebook with python three. So if we   click here, we're going to open a new Jupyter 
Notebook file from scratch as we did before,   then we have the open and in this case, we can 
open a Jupyter Notebook we created before we can   also make a copy to Jupyter Notebook and then 
change the name, we can save a Jupyter Notebook   file and rename the file as we did before, we only 
click here and rename the file, then we can save   all the progress we make in Jupyter Notebook. 
For example, after writing many lines of code,   you can save all the progress you make by pressing 
Ctrl S or Command S on Mac, and you're going to   create a checkpoint. And later you can revert to 
a previous checkpoint by using this option here.   So here you will see many checkpoints and you 
can revert to a previous checkpoint. By the way,   by default Jupyter Notebook makes saves 
every third seconds or maybe one minute.   So there is no need to press Ctrl S every time. So 
keep that in mind. Then we have other options that   I don't use so much like print this Jupyter 
notebook or export that Jupyter Notebook   file to HTML or PDF and so on. Okay, now 
let's see the second option that says Edit.   And here we can edit all the cells we have here 
in this Jupyter Notebook. By the way, here,   what you see here on the screen is a cell. So we 
can edit with this edit option. For example, we   can cat cells, we can copy cells paste cells above 
and delete cells. On the right, you can see the   shortcuts that we're going to see on the next 
video in detail. And well, you can check all the   edit options that you can perform on Jupyter 
notebook here, then in the V option, we can   toggle the header, the toolbar and also line 
numbers. So here, if I click on toggle header,   the header is going to disappear. And if I click 
on toggle toolbar, this toolbar disappears to   also here and toggle line numbers we can show 
here line numbers. So if I write anything,   we can see that it says 123 and so on. And 
I'm not going to use this for this course,   I'm going to leave it with the default 
options. So here I'm going to revert to the   original options. So without line numbers, and 
I want to show the header and also the toolbar,   but you can personalize it as you want. Next in 
the insert options, we can insert cells above   or below, we only click here. And well we're going 
to see the shortcuts later in the next video.   Then we have the cell options, we can run cells or 
run all the cells in this Jupyter Notebook file.   And then we have the kernel option. And a kernel 
is a computational engine that executes the code   contained in a notebook document. When we open 
Jupyter Notebook. A kernel is automatically   launched. And we can interrupt this kernel by 
clicking here. So by interrupting we can pause   the execution of our code we can also restart 
everything and do more things here. Sometimes,   for example, I interrupt the kernel when I line 
of code or a cell takes too much time to execute.   And well you can do the same here with restart or 
interrupt. Then we have the Navigate option that   doesn't actually have anything here, widgets 
that I don't use so much and will help that,   I think it will send you to that documentation 
of Jupyter Notebook. And you can read it if you   want. All right here. Then we have the toolbar, 
and here you will find some shortcuts of the   menu bar that we've seen before. For example, 
here, you can save and make checkpoint. So here I   click here. And as you can see, here, it says 
checkpoint graded, or something like that, yeah,   checkpoint created and the time that he was 
created, then you can here with this plus button,   insert, cell below sway click here, and as you 
can see, we can insert a cell below. And also you   can use shortcuts, but that I'm going to show you 
in the next video, then we can cap selected cells   with this button, we can copy a cell with this 
bottom. And also we can pace sales below. Also,   we can move a cell above or below, for example, 
I'm going to write anything here in this cell,   I can move it evolve with this button or below, 
as you can see here, then we can run this code,   for example, I can write the number one, and 
then run the code. And as you can see here,   the code ran and it shows the number one and well, 
those are some of the frequently used buttons   in the toolbar. And that's everything you need to 
know about this Jupyter Notebook file. Okay, now,   before finishing this video, I'm going to show 
you some other options that you can find here in   the user notebook interface. In here, you can see 
that there are some other options. So right now   we are in the Files tab. And we can change to 
the running tab here. And here you can see all   the currently running Jupyter Notebook processes. 
For example, we can see here that Jupyter Notebook   file we created and that we opened. So you can 
recognize that you put your notebook file is open,   or that is running, because here the icon will be 
in green. So here if we go back to the Files tab,   we can see that this Jupyter Notebook file, which 
by the way has the IP y and b extension is in   green, so the icon is in green. So this indicates 
that the file is running in well, it was opened.   So here we can see that is open, and we can shut 
down this file. And this is different from closing   this file. For example, here I have the file. 
And if I close this file, here, we can see that   file is still running. Here we see running 
is in green, and in the running tab,   it still shows up. So if we want to shut down 
this file, we click here. And it says that there   are not not books running. And we can see here 
that the notebook has a great icon. Alright,   then we have the clusters tab and this tab I 
don't use so much. And actually, it doesn't show   anything here. And then we have the NB Extensions 
tab. Here, you can install any extension to   personalize Jupyter Notebook even more, and we're 
going to see some cool Jupyter Notebook extensions   in the next videos. And by the way, this NBA 
Extensions tab doesn't show up in some versions   of Jupyter Notebook, but we can easily install 
it and we'll also see how to install these ennemi   extension step in the next videos. Finally, we 
have this box that shows our directory. So here   this folder indicates that root directory. So if 
I click here, we are not in the root. And if I   click on the folders, Anaconda script and then my 
course I go to the folder where I was before. And   that's it. These are all the things you need to 
know about the Jupyter notebook interface. Okay,   in this video, we're gonna see some cell types and 
cell modes in Jupyter Notebook. So first, we're   going to open that Jupiter notebook file that we 
created in the previous video, which is this one   example that I p y and b. So we click on it. And 
here we have the Jupyter Notebook file opened.   In here by default, we have these four sold in 
command mode. And we can say that this is command   mode because here this blue color indicates 
that the cell is in command mode. And when we   are in command mode, we can do things outside 
the scope of any individual cell. So basically   all the tools we see here in the toolbar, we can 
apply it in command mode. Also in command mode,   we can apply some shortcuts that I'm going to show 
you later. And for example, if we want to see the   shortcut window, we press the letter H in command 
mode, and we can see the keyboard shortcuts here.   So here You can see all the shortcuts in all the 
shortcuts that you can apply in commandment. Now   I'm going to close this one. And also you can 
apply different shortcuts like for example,   if you press B in the command mode, you will 
see that there is a new cell because B is the   shortcut that introduces a new cell below. Now, 
if we press enter, you're going to see that the   color is going to change to green. So here we have 
green color. And this green color indicates that   we are in Edit mode. And the edit mode is for 
all the actions you will usually perform in the   context of the cell. For example, introducing 
text or writing code. So here I can write,   say 123. So if I write 123, and then I click 
on this run button, I'm going to run this cell.   And as you can see here, I run this first 
cell. And also after running the cell,   you can see that we are again in command mode. 
So to go to Edit Mode, we press Enter again, and   now we can edit the numbers we introduced. So for 
example, I can write 456, and then run again. And   here you can see that the output shows 12345, and 
six. By the way, if you try to use the shortcut   in edit mode, it won't work here, press enter. And 
now I'm on edit mode. And if I press the nether H,   you can see that nothing happens, we don't have 
the shortcut window. And if I press the letter B,   you can see that we don't insert any cell below. 
This happens because those shortcuts work only on   command mode. So to escape this edit mode, we have 
to press the Escape button. So press escape. And   now I'm again in command mode. So if I press H, we 
have here that keyboard shortcut. And if I press   B, you can see that we inserted a new cell. And 
that's it for the command in the edit mode. Now   we'll see the cell types in Jupyter notebook in 
Jupyter Notebook. There are three main cell types.   And we can see all of them in this drop down here. 
Right now the type of this cell is code. So here   it says code. But if we press here, you can see 
other cell types like Markdown and row and B   convert. So we're gonna see first a code cell, 
and it already has the check. So this one is a   code cell. So now I press here, and now well, it's 
in code cell. If I press Enter, I'm in edit mode.   And here I can introduce any code I want. So here 
I can write any number 99. If I press Control,   Enter, we can see that here, this is the input 
in here we got the output of this code, we're   going to see how the code cell works throughout 
this course. But now it's time to see how that   markdown cell works in Jupyter Notebook. So here, 
I'm going to the cell. Now I'm going to change the   cell type. So I press here in the drop down. 
And now I select markdown in the markdown cell,   we can introduce any type of text we want. For 
example, we can introduce titles. So if I delete   this and press the hash sign, we can get title. So 
one hash, it means title. So here I press a space   in now I write title. Now I press Ctrl, enter 
or this run button to run the cell. In here,   we got the title. By the way, you shouldn't get 
this one number because I use modify the default   behavior of Jupyter Notebook. So mine enumerates 
the titles and subtitles, but in your case, you   will see only the word title. And if you want, you 
can introduce also subtitles here. So for example,   I'm going to insert a new cell with this button 
is plus button. And now I'm going to move this   cell up with this button here. So I press this 
in now I'm going to change the cell type from   code cell to markdown cell. So I go to the 
drop down and select markdown. And by the way,   you can change the cell type also with shortcuts. 
So if you're in command mode, you can press the   Y button to change the code cell. So I press 
the Y button. And as you can see here, it says   in and this in with square brackets indicates that 
this is a code cell. So here I can press enter and   introduce any code here I introduce numbers and 
press the Run button. And here you can see that   we have an input and an output. So this is a 
code cell. But now we can press the M button   to make this cell a markdown cell. So now 
we press M and here we are in command mode.   So now we can get this markdown cell in here. You 
don't see that in Word with the square brackets   anymore. So now I'm going to edit mode so I 
just press here or Well, you can press enter to   Go to Edit mode. In order to introduce a subtitle, 
I'm gonna write double hash sign. So I press   hash sign twice. Now let's paste in. Now I'm 
going to write a subtitle. So I write subtitle,   I press Ctrl, Enter, or the run button to 
run the cell. And we've got here a subtitle.   And we can also introduce text, I'm going 
to introduce a new cell with a plus button.   And you can also do it without beat shortcuts. 
I'm going to do it with a B shortcut, right now,   I press B. And here I got this new cell. 
And we can move this with this button here.   And now we have this cell in the position we want 
it. So here, I can introduce text by converting   the cell to markdown. So here, I choose markdown. 
Now you press Enter to go to Edit Mode. And here   I can introduce any text. For example, I can write 
hello, I press Control, Enter. And now we can see   that we have here this text. And finally, the last 
type of cell is that row and B convert. And this   type of cell is not ever loaded by the notebook 
kernel. So if we convert this code cell to a row   cell, this cell won't be emulated by the notebook 
kernel. So let's try here, I press row, and be   converted. Now we can see that this looks like a 
plain cell. And well this type of cell is not used   that often, actually, we're going to use only 
that code cell and a markdown cell in this course.   And that's it. In this video, you'll learn the 
cell types and cell modes in Jupyter Notebook.   Okay, in this video, we're going to see 
some common shortcuts used in Jupyter   Notebook. And we're going to start with the F 
shortcut. And by the way, to use this shortcut,   you have to make sure you're in the command 
mode and to verify during the command mode,   make sure that the cell has this blue 
color. Okay, now during the command mode,   you can press the letter F, and you're going to 
see these Find and Replace. So this first shortcut   allows us to find our word in a cell and then 
replace it with another word. For example,   I can write here the word hello. And 
here, it found the word hello, inside this   hello, world sentence. And now I can replace 
this word with the world. Say hi, for example.   So here, I write Hi, in red, we can see the match. 
And in green, we can see the word that we're going   to insert. So here, let's click on Replace all. 
And now you can see that it doesn't say hello   world anymore. But now it says Hi, world. So now 
I press Ctrl Enter, which is another shortcut to   run the cell. So you can press here and run 
or only press Ctrl Enter to run this cell. So   press Control Enter. And now we ran this cell in 
another way to run cells is to press shift, enter.   But in this case, we're going to run an insert 
a new cell below. So now let's see I press Shift   Enter a note here, it ran the cell because 
now test in n three inside square brackets.   In here, we can see that we have a new cell. Okay, 
now another shortcut that is often used is the   y and m shortcut. So now this cell is a code 
cell. And if we want to make this a markdown cell,   we only have to press the M letter, so we 
press M and this is going to be converted to a   markdown cell. And if we press the letter y, 
this is going to be converted to a Kotel and   also you can change the heading here, you can 
make the heading bigger or smaller. So here,   I'm going to locate the cell A now to make 
this one smaller, we can press the numbers.   So if we press the number two, we can see that 
this one gets smaller. And if I press number   three, the title gets smaller for smaller and 
so on. So as you can see the more hash signs,   the smaller the text. So here I'm going to delete 
this hash signs. And one hash sign represents   the biggest phone size, which is the title. So 
now we press Ctrl, enter, and now we have this   in heading one. But if I press number five, and 
then press Control Enter, we can see that now this   cell has had in five and it's smaller. So now I'm 
going to revert to heading one. So you press one,   and then Ctrl, enter. Okay, now we can navigate 
through the cells by pressing on the up or down   keys on our keyboard. And as you can see here, 
we can navigate through all the cells here or   we can also press with the mouse, we can press on 
the cells we want. Okay, now we can insert a new   cell above by pressing the A key so if I press a 
we get here a new cell above and if I press enter   b, we get a new cell below. Now if I press 
x, we're going to Cat the cell. So I press X,   and you can see that the cell was Cat A. Now if we 
press V, we paste that cell below. So I press V,   now we got the cell. And if I press Shift plus V, 
we get the cell pasted above. So I press shift in   V, and we get this new cell above this cell I have 
here, okay, now I can delete cells by pressing   D twice. So impressed the two times. And as you 
can see here, that title disappeared. So now   it tried again, and we don't have the title 
anymore. But now if we press the letter Z,   we can Undo those changes. So let's undo what we 
did before. I press Z, and we get here, the title   back. Okay, another useful shortcut is ctrl S, 
that allows us to save the changes we made in this   Jupyter Notebook file. So I press Ctrl S, and you 
can see here that says, checkpoint created. So I'm   going to press again Ctrl S, and here it says 
checkpoint created in here also says the time   and it says these are some of the most common 
shortcuts used in Jupyter Notebook. But you can   see other shortcuts by pressing the letter H. 
So press H. And here you can see more keyboard   shortcuts. Or you can also go here to help and 
then go to keyboard shortcuts here, and you get   the same window. So here you can see a list of 
shortcuts for command mode. And also for the edit   mode, you can see the description of a shortcut, 
and also how to do it in your operating system.   One of the typical ways to get started 
with a programming language like Python,   is printing a simple message, you can write any 
message you want. But it's traditional among   coders to start with a Hello World. So let's try 
it. Let's print our first message using the print   function. The print function prints a message to 
the screen. So I'm going to write here, print.   And then I'm going to open parenthesis, 
every time we use a function. In Python,   we have to open parenthesis, well, in this case 
for the print function. And as you can see, here,   the functions get green color in Jupyter 
Notebook. So that's how you can identify them. So   inside these parentheses, I'm going to write the 
message. So in this case, it's going to be Hello,   world. So this is our first message. 
Now, to execute this first line of code,   we have to press Ctrl N, Enter, or command and 
enter if you're on Mac. So I'm going to press   this. And as you can see, here, we have our first 
hello world. Another way to run this first cell   is pressing here on the run button is going to 
have that same effect. So I pressed and it rang.   So as you can see here, it says in which 
represents a code cell. And this is a markdown   cell, as we've seen before, one of the advantages 
that Jupyter notebook has is that it allows us to   print the last object in a code cell without 
specifying the print function. So for example,   here, I can print this Hello World with without 
writing this print function. So I'm going to copy   this Hello World message that it's inside quotes. 
And I'm gonna run this code. So just Ctrl, Enter.   And as you can see, here, we have this message 
printed. So this is one of the advantages that has   up or not, if you do this in another Python 
ID, it will work. So here you can try yourself,   you can write any message you want. Apart from 
the first hello world, you can try with your name.   So we write prayer and then parentheses, and we 
open quotes, because we need to define a string.   I'm going to tell you about strings a little bit 
later. But yes, so you know right now. And here,   for example, I can write my name. So my 
name is Frank, and I can print my name, then   I can print also numbers. So I print my age 26. 
And it's gonna work too. And besides writing code,   you can also add comments, comments are a useful 
way to describe what we're doing in our code. So   here, we can use comments. We just have to write 
their hash sign, which is this one. So you write   hash sign in, then you write the comment. In this 
case, I'm gonna write my name. And I'm going to   say printing my name so we know what our code is 
doing here in the front. message we wrote, We can   also add a comment. So we write hash sign. And 
then we can say printing my first message. As you   can see here, the comments also have a different 
colors. So, so far, we have three colors, this   color for their comments, green color, for 
God functions in red color for the string,   this is just a useful functionality most texts 
a to have, that allows us to easily read code.   Okay, now let's see some data types in 
Python. Every volume in Python is an object,   an object has different data types. Let's 
see the most common data types in Python.   So one of the most common is that the types in 
Python are integer and floats. Both are numbers.   But integers are numbers that can be written 
without our fractional component, just like,   for example, the number one, number 2345, and so 
on. So all of them are integers. And we can check   these value or this data type by using 
the type function. So this is our second   function we're going to see so we 
write type, and then parentheses,   and we execute, we run this code. And as 
you can see here, in the output, it says,   I n t, which represents integer, so this is an 
integer. Okay, the second type of data I want to   show you is float. Floats are numbers that contain 
floating decimal points. So basically 2.3, let's   say 1.25 5.4, and so on. So here, we have another 
type of data. And let's check out if these are   actually floats. So we use type, and then 
parentheses, and we run this code. And we say   that we have float. And just like on Excel, you 
can perform math operations in Python using these   numbers. So some operations, you can use our 
addition, for example, you can say one plus two,   and then execute this code, and you get three, you 
can use subtraction, so four minus one execute,   and you run this code and you get three. 
You can also do multiplication, division,   exponents in more in Python. But now let's see the 
third data type that we will see often on Python,   and it's the Boolean, Boolean are true or false 
values. And we can check this using again,   that type function, and we write type. And 
within parentheses, we write for example,   true. And we run this code and we see that we got 
that bool, which represent a boolean data type. So   we can also write type, and in this case, false, 
and run this code, and we get bool. Again, so this   is Boolean. And we're going to use Boolean, 
often when we use conditionals. Okay, now the   fourth data type I want to show you and it's very 
common is the string. A string represents a series   of characters. And in Python, anything inside 
quotes, either single quotes or double quotes, is   a string. So let's see them actually, we already 
see one kind of string here when we printed this   Hello, world. And you're actually familiar with 
this, but we're going to see it again. So to   create a string, we have to open either single or 
double quotes. So in this case, I'm going to use   double quotes. So you see it now. And now I'm 
going to write any message. So I'm going to write,   for example, again, hello world. And again, to 
verify the type, we can use that type function,   parentheses, run this code, and we get the STR 
that represents a string. And one cool thing   a string has is methods, we can apply different 
functions to strings, as we will do in Microsoft   Excel, for example. However, in Python, we use 
methods a method is a function that belongs to   an object. To call a method, we use the dot 
sign after the object. Let's see some string   methods to change the case of text. So here, 
I'm gonna write again, hello world. But now I'm   going to use some string methods. So write hello 
world. In this case, I'm going to use the upper   method to make this uppercase, so I'm going to 
use the print function. But actually, we don't   need to use the print function because as I told 
you before, and Jupyter Notebook, we don't need   to use the print, because it automatically prints 
the last line of code. So since this is the only   line of code in this cell block, it's going 
to print it automatically. So we just run this   cell. And we have hello world in upper case. So 
as you might expect, now, we can also change the   case of the text. In this case, it can be 
on lower case, or title case. So I'm gonna   just copy and paste this twice. In here, I'm going 
to write instead of upper, I'm going to use flour,   and then title. So you can see how it's going 
to change the case. So here, I'm going to run,   and let's see what happens. So as you can 
see, here, it only printed the last one,   because I told you before, it only prints the last 
one. And if we want to print the three of them,   we have two options. So we can maybe here cut and 
paste on each cell. Or what we can do is to print   each of them. So here, for example, I can do 
print here, and I can do the same for them.   So instead of using more cells, 
we can print all of them.   And here, we can print this one too. Actually, we 
don't need them, we don't need it, because it's   going to print the last line. But just for the 
sake of this video, I'm going to print the three   of them. So here, I'm gonna run this code. And as 
you can see here, the first it has an uppercase,   the second has lowercase. And the third has a 
title case. So that's how you do it on Python,   other string method that you can find Python is 
the count method. So I'm going to delete this,   and actually this one too. And we're going to see 
this now. So first, I copy this. And now I paste   it here. And here, I'm going to use the count. So 
the count method, so I write count. And then here   I open single quotes, and I write the letter 
that we want to count. So here, for example,   I'm going to write that l letter. And what 
this string method is going to do is going   to count how many times these l letter is 
included in this string. So as we can see,   there are two L's, so it should set two times. So 
I run these code, and actually is three because   there are two in kilo and one in world. So I was 
wrong. And here, another string method that you   can use is the replace method. So we can replace 
one letter for another. So here, let me copy this,   and I'm going to paste it here. And instead 
of writing count, I can write replaced.   So here, the first letter that we're going to 
see here is the letter that we want to replace.   So in this case, I'm going to change the L with 
O. And the second letter is the letter that you   want to put in that string. So I'm going to use 
the U. So I'm going to change every time that   an O appears here in the string, we're going to 
replace it for you vowel. So let's try it. So   I run this code. And now it says, Well, hello 
world, but with you. And these are some of the   most common string methods in Python. Okay, now 
it's time to learn something that you're gonna see   often in Python, which are variables, variables 
help us store data values. In Python, we often   work with data. So variables are useful to manage 
this data properly. A variable contains a value,   which is the information associated with a 
variable to assign a value to a variable,   we use that equal sign. So let's create 
a message that says, I'm learning Python,   and stored in a variable called message underscore 
one. So here, I write message underscore one. And   we set it to that is string. I'm learning Python. 
So are you open double quotes in here I Right, I'm   learning Python. So this is string. We've 
seen this before. And this is the variable,   and we assign this value to the variable 
using the equal sign. Now I'm going to run   this. And as you can see, nothing happens. 
But actually, we just assigned that string   to the variable message underscore one. Now, if we 
want to obtain the message, I'm learning Python,   we only have to type the variable name, and then 
execute that code. So I'm gonna copy and paste it   here. And then we run this code. And as you can 
see, by running this cell, we obtain the content   inside the variable message underscore one, we 
can create as many variables as we want, just make   sure to sign different names to new variables. So 
let's create a new message that says, It's fine   and stored in a variable called message underscore 
two. So first, I write message. So Ms search   and underscore two, and then we set this equal to   open double quotes, and right, and it's fun. 
This is my second variable, and I'm gonna run   this cell. So as we can see, the string was 
assigned to the second variable. And if I   copy and paste this variable here and run this 
code, we can see that the message it's there. By   the way, if you're using single quotes, instead 
of double quotes, or some using in this video,   probably you have the following issue. 
So here, I'm going to copy this one   and paste it here so you can see what I'm talking 
about. So let's say you're let's say you're using   single quotes, instead of double quotes. So you 
get this, this is a problem that you will have   when using single quotes. Because in the 
English language, we use this apostrophes often.   So a simple way to deal with this is using 
double quotes. So as you can see here, if I use   double quotes, everything is okay. Everything 
remains as a string. But with single quotes,   it doesn't happen. So only the i gets this 
string by dress, it doesn't get a string value,   or the string datatype. So just make sure you 
use double quotes every time you have these   apostrophes, and that's it. Okay, now, let's put 
these two messages together. So message one with   message two, I want to put them together. 
So this is called a string concatenation.   If we want to put message one, in message two,   together, we can use the plus operator. And we can 
just do this. So I'm going to copy message one,   or the variable message one. And now I'm going 
to copy the variable message underscore two.   And I use the plus in the middle to concatenate 
this first message with this second message.   So run it, let's see what happens. So here, we 
can see that the two messages were concatenated.   But here, there isn't a space between these two 
messages. So this is the first message and this   is the second and there isn't any blank space 
in the middle. So what we can do here is to   just add a blank space. So I'm going to copy 
this one and paste it here and show you how to   do it. So here I add a new plus operator in the 
middle, we open A string. So with single quotes,   or double quotes, in this case, I'm going to use 
single quotes here, integrate this blank space,   I'm going to press a space. And here we have 
our blank space here. And then we run this code.   And now let's see. And here as we can see, 
there's a space between Python and that. And   we have this blank space. And we want we can 
assign this new message to a new variable. So   I'm gonna assign this to a variable called 
message. And I write message here. And I   include here below the code in here, I can print 
this so as you can see, if I run this, we can   see that the message is there. Okay, now let me 
show you an alternative way to join two strings.   So this is called the F string, and it works 
like this. You write F and you open A string, so   we write a single quotes here. So one and two in 
here. As you can see the whole, the whole thing is   red. So it's like everything is a string in here 
inside, we can write the message. So let's see,   let's say we write a simple Hello World. So hello, 
world. And we run this. And as you can see here,   this is a string, it just has this F, in front of 
that string. In here, one of the advantages that   this f string has is that it can have variables 
inside the string. So here, for example, we can   write a variable opening these curly braces. So 
these collaborations can have variables inside it.   So here, I can write message, underscore 
one, and we can print it. So if we print,   we have this string, I'm learning Python a now 
if we want to concatenate this first message with   our second message, we just have to include 
curly braces, again, I put it here. And now   I write message two. And between message one 
in message two, I just have to press pace.   And we have this. So I'm learning Python, and 
it's fun. So here, we just press is pace. And this   pace also appears here. So for example, if we add 
some random text, let's say ABC, we get this ABC,   between Python in between. So this is how f 
string works, do you just have to write the F,   then open single quotes, and inside you can write 
any message. And to include any variable, just   you have to open these curly braces, write the 
variable name, and that's how to join strings.   Okay, now it's time to see a data type 
that is used. Often in data analysis,   I'm talking about this. In Python lists are used 
to store multiple items in a single variable   list are order and mutable containers. In Python, 
we call mutable, two objects that can change their   values, that is, elements within LA's can change 
their values to great Alice, we have to introduce   that element inside the square brackets separated 
by commas. So let's create our first list.   First, we have to set the name of the list. 
In this case, I'm going to name it countries.   And now to create the list, we have to open 
square brackets SAS said before. So here,   we open square brackets. And here we have 
to write the elements. So I'm gonna include   in these countries list just strings, and they're 
going to be names of countries. So the first one,   I'm going to write the United States. So 
this is the first element in my list. And to   write the second, we have to use the comment. So 
here, comma, and now the second. So let's write   India, tomorrow. So now China, and finally 
Brazil. So these are the four countries,   as you can see here, these are lists. So we have 
the square brackets that represent the list.   And we have four strings. And this is how we 
define or how you create a list. So now I'm going   to run this one. And to see the content, I'm going 
to paste the name of this list in now I run here,   I include only strings. But keep in mind that 
lists can have elements of different types.   So for example, one string and the other 
and integer, and then a float, and so on.   And also lists can have duplicated elements. 
So for example, I can have here, United States,   written twice. So here, for example, I can write 
United States, twice m, that's okay, because this   can have duplicate elements, but I don't want 
it that way. So I'm gonna delete it and leave it   as it is. Okay. Now, if we want to get an element 
inside this list, we have to use something called   indexing. By indexing, we can obtain an element by 
its position. So each item in a list has an index,   which is the position in the list. Python uses 
zero based indexing, that is the first element   so United States has an index zero, the second So 
India has an index one, and so on. To access an   element by its index, we need to use the square 
brackets again. So let's see some examples.   Let's start by getting the first element. 
So United States. So what we have to do is   to write the name of the list, in this case 
countries, and then open square brackets,   in inside square brackets, we have to write the 
position of this element. So it starts with zero,   so we write zero to get that first element. 
And then we run this code. And as you can see,   we get the first element. So if we write 
here, countries square brackets, one,   we get India. And if we write countries square 
brackets to we get China, and we do this,   with the number three, we get Brazil. So to 
verify this, I'm gonna print each of them.   So let's see what happens. So here print. And 
finally print this one. In now I'm going to run   and we shall get each element of the list from the 
United States to Brazil. So let's try out. So here   we have each of them, United States, the first 
one, then India, then China, and then Brazil. So   it's correct. So this is the most common way to 
use indexing, but there is also negative index,   this helps us get elements is starting 
on the last position of the list. So   instead of using indexes from zero and above, 
we'll use indexes from minus one and below.   So let's get the last element of the list. 
But now using a negative index, so we want   to get the last element which is Brazil. And 
we did it before with countries square brackets   three. But now we're going to do it with negative 
indexing. So here, I'm going to write countries   and copy and paste it here. And now I open 
square brackets. And instead of writing three,   we're going to write minus one. And these minus 
one represents the first element is Talend. From   the last position to Brazil will be minus one, 
China is minus two, India minus three United   States minus four. And that's how it works. So I'm 
going to run this one countries, square brackets,   minus one, and we will get Brazil and we got it. 
So let's do this one more time. And in this case,   I want to get United States, which is minus 123, 
and four, so it's countries minus four. So we run   this and we got United States, but now using a 
negative index. Okay, now let's see something   called as slicing is slicing means accessing parts 
of the list, as lies is a subset of list elements   is slice notation takes the form of a list. 
So the list name, and then a square brackets   and this tart, then this colon and stop this is 
Todd represents the index of the first element in   his top represents that element to stop at without 
including it in the slides. So let's see some   examples. So I'm going to use this country's list 
again, and use I'm going to copy this one, and   I'm going to paste it here. So this is the name 
of my list. And now I open square brackets. And   we're going to get, let's say, we're going to 
talk at position number zero, and then column   and let's get from zero to position number two, 
so we have to write three, because it stops   at three without including these elements in the 
position number three. So let's run this one. And   as you can see, here, we have index zero, index 
one and index two. So it didn't include index   number three, you know, let's say we want just 
the first element, so we write from zero to one.   So it's only zero and one no, because it doesn't 
include one, and it's topped at one. So here I   run, and we got only United States. So now let's 
try something different. Let's say we want to get   elements from index one to the last one. So let's 
say let me see here. We want to get from India to   Brazil. So it's one two and three. So we have to 
write four because it stops at four and we got   three. So let's write here 124 English 
we'll get Yeah, India, China, and Brazil. So   this is one way to do it. But another 
way to do it is just delete this and   leave it as it is and then run the code. And 
as we can see, we got the same result. So   every time you want to get from one position 
to the last one, you can omit the top element,   and just leave it without that element. 
So just as we did here, and the same   goes for the start. So let's say we want to get 
from the first position, so index zero to two. So   we don't include that start element, and we write 
only colon, and two. So we're on this, and we get   United States. And then we get India, because this 
is the first and this is the second. So every time   we want to get from the first element, or into the 
last element, we can omit that target and its top   elements, as we did in these two examples. Okay, 
now let's see how we can add elements to a list.   There are different methods that help us add 
a new element two lists. So let's have a look.   The first one is called append. And we're going 
to use that counters list as an example. So I'm   going to write countries just so you can remember. 
And here it's countries. And as you can see,   it has four elements. And let's say we want 
to add any country to this country's list. So   what we can do is just right here, or paste 
here, countries, you know, add, append, or that   append in here, as you can see is this method. So 
inside parentheses, we can write the new country,   we want to add to this list. So let's 
say we want to add the country Canada.   So write Canada. And now we'll run this 
code. As you can see, nothing is printed,   but it will print the counters list again, we 
see here a new element. So as you can see here,   that append method adds a new element at the end 
of the list. So this is by default at the end. But   what happens if you want to add an element in a 
different position. So here, you can use another   method, which is called that insert method. So 
let me show you here, I'm gonna copy countries,   you know, I'm going to use the Insert method. So I 
write that insert, then parentheses, and this one   accepts two arguments. The first one is the index. 
So the position of the element do want to insert.   So let's say we want these are the first position. 
And the second argument that it takes is the   new element do want to add. So in this case, 
let's say we want to add that Elements pane,   so these, another country, and it's going 
to be in the first position, so index zero.   So let's try I run this one. And again, nothing 
happens, apparently, nothing happens. And here,   if I run this country's list, again, we can see 
that there is a new element, and this element is   pain. It's located in the first position. Unlike 
Canada, that was placed in the last position.   This is one of the difference between the append 
method and the insert method. So with insert,   we can specify the position, we want to 
insert this new element, but with append,   the element is added at the last 
position. Another thing you can do is to   join two lists, using the plus operator would use 
the task operator to concatenate strings before   but you can also join two lists. So let me show 
you here. I'm going to create a new list just to   show you how it works. So my new list is going to 
be called countries underscore two. So I'm gonna   include different countries. So in this case, it's 
going to be the UK, then Germany am. That's right,   Austria. So we have three countries in this new 
list. And now I'm going to run this one. And   if we want to concatenate these first 
list countries, with the second list,   countries to We can use the plus 
operator. So here, I write plus.   And then I run this one. And as you can see, 
I got five elements from the first list.   And three elements from the second list in 
another cool thing you can do in Python is   putting these two lists inside another list, 
which is called nested list. So let's try out.   So here, I'm gonna create a new list, it's 
gonna be called nested underscore list.   In here, I'm going to open square brackets 
to create a new list. And as elements, I'm   going to write countries, which is my first list, 
and then comma, and then countries underscore two.   And this is my second list. So as you can 
see here, these elements inside this list,   the first is a list in the second is the list. 
So we have lists inside another list, which is   called a nested list. So I run this one, and then 
I paste nested underscore list, and we run and we   get here. The first is as first element and the 
second list as second element, you won't see these   nested lists so often, but you will encounter this 
a couple of times, so it's good for you to know.   So now we're going to say the opposite of 
adding an element to a list, which is removing   an element. So here, I guess, pasted the country 
slate we had before. And what we're going to do is   to remove some of the elements of this list. So 
there are different methods that help us remove   an element from the list. One of them is the 
remove method. So to remove an element using this,   we have to first write the name of the list, and 
then use that that sign and then write remove,   and write parentheses in inside here, we have to 
write the element we want to get rid of. So first,   it's United States. So write United States. 
And let's run this one. And as you can see,   apparently, nothing happens. But if 
we paste countries, here, we have   all the elements, but United States is not there. 
So as you can see, the first matching value   was removed. But you can also remove an element 
by its index. So this is accomplished without   pop methods. So I'm going to copy all of this. 
And now I'm going to paste it here. So instead of   writing that, remove, I'm gonna write that pop in 
here, I'm not gonna use the name of the element,   but its index. So I write the index. In this case, 
let's remove the last one. So it's going to be   index minus one. And what pop is going to do 
is to remove that element with index minus one,   and then returns this element. So this element is 
Canada, I didn't run this code here, so you can   ignore it. So I'm going to come in this one. And 
our reference is going to be this this list. And   to verify we use write countries, and then run, 
and here, as you can see, there isn't Canada   anymore. And that's how you remove an element 
using the pop method. But there's still another   way to remove an item using an a specific index. 
And it's that Dell. So I'm going to show you here,   del, it's the function del function. And here, we 
have to write the countries list. And then again,   open square brackets in here, write that index. 
So I write here, the index. And unlike the pop   method, we're not going to get the name of the 
element we're getting rid of, but just deleting   the element. So I run this one. And here, we 
didn't get anything. And I'm gonna print this.   So countries and that element at index zero was 
removed. So it's pain because that's the first   element so we delete it or we remove the first 
element. So we only got India, China and Brazil.   And there you have it three different ways 
to remove an element from a list. Okay,   now let's see how to sort a list. We can easily 
solve a list using the stock method. Let's create   a new list called numbers. And then sorted from 
the smallest to the largest number. So here first   I write numbers, and then open square brackets. So 
I'm going to write some random numbers. So force   four, then three, then 10, then seven, one, and 
then two. So this is my list. So I run this code.   And now to sort it from the smallest to the 
largest number, we write numbers, then sort,   then open parentheses. And by default, this 
is going to be sorted from the smallest to   the largest number. So I run numbers again, in 
here, it starts with one, and it ends with 10.   And as you can see, it's from the smallest to the 
largest number. So that's the default behavior   of the SOC method. But we can control how this 
works. So we can add that reverse argument to the   SOAR method to control the order. So if we want it 
to be descendant, we set reverse to true. So here,   again, I'm going to create again, the numbers 
list, and then write numbers. That sort   in inside parenthesis, I write the reverse 
argument in, I'm going to set it to true here.   And then I'm gonna print numbers. So here, I get 
an error, because here it I wrote number and its   numbers. So here, I'm going to add the s, and here 
s two, so run again. And here we have, from the   end here, we see that the list is sorted from 
the largest number to the smallest number.   So as you can see, the default behavior of this 
sort method is reverse equal to false. So you can   control it here, by writing reverse equal to true 
as we did here, okay, now let's see how we can   update values. And always, to update a value on 
a list, we use indexing to okay, that element we   want to update, and then we set it to a new value 
using that equal sign. So let's say we want to   update the first element of this numbers 
list. So now it's four, but we want it to be,   let's say, 1000. So we write here numbers. 
And we use indexing. So we write numbers,   the first element has index zero, so we write 
numbers of square brackets than zero, then we   set it equal to the new value we want to include. 
So in this case, I'm going to write 1,000th.   And now I'm going to print the numbers, 
please, to see the results. So run this   one. And as you can see, here, the number of 
leads we got is from the last change we made,   so the one that's taught with 10. So it's not 
this one, but this one because it's the last   one we ran. So instead of 10, we replace this one 
with 1000, because this is the first element with   index zero. So with ID, numbers, square bracket 
zero, and we update that first element with 1000.   Okay, finally, we can make copies of the list we 
created. So there are different options to create   a copy of a list. One of them is that slicing 
technique. So as you might remember, to do   slicing, we have first to write the name of 
that list, which in this case, is countries.   And then we open square brackets, then 
we're supposed to write the start and   stop. So in this case, we're not going 
to write start in a stop but only column.   So if we don't write start in, we don't write 
stop, it means we want the whole list. So   let's try this out. I'm going to run this 
one. And as you can see, here, we got the   whole list. So the counter sleaze doesn't have 
the original values, because of the changes we   made when we added and remove elements. So I'm 
going to pace the original counters list with   four original values that are the United States, 
India, China and Brazil. And here let's see the   changes in how we test it out. In as you can see, 
we got the whole list. So from the first element   United States to the last element Brazil, because 
we're slicing the whole list. So if we write here,   new underscore list, and we set this 
equal to countries with this slicing,   what is going to happen This new list is going 
to have the same values as the country list.   So I write here new list. And as you can see 
here, it has the same values. So recreated copy of   that counters list. So this is one way how 
you can create a copy. And the second way is   more straightforward, or is it more explicit, 
so is using the copy method. So we write,   again, countries the name of the list, and 
then we use the copy method. So write, copy,   and then parentheses. So with this, we create 
a copy of this list. So let's run this code.   And as you can see here, it returns the 
list. But if we assign these to a new list,   we're going to create a copy. So here, I'm going 
to write new underscore list underscore two.   So here, we assign this copy to this new list. So 
I'm going to copy this new list and paste here.   And as you can see, here, we have the values 
of this list, which are the same as the   original countries list that is here. And 
that's it. That's how you make a copy of a list.   So now let's see how dictionaries work in Python. 
In Python, a dictionary is an unordered collection   of items used to store data values, and a 
dictionary contains a key and a value. So   this is what you will often see in a dictionary. 
So here, for example, the name of my dictionary is   my underscore dict. And to create this dictionary, 
we have to use these curly braces. So we open   curly braces in inside, we write our first item in 
the first item consists of a key here on the left,   and then our value here, and it's separated with 
the colon. So here we have the key, then column,   and then the volume. And then we have here the 
second item. So the second key and the second   value. So now let's create a dictionary that 
has some basic information about me. So I'm   going to name this dictionary, my underscore data. 
And now to create this dictionary, I'm gonna open   curly braces. And the first key is going to be 
name. So I write name, and it has a value that   is my name. So I'm going to write Frank. So 
I open single quotes, and then write Frank.   And then I'm going to add a new item. So I write 
coma. And then the second key is going to be age.   And the second value is going to be my age. So in 
this case, I'm going to write my age, which is 26.   So as you can see here, the first is a strength, 
the first value is a string, and the second is   integer. So we can mix different datatypes. 
So now I press Ctrl, enter to run this   code, and we created this dictionary. So now 
I write my underscore data in here you have   the dictionary we created. So here we can get the 
keys of this dictionary, we only have to write my   underscore data that keys so this is the keys 
method. So we run this, and we get this dict   underscore keys. And the values are name and age, 
which are the keys of this dictionary we created.   So name the first key and age the second key. 
Now we can get also the values. So my name and my   age. So we just have to use the values method. So 
I'm going to paste this one here. And instead of   writing that keys, I'm going to write that values. 
And now run this and we get my name and then   my age. So next, I'm going to get the items. So 
as I said before, an item is this. So this is the   first item. And this is the second item. So we can 
say that the item is a pair of key and volume. So   we can get this by using the items method. So 
instead of writing dot values, I'm going to   write here that items and then run this one. So 
here we got the first item. So the first pair,   key and value, which is my name am well 
that key name and then my name Frank.   And then the second items so the key name, 
age and the age which is 26. Now we can add   a new pair of key value in this dictionary we 
created. So let's say we want to add my height.   So I write my data in. Let's say we want to add 
the key name height. So I write height. So we use   square brackets here. And then we set this to the 
value. So let's say it's 1.7. So I write my data,   and then square brackets, then hide inside it, 
and then equal to 1.7. So if I run this, in,   then I run the dictionary, we can see that there 
is a new item, and it's the height. So height,   column, and then 1.7. This is how you add 
a new item to the dictionary, a now we can   update this height. So let's say I'm not 
1.7, but I'm 1.8 meters. So what we can do   is to use that update method to update this 
value. So I write my underscore data. In here,   I can use the update method. So I write update, 
and then inside parentheses, we have to open   curly braces to update this new item. So 
I'm gonna write the key, which is height.   And then I'm going to set the new height, which 
is 1.8. So let's try this out. I run this, and   then let's see the values. So let's see if it was 
updated. So I ran this, and we got the height 1.8.   So it's perfect. So now let's see how we can make 
a copy of a dictionary, the same way we did before   for the lists. So to make a copy, we just have to 
write the name of the dictionary, in this case,   it's my underscore data. And then just as we did 
for the list, we can use that copy method. So   we write that copy with parentheses, and then we 
create an a copy. So here you can see the copy.   And now I can assign these to a new dictionary. 
So I'm going to write new underscore dict.   And now, I'm going to copy this one, I'm going 
to run and then I write new underscore dict.   And run this. And as you can see, it has the 
value of the my underscore data dictionary.   And something I didn't tell you when I make a 
copy of the list is that if you change the data   inside that my underscore data dictionary, so the 
old dictionary, that effect is not going to be   seen in the new dictionary. So for example, 
if we write one, that nine, and here,   I update this in the old dictionary, so here 
you can see height 1.9. And if we run this new   underscore dict, we can see that after running, 
this height, remains with the same value 1.8.   And he doesn't change to 1.9. This doesn't happen 
if you make one of these copies most people do.   So let me show you what I'm talking about. So 
most people just make a copy doing new data,   underscore to equal to my data. So this is the 
old dictionary, and this is my new dictionary.   So what happens if I run this, and then I, 
I'm going to show you the values of this new   dictionary. So this is 1.9. And if I update this 
to, let's say, one point, 95. So update here,   update here, here is one point 95. And if I 
run this new underscore dict underscore two,   we can see that the value was updated to 
and this shouldn't happen. So if you want   to create a new dictionary that works 
independently from the old dictionary,   you should use that copy method. And this is 
the same if you're making a copy of a list.   Finally, let's see how to remove elements from a 
dictionary. So just like we did with the lists,   we can remove an item in a dictionary. So 
there are different options. First, we have   the pop method. So right, my underscore data, I'm 
using the old dictionary we've been using so far.   So my underscore data, and I'm gonna 
write that pop. So this is the pop method.   So here I can write that key. So in this case, 
I'm going to write the key. Let me see here, my   underscore data the key name, so I write And then 
parentheses them name. So as you might remember,   the pop method returns this value of that key. 
Before we did with the list, and it returned   the list element, in this case, it returns 
the value of the key. So this is the key name,   it returns the value. So if we print this my 
underscore data dictionary, we see that this pair,   key value is in here. So we successfully remove 
this item. Another way to remove an element or   an item from a dictionary is using the delta 
function. So we write del, and then we write   that name of the dictionary. So my underscore 
data, and then we have to specify again, that name   of the key. So we open square brackets and open 
quotes. In here, let's say we want to delete or   remove the H key with its value. So write H, and 
we run this. And then if we print this dictionary,   again, we get the dictionary and we say that the H 
key was removed, and also its value. And finally,   you can remove all the items in a dictionary with 
a clear method. So we write my underscore data   and use that clear with parenthesis. And now if 
we bring this dictionary, you can see that this   is an empty dictionary, because we removed 
all the elements from this dictionary.   Now let's see one of the most common statements 
use in Python. This is the if statement,   the if statement is a conditional statement used 
to decide whether a certain statement or block of   statements will be executed or not. Here, you can 
see the syntax of this if statement. And as you   can see, it starts with the if keyword, followed 
by that condition. So if the condition is true,   this code here is going to be executed if 
the condition is not true. So it's false.   The code here in the lf it's going to be tested. 
So here in this LF block, this new condition   will be tested. And if this is true, this code 
below will be executed. But if it's not true,   then the else block will be tested. And here, 
this is the last block, and automatically this   code will be executed. So here one little detail 
that most beginners forget to write is the column.   So it's sometimes easy to forget, it's there, 
but you have to include it in one order things   some people miss is this indentation. So here, 
there is an indentation, you have to include after   the column. So every time you write here column, 
you press enter in you automatically. In most test   editors, you're gonna get this indentation. But 
if for some reason you don't get that indentation,   and you get something like this, you can indent 
this line by using the tab key in your keyboard.   So just press tab, and it's going to indent 
this line. So make sure you're right that   column and do include an indentation for 
each code that will be executed. So here,   here and here. So now let's have a look 
at some examples to see much better how   that if statement works. So first, I'm going to 
create a new variable. And as you might remember,   to create an variable, you have to write a name of 
this variable. In this case, I'm going to name it   age, and then you have to set it a value. So in 
this case, this is going to be a number. So I'm   going to set this age to the number 18. And now 
I'm gonna write this if condition or if statement,   so I write f, h is greater than or equal to 18, 
then column and then this code is going to be   executed. So if this is true, I'm going 
to write, print and then a message. So   if this person or if the age is equal or greater 
than 18, I'm gonna write the message. You're   and adult ng as you can see here, I'm using single 
quotes, and I wrote down pastor feet. So I'm   going to use double quotes, and everything is fine 
now. So here print, then the message and you're an   adult. So if this isn't true, I write else in then 
column and print. Here a new message, which is,   you are a kid. So let's see this again. 
So if the age is equal or greater than 18,   then we print, you're an adult. But if it's 
less than 18, we print you're a kid. So   here, we run this code English, we'll get this, 
because 18 is equal to 18. So let's run ng as you   can see, we get the message, you are an adult. 
So now we can play with this, we can change the   age value. So here, I'm going to set it to 15. So 
I ran in as you can see here, 15 is less than 18.   So this is false. And this code is executed. So 
this block here is going to be executed. So we got   you are a cape. So we can try this one more time. 
So in this case, I'm going to write another age.   So 30. And again, 30 is greater than 18. So this 
is executed, so you're an adult. So now let's add   a new block, and I'm gonna use the LF. So I write 
LF, and then h. And then greater than, let's say   13. And then column, press enter, and we got this 
indentation. And then we print another message. So   if the H is equal to or greater than 13, we write 
the message you are at teenager. So teenager. So   if it's between 13 and 17, or well, less 
than 18, it's going to be your a teenager.   But if it's less than 13, it's going to be you're 
a kid. So let's try this out. So I ride first 10.   And then we get your kit, because it's less than 
13, then we're changing this to 14. And then we   get you're a teenager, because 14 is greater 
than 13. In finally we write 20. And we get   you're an adult, because 20 is greater than 18. 
And that's it. That's how the if statement works.   Now it's time to see one of the most common loops 
in Python, this is the for loop. Python, for loops   are used to loop through an iterable object and 
performs the same action for each entry. One   example of an iterable object is a list. So we can 
look through each element of a list and perform   the same action on each element of that list. Here 
you can see the syntax of the for loop, and as you   can see, here is the for keyword, and then we have 
to use a variable, then we have to write that in   keyword. And then that iterable in this case, as 
I told you before, the most common is the list.   So you have four variable in list. I'm gonna write 
here lists so you can see much better and then we   have to write that column. And then after a 
column, it goes and indentation. So here we   have the indentation in the code that will be 
executed for each iteration here that we make   with a for loop. So to see this much better, I'm 
going to use that countries list we created before   so these are the countries list. And I'm going to 
loop through this list. So right for and then we   have to set a variable that is going to be just 
just temporarily, so this variable is going to   be called country. So this variable doesn't exist, 
we just created temporarily. So for country in and   then we have to write the name of that iterable 
which is in this case a list. So countries,   so for country in countries and then column 
and then enter in we get this indentation.   Then we say print Country. So for this 
variable in this iterable, which is a list,   print each element, this is what we're saying in 
this for loop. So we run this, in, as you can see,   each element of the list country is printed. 
So we're looping through that countries list   and printing each element. So the first is the 
United States, then India, then China and Brazil.   And this is how the for loop works. Now, 
let me show you a new function that you can   implement along with a for loop. And it's called 
enumerate. So I'm going to write here enumerate.   In here, I'm going to put this country's list 
inside this new function. So what this enumerate   function does is to enumerate each element of the 
country's list, as we loop through the list. So   I'm going to add here a new variable, and it's 
going to be i, then comma and then country. So   this enumerate will return two elements, the first 
one is going to be the number of the loop. And the   second one is going to be the element itself. So 
here, I have to print apart from the country, the   i variable that I just created here, or 
it's just temporarily here. So write print,   I, and then print country. So here, we're going 
to print here, that number of iteration in that   element. So I run Ctrl, enter, and here we get it. 
So first is the United States. In that iteration,   the first iteration with each, which is zero, 
then we get India in the second iteration,   which has one, and so on. So as you 
can see, here, the AI starts with   zero. So this is how enumerate works, it starts 
with the number zero, and it returns the number   of the loop and the element. And finally, 
let's loop through elements in a dictionary. So   let's use the dictionary we created before that 
was my underscore data. Well, this is empty.   So I'm gonna use the original dictionary. 
So here I have the original dictionary,   and it's here, so I'm just going to print 
it. So this is the dictionary in now we're   gonna loop through this dictionary. So let me 
show you here. First, we have to write for,   and then we write the key. And value because one 
item, as you might remember, is made of a key, and   the value, so key and value. So we say, four key 
coma value in, and then the name of a dictionary.   So right, my underscore data. In order 
to get the items of this dictionary,   we have to use the items method, so we write that 
items, and then parentheses, then we write column,   and we press enter. So here, we can print the 
key. And we can also print the value. So key   and value, and then we run this code, and as 
you can see here, we get the key, the first key,   and we get the volume, we get name, and we get 
Frank, and then the second key H ENDA H 26.   So this is how you loop through elements or 
items inside a dictionary. Okay, now let's   see how functions work. In Python. A function is a 
block of code, which only runs when it is called,   you can pass data known as parameters into a 
function. So here is the syntax of a function.   And as you can see here, we have first to set 
the keyword def to create this function. And   then we have to write the name of this function. 
And inside parentheses, we define the parameters   of the function that we're creating. Then we 
write column and below, you have to write the code   and every function should return something. So we 
have to use that return keyword, and then return   something like a variable for example. So now 
let's create a basic function. So first we write   def, and then we write the name of the function. 
So this function is going to do something really   simple. It's going to sum the values we pass into 
it. So it's going to be name, some underscore   values. And as parameters we said a coma B, then 
Column M, press enter, then what this function   is going to do is to add the a plus V values, 
and we're going to set this equal to x. So we   write x equal to a plus b, as I told you before, 
you should return something after we finished our   function. So we write return. And here, we're 
going to return the x variable. So write x. And   that's it. That's how you create a function. 
I ran this code, as you can see, apparently,   nothing happens, but this function was created. 
So to use this function, we have to call it so to   call this function, we have to write the name of 
the function. And then we pass some parameters in,   in this case, it's called arguments when you call 
the function, so I'm going to write down argument   one and argument three. So once you call this 
function is going to go to the function here,   and is going to set this one equal to A in these 
three equal to b. So you have one plus three,   and this is four. So x is going to be equal to 
four, and then this function is going to return   the value of x, which is four. So this is supposed 
to return the value of four, so we run this,   and we get the value of four. So this function is 
working properly. Okay, now let's see some built   in functions that Python has. Python has lots 
of built in functions that can help us perform   a specific task, let's have a look at some 
of them. So let's start with a land function,   we only have to write the word land, and then 
we open parenthesis. And as you can see here,   you better not look gives the green color to 
functions. Now let's calculate the length of   the country's lease. So I have here the conscious 
waste. And now I'm going to copy this one, paste   it inside parentheses. And what the len function 
is going to do is to calculate the length of any   iterable object, in this case, a countries list 
is an iterable object. And now I'm going to run to   calculate the length of this object. So I run this 
one. And as you can see here, dial length is four.   And this is how the land function works. Now let's 
see a different function. In this case, I'm going   to create a new list that contains only numbers. 
So I'm going to write random numbers here.   1063 81, then one there, 99. So this is my new 
list. And I created this list with only numbers to   try the max and min function. So the max function 
is this one, we write Max and then parentheses,   and this one returns the item with the highest 
value in an iterable. So my iterable is this list,   and we're going to get the highest value of the 
elements inside this list. So we'll run this one.   And as you can see, here, the maximum value 
is 99. And we can do also the mean function,   and it's going to have the opposite effect. In 
this case, we're going to get the minimum value   of this list. So we run and we get one. Okay, 
another common function used in Python is the   type function and this function give us the type 
of the object, we only have to write type in what   this function does, is to return the type of an 
object. So in this case, let's copy and paste that   country's object. And if we run this, we can see 
that this object is a list. And that's correct,   because here we created a list with square 
brackets. So that's what the target function does.   And finally, the last function we're going 
to see is the range function. This one   returns a sequence of numbers that start 
with a number and ends with another number.   So let's see how it works here. So this one 
has three arguments. First I start number,   this one, I'm going to write one, then the 
number where the sequence stops. In this case,   I'm going to write let's say 10. And then 
the last argument is the increment. So how   this sequence is going to grow by how much so in 
this case, I'm going to say that this sequence is   going to grow by two. So write two. Now I run in 
as you can see, Nothing happens, we only get the   same text here, that if we make a loop here, 
so I write for I, in wrench, now print this i.   So this is a for loop, we saw this before. And 
here we run. And as you can see here, we're   iterating over this range, and we're getting the 
elements inside this range. So the first element   is one, the second is incremented by two, so one 
plus two is three, then three plus two, five,   then seven, and then nine. And then we should 
get 11. But the last element here, it's 10. So   this sequence stops at 10. So we only get until 
number nine. And that's how the range function   works in Python. And that's it. Now, you know, 
the most common built in functions in Python.   Okay, in this video, we're going to see what 
are modules in Python. In Python, modules are   files that contain Python code, a module can 
have classes, functions and variables in even   runnable code. And to get access to a module, 
we have to use the Import keyword, this one,   and to see a module in action, we're going to see 
that oh as module, and this one comes with Python,   so you don't need to install it. So to get access 
to these always module, we have to write import   always. And that's it. We only write this in now 
let's see some functionalities of this module.   So the first one that we're going to see is the 
get current directory method. So to get access   to that method, we right always, then get C, W, 
D, and then parentheses. So this C W D stands for   current working directory. So we're going to get 
the directory where our Jupyter Notebook file is   located. So this file I'm working with right now. 
So let's run in, let's see what happens. So as you   can see, here, I have the path where the Jupyter 
Notebook is located. So this is the complete path.   And you can see it by using the get CWD method. 
So now let's see another method. And in this case,   we're gonna list all the elements in the 
folder where this Jupyter Notebook file   is located. So here, to do that, we're going 
to use the method list Dir. So this means list   directory, and I'm going to run it. And as you 
can see here, I have this Jupyter Notebook file   that is named untitled. As you can see here, 
the name of my file is Untitled. And this order   elements, you can ignore it, they are not files, 
they are just some hidden elements in my folder,   but they don't matter. So right now, the only 
file I have in this folder is this untitled file.   So this is what the list der does. So it lists 
all the elements in the folder where this Jupyter   Notebook file is located. And now let's see the 
last method, which helped us create a new folder.   So this method is called make Ders. And we have 
to write always that make the errors, and then   parentheses, and inside parentheses, we have to 
write the name of the folder we want to create.   So in this case, I'm going to name it New 
Folder. Simple as that. And now if we run,   we're going to see that nothing happens. But now 
if we use this list dir method to list all the   elements in my folder, we can see that there is 
a new folder. So here, if we compare this result   we got before with this new result, we can see 
that there is one new element. And this element   is that New Folder element, which is the folder 
we created using that make ders method. And that's   it. Those are some basic things you can do with 
the OAS module. In the following videos, we're   gonna install different libraries, packages and 
modules, so we can do even more things in Python.   In this first introduction to pandas, 
we're going to learn what is pandas?   We're going to compare pandas with Excel, and then 
we're going to learn what are pandas data frames?   So first, Pan This is probably the best tool 
to do real world data analysis in Python.   It allows us to clean data wrangle data, make 
visualizations, and more. You can think of pandas   as supercharged Microsoft Excel, because most 
of the task you can do in Excel, you can also   do it in pandas and vice versa. That said, there 
are many areas where pandas outperforms Excel. So   before you learn pandas, let me show you why you 
should learn pandas, especially if you already   know Excel. So there are some benefits that 
pandas has over Excel or Python has over Excel.   So before dedicating time to learning pandas and 
also Python, let's see what are these benefits.   So first, limitation by size, Excel can handle 
around 1 million rows, while Python can handle   millions and millions of rows. Another benefit 
that Python and pandas have over Excel is the   complex data transformation. So in Excel memory 
intensive computations can crash workbook while in   Python. When you work with pandas, you can handle 
complex computations without any major problem.   Also, Python is good for automation. While 
Excel was not designed to automate tasks,   you can create a macro or use VBA to 
simplify some tasks. But that's the limit.   However, Python can go beyond that with its 
hundreds of free libraries available. And finally,   Python has cross platform capabilities. This 
means that Python code remains the same regardless   of the operating system or language set on your 
computer. Okay, before I start writing code, let   me explain to view the core concepts of pandas. 
So we're going to start seeing the concepts of   arrays. So arrays in Python are a data structure 
like lists. So you can find like one dimensional   array or two dimensional arrays, also known as 
2d array. And the two main data structures in   pandas are series and data frames. So the first 
is a one dimensional array. Why the second,   a data frame is a two dimensional array. 
In pandas, we mainly work with data frames.   But if you didn't understand so much the 
definition of a data frame with arrays. Let me   show you another definition, this one using Excel. 
So a panda's data frame is the equivalent of an   Excel spreadsheet, pandas data frames, just like 
Excel spreadsheet, have two dimensions or access.   So there are two axes and one is the row and 
the other is the column. So the column is also   known as series. So what we seen before this 
one dimensional array series is a column this   is another name to call the columns in, in a 
panda's data frame. On top of the data frame,   you will see the name of the columns. And on the 
left side, there is the index. By default index in   pandas start with zero. That intersection of 
a row with column is called a data value, or   simply data. We can store different types of data 
such as integers, strings, Boolean, and so on.   Right now, you see on the screen, a data frame 
that shows the US states rent by population.   I'm going to show you the code to create a data 
frame like this later. But now let's analyze this   data frame. So the column names are also known 
as features. So our features here are states   population, and postal. While each row value is 
known as observation, we can say that there are   three features and four observations because 
there are three columns and four rows.   Keep in mind that a single column should have 
the same type of data. In our example, the states   and postal columns only contains strings. While 
the population column only contains integers.   We might get errors when trying to insert 
different datatypes into a column. So avoid mixing   different type of data. So now let's see that 
terminology translation between Excel and pandas.   So as I mentioned before, in Excel, we work with 
worksheets. In pandas, we work with data frames.   So the columns in Excel are also known as series 
in pandas. But we also mentioned or we also say,   often the word columns. And in pandas we worked 
with index. So the index are those numbers that   are on the left. And in pandas, we also say 
rows, we have many rows with observations too,   but rows are fine. And finally, in pandas, we 
work often with these n a n that stands for not   a number. And this is the equivalent of an empty 
cell that you might find in Excel. So that's it   for now. In the next video, we're going to learn 
how to create a panda's data frame from scratch.   Welcome back. In this video, we're going to learn 
different ways to create a panda's data frame.   So as you might remember, a data frame looks like 
this. It has columns and rows, and the columns are   series. So series are 1d array. And arrays is how 
we create a data frame. So this is the first way   to create a data frame with arrays. So these 
are arrays, we have 1d arrays, 2d arrays, in   1d arrays are basically columns, while 2d arrays 
are data frames. So usually, to use arrays,   we use a library name NumPy and NumPy is what 
is under the hood of pandas. So to use NumPy,   we have first to import NumPy. We're going to do 
that a bit later when we write code. But just to   give you an idea of what a numpy array looks like, 
here, I wrote a basic array, we have to use in P   that array to create this data frame that you see 
on the right. And well this is one way to do it.   You can also use lists, as I'm showing you right 
now. And as you can see here. And the second   option, when you create a data frame with 
lists, you don't need to use NumPy arrays,   because you're using some kind of lists arrays. 
So we're going to write that code to create   a data frame with arrays. But let's see 
the second option to create a data frame.   So the second option is dictionaries, you 
can create a data frame with dictionaries.   And as you might remember, a dictionary has 
a key and a value. So we can use the key as   column name and the value as the data. So 
the value can be a list. So this data will be   many elements inside a list. So a pair of key 
and value is known as item in a dictionary,   in this case is going to be a series 
because it's one column what we have here.   So this is the second way to create a data frame 
with dictionaries. And we're gonna see that   with code a little bit later. But now let's see 
the third way, which is with CSV files. So CSV   files are files that can be open in spreadsheets 
like Excel. And this is the easiest way to create   a data frame because we only need to read the 
CSV file and then the data frame is created.   And that's it. So now let's go to Jupyter notebook 
to create a data frame writing some code. Okay,   now we are on Jupyter Notebook. In here, we're 
going to write the code to create a data frame.   And we're going to use the three ways I showed 
you before. So the first thing we're going to do   is to import the libraries we're going to use to 
create a data frame. So that's the first line of   code. And I already wrote that. So it's here. So 
first, we import pandas, and then we import NumPy.   So import pandas as PD. PDS is convention to 
name pandas and NPWS. away to name NumPy. So   to run this code, just press ctrl enter in our use 
weight in we import pandas in NumPy. So let's see   the first way To create a data frame, so the 
first is with arrays. And to create an array,   we have to use a numpy. This is the first 
option. So we write in p, which is the   short name for NumPy. And then we use the array 
methods. So we write array, open parentheses,   and inside we write the array we want to create. 
So I'm going to create, I'm going to write   random numbers just for the sake of this 
example. So I open double square brackets.   And then let's write, let's say one and four. 
And then let's say two and five, and the last   one is going to be three and six. So each pair 
of let's call it list. Actually, they are lists   each list or percent row. So this is the first 
row or this is going to be the first row, this   is going to be the second row in our data frame. 
And this is going to be the third row. So here,   we can name these arrays, and I'm going to name it 
as data. So that is equal to this numpy array. So   I'm going to execute this code. And now we have 
this data. So we created the array using NumPy.   Now let's create a data frame with pandas. So to 
create a data frame with pandas, we have to write   pandas. In this case, I can write PD, because I 
name it like this here in my first line of code.   So I write PD. And then to create a 
data frame, we use the the data frame   method. So we write that data frame, and 
then we open parentheses. And here we have to   feel some arguments. So the first one, and 
that's what something that you always have to   include in this data frame method is that data 
because you cannot create a data frame without   data. So first, we include the data. So first, 
copy here, our array, and then you paste it here.   That's the first argument. So you can create this 
data frame as it is, I'm going to show you here,   use CTRL. And enter su as you can see, here, 
here's my data frame. But as you can see,   it's full of numbers and column names also have 
numbers and the row names also have numbers. So   to make it more understandable, we can 
rename this. This column names and row names,   or index, actually, the name of the row names 
are index. So first, we can name this index   as rows. For example, we, you only need to add 
the index argument have some writing right now.   And then you have to specify the names you 
want to set. So you have to open list. So this   first or this second argument has a form of 
a list. So the first element is going to be   the first index. So here, zero, so in case 
you don't want it to be zero, you can set here   another name. So in my case, I'm going to set 
it as row one, then Kuma to set the second index   as row two, and the third as row three. 
So now we can add also, or we can modify   also the column names, we have to use that 
column argument. And here we write it columns.   And then we open square brackets, because it's at 
least here that we're going to add it. And in this   case, we have to modify only two elements. So the 
first is going to be I want to name it, call one   and the second call two. So I'm going to write 
this one. And actually, I'm gonna name this data   frame. So I'm going to set it to a variable, 
and this is going to be equal to the F, the F,   it's the common way to name a data frame. So DFS 
stands for data frame. So I'm going to run this   code now. And as you can see here, it ran. Now 
to show the data frame I can write here DF, so df   and now we have here, the data frame. And as 
you can see here, the first row one for it's   my first my first list, and the second is the 
second row and the first column Well, that's a   serious as we've discussed before. So we have also 
the column names that we modify and the row names.   So now let's quickly see how to create a data 
frame with arrays. But in this case without NumPy.   So I'm going to copy these line of code, and 
I'm gonna paste it here, option two. So here,   I'm going to paste this because this is the base 
of this arrays with list shape. And I'm gonna just   delete this, I don't want numpy array anymore, 
just this double square brackets. So I run this.   Now, to create a data frame is the same way we 
did before. So just copy this and paste it here.   So run this, and now I can run the I can write 
df, and now execute this code. So as you can see,   we have the same result, I'm just showing you 
the second way. So you don't have to worry about   learning right now. NumPy. Okay, now let's create 
a data frame from a dictionary. And we're gonna   use lists in this example, and we're going to 
create a data frame using more meaningful data. So   in this case, to create a dictionary, I'm going 
to use two lists, the first is going to be least   name states in the second, it's going to be the 
population, and it will contain the population of   each state. So the first list is states, and I'm 
gonna write it here. And I open square brackets,   because this is a list, you know, I write some 
states in, in the US. So the first is California.   The second is going to be Texas, let me write it 
here. The third is going to be Florida, and the   last one, New York. So I quickly write it here. 
And now I'm going to create a population list.   So in this case, going to pay. So in 
this case, I'm going to paste this data,   so it pays to the population on each state, you 
know, I'm going to create a dictionary from these   two lists. So I'm gonna write the name of the 
dictionary. So the name is going to be dict.   Underscore states, then this is a dictionary, so 
I should use square brackets, sorry, curly braces.   And now I'm gonna set the name of the key. 
So the first key is states, then colon,   and now the element or the value. 
So this is states, the first volume.   And the second key and value is population, 
I'm just gonna set it to with a capital letter.   And the second is the least population that we 
have here. So with this, we create our dictionary.   So I'm gonna run these two. And now we have lists 
and the dictionary. So now we can easily create   a data frame using the data frame method that we 
used before for the first option when we create a   data frame with an array. So to do it, just write 
PD, then that data frame, and now we have to write   inside parenthesis, the name of the dictionary. 
So I'm going to copy a dict underscore states. And   I'm going to set this to add a new variable. So 
I'm going to name these DF underscore population.   So data frame about population. So now I run this, 
and here I get an error because I didn't write   data frame correctly. Here is in capital letter. 
So run again, and now everything is okay. So   now to show the data frame, I use paste this 
one here, and now Iran. So here we have this   data frame. And as you can see here, my first 
key is states is the name of my first column   in the data inside the state's list is here. 
So here is my first column or my first series,   and the same goes for population with its data. So 
here we created a data frame using a dictionary.   Okay, finally, let's create a data frame from 
a CSV file. To create a data frame from a CSV   file, we have to use the read underscore 
CSV method. So first, we write as usual PD,   that stands for pandas. And then we use the method 
so we write rate underscore CSV, open parenthesis,   and then we have to write name of this CSV file 
here, I'm going to paste the name. So it's name,   students performance that CSV and download 
this data, you can check the notes of this   video. And actually, we can have a look at this 
data before importing into pandas. It's here I   have it in Google Sheets. And as you can see here, 
we have this course of some exams, math, reading,   and writing. And we have some other data. So 
we can import all of this data, all of our   1000 rows in pandas. So all of this is going 
to be here. So here, we only have to define the   name of this data frame. So here, I'm 
going to name it DF underscore exams. So   now Iran, and to show now the first five 
rows of this data frame, we can use a method   named head that we're gonna see later. But just to 
give you an idea of this, we can write that head,   and we get the first five rows. So as you can see, 
here, we have the first five rows of this Excel   or actually CSV file. And you can see here, for 
example, the first row, it says female group B,   and math score 72. So let's check if that data 
is the same here. So we have female group B,   and math scores 72. So we have all this data 
here in this data frame. So if we want to see   all of them, all of the rows here, we can forget 
about that head. And now we have all the rows.   Well, here, we cannot see part of the rows. I'm 
going to show you how to see that part later   in this course. But now, as you can see, if we 
run these DF underscore exams, we can see like   the summary of this dataset, or well data frame 
this case, by the way, in pandas or when we work,   actually in Python, we usually call these type of 
CSV files. We'll call it data sets. And when we   read our data set, using what pandas, the 
result is a data frame what we have here,   so the CSV file, it's a dataset, and this 
when we read it with pandas is a data frame.   And that's it. These are the three 
ways to create a panda's data frame.   Okay, now it's time to see how to display 
a data frame in pandas. So here I have the   CSV file we use before to create a data frame. 
And a little detail I forgot to mention before   is that this CSV file should be located in the 
same directory where your Jupyter Notebook script   is located. So what I mean by that Jupiter not 
postscript is what we're seeing right now. I mean,   the, what we're working right now is a Jupyter 
Notebook script, this this file that we're   working right now. So what you have to do is to 
download this CSV file and place it in the same   folder where your Python or your Jupyter Notebook 
script is located in the same folder, and this is   how you're going to read this CSV file using the 
read underscore CSV method. So just make sure   both the CSV file in the Jupyter Notebook 
script is in the same place in the same folder.   Okay, now, I'm going to run these first two 
lines of codes that we've seen before. So   the first input pandas and the second reads this 
CSV file, so I run this, and now we have this CSV   file is stored into these DF underscore 
exams. This is my data frame. So now,   let's see how we can see this data frame. So the 
easiest way to see this data frame is just copy   this name this variable in our pasting here. Now 
I execute this, you know, we have the data frame.   Actually, this is a summary of the data frame 
because not all the rows are seen here. So here we   scroll down a little bit. We can see here that 
there are 1000 rows and eight, eight columns.   So here we can see all these rows and the 
columns. But as you can see here in the middle,   we cannot see the the rows, so it's until four in 
there. It continues with 995. So usually when we   work with pandas, we don't need to see that data 
one by one. So row by row. That's not how we do it   with pandas. But if For some reason, you need to 
see all the data in pandas, as you will do it here   in Excel or in Google Sheets. I'm going to 
show you a way to do it a bit later. But first,   I'm going to show you different ways how we 
usually displayed a data frame in pandas.   So the first way to do it is using the head 
method. So here, to use the head method, we only   have to write the name of the data frame, in this 
case, DF underscore exams, and then right head,   then parenthesis, then we run this, and this is 
how we get the first five rows in a data frame.   So as you can see, here, we have from row zero 
to row four, and this is how we got these first   five rows. So this is the head method in the same 
way, we can get the last five rows of this data   frame by using the tails method. So here, we only 
have to write again, the name of the data frame,   in this case, well, the same DF underscore exams, 
and then write that tails, then parentheses,   run this, and actually, I think it's tailed. Yeah, 
it's tailed in singular. And now we get this, we   got the last five rows, so it's from 995, to 999. 
So these are the five rows, the last five rows.   And now in case you want to get more rows, so 
not only the first five, or the last five rows,   you can add an argument to the either the head 
or the tails method. So I'm going to use here   the head method as an example. So here, I copied 
this, and I'm going to paste it here. So let's say   now we want to get the first 10 rows. So we right 
here inside parentheses, 10. And now we run this,   and I scroll down here, and we can see that the 
first 10 rows are here. And we can do the same   with tail. So here are right tail. And as we 
can see, the last 10 rows are displayed here.   So you can specify the number of rows that 
you want to display. And that's how you do it.   So now, I'm going to show you how to display all 
the rows of this data frame, as you will do it in   Excel or in Google Sheets. To do so first, we 
have to know how many columns this data frame   has. So an easy way to get the number of columns 
is using the Shape attribute. To get the shape   attribute. First, we write the name of the data 
frame. So in this case, DF underscore exams.   And then to get to this attribute to get access to 
this attribute, we use the DAT and then the name   of the attribute in this case shape. So now we run 
this, and we get 1008. The first is the number of   rows, and the second is the number of columns. So 
we have 1000 rows. So now to display all the rows,   we have to use that set underscore option method. 
So we'll write PD dot set underscore option.   And inside parenthesis, our first argument is 
going to be the following. In this play that Max   underscore rows. So here, we have to specify 
one more argument. And this is going to be   the number of rows we want it to to have. 
So here it's 1000 because we have 1000 rows,   and we run this. And as you can see here, 
nothing happened because we only modified   the default behavior of pandas. So if we want 
to get the data frame, we just press enter   and execute this data frame. I'm going to scroll 
down in here as you can see here, there are all   the rows of this data frame. So I'm going to 
scroll all the way down here. And as you can see,   it says 999 So all dot rows are here displayed. 
In that's it for this video. In the next video,   I'm going to show you the different attributes, 
methods and functions a data frame has in pandas.   Welcome back. In this video, we're going to see 
some basic attributes, methods and functions that   we can use in pandas. But first, let's learn what 
are each of them. So first, attributes are values   associated with an object and they are 
referenced by name using that expression.   So to get to an attribute, we have to use the 
DAT sign. So for example below you can see that   we have a data frame named df and to get 
columns, we have to use that that columns. So   columns, it's an attribute. And that's how 
we get this attribute of this data frame. So   now we have a function. A function is a group of 
related statements that performs a specific task.   So we've seen functions before. In Python, we've 
seen some Python built in functions like the max   that gets the maximum value of a list, or main 
that gets the minimum value or length that gets   the length of the list. So those are some Python 
built in functions that we can use in pandas   to. And finally, methods are functions which are 
defined inside a class body. So we haven't talked   anything about classes, because it's not the main 
topic in this course. So just keep in mind that   functions are inside a class. So when the creators 
of pandas built pandas, they use many classes.   And those functions inside some classes are 
known as methods. So for example, below,   you can see the head method. And we've seen also 
the tail method and some other methods. So far,   as a rule of thumb, when we use methods, we have 
to write the parentheses. But when we want to get   access to attributes, we only write that that 
and the name of the attribute. So the methods,   it's with that in parentheses, and the attribute 
is with only that in the name of the attribute.   So enough talk now let's write some code in 
Jupyter Notebooks. So here, we're going to use   the same CSV file we use in the previous video. 
And we import pandas, as we did before, then we   read this CSV file with a read underscore CSV 
method. And now we show that data frame simply by   writing the name of the data frame, so we've seen 
this before, I'm just reminding you, now we'll   see some basic attributes, methods and functions 
that we can use in pandas. So first, let's check   some attributes of this data frame. So first, 
I'm going to copy the name of the data frame.   And now let's check. So the first attribute, it's 
going to be the shape. So we've seen this before,   I believe. And to get to the 
attribute, we write the dot,   and then we write the name of the attribute. So 
its shape. So DF exams, that shape and we get the   name of the attributes. The first is the number of 
rows, and the second is the number of columns. So   that's good, the next attribute, the next 
attribute is going to be that index attribute. And   as you might expect, we have to write only that 
name of the data frame, then that and no index.   And this is how we get the index of this 
data frame. So as you can see, this has   some form of range, arranged, as you might 
know, has three arguments. And actually two   are necessary. The first is the start, 
in this case, it starts in zero.   And the second is this top, so the last element 
is tops at 1000. So this is true, because here,   my data frame starts with zero and, and finishes 
with 999. Well, it's 1000, because tops one before   1000. And here it increases by one, so 012 and 
three, and so on. So a step is one. So this is my,   my index attribute. So now let's continue. And 
now let's get access to the column attribute.   So to do so we write the name of the data frame. 
And then we write the name of the attributes. So   in this case, column, it has to be written with 
S, so in plural, so we run this and we get the   name of the columns. So as you can see here, we 
have eight columns, the gender, race, ethnicity,   and so on. And we can use this attribute 
even to modify the name of the columns,   but we'll see that later. And now let's see how 
we can obtain the data types of each column.   To do so we have to use the D types attribute. So 
we write well the name of the data frame again,   and then D types. And this is going to give us the 
type of each column. So the gender is object and   actually from the gender to the test preparation 
course our objects while the math scores   reading score and writing score are integers. So 
numbers. By default, anything that says object is   some kind of string. So I'm going to bring this so 
you can see much better. So here is the data frame   again. And as we've seen before, from gender 
to test preparation has that type object in,   as we can see here, all of them are strings. So we 
can say that objects are the same as strings here.   And also anything that has a score 
here represent some kind of number.   So that's why we get here integers. So in 64, so 
these are the most common attributes in a panda's   data frame. Now, let's review some methods. 
So first, let's see the first five columns.   And as you might know, it's with a hat method. So 
we only write the name of that attribute, sorry,   the name of that data frame. And then we write the 
head method, so head and parentheses. So we run   this and we obtain the first five rows. So we can 
also obtain some summary, input the data frame by   using the info method. So here we write the name 
of data frame info, parentheses, and execute this.   So here, we have some information about this 
data frame. And here, we have, again, the data   type here, and also how many rows are non null. So 
as you can see here, all the data that we have in   this data frame are non null. So there isn't any 
empty data here in this data frame. Okay. Now,   if we want to get some basic statistics of a data 
frame, we have to use that describe method. So we   write the name of the data frame in right describe 
parentheses, execute this, so we run this code,   and we have some basic statistics. So first, 
the count. So this indicates how many rows   each column has. So each of them have 1000 rows, 
then we have the mean. So it's basically they   assume each of the data here, that numeric data 
and then divided by 1000, because there are 1000   rows, then the standard deviation, the minimum 
value, for example, in math score, the minimum   value was zero, then 25% represents 
the percentiles. So this is q1 25%,   q2 is 50%. In q3 is 75%. Then we have the maximum 
value on each score on each exam. And we see that   the maximum score is one candidate, and each of 
them to the describe method is a useful method   whenever we want to get some basic statistics of 
the data frame, especially of the numerical data   that we have in our data frame. Okay, now let's 
see some functions that we can use. In pandas,   we can use some built in functions that Python 
has in pandas, for example, if we want to get   the length of a data frame, we only have to write 
land, and then inside parenthesis the name of the   data frame. So we run this, and we obtain that 
the length of this data frame is 1000. Actually,   the length of a data frame indicates only the 
number of rows. So here I made a mistake is rows.   And this is how we obtained the number of rows 
of data frame. So also, we can use other built in   functions that Python has like the max function, 
so write Max them the name of the data frame,   we run. In this case, we didn't get anything, 
anything meaningful because we get like a string.   But if we write here, the index and we write 
Max, as you might remember, if we use this,   this attribute, we're going to get the list 
of index. So if we use the max function,   we're going to get the maximum or the 
highest index here, so run and is 999.   So we can also get the lowest index of a data 
frame. We only have to copy this and instead   of writing the max function, we write min. So in 
this case, we get the minimum index n is zero. So   now we can obtain the data type of the data 
frame. Well, the data frame has data frame type,   but we can verify that using the type 
function. So we write type, then, sorry,   write only the name of the data frame. And we run. 
So here you can see, the type of this object is a   data frame. And finally, we can use common 
function that is the round function. So we   write only round. And this has two arguments. 
So first, the object that we want to run,   and in this case is our data frame. And the 
second argument is the number of decimal   points that we want to have. So in this case, 
I want two decimal points. So we'll run this.   And we're not going to get this number of decimal 
points in this particular example, because the,   the numerical data we have here, it's integers. 
So they are not floats. So this doesn't have any   effect. But if you have a data frame with float 
numbers, you can round those numbers using the   round function. And that's it. These are the most 
basic attributes, methods and functions that we   will see often in pandas. Alright, now it's time 
to learn how to select a column from a data frame.   So here I have the same CSV file we've 
been using in the previous videos. And   well, let's import pandas, and let's read this CSV 
file. So I have this in the same data frame. And   I'm just showing the first five rows. So now to 
select one of the columns of this data frame, we   have two options. So let's see the first option. 
The first option is using the square brackets.   This is the preferred way to select a column in 
pandas. And let's see how to select that gender   column. So the first one here, so the first thing 
we have to do is to write the name of the data   frame, in this case, DF underscore exams, and then 
open square brackets. So I open square brackets.   And now we have to write the name of the column. 
So we open quotes in here, I'm going to copy   the name of this column, and I'm going to paste it 
here. So we have here, the name of the data frame,   and then the name of the column we 
want to select. So now we press Ctrl,   enter to run this code. And as we can see, 
we have the first column of this data frame.   So here we have this in, as you might expect, 
this is an array. So this is a 1d array. And as   we discussed before, in previous videos, 
1d arrays are series, so we can verify   if this is true, so we can do this with that 
type function. So I'm going to copy this   column this selection. And now what we're going to 
do is to use the type function, so we write type,   then open parentheses, and then inside 
parentheses, we write the object we want to   evaluate. So in this case, is this. And now we run 
this. And as you can see, here, we get a series   and series, just like pandas, data 
frames have attributes and methods,   so we can access those attributes and methods. 
And actually, the attributes and methods between   a series in a data frames are very similar. 
So for example, if we want to get the index   attribute of this series, we only 
have to write that name of the series,   and then write that and the name of the attribute. 
So index, so we'll run this, and we get this   index in form of a range that starts with 
zero and ends with 1000. So another method   that's sure pandas in series is the head 
method. So we can also get the first five rows   by writing that head, and parenthesis. So as 
you can see here, we get the first five rows   of this series. Alright, that's it for the first 
syntax. This is my favorite syntax. And actually,   most people use it because it's the most practical 
in our time to see the second syntax to select   a column from a data frame. So this syntax 
involves writing that that sign, which is here.   So let's say we want to get the same gender 
column, so we write the name of the data frame,   followed by that and the name of the column 
so gender, in this case, we don't need to open   quotes. And we don't need the square brackets. So 
we run this code, and we get the same series. So   it's here. And probably now you might be thinking 
that this is more practical than the first syntax.   But this syntax has some pitfalls. So now, let 
me show you here. So what if you want to get   one column that has two words, for example, 
what if you want to get, let me show you here.   This column that has as name math is core. So now 
let's try to get access to this column. I'm going   to copy this column name. And now scroll down. 
And now let's try. So I'm going to write first   the name of the data frame. And now the.so. To get 
access to this, or to select this column, we have   to write the column name. So this is the column 
name. But as you can see, if I run this, we get an   error. Because Python doesn't work like that. In 
Python, when we have two words, we usually add as   underscore. So that's how Python understands this, 
that this is a variable. But if it's like this,   Python will not understand what you're trying 
to do. However, if you use the first syntax,   so the square brackets doing have this problem. So 
let me show you here. Now I'm going to write this.   I'm going to copy it now I'm going to 
paste it here. And instead of having   this only dot notation, I'm going to open 
the square bracket. So open square brackets,   and then add the quotes. So as you can see here, 
the column names has a string type in Python know   that this is a string in now, if you delete this 
dot sign in, you execute this, you get this column   without any error. So these one of the bandages 
that the square brackets has over the that sign,   and that's it. In this video, we'll learn how 
to select one column from our data frame. And   in the next one, we're going to learn how to 
select two or more columns from a data frame.   Okay, in this video, we're going to learn 
how to select two or more columns from a   data frame. So as usual, we're going to start 
by importing pandas and reading the CSV file   we've been using so far. So we execute these two 
lines of code, and we get here that data frame.   So what we're going to do in this video is to 
select two random columns from this data frame.   So first, let's pick some columns. So I'm one 
to select the gender column and also the math   score column. So to select these two columns, we 
have to use that square brackets again. So here,   in this case, we have to use two square 
brackets to select two or more columns.   So to do this, we have to write first the name 
of the data frame. So it's DF underscore exams.   And now we open square brackets, so we write 
one and two twice. So we have two pairs of   square brackets. In inside, we have to write 
the name of the columns we want to select. So   we said that we wanted the gender column, so we 
write gender. And the second column that we chose   was that math score. So I open these quotes, 
and now I write math score. So here, I have this   two columns. And by the way, the order that 
we write these columns is the same order that   we're going to get that data frame, I mean, we 
can define the order of the columns inside this   square bracket. So here, we're saying that 
first is the gender column. And second,   it should be the math score column. So now, let's 
run this. And as you can see, here, we obtained   first the gender column in second math score 
column. So here, we can see that it's data frame,   and there are 999 rows. So now, we can verify that 
this is actually a data frame by using that type   function. So let's check if this selection is 
a data frame. So now I'm going to copy this   in here. Let's check out the data type of this 
selection. So here I paste it in. Now we use the   type function, we open this parenthesis, and now 
we execute this code. And as you can see here,   we get that this is a data frame. So here one 
little detail I want to tell you, is that when   we use these two square brackets, or two pairs 
of square brackets, we're always going to get   a Data Frame. But when we use only single pair of 
square brackets, as we did in the previous video,   we get a series. So one pair of square brackets 
is for a series and two pairs of square brackets,   it's for a data frame. Okay, now to 
continue with the video, I'm gonna   select two or more columns using these two pairs 
of square brackets. So now let's choose the   columns that we're gonna get. So in this 
case, I'm going to get that gender column   and all the scores that we have here. So the 
math score, reading score and writing score.   So to do so first, I'm going to copy this first 
selection with it, to have it as a reference.   And now I'm going to paste it here. So here, 
so far, we have two columns. So let's add the   two remaining columns. So here, an easy way to, 
to write these columns, it's just by copying this   in the data frame in here, we can paste it. So 
instead of writing those names, we can just paste   it here. Now I delete and I put it inside quotes. 
So here inside quotes, and here we have it.   So here, as I said, before, we can change the 
order of the columns, we use have to, for example,   here, I cat, this, and let's say we want to 
have the writing score in the beginning. So   here, I paste writing score. And now what 
we're gonna get is first the gender column,   then raw writing score column, and then the math 
score and reading score columns. So now, let's   run this code. And as you can see, here, we have 
this data frame in the order that we defined here.   Okay, now, you might be thinking, if there is 
a way to select two or more columns using that,   that sign, so let's check if that's possible. 
Here. For example, let's say we want to get   the gender in the math score column using the dot 
notation. So here, I have it. And as you can see,   here, this doesn't look right, because it 
you have two strings separated by a comma,   but you don't have a list, you have square 
brackets, this is probably gonna fail. So   let's check, I'm going to run this code. 
And as you can see, here, we get an invalid   syntax. So it's a syntax error. So as you can 
see, we cannot select two or more columns with   that sign. And this is one of the disadvantages 
that that sign has over the square brackets.   This is why most people prefer to use the square 
brackets instead of the dot notation. And that's   it for this video. In this video, we learn how 
to select two or more columns from a data frame.   Okay, in this video, we'll see different ways 
to add a new column to a data frame. So here's   the same students performance data frame. And as 
you can see, we have three columns with scores,   math score, reading score, and writing score. So 
let's say we want to add a new score. So in this   case, let's add our language score. So to add a 
new column in spreadsheet, like Google Sheets,   or Microsoft, Excel, will simply insert a 
new column. And that's it. But in pandas,   we have to use different methods, or different 
ways to allow us to insert a new column.   So let's see how to do it here. So first, 
let's add a new column with a scalar value.   So a scalar value is simply a single value. And 
in this case, it's the column is going to have one   single value, so all the rows is going to have the 
same value. So to do so, we're going to have to   select this imaginary column because this column 
doesn't exist so far. So what we're going to do   is to select a column, as we will do with any 
other column. So first, we write the name of   the data frame, in this case, DF underscore exams. 
And then we open square brackets and open quotes,   as we will do in any column. So here, instead of 
for example, writing math score, I'm going to copy   this. Instead of selecting math score, we have to 
write the name of the column we want to create.   So in this case, let's write language score. So 
this is a new column, we want to create a now   we have to assign to this new column, we have 
to give it a new value or a new scalar value.   In this case, I'm going to add a value of 70. 
So now if we run this code, we're going to see   that nothing happens that if we now show the 
data frame, we're going to see that we have   a new column and this column is name, language 
score. And value that this column has is   the same value. So it's 70, in all its rows, so 
we have 70 in row zero, and if we scroll down,   we're gonna see that it's 70 in older rows, so 
even in row 999, but it's a bit weird that in an   exam, you will have all the students with the same 
score. So what you will usually do is to add some   different values to this column. So to do 
this, we have to use arrays into great arrays,   we have to use NumPy. So here in the second way to 
add a new column, we're going to use arrays. So in   this case, we have first to see how many rows this 
data frame has. So in this case, it has 1000 rows.   And this is important because the number of 
rows has to match with the number of the rate   we're going to create. So let's create this array. 
And first let's import NumPy. So we write import   NumPy as NP. So we run this code. And now 
we import NumPy. So now we have to create   an array of 1000 elements. And to do so we're 
going to use a method called arrange. So it's   written like this, our range. And this gives 
us an range of numbers that start with the   first argument, and that I'm going to write 
zero. And the last argument that in this case,   it's going to be 1000. So these are the 
limits of my range. So I execute this.   And as you can see, here, it starts with zero 
and till 1000. So to verify the length of this   range, we have to use the length function. So 
as you can see, here, the length is 1000. So   the rate has 1000 elements. So now I'm going to 
assign this to a new variable. And I'm going to   name this variable language score. So language 
underscore score. So we execute this in here,   I was planning to see the length of this array. So 
I quickly do it here, as we did before, so land,   you know, we count the length of the array. So 
now we have to add a new column to a data frame   with this array. And to do that, we have only 
to use the same way we did before. So first, we   write the name of the data frame. And then we make 
the selection. So this selection is going to be   with the new column Well, in this case is not 
new, because we already created it. But let's   imagine it's a new column. So it's language score. 
And now we have to set the array to this column.   So we write language score here, and we set it 
to this new column. So now to see the results,   we only show this data frame. And as we can see, 
here, we have a new column. And this new column   starts with zero, and it ends with 999. 
So it doesn't have a single value anymore,   but now has a range of values, you know, there 
is a little detail we have to take care of.   So it's course are supposed to be between zero 
to 100. And we have here from zero to 199. And   also here we have a sequence of numbers, so it's 
from zero, and then one and it increases by one.   And usually in scores, you will see that students 
have random scores. So we have to create here,   an array with random numbers. And to do that, we 
have to use NumPy again, but here we have to use   a different method. In this case, the method is 
named random dot Rand i n t. So let's write it   here. np dot random that ran. And then i NT. So 
the first argument is the lowest value of these   random numbers. And by the way, these are random 
integer numbers, because it's course are usually   integer numbers. And in this case, I'm going 
to say this, this to one. And the second score   is the highest number or value in these random 
numbers. And I'm going to set it to 100.   And the third argument is the size. In this 
case, we want an array of 1000 elements,   so we set the size to 1000. Now we execute 
this, we run this and I'm not going to see this   rate again. I'm just going to check that it has 
the land we want to By using the length function,   in here we have 1000 elements. So now let's create 
a new variable and store this in a variable. So   here, this is going to be i n t, and then language 
underscore score. And this is going to be our   new variable. So here one little detail you should 
know is that the first argument is inclusive. And   the last one is exclusive. So this means that if 
we here, let's say, we get the minimum value of   this new array, we're going to get that minimum 
value is one, because this first argument is   inclusive, which means that it can be included 
in this new array. However, if we print now,   the maximum value of this array, we're going to 
get that one candidate is not there, because it's   exclusive, which means that this second argument 
shouldn't be included in this array. Okay,   finally, let's insert these random integer 
numbers in the new column that we created. So   we have to just use the same way we did before. 
So here, I copy, and now I paste it. So here,   instead of assigning this language 
underscore score, I'm going to use this IMT   language underscore score. So here, I'm going 
to run this code. And as you can see, here,   we have this, the same column. And 
we have now this data that is random,   random integer numbers from the rows zero to 
the row 999. So now, these new data looks more   like a scores like real scores, because these are 
random numbers. And these are between zero and 99.   And that's it. Now, one more little detail 
I want to share with you is how to create   random float numbers, because before we created 
a random integer number, but if for some reason   do want to create random float numbers, 
there is a way how to do it with NumPy.   So we only write in ping, then that random, 
then that uniform, and arguments are the same.   So the minimum value and then the maximum value, 
then the size, which is 1000. Then you run this,   and well, it's similar to the one we got before. 
But now we have float numbers. And that's it. In   this video, we'll learn different ways 
to add a new column to a data frame.   Alright, now it's time to see some operations we 
can perform on data frames. So here we have the   same data frame DF underscore exams. And here we 
can apply some common operations to the numerical   columns like math score, reading score, and 
writing score. So let's see how to do this   in pandas. So first, we're going to see how to 
make operations in columns. So our first task is   to calculate the total sum of a column. So let's 
pick first our math score. And let's calculate the   sum of this column. So to do that, we have first 
to select a column. And as you might remember,   to select a column first, we have to write the 
name of the data frame, in this case, the F   underscore exams, then we open square brackets 
and then write either single or double quotes,   then we have to write the name of the column. 
In this case, it's this one match score.   This is the column we want to select. And now 
instead of selecting, we're going to perform   operations. So in this case, I want to calculate 
the total sound of this column, and we have to   use the sum method. So we write that sum in 
parenthesis. And this is how you calculate the   total sum of this column. So to verify this, 
we run this code, and here we get 66,000.   And this is the total sum of this math column. 
Great. Now we can make some other alterations do   will do in Excel, for example, we can calculate 
the number of rows using the account method.   So here, we can easily do that. I'm just going 
to copy this one. And now instead of writing the   sum method, we write count. So here count and 
now let's see. So we see 1000 rows. And yeah,   this is correct because these data Has 1000 
rows. So now we can calculate the mean of this   math score column, we have to copy this one, now 
paste it. And instead of writing count, we have   to write mean. And here we got the average value 
of this math score column. So to get the average,   we have to sum all the rows in this math score 
column, and then divided by the total number of   rows, in this case 1000. And this is how you 
get this mean value, then we can get other   other operations using the method. So here, for 
example, we can get the standard deviation by   writing STD. So we execute this, and the standard 
deviation of this math score column is 15 m,   we can get also the maximum and minimum volume. 
Let's do it quickly here. So first, the max,   and then the main value, you can actually do it 
with Python built in function. But we can also   do it with methods. So here I ran in as you can 
see, here, the minimum volume of the math score   is zero, and the maximum is 100. Okay, now I'm 
going to show you a quickly way to make the same   calculations. Using that is quite method. I think 
we saw that it's quite method in previous videos,   but in case you don't remember it, I'm going to 
write here, the name of actually, we only need   the name of the data frame with, we don't need the 
name of a specific column, we only need the name   of the data frame. And now we can use the describe 
method. So write that describe with parenthesis.   Now we got like a summary table with some 
important statistical values. And here we have   the account that mean the standard deviation, 
the minimum and maximum value. And as you can   see here, we get all of this with one method. 
Okay, so far, so good. Now, instead of making   operations in columns, we're going to learn how to 
make operations in rows. So now let's calculate,   let's say, the sum of the math score, reading 
score and writing score. To do so we have to   make some selections. And in this case, we have to 
make some independent selections. So to show you,   I'm going to copy the name of these three 
columns. I copied it. And now I paste it here.   Now we have our math score, reading 
score and writing score. So now let me   delete that sign. And now we have to make some 
independent selections. So first, we write the   name of that data frame. So DF exams. Now to make 
the selection, we open square brackets in quotes.   So now, let me do this quickly in the orders. 
Now here, so I open a square brackets. And now   let me do it here too. And now it's ready. 
So here we made some independent selections,   in order to make to calculate the sum in a 
row, we have to use the plus sign. So here,   the plus operator, we have to write it here 
and here. So basically, here, we're making   some in each row. So to verify this, we run this 
code. And as you can see, here, we got the sum of   the scores column. So here, let's verify fast 
the sum of the first row. And it's 72 plus 72,   plus 74. So 72 with 72 is 144. And with 74 is 
218. So here we have it. It's correct. So now,   let's do something else. So now instead of just 
summing these three rows, or actually these three   columns, what we're going to do is to calculate 
the average to get like an average score. So here,   let me copy this in here, we're going to calculate 
the average by summing this and then dividing this   by three. So this is how we calculate the score. 
In our let's assign this result to a new column.   To do so we only write equal in them. As 
you might remember from previous lessons,   we have to add a new column by writing 
the name of this column. So we do that   writing the name of the data frame, and then 
making like a selection so we open square   brackets, then open quotes in here we write 
the name of the column that we want to create.   So this is same as we did in previous lessons 
where we added a new column. So in this case,   I'm going to name this new column as 
average. And I'm going to execute this   in our to verify that this new column was created. 
I'm going to show this data frame here. Below,   in here is our data frame. So now, in the last 
column, you can see that there is an column named   average, it has the average value of this math 
score reading score and writing score a Now here,   we can control the number of decimals, we can 
just use the round function and write the number   of decimals we want to get. So in this case, I 
want only two decimals. So I run this. And as you   can see, here, our data frame looks much better, 
because we only have two decimals. And that's it.   In this video, we'll learn different ways to make 
operations in columns and rows on data frames.   Alright, now let's have a look at the value counts 
method. So so far, we have seen how to count the   number of rows in a data frame. So for example, if 
we want to count the number of roads in the gender   column, we either use the length function, so we 
write land, then the number or the name of the   data frame. And we only have to write the 
name of the column. So as you might remember,   this gives us the number of rows. And we can also 
use that count method. So here we write count. And   we get the number of rows that what if we want to 
count the gender elements by category, so female,   or male? What if we want to know how many female 
in how many male elements are in this gender   column. So this is when the value counts comes 
in handy. So we can use this method to count each   category of the column. So to use this method, 
we only have to write the name of the data frame,   followed by the column that we want to count. So 
in this case, is that gender column. And then we   have to use the value underscore counts method, 
as you can see here. So now we execute this.   And as you can see, here, we have not only that 
total rows in this gender column, but now it's   divided by category. So we have that there is 518 
females and 482 males. So this is how the data   is spread in the gender column to now we can do 
more with the value counts method. So we can get   the percentage that each category represents in 
the whole column. So here, I'm going to copy this   in now to calculate the percentages, also known 
as relative frequency, we have to add an argument   name normalize. So we write normalize, 
equal to true. And then we execute this,   as we can see here, female represent 51% of 
the total observations in the gender column,   while male only represents 48% of the total 
observations. So as you can see here, the   value count method is useful when you want 
to have a look at the data by category. Okay,   now let's see another example. And in this case, 
let's pick a different column. So here, I'm going   to choose this parent table level of education 
column. I copy this. And now let's calculate,   let's count the elements by category. So here, 
I'm going to write the name of the data frame   the exams. You know, I open square 
brackets quotes in here, I paste this   column. Now to count the elements by category in 
this column, we use the value underscore count   method. So we run this code. In here you can see 
how the data is divided in this column. So most   people have some college level of education, while 
just a few people have a master degree. And now   if we want to get the percentages that represent 
each category, we again use the Normalize   arguments. So we write normalize equal to true 
and now we're going to get that percentages.   So we can see the percentages If we want to 
round these to two decimals, we use the round   method. So we write that round parentheses 
in our two decimals. And as you can see here,   we round it to two decimals. And that's it. Now 
you know how to use that value counts method.   Okay, in this video, we're going 
to see how to source a data frame   using the sword underscore values method. First, 
let's import and read the CSV file that we've   been working with in this tutorial. And now 
let's store the data frame. So here we have the   data frame, as you might remember, it's, it has 
these three numerical columns. And now I'm going   to swirl it using one of these columns. So let's 
use the sort underscore values method. And first,   I'm going to write the name of the data frame, 
which is dF underscore exams, and then right   sword underscore values. Now I open parentheses. 
And now I can use this help here. And as you can   see, the only mandatory argument is by so we can 
use this one by and this one, we have to specify   the name of the column we want to sort by. 
So in this case, I want to sort by that   math score. So I'm choosing this numerical column 
to start with. So I'm going to write math score,   actually, I'm going to copy this one, and paste 
it here. So my math score. And sorting this data   frame is as simple as that. Now, we can run this 
code, as you can see, here, the data frame was   sort ascending by default. So it starts with 
zero, and it ends with 100 in the match score.   So this is how the source and the score values 
behave by default. And here one little detail,   you don't need to specify the byte word, we can 
omit it. And we run this in as you can see, here,   it still works. So here we can modify that default 
behavior of the source anger score values method,   we only have to add a new argument in is that 
ascending argument. So let me show you here. I'm   going to copy this one first, and show you here. 
So in this case, we're going to sort these sending   by the same column, so we only write, comma, 
and then we specify the sending arguments we   write a sending equal to, and here, I want to 
show you something in this little help here,   your the sending is set to true by default, 
this means that is ascending by default,   but we can change this default behavior by setting 
ascending equal to false. And that's what we're   going to do here ascending equal to false, so it 
means descending. And now I'm gonna run this one,   and as you can see here is sort descending by the 
math score column. So here, it starts with 100.   And it ends with zero. But that's not all, we can 
do much more with a sort underscore value method.   So first, I'm going to show you here how to 
sort by two different columns. So here, let's   copy and paste these one. So in this case, we're 
going to sort descending by multiple columns.   So instead of writing only math score, we're going 
to add here, one more column is going to be that   reading score column. So here, I copy this one. 
I'm gonna copy and paste it here. But first,   we have to add the square brackets, 
because as you might remember, when we   write two or more columns, we need the square 
brackets. Now I write comma, and I paste   this written score. Now I add quotes. And that's 
it. That's everything you have to do to sort   by multiple columns. Now I'm going to run this 
one. As you can see here, it was sort, descending,   first by the math score column, and then by 
the written score column. So the priorities   are set here in the list that we include here. 
So first is the math score column first priority,   and the second priority is the reading score 
column. And that's what you can see here.   Now I'm going to show you a little detail here. 
Let me copy the DF underscore exams. And if I   print this one, you can see that the changes we 
made weren't updated. So this here, the math score   column has the original values. This happens 
because the sword underscore values method,   like many other pandas method on Create a copy of 
the data frame. So here we obtained a copy. This   one is a copy, but it doesn't update the values 
of the data frame unless we add a new argument,   which is the in place argument. So I'm going to 
show you here. But first, I'm going to delete this   tf underscore examples. And now I'm going 
to copy this one, and show you how to update   the values of this data frame. So here, I'm going 
to copy, those are the same values. But now I'm   going to add a new argument, which is the in 
place argument. So here we're right in place   equal to, and now I'm going to show 
you the default value. So here,   the default value of employees is false. 
This means don't update the data frame,   but only create a copy that if we set it to 
true, it means update this data frame. So here,   I'm going to set it to true to update the data 
frame. So here writes true. And now I run this,   in apparently nothing happens. But if now we 
print that DF underscore x times data frame,   we're going to see that we have that data frame 
sorted. In case you don't want to add that in   place argument. And you want to update the values 
of that data frame, you have another option that   we used before, which is overwriting the values 
of this data frame. So for example, you can only   delete that input argument and write df underscore 
exams equal to this. So this is overwriting the   values. But in this case, we're not going to do 
that, we're going to add that in place argument,   as you can see here, finally, we're gonna see how 
to sort but now not with numerical data met with   text. So as you can see, here, we before a sort 
this data frame, by the math score column, in this   one has this numerical data. But in this case, 
we're going to solve it by their race ethnicity,   which has this text, so we're gonna sort 
this one. So first, we were supposed to get   group one, and then Group B, C, D, and so on. 
So let's do this here. I'm going to scroll down.   And first, we have to write the name of the 
data frame, followed by the swore underscore   values method. And now specify the name of the 
column. So here, I'm going to copy race ethnicity,   here, let me copy here, and it's done. Now I 
have the name of the column, I'm going to set to   ascending to true, I know that new argument 
we have to add to sort, this is that key.   So add key, then equal to in this case, we're 
gonna use that lambda function. I'm not sure if   you're familiar with the lambda function. But it 
works similar to an average function we've seen   before in the Python Crash Course. But in this 
case, is going to behave a little bit different.   So let me show you here. First, you have to use 
the lambda keyword, so we write only lambda. And   now we should write the object that is supposed to 
return. In this case, I'm going to write the call   that stands for column. And then we have to write 
a column and specify the operation we have to   make over this variable. So in this 
case, I want to write a column or call   and then access that a string attributes. So I 
write that str, and then use that lower method.   So what we're saying here is get the string values 
of the column and then transform it to lowercase.   So here, we get that textual data in 
lowercase. And with these three arguments,   we're saying, sort the values inside the race, 
ethnicity column, and sorted, ascending, and then   sort the textual data of this column in lowercase. 
So here, we have this a, b, c, d, e, in uppercase,   but we're going to get it in lowercase and sorted 
by this text data. So now let's run this one.   And let's see the results. So as you can see 
here, we have this race, ethnicity column,   and it's order ascending. So here, we got the 
A and B and C and D, and so on. And that's it.   These are the different ways to store a data 
frame using the sword underscore values method.   Welcome back. In this video, we're going to 
see different ways to make pivot tables. If   you're an Excel user, probably you 
make many people tables in the past.   In pandas, we can also make pivot tables. And 
in this case, we use two different methods,   the pivot method and pivot underscore table 
method. In this video, we're going to see   the difference between the two of them. So 
first, let's see what's the pivot method.   So the pivot method, reshapes data based on 
columns values in it doesn't support data   aggregation. So this means that this is not the 
regular pivot table you'll see in Excel. Because   you can only reshape data with a pivot method, and 
you cannot do anything else. To explain you better   what the pivot method does, I'm going to show you 
an example. So here we have a little data frame.   And this one has six rows and four columns. As 
you can see here, there are many duplicate values.   For example, in a column foo, that one value is 
repeated, at least twice. And the same goes for   the two value also in the column bar, you 
can see that the a, b and c is duplicated.   So when we have this type of data frame, we can 
reshape it to have a different view. And to make   a better analysis. In this case, we can use the 
pivot method, as I'm going to show you right now,   you only have to write the name of the 
data frame, followed by the pivot method,   and then specify three arguments. So the first 
one is the index. In this case, I'm going to   reshape this data frame to send that column 
food as an index. This means that the column foo   will be in the position where is right now 
the numbers from zero to five, on the left.   Next, you have to define that column. So these 
are the new columns that we're going to see   in our new data frame, the one that we're going to 
reshape, so in this case, I'm selecting the data   inside the bar column as new columns. This means 
that A, B and C will be the new columns in our new   data frame. And finally, we have to choose the 
values we wish to show in this new data frame.   So in this case, I'm choosing the best column. 
So all the values inside there will be shown   in our new data frame. So this is the column 
that I'm selecting. And now I'm going to show   you the result of this pivot method. So here 
it is. And as you can see, here, we have the   foo in the index, as I told you before, and A, 
B, and C, that are data from the bar column. Now,   our columns in this new data frame, also all the 
data inside this bass column, is the only data   that is displayed in this reshaped data frame. And 
now let's see why is sorted this way. So why one   is here, two is here, three is here, and so on. So 
here, the value is defined by the index or row in   the column. So between one index one, and column 
A is one and why that happens, because if we go to   the, our previous data frame for the original data 
frame, that is here, we can find that here is one,   A, and the value that corresponds to that pair 
is the number one. So let's pick another one.   For example, five, here, we have two in B. 
And if we go here to our original data frame,   we have that two and B, the value that corresponds 
that pair is five. So that's why this value   is here. And that's how this new data 
frame was reshaped. Okay. And finally,   we have the pivot underscore table method. 
And this one creates a spreadsheet style   pivot table. So this is similar to the pivot table 
that we will find in Microsoft Excel, for example.   And this one supports data aggregation and explain 
you more about the pivot underscore table method,   as well as the pivot method. We're going to see 
some examples in the next video. And this time,   we're going to write some code so you can 
understand much better what we're doing.   Alright, now it's time to say how the pivot method 
works in action in pandas. So first, as usual,   we import pandas as PD. So here, I import this 
library, and then we're going to Use a different   data set to work with this peel method. So to read 
this data set, we use the PD read underscore CSV   method. And inside parentheses, we write the 
name of this data set. So in this case is GDP,   that CSV that you can find in the notes of this 
video. So this is the new data set. And now let's   have a look, I'm going to run this one. And as you 
can see, here, we have data about GDP per capita   that is in this column. And basically, this is 
how the GDP grew over the years for each country.   So here, I'm gonna tell you, which are the 
columns we're going to use for this example.   So first, we're going to use the Country column 
that contains data about different countries,   then we're going to use that year column that 
well contains different years. And that GDP   per capita, that it's in this column. So 
basically, what we want to do in this exercise   is to obtain a different view of our original 
data set. So this data set that we're reading   here with pandas has this view that we want to 
get a different view to have a better analysis.   So the goal of this exercise is to see the 
evolution of the GDP per capita over the years   for each country. And then we're going to put 
that country names in the columns. So the only   data we're going to show in our new data frame is 
going to be that GDP per capita that it's here.   So I want to show you now this with code, and 
let's write it here. But first, let's assign   a variable to this data frame. So here, I'm going 
to write df underscore GDP. So this is the name   of my data frame. And now I'm going to show 
it in its here. So now I'm going to copy this   data frame. And to use the pivot method, 
I'm going to paste this one. And now right   that pivot, now, we open parentheses. And now 
as you might remember, from the previous video,   we have to introduce three different arguments. 
And if you don't remember the three different   arguments, we have to introduce here, you can only 
press the shift and tab keys on your keyboard,   and you will get this. And here, you can see 
there are three arguments I'm talking about.   So first, we have to write that index argument. 
So write index. And, as I told you before,   I want the year column to be the index of my new 
reshaped data frame. So I'm going to set this   year as the index of my new data frame. So right 
here, year, next, we write coma, and press Shift,   and tap to show this. So the second argument is 
the columns. So we write columns, then equal and   open quotes. So here, as I told you before, I want 
the countries here listed in the Country column,   I want each country to be an independent column. 
So for example, here, let's say we have the   United States. So I want the United States to 
be column number one, then column number two,   China, then Australia, then Spain, and so on. So 
each country should have one independent column.   So that's what we want. And to get that we have 
to set the Country column here to the columns,   argument. So here, country, and that's it. Now, 
again, Shift plus tab to show this window here.   And now the third argument is values. So here, 
I'm going to write values equal to open quotes.   And here, the only data I want to show here 
in my new data frame is going to be that GDP   per capita, which is the one that is here. And 
now I'm going to copy this one and paste it here.   So remember, our goal. Our goal is to 
see the evolution of the GDP per capita   over the years for all the countries listed here 
in this column. So here, we're going to execute   this code and let's see the result. So here Ctrl, 
enter, and as you can see here, I have the new   view of this data frame in It looks much better, 
it's more readable, because we can see the GDP   evolution over the years for each country. 
So now let's verify if everything is correct.   So here we have the index year. And here we have 
the year as index. So everything is fine, then the   columns should be country. And now we have each 
country in the columns. So it's correct. Next,   the values are the GDP per capita. And yeah, 
we have here, that intersection between the row   and a column is our value that corresponds to 
the GDP per capita of that country in that year,   so everything is working fine. And there you have 
it. This is how the pivot method works in pandas.   Okay, now let's see how the pivot underscore 
table method works in pandas. So in this case,   we're going to work with a different data 
set. And to read it, we're going to use the   method PD rate underscore Excel, because in 
this case, the data set is not a CSV file,   but an Excel file. So we use rate underscore 
Excel for an Excel file. So in this case,   the name of this dataset is super market 
underscore sales, that x LS x. And this is what   we're going to see after you run this. And 
here, you can see that we have different   columns about what specific person bought in a 
supermarket. And here Well, we have the branch,   the city, the gender and different data. So here 
to make a pivot table, we're going to first name   this data frame, and I'm going to name it DF 
underscore sales. Now, I'm going to show it here.   And okay, now it's here. Okay, the goal of this 
task is to see how much female and male is spent   their money in this supermarket. So to do 
that, we're going to use the pivot table   method in pandas. So first, I'm going to copy this 
data frame. And now I'm going to paste it here.   And now we're going to make a pivot table and 
add an output function. Because remember that   the pivot underscore table method allows us 
to add an aggregate function, and the pivot   method doesn't support that. So we're gonna 
use the pivot underscore table this time.   And now we're gonna introduce some important 
argument. So the first one is the index. And   in this case, if we want to see how much 
male and females pant in this supermarket,   the index is going to be the gender. So here, I'm 
going to copy gender here. And it's going to be   here, index equal to gender. So this is the first 
necessary argument. And the second one is going to   be the aggregate function. So we have to write 
a Double G, F, U, and C, and then equal to and   then write the aggregate function we want 
to perform. So in this case, is going to be   a sum. So we write, sum, and now everything 
is ready. So what we're supposed to get here   is the information about the sales here in 
this data frame, but now divided by gender.   So we have the female category, and then the 
male category. So let's verify this. I'm going   to run this one. And as you can see, here, 
we have this summary table or pivot table,   and now it's divided by gender. So we can see how 
much female is spent here in the total column,   and also how much male is panned. Also in the 
total column, and here in the Quantity column,   we can see how many products they bought, how 
many products female and male bought in this   supermarket. And one detail you might have 
noticed is that only that columns that contain   numerical data are displayed here. So for 
example, here, branch and city that contain   only tax aren't here in this pivot table, 
because here in the aggregate function argument,   we indicated that we want to sum and when we sum 
values, we cannot some text, but only numerical   data, so only the columns that have numerical 
data are displayed in this new pivot table.   Okay. That's our first pivot table. And we can do 
even more. For example, we can select a pair of   columns that we're interested in. So let's say we 
only care about the quantity and the total column.   So we want only those columns. So we can get that, 
I'm going to copy this one. And to show you how to   get only those two columns, I'm going to add 
a new argument. So here, I'm going to write,   in this case, the name of the argument is 
values. So I read values equal to, in this case,   I'm going to select the quantity and the total 
columns. So I open square brackets, because I'm   going to select two or more columns. In inside, I 
write the name of the columns. So first quantity,   right here, and then total. So here, too, so we're 
going to get the same pivot table, but in this   case, only the quantity and the total columns 
are going to be shown in this table. So I'm   going to execute this one in here, I get an error, 
because I didn't include this comment. So I'm   going to add it here. And now everything should be 
fine. And yeah, we got the same pivot table, but   only the quantity and total columns are displayed 
here. And here, we can clearly see that female   spent more than male in this supermarket. 
But we can get even more detail here. So far,   we know that female is paying 167,000 In 
this supermarket. But with pivot tables,   we can even know in which product lines, 
this money is spent. So let me show you here,   we can see how the money is spent in this 
product line column. So we only have to add   a new argument to this pivot table method. So 
I'm going to show you here, first, we copy this.   And now I'm going to paste it here. And 
we're going to make a pivot table that says   how much male and female spent in each category or 
Well, product line. So we add a new argument, and   this one is going to be the columns argument. So I 
write columns, then open quotes, I add the comma,   in here, I write the name of this column, that is 
product line. So I scroll up, I copy this column,   and then we're gonna see in which category he 
spent the money. So health and beauty or sports,   and so on. So now I scroll down in here, I paste 
it. And before I run this code, here, we only   want to display that total, because we only want 
to see where the money goes not the quantity. So   only total, so I delete the square brackets to and 
with total, we're gonna see where the money goes,   divided by gender. So here, I run, because it's 
ready. And now, as you can see, here, we can see   how much female in males pant in each product 
line. So we can quickly see, for example, that   female is spent more money on fashion accessories 
that male and that kind of makes sense. And also   in sports, women is pant, or female as pant more 
money than male. So we can easily see all of that   by using the pivot underscore table method in 
pandas. And this is similar to the pivot table   you will find in Excel. And that's it. That's 
how you make a pivot table in pandas. Alright,   before showing you how to make visualizations 
with pandas, first, we have to check the data   set. And also we have to make a pivot table. So 
we can easily make the plots with pandas later.   So first, we have to import pandas to read this 
CSV file. And well, I have this import pandas   as PD. So we just run this code. And now let's 
read this new data set. So as you might remember,   to read a CSV file, we have to use the read 
underscore CSV method. So we write PD, that rate   underscore CSV. And then we write the name of the 
CSV file. So in this case, the name is population.   And I'm going to use this population underscore 
total that CSV so I pressed top to get this the   name. So we have now the name. And now I'm going 
to assign these to a new variable. So the variable   is going to be DF underscore population. And 
there's core row. So this row data, and now   we're gonna have a first look at this dataset. So 
I paste this. And now I'm going to run these two.   And now we have this data frame. So here, as 
you can see, we have the population of many   countries throughout the years. So for example, 
we have China here, United States, and India.   So we have their population, and Kira wrote the 
name row, because this dataset was extracted using   some web scraping techniques. And then it wasn't 
modified. So now we have to make some changes   to reshape this data frame. So we make it easy 
for us to make visualizations with pandas later.   So what we have to do here is to make people 
table to reshape this data frame. And that's   what we're gonna do here below. So we're gonna 
make a pivot table, and we're going to use that   pivot method. So as you might remember, the 
pivot method returns a reshaped data frame   organized by given index column values. But it's 
a pivot without aggregation. So this is what we   want. So we only want to reshape this data frame. 
So we're going to start by dropping no values.   So we do that by writing the name of the data 
frame. And now I'm gonna just copy the name,   and I paste it here. And now to 
drop null values, we have to use the   drop any method, so I write drop in a, and then 
we have to run this. And as you can see, here,   we have the result, it's a copy from this data 
frame. But if we want to save the changes that we   make to that data frame, we have two options. 
The first option is to use that in place   argument. So I write in place, and then set this 
to true. So if we do this, and we run, all the   changes that we make to the data frame are going 
to be saved. And the second option is to do   something like this to overwrite the content 
inside this data frames. So we do something like   this, we write df underscore population underscore 
row is equal to the same data frame, but that drop   in a so we're overwriting the content inside this 
data frame. So I'm gonna choose the first option   just to reduce some code. So I write in place 
equal to true, and now Iran, and this new data   frame shouldn't have any new values. Okay, now 
it's time to make this pivot table. So first,   I'm going to show you what I'm going to do. So 
we have a better idea before writing the code.   So here we have the original data frame. And what 
we're going to do is to reshape this data frame.   So I want the year to be in that index. So 
the year column, I want it to be here in   the index instead of 01, and so on. And then 
I want that Country column or the country,   the values inside the country column, I want it 
to be here in the columns. So for example, I want   China here in one column, then United States in 
another column, and then India in another column.   In I want the population data inside the data, I 
want this to be the only data here. So to do that,   we have to use the pivot method. And that's what 
we're going to do here below. So let's do it here.   So first, we have to write the name of the data 
frame, which is this one, and then write that   pivot, then we open parentheses in here. Let's see 
the arguments that this pivot method accepts. So I   press shift and tap. To get this helpful. Let's 
call cheat sheet. And now we have the arguments   that this pivot method accepts. So first is the 
index, then the column and then the values. So as   I told you before, the index, I want it to be the 
year column. So we have to write index equal to   open quotes and I write in year, then comma, and 
let's check another argument. So the next argument   is the columns. So I want the columns to be the 
country. So the data inside the country columns.   So here I write columns. Then I open quotes in 
here, I read country. So country. And now the last   one, I think, is values. And yeah, its values. So 
I want the values to be the population data. So   let me see if that's correct. And yeah, it's here. 
So population, and I'm going to press Enter here,   so it looks much better. In our population, it's 
here. So I have that three arguments that index,   the columns and the values. Now, I'm going 
to reshape my original data frame. So here, I   press Ctrl, Enter. Now, as you can see, here, we 
have the countries in the columns. So here we have   many countries. It's from the first country, 
Afghanistan, to Andorra, Argentina, Uruguay,   and many other countries. So we have also 
the year, so it's here, the year from 1955,   to 2020. So we can see here the evolution 
of the population throughout the years for   all the countries in this dataset. But as you can 
see, there are many countries. So what we can do   here is to select just some countries. So 
we can simplify our visualizations later,   in pandas. So here, I'm going to select some 
columns. But first, I'm going to name this   new data frame, I'm going to give it a name. 
So I'm going to name it DF underscore pivot.   So this is my new data frame. Now I'm going to 
rearrange this, and now it looks much better.   So now I'm going to run this. And now let's 
select some countries. So I copy this pivot data   frame. And now we open square brackets, double 
square brackets to select two or more columns.   And here, let's write some countries, the 
first United States, then, let's say, India,   then China, to more countries, Indonesia. 
And last but not least, Brazil. So here,   we have the five countries. So I run here in we 
have these five countries and population from   1855 to 2020. So great. Now we simplify this data 
frame. And now I'm going to overwrite the content   inside that data frame DF underscore pivot. And 
I'm going to write here, DF pivot equal to DF   pivot and with the selection, so I'm overwriting 
the content. So I press Ctrl, enter, and our new   DF underscore pivot is here. So we have it here. 
And now I'm going to show it to you. And this is   our new DF underscore pivot data frame. And 
that's it. Now our data is ready. So we can   use it to make free visualizations with pandas. 
And that's what we're gonna do in the next video.   Okay, now it's time to make some visualizations 
with pandas. In here, I have the data frame that   we created. This is the pivot table we created 
in the previous video. And as you can see, here,   we have five countries in the columns. In here 
we have the year in the index from 1855 to 2020.   So what we're going to do now is to make our 
first visualization, so I scroll down here,   and the first one is going to be line plots. 
So here first to make this visualization,   I'm gonna copy the name of the data frame, and I 
paste it here. So now to make plots with pandas,   we have to use their plot method. So we 
write that plot. And now I open parentheses.   And one necessary argument we need to introduce 
is the kind of argument so I write kind,   now equal to, and here I have to write the kind of 
plot we want to make. So in this case, is a line   plot. So we write line. And this is actually the 
mandatory argument we have to introduce here. And   now we can run this code so I press Ctrl N As you 
can see, here, I have the line plot. So in this   line plot, we can quickly see the evolution of 
the population throughout the years. For example,   China and India, which are green and orange 
lines, they had some fast growing population,   while the United States, Indonesia and 
Brazil, they have lower population mean,   also, the population didn't change so much in the 
past 50 years. Here, we can add more arguments to   this plot method to customize this line plot. 
So here, we can introduce another argument,   which is the x label. And this x label is what 
you can see here, here, when we created this   line plot, by default, it was assigned this year 
label, but we can change it. So for example,   let's say we have, we want to write year, but 
now with capital letter, so we right here,   here. And now let's say we want to add a new label 
here in the y axis. So here, we have only to write   y label, and then equal to open quotes. And 
here, we have to write the name we want. So   in this case, I'm going to write only population. 
And finally, we can also add a title. So we can   add any title we want. In this case, I'm going to 
write well, the name of the argument first title,   then equal to, and then the name of the title 
is going to be, let's say population from 1855,   and to 2020. So this is the title. So let's run 
this. In. As you can see, here, we got the title,   population 1955 to 2020. And the x label and y 
label were modified to finally we can add one more   argument in this case, the argument is the size of 
the figure. So to change the size of this figure,   we can add the argument name, fixed size, and 
this is a tuple. So we have to open parentheses.   And now to edit the size, we have to add two 
arguments. The first is the size of the x axis   and the second the size of the Y axis. So in this 
case, I'm going to set it to a and then four,   which means that the x axis is going to be 
large, while the y axis is going to be short. So   here, I'm going to run this code, and let's check 
it out. So here the figure has a different size.   And that's how you can customize this line plot. 
Okay, now let's make a bar plot with pandas.   So the first thing we have to do is to select 
only one year, so the bar plot only accepts   years one year, and we can plot their their 
population of different countries. So let's select   one year of this data frame we have before. So I'm 
going to copy the name of the data frame. So you   can check it out again. So this is the data frame, 
and we're going to select one year. So to do that,   we have to use the index attribute. And then that 
is in method. So first, I'm going to show you   the index method, in case you don't remember. 
So here the index, sorry, again, that index   attribute allows us to see all the index in this 
data frame. So we have here from 1855 to 2020.   So that's what the index attribute does. And 
now if we use that is in method, we can filter   out some inks. So here, let's say we want 
to select only that 2020. So I copy 2020.   You know, here, I write equal to, and 
first, I'm going to make the selection.   So it's here. And now I'm going to show you what's 
the result. So here, I press Control, Enter,   and the result is this little data frame that 
only contains the population in the year 2020.   So this is important because the bar plot is 
supposed to show only population in this year.   So here we have it. And now what we're going to do 
is to name this guy Add a frame. So here we write   equal to. And then let's give it a name. So I'm 
going to name it, DF underscore pivot underscore   2020. So here, I press Ctrl, Enter. And now I'm 
going to show this new data frame. Well, again,   here, and here, one little detail, I have to tell 
you is that when we make barplot, we have to put   text data in the index. So here, the name of the 
countries should be in the index. So to do that,   we have to use the transpose method. So this 
transpose allows us to switch rows and columns,   and vice versa. So here, we can easily do that by 
writing the data frame the name of the data frame,   you know, that T. So if now we run this code, 
we can see here that we have this. So now the   year 2020 is in the column and not in the index 
anymore. And country names are in the index here.   So this is the format we need to have before 
making the bar plot. So now I'm going to overwrite   the content in this data frame. So I write df 
underscore pyboard, underscore 2020, equal to this   same data frame, but that T. So here, I run this, 
you know, it's time to make the bar plot. So here,   I copy the name of the bar plot, you know, I 
use the plot method. So I write plot, again,   open parentheses. And the first argument is 
the kind. So I open quotes, and we write bar.   So now it's ready. And we can run it. So as you 
can see, we have a basic bar plot. And it has some   default values, like the name of this x label. 
And also the default color is blue. And we can   customize this bar plot a bit more, for example, 
I want a different color. So I write the color   argument and then open these quotes. And let's 
say I want it to be orange. So I write orange. In   also, we can change the X and Y label. Actually, I 
can copy this here, so I can save some time here.   So x label and wire label are here. And let's 
paste it here. So x and y label. And finally,   I can add also the title, which was here. So I 
copy and paste it. But in this case, the title is   a bit different, because in this case is not from 
1855 to 2020. But it's only 2020. So here, I have   only 2020. And now let's run this to see the 
results. So you can see here we have the title,   the x and y label, and bar plot is in orange. So 
that's how you customize the bar plot. Alright,   so far, so good. Now let's go one step further 
by making bar plots grouped by n variables. So   here, we have to select a group of years to make 
these bar plots grouped by n variables. So I'm   going to copy this code we use before to select 
only the year 2020. I'm going to copy this, in in   this case, I'm not going to select only one year, 
but a group of years. So let me show you here.   Instead of choosing only 2020. I'm going to 
show you the pivot table again. So you can   easily understand. So instead of choosing only 
2020, I'm going to choose some other years here,   so I'm going to delete this. And I'm going to 
write it here. So let's say 1980 1990, then 20,   then 2020 10. In well finally 2020. So we have 
a group of years here, and we're selecting this   using the index and is in method. So here, 
I'm going to give it a different name.   In this case, since it's a sample, I'm going to 
write the F underscore pivot underscore sample.   Now I'm going to first I'm going to show you 
this one, so you can see what this looks like.   So now we have five countries, no five years. So 
now I'm going to assign these to my data frame.   So DF underscore p with underscore sample, I 
run this and now we have this new data frame.   So It's time to make these grouped bar plot. 
So here, we write the name of the data frame,   and then the plot method. So write that plot, you 
know, let's add the first argument, which is kind   and equal to bar. Now we run this. And 
as you can see, here, we have the plots,   or the bar plots grouped by year. So here's 
1980, in 1990, and so on. And you can also add   the same arguments we added here. So for example, 
I can add the x and y label, so I can do it here.   I'm gonna do it fast. So here, I run. 
And as you can see, here, we have the,   we modify the X and Y label. And that's it. 
That's how you make bar plots with pandas.   Okay, in this video, we're going to learn 
one of the most common charts that we can   make intenders in actually any other visualization 
tool, and these are pie charts. So before we make   this pie chart, first, let's give a look 
to the data frames we're going to use.   In this case, to make a pie chart, we're 
going to use the same data frame would use for   making the bar plot because it follows the same 
logic. So here, I'm going to copy the data frame   we created for the bar plot, which is this one, 
DF underscore people underscore 2020. So this is   what we created before by using that index 
attribute. And that is in method. So here,   I'm going to copy this. And now I'm going to show 
you here so so you can remember what's inside this   data frame. And it's here. So here, as you can 
see, we have the column 2020. And the countries   are in the index. So everything is fine. That's 
what we need. That's the format we need for making   the pie chart. But there is one little thing 
we have to modify. And this is the column name,   because now it's 2020. In this is a number, it's 
actually I think it's an integer. So it's not a   good practice to have numbers in columns. So 
what we have to do is to make this a string.   And to do that, we use that rename method. So we 
write that rename, open parentheses a now we use   the columns argument, so we write columns, then 
open these curly braces. And now we write the   name of the column we want to change, which 
is 2020. And we're going to make this integer   value into a string. So we open quotes and 
write 2020. So apparently, they are the same,   but the green one is an integer, and red one is 
a string. So now to make to save these changes,   I'm going to write in place equal to true. And 
I'm going to run this. So now we can make the plot   here, I'm going to write the name of the data 
frame. And now I'm going to use that plot method.   So here I write that plot. So the first argument 
is kind, in here, I write pi. So the current is pi   a now I run this in here, I forgot to include 
that y argument. And I'm going to write here.   So the y argument is supposed to have the data. 
So in this case, I'm going to show you here again,   the data frame. So the data is here in 2020. So 
we should write here 2020. So I'm going to delete   this. And here in the Y argument, you write the 
column that has the data. So that's what we did.   So now I run this. And now we finally have 
our pie chart. So here is the pie chart. So   that's how you make a pie chart. If you want you 
can even add another argument like the title for   example here. I can say that this is a population 
in 2020. But in this case in percentages,   so write this in our we have this title. So 
that's how you make a pie chart in pandas.   Alright, so far, we made a pivot table in 
many plots using Pandas, and in this video,   we're gonna learn how to export the pivot table in 
also the plots we made with pandas. So let's start   by exporting the plots we made with pandas and to 
do that, first we have to import matplotlib. So   we write import Math plot lip, that pie plot, and 
then we write as PLT. So this PLT represents this   matplotlib.pi plot. So now we run, and we 
import matplotlib. And now we can use this PLT   to save the plot. So we write PLT dot save fic. 
And now we open parentheses in here, we have to   write the name of the file, we want to export. 
And here, I'm going to write my underscore test   that png. So this is the extension. And this is 
the name of the file. And now before exporting   this file, I'm going to show you something 
here. So probably you know this that when we   make the plot with pandas, we get these words 
here that says access subplot and all of this.   So we can get rid of these words by using the 
show method. So we'll write PLT that show with   parentheses. And if we run this, we're going to 
export this figure. And also we're going to get   rid of these words. So let's try to run this. And 
as you can see here, all those words disappeared.   And also we exported the figure to a PNG file. And 
now this file should be located in the same folder   where you have this Jupyter Notebook file. Okay, 
I'm going to open that file. But first, I'm   going to export the pivot table. So here, I copy 
this DF underscore pivot, and I paste it here.   In order to export it, we have to use that to 
excel method. So right to underscore Excel.   And now I open parenthesis here, we write the 
name of the file, where we're going to export   this pivot table. So in this case, I'm going 
to name it pivot underscore table that XL s x.   So this is the extension of Excel. And this is 
the name of this file. So now I run this, and   now the Pivot Table shall be exported. Alright, 
now I'm going to open the Excel file and the PNG   file we created. So it's here, and here we have 
the plot, we export it, and also the pivot table.   So as you can see here, the plot looks exactly 
the same as the one we created here with pandas.   And the pivot table is the same. So I'm going 
to show you how the pivot table looks. And   here is the pivot table in here is the pivot table 
we exported. I open it in Google Sheets and looks   exactly the same. And that's it in this video to 
learn how to export data frames as well as plots