If you already work with data in Excel, and
want to add more power to your data analysis and evaluation using Python, then this is the
course for you. Frank is a data scientist, and he will teach you how to use Python to work
with data. Hi, everyone, my name is Frank and draw that. And this is my Python course for Excel
users. I created this course to help Excel users move from Excel to Python. That why Python?
Well, in Python, we can do most of the things we will do in Excel, such as working with data,
making charts, and pivot tables. But that's not all. We can use all the power of Python
to automate tasks, work with large data, and do lots of things. Thanks to the 1000s
of Free Libraries Python has on top of that, Python can help you become a better data analyst
or get into new fields like data science, I divided this Python course for Excel users in
three modules. In module one, I'll teach you all the Python core concepts you need to know for data
analysis. Then in module two, we'll learn pandas, pandas is a Python data analysis library that will
help us do most of the things, we can do an excel in module three, we'll put into practice what we
learned in this course, by creating a pivot table and visualizations such as line plots, bar plots,
and pie charts. Remember that in the description, you will find the files code, as well as a free
PDF Python cheat sheet I created for this course. There, you will find the concepts, methods and
functions we will see in this course. By the way, I'm Frank, and I will be your instructor
in this course. So let's get started. To download Anaconda, we go to anaconda.com and
click on get started. Then we choose their last option download Anaconda is Tollers. And then we
have here that different Anaconda is taught. So there are Windows, Mac, and Linux. So in my case,
I'm going to choose Mac, and I'm gonna choose the 64 bit graphical installer. So now I'm downloaded
Anaconda. And once it's downloaded, I'm gonna click on it, and a message will pop up. Do you
just have to click on Allow us I'm going to do right now. So just click on Allow and then
click on continue until the installation starts. So I just click Continue and then agree and then
continue. And it's going to start installing Anaconda. In case you're on Windows and you're
installing Python or Anaconda for the first time, make sure to check the first box you see now
on screen. So I'm going to speed up the video now. Okay, the installation is almost done. And
now it's telling me that Anaconda works with pi term. And now I'm just going to click on Continue
to finish that installation. So I click Continue. And then we'll see just a summary of what was
installed. And now I'm going to close this window and I'm going to open Anaconda. So I'm going to
locate that icon, it's green icon, this one that you see here. And I'm going to open Anaconda. I'm
going to wait a couple of seconds. And let's see what was installed. So here we have that you put
your lab and Jupyter notebook, which are widely used in data science. So I'm going to launch
Jupyter Notebook. So here it's opening Jupyter Notebook. Let's give it a second. And now we open
a new notebook with python three. So python three was installed to and that's it. In the following
videos, we'll learn how to use Jupyter Notebook. In this video, I will introduce you to the
Jupyter notebook interface. Jupyter Notebook is an open source web application that allows
us to create and share documents that contain live code equations, visualizations,
and text. This is a perfect text editor for doing data cleaning and transformation.
That visualization and data analysis this is why Jupyter Notebook is widely used in data science
and also machine learning. As you might remember, we installed Jupyter notebook in Python with the
Anaconda navigator in this means that we already have installed some popular libraries used in
Python for data analysis. By the way, one of the terms of Jupyter Notebook is Jupiter lab. Both are
similar, but we're going to use Jupyter notebook in this course, because of its simplicity. So
let's open Jupyter Notebook. And to do that, we have to click here on the launch button. So I
click here. Now we wait a couple of seconds Now we have here the interface of Jupyter Notebook. So
I'm gonna maximize this. And by default Jupyter Notebook opens the root directory of your
computer, it's a good idea to create a folder where all your Python scripts will be located. In
my case, this folder is called Anaconda scripts. So I click here. And now I can navigate through
the folders. And the folder I'm going to use for this example is this one that says my course here,
we're going to create our first python script. To do that, we click here on the New button. So
click here, and we have to click on the first option that says, python three, there are other
options like text file folder, or the terminal, but we're not going to use these options
in this course. So click on python three. And now we have a Python script powered
by Jupyter Notebook. So here on the right, you can see that it says python three, and also
there is the Python logo. And on the left, you can see here the Jupyter Notebook logo, and also
the name of this Jupyter Notebook file, we can change the name of the file by clicking here on
on title. So I click here, and I can change it to, let's say, example. So I write example, in I click
on Rename, and now we rename these up or not file. Alright, now let's navigate through this menu bar
that we have here in this Jupyter Notebook file. So the first option is the file. In here, we can
create a new notebook with python three. So if we click here, we're going to open a new Jupyter
Notebook file from scratch as we did before, then we have the open and in this case, we can
open a Jupyter Notebook we created before we can also make a copy to Jupyter Notebook and then
change the name, we can save a Jupyter Notebook file and rename the file as we did before, we only
click here and rename the file, then we can save all the progress we make in Jupyter Notebook.
For example, after writing many lines of code, you can save all the progress you make by pressing
Ctrl S or Command S on Mac, and you're going to create a checkpoint. And later you can revert to
a previous checkpoint by using this option here. So here you will see many checkpoints and you
can revert to a previous checkpoint. By the way, by default Jupyter Notebook makes saves
every third seconds or maybe one minute. So there is no need to press Ctrl S every time. So
keep that in mind. Then we have other options that I don't use so much like print this Jupyter
notebook or export that Jupyter Notebook file to HTML or PDF and so on. Okay, now
let's see the second option that says Edit. And here we can edit all the cells we have here
in this Jupyter Notebook. By the way, here, what you see here on the screen is a cell. So we
can edit with this edit option. For example, we can cat cells, we can copy cells paste cells above
and delete cells. On the right, you can see the shortcuts that we're going to see on the next
video in detail. And well, you can check all the edit options that you can perform on Jupyter
notebook here, then in the V option, we can toggle the header, the toolbar and also line
numbers. So here, if I click on toggle header, the header is going to disappear. And if I click
on toggle toolbar, this toolbar disappears to also here and toggle line numbers we can show
here line numbers. So if I write anything, we can see that it says 123 and so on. And
I'm not going to use this for this course, I'm going to leave it with the default
options. So here I'm going to revert to the original options. So without line numbers, and
I want to show the header and also the toolbar, but you can personalize it as you want. Next in
the insert options, we can insert cells above or below, we only click here. And well we're going
to see the shortcuts later in the next video. Then we have the cell options, we can run cells or
run all the cells in this Jupyter Notebook file. And then we have the kernel option. And a kernel
is a computational engine that executes the code contained in a notebook document. When we open
Jupyter Notebook. A kernel is automatically launched. And we can interrupt this kernel by
clicking here. So by interrupting we can pause the execution of our code we can also restart
everything and do more things here. Sometimes, for example, I interrupt the kernel when I line
of code or a cell takes too much time to execute. And well you can do the same here with restart or
interrupt. Then we have the Navigate option that doesn't actually have anything here, widgets
that I don't use so much and will help that, I think it will send you to that documentation
of Jupyter Notebook. And you can read it if you want. All right here. Then we have the toolbar,
and here you will find some shortcuts of the menu bar that we've seen before. For example,
here, you can save and make checkpoint. So here I click here. And as you can see, here, it says
checkpoint graded, or something like that, yeah, checkpoint created and the time that he was
created, then you can here with this plus button, insert, cell below sway click here, and as you
can see, we can insert a cell below. And also you can use shortcuts, but that I'm going to show you
in the next video, then we can cap selected cells with this button, we can copy a cell with this
bottom. And also we can pace sales below. Also, we can move a cell above or below, for example,
I'm going to write anything here in this cell, I can move it evolve with this button or below,
as you can see here, then we can run this code, for example, I can write the number one, and
then run the code. And as you can see here, the code ran and it shows the number one and well,
those are some of the frequently used buttons in the toolbar. And that's everything you need to
know about this Jupyter Notebook file. Okay, now, before finishing this video, I'm going to show
you some other options that you can find here in the user notebook interface. In here, you can see
that there are some other options. So right now we are in the Files tab. And we can change to
the running tab here. And here you can see all the currently running Jupyter Notebook processes.
For example, we can see here that Jupyter Notebook file we created and that we opened. So you can
recognize that you put your notebook file is open, or that is running, because here the icon will be
in green. So here if we go back to the Files tab, we can see that this Jupyter Notebook file, which
by the way has the IP y and b extension is in green, so the icon is in green. So this indicates
that the file is running in well, it was opened. So here we can see that is open, and we can shut
down this file. And this is different from closing this file. For example, here I have the file.
And if I close this file, here, we can see that file is still running. Here we see running
is in green, and in the running tab, it still shows up. So if we want to shut down
this file, we click here. And it says that there are not not books running. And we can see here
that the notebook has a great icon. Alright, then we have the clusters tab and this tab I
don't use so much. And actually, it doesn't show anything here. And then we have the NB Extensions
tab. Here, you can install any extension to personalize Jupyter Notebook even more, and we're
going to see some cool Jupyter Notebook extensions in the next videos. And by the way, this NBA
Extensions tab doesn't show up in some versions of Jupyter Notebook, but we can easily install
it and we'll also see how to install these ennemi extension step in the next videos. Finally, we
have this box that shows our directory. So here this folder indicates that root directory. So if
I click here, we are not in the root. And if I click on the folders, Anaconda script and then my
course I go to the folder where I was before. And that's it. These are all the things you need to
know about the Jupyter notebook interface. Okay, in this video, we're gonna see some cell types and
cell modes in Jupyter Notebook. So first, we're going to open that Jupiter notebook file that we
created in the previous video, which is this one example that I p y and b. So we click on it. And
here we have the Jupyter Notebook file opened. In here by default, we have these four sold in
command mode. And we can say that this is command mode because here this blue color indicates
that the cell is in command mode. And when we are in command mode, we can do things outside
the scope of any individual cell. So basically all the tools we see here in the toolbar, we can
apply it in command mode. Also in command mode, we can apply some shortcuts that I'm going to show
you later. And for example, if we want to see the shortcut window, we press the letter H in command
mode, and we can see the keyboard shortcuts here. So here You can see all the shortcuts in all the
shortcuts that you can apply in commandment. Now I'm going to close this one. And also you can
apply different shortcuts like for example, if you press B in the command mode, you will
see that there is a new cell because B is the shortcut that introduces a new cell below. Now,
if we press enter, you're going to see that the color is going to change to green. So here we have
green color. And this green color indicates that we are in Edit mode. And the edit mode is for
all the actions you will usually perform in the context of the cell. For example, introducing
text or writing code. So here I can write, say 123. So if I write 123, and then I click
on this run button, I'm going to run this cell. And as you can see here, I run this first
cell. And also after running the cell, you can see that we are again in command mode.
So to go to Edit Mode, we press Enter again, and now we can edit the numbers we introduced. So for
example, I can write 456, and then run again. And here you can see that the output shows 12345, and
six. By the way, if you try to use the shortcut in edit mode, it won't work here, press enter. And
now I'm on edit mode. And if I press the nether H, you can see that nothing happens, we don't have
the shortcut window. And if I press the letter B, you can see that we don't insert any cell below.
This happens because those shortcuts work only on command mode. So to escape this edit mode, we have
to press the Escape button. So press escape. And now I'm again in command mode. So if I press H, we
have here that keyboard shortcut. And if I press B, you can see that we inserted a new cell. And
that's it for the command in the edit mode. Now we'll see the cell types in Jupyter notebook in
Jupyter Notebook. There are three main cell types. And we can see all of them in this drop down here.
Right now the type of this cell is code. So here it says code. But if we press here, you can see
other cell types like Markdown and row and B convert. So we're gonna see first a code cell,
and it already has the check. So this one is a code cell. So now I press here, and now well, it's
in code cell. If I press Enter, I'm in edit mode. And here I can introduce any code I want. So here
I can write any number 99. If I press Control, Enter, we can see that here, this is the input
in here we got the output of this code, we're going to see how the code cell works throughout
this course. But now it's time to see how that markdown cell works in Jupyter Notebook. So here,
I'm going to the cell. Now I'm going to change the cell type. So I press here in the drop down.
And now I select markdown in the markdown cell, we can introduce any type of text we want. For
example, we can introduce titles. So if I delete this and press the hash sign, we can get title. So
one hash, it means title. So here I press a space in now I write title. Now I press Ctrl, enter
or this run button to run the cell. In here, we got the title. By the way, you shouldn't get
this one number because I use modify the default behavior of Jupyter Notebook. So mine enumerates
the titles and subtitles, but in your case, you will see only the word title. And if you want, you
can introduce also subtitles here. So for example, I'm going to insert a new cell with this button
is plus button. And now I'm going to move this cell up with this button here. So I press this
in now I'm going to change the cell type from code cell to markdown cell. So I go to the
drop down and select markdown. And by the way, you can change the cell type also with shortcuts.
So if you're in command mode, you can press the Y button to change the code cell. So I press
the Y button. And as you can see here, it says in and this in with square brackets indicates that
this is a code cell. So here I can press enter and introduce any code here I introduce numbers and
press the Run button. And here you can see that we have an input and an output. So this is a
code cell. But now we can press the M button to make this cell a markdown cell. So now
we press M and here we are in command mode. So now we can get this markdown cell in here. You
don't see that in Word with the square brackets anymore. So now I'm going to edit mode so I
just press here or Well, you can press enter to Go to Edit mode. In order to introduce a subtitle,
I'm gonna write double hash sign. So I press hash sign twice. Now let's paste in. Now I'm
going to write a subtitle. So I write subtitle, I press Ctrl, Enter, or the run button to
run the cell. And we've got here a subtitle. And we can also introduce text, I'm going
to introduce a new cell with a plus button. And you can also do it without beat shortcuts.
I'm going to do it with a B shortcut, right now, I press B. And here I got this new cell.
And we can move this with this button here. And now we have this cell in the position we want
it. So here, I can introduce text by converting the cell to markdown. So here, I choose markdown.
Now you press Enter to go to Edit Mode. And here I can introduce any text. For example, I can write
hello, I press Control, Enter. And now we can see that we have here this text. And finally, the last
type of cell is that row and B convert. And this type of cell is not ever loaded by the notebook
kernel. So if we convert this code cell to a row cell, this cell won't be emulated by the notebook
kernel. So let's try here, I press row, and be converted. Now we can see that this looks like a
plain cell. And well this type of cell is not used that often, actually, we're going to use only
that code cell and a markdown cell in this course. And that's it. In this video, you'll learn the
cell types and cell modes in Jupyter Notebook. Okay, in this video, we're going to see
some common shortcuts used in Jupyter Notebook. And we're going to start with the F
shortcut. And by the way, to use this shortcut, you have to make sure you're in the command
mode and to verify during the command mode, make sure that the cell has this blue
color. Okay, now during the command mode, you can press the letter F, and you're going to
see these Find and Replace. So this first shortcut allows us to find our word in a cell and then
replace it with another word. For example, I can write here the word hello. And
here, it found the word hello, inside this hello, world sentence. And now I can replace
this word with the world. Say hi, for example. So here, I write Hi, in red, we can see the match.
And in green, we can see the word that we're going to insert. So here, let's click on Replace all.
And now you can see that it doesn't say hello world anymore. But now it says Hi, world. So now
I press Ctrl Enter, which is another shortcut to run the cell. So you can press here and run
or only press Ctrl Enter to run this cell. So press Control Enter. And now we ran this cell in
another way to run cells is to press shift, enter. But in this case, we're going to run an insert
a new cell below. So now let's see I press Shift Enter a note here, it ran the cell because
now test in n three inside square brackets. In here, we can see that we have a new cell. Okay,
now another shortcut that is often used is the y and m shortcut. So now this cell is a code
cell. And if we want to make this a markdown cell, we only have to press the M letter, so we
press M and this is going to be converted to a markdown cell. And if we press the letter y,
this is going to be converted to a Kotel and also you can change the heading here, you can
make the heading bigger or smaller. So here, I'm going to locate the cell A now to make
this one smaller, we can press the numbers. So if we press the number two, we can see that
this one gets smaller. And if I press number three, the title gets smaller for smaller and
so on. So as you can see the more hash signs, the smaller the text. So here I'm going to delete
this hash signs. And one hash sign represents the biggest phone size, which is the title. So
now we press Ctrl, enter, and now we have this in heading one. But if I press number five, and
then press Control Enter, we can see that now this cell has had in five and it's smaller. So now I'm
going to revert to heading one. So you press one, and then Ctrl, enter. Okay, now we can navigate
through the cells by pressing on the up or down keys on our keyboard. And as you can see here,
we can navigate through all the cells here or we can also press with the mouse, we can press on
the cells we want. Okay, now we can insert a new cell above by pressing the A key so if I press a
we get here a new cell above and if I press enter b, we get a new cell below. Now if I press
x, we're going to Cat the cell. So I press X, and you can see that the cell was Cat A. Now if we
press V, we paste that cell below. So I press V, now we got the cell. And if I press Shift plus V,
we get the cell pasted above. So I press shift in V, and we get this new cell above this cell I have
here, okay, now I can delete cells by pressing D twice. So impressed the two times. And as you
can see here, that title disappeared. So now it tried again, and we don't have the title
anymore. But now if we press the letter Z, we can Undo those changes. So let's undo what we
did before. I press Z, and we get here, the title back. Okay, another useful shortcut is ctrl S,
that allows us to save the changes we made in this Jupyter Notebook file. So I press Ctrl S, and you
can see here that says, checkpoint created. So I'm going to press again Ctrl S, and here it says
checkpoint created in here also says the time and it says these are some of the most common
shortcuts used in Jupyter Notebook. But you can see other shortcuts by pressing the letter H.
So press H. And here you can see more keyboard shortcuts. Or you can also go here to help and
then go to keyboard shortcuts here, and you get the same window. So here you can see a list of
shortcuts for command mode. And also for the edit mode, you can see the description of a shortcut,
and also how to do it in your operating system. One of the typical ways to get started
with a programming language like Python, is printing a simple message, you can write any
message you want. But it's traditional among coders to start with a Hello World. So let's try
it. Let's print our first message using the print function. The print function prints a message to
the screen. So I'm going to write here, print. And then I'm going to open parenthesis,
every time we use a function. In Python, we have to open parenthesis, well, in this case
for the print function. And as you can see, here, the functions get green color in Jupyter
Notebook. So that's how you can identify them. So inside these parentheses, I'm going to write the
message. So in this case, it's going to be Hello, world. So this is our first message.
Now, to execute this first line of code, we have to press Ctrl N, Enter, or command and
enter if you're on Mac. So I'm going to press this. And as you can see, here, we have our first
hello world. Another way to run this first cell is pressing here on the run button is going to
have that same effect. So I pressed and it rang. So as you can see here, it says in which
represents a code cell. And this is a markdown cell, as we've seen before, one of the advantages
that Jupyter notebook has is that it allows us to print the last object in a code cell without
specifying the print function. So for example, here, I can print this Hello World with without
writing this print function. So I'm going to copy this Hello World message that it's inside quotes.
And I'm gonna run this code. So just Ctrl, Enter. And as you can see, here, we have this message
printed. So this is one of the advantages that has up or not, if you do this in another Python
ID, it will work. So here you can try yourself, you can write any message you want. Apart from
the first hello world, you can try with your name. So we write prayer and then parentheses, and we
open quotes, because we need to define a string. I'm going to tell you about strings a little bit
later. But yes, so you know right now. And here, for example, I can write my name. So my
name is Frank, and I can print my name, then I can print also numbers. So I print my age 26.
And it's gonna work too. And besides writing code, you can also add comments, comments are a useful
way to describe what we're doing in our code. So here, we can use comments. We just have to write
their hash sign, which is this one. So you write hash sign in, then you write the comment. In this
case, I'm gonna write my name. And I'm going to say printing my name so we know what our code is
doing here in the front. message we wrote, We can also add a comment. So we write hash sign. And
then we can say printing my first message. As you can see here, the comments also have a different
colors. So, so far, we have three colors, this color for their comments, green color, for
God functions in red color for the string, this is just a useful functionality most texts
a to have, that allows us to easily read code. Okay, now let's see some data types in
Python. Every volume in Python is an object, an object has different data types. Let's
see the most common data types in Python. So one of the most common is that the types in
Python are integer and floats. Both are numbers. But integers are numbers that can be written
without our fractional component, just like, for example, the number one, number 2345, and so
on. So all of them are integers. And we can check these value or this data type by using
the type function. So this is our second function we're going to see so we
write type, and then parentheses, and we execute, we run this code. And as
you can see here, in the output, it says, I n t, which represents integer, so this is an
integer. Okay, the second type of data I want to show you is float. Floats are numbers that contain
floating decimal points. So basically 2.3, let's say 1.25 5.4, and so on. So here, we have another
type of data. And let's check out if these are actually floats. So we use type, and then
parentheses, and we run this code. And we say that we have float. And just like on Excel, you
can perform math operations in Python using these numbers. So some operations, you can use our
addition, for example, you can say one plus two, and then execute this code, and you get three, you
can use subtraction, so four minus one execute, and you run this code and you get three.
You can also do multiplication, division, exponents in more in Python. But now let's see the
third data type that we will see often on Python, and it's the Boolean, Boolean are true or false
values. And we can check this using again, that type function, and we write type. And
within parentheses, we write for example, true. And we run this code and we see that we got
that bool, which represent a boolean data type. So we can also write type, and in this case, false,
and run this code, and we get bool. Again, so this is Boolean. And we're going to use Boolean,
often when we use conditionals. Okay, now the fourth data type I want to show you and it's very
common is the string. A string represents a series of characters. And in Python, anything inside
quotes, either single quotes or double quotes, is a string. So let's see them actually, we already
see one kind of string here when we printed this Hello, world. And you're actually familiar with
this, but we're going to see it again. So to create a string, we have to open either single or
double quotes. So in this case, I'm going to use double quotes. So you see it now. And now I'm
going to write any message. So I'm going to write, for example, again, hello world. And again, to
verify the type, we can use that type function, parentheses, run this code, and we get the STR
that represents a string. And one cool thing a string has is methods, we can apply different
functions to strings, as we will do in Microsoft Excel, for example. However, in Python, we use
methods a method is a function that belongs to an object. To call a method, we use the dot
sign after the object. Let's see some string methods to change the case of text. So here,
I'm gonna write again, hello world. But now I'm going to use some string methods. So write hello
world. In this case, I'm going to use the upper method to make this uppercase, so I'm going to
use the print function. But actually, we don't need to use the print function because as I told
you before, and Jupyter Notebook, we don't need to use the print, because it automatically prints
the last line of code. So since this is the only line of code in this cell block, it's going
to print it automatically. So we just run this cell. And we have hello world in upper case. So
as you might expect, now, we can also change the case of the text. In this case, it can be
on lower case, or title case. So I'm gonna just copy and paste this twice. In here, I'm going
to write instead of upper, I'm going to use flour, and then title. So you can see how it's going
to change the case. So here, I'm going to run, and let's see what happens. So as you can
see, here, it only printed the last one, because I told you before, it only prints the last
one. And if we want to print the three of them, we have two options. So we can maybe here cut and
paste on each cell. Or what we can do is to print each of them. So here, for example, I can do
print here, and I can do the same for them. So instead of using more cells,
we can print all of them. And here, we can print this one too. Actually, we
don't need them, we don't need it, because it's going to print the last line. But just for the
sake of this video, I'm going to print the three of them. So here, I'm gonna run this code. And as
you can see here, the first it has an uppercase, the second has lowercase. And the third has a
title case. So that's how you do it on Python, other string method that you can find Python is
the count method. So I'm going to delete this, and actually this one too. And we're going to see
this now. So first, I copy this. And now I paste it here. And here, I'm going to use the count. So
the count method, so I write count. And then here I open single quotes, and I write the letter
that we want to count. So here, for example, I'm going to write that l letter. And what
this string method is going to do is going to count how many times these l letter is
included in this string. So as we can see, there are two L's, so it should set two times. So
I run these code, and actually is three because there are two in kilo and one in world. So I was
wrong. And here, another string method that you can use is the replace method. So we can replace
one letter for another. So here, let me copy this, and I'm going to paste it here. And instead
of writing count, I can write replaced. So here, the first letter that we're going to
see here is the letter that we want to replace. So in this case, I'm going to change the L with
O. And the second letter is the letter that you want to put in that string. So I'm going to use
the U. So I'm going to change every time that an O appears here in the string, we're going to
replace it for you vowel. So let's try it. So I run this code. And now it says, Well, hello
world, but with you. And these are some of the most common string methods in Python. Okay, now
it's time to learn something that you're gonna see often in Python, which are variables, variables
help us store data values. In Python, we often work with data. So variables are useful to manage
this data properly. A variable contains a value, which is the information associated with a
variable to assign a value to a variable, we use that equal sign. So let's create
a message that says, I'm learning Python, and stored in a variable called message underscore
one. So here, I write message underscore one. And we set it to that is string. I'm learning Python.
So are you open double quotes in here I Right, I'm learning Python. So this is string. We've
seen this before. And this is the variable, and we assign this value to the variable
using the equal sign. Now I'm going to run this. And as you can see, nothing happens.
But actually, we just assigned that string to the variable message underscore one. Now, if we
want to obtain the message, I'm learning Python, we only have to type the variable name, and then
execute that code. So I'm gonna copy and paste it here. And then we run this code. And as you can
see, by running this cell, we obtain the content inside the variable message underscore one, we
can create as many variables as we want, just make sure to sign different names to new variables. So
let's create a new message that says, It's fine and stored in a variable called message underscore
two. So first, I write message. So Ms search and underscore two, and then we set this equal to open double quotes, and right, and it's fun.
This is my second variable, and I'm gonna run this cell. So as we can see, the string was
assigned to the second variable. And if I copy and paste this variable here and run this
code, we can see that the message it's there. By the way, if you're using single quotes, instead
of double quotes, or some using in this video, probably you have the following issue.
So here, I'm going to copy this one and paste it here so you can see what I'm talking
about. So let's say you're let's say you're using single quotes, instead of double quotes. So you
get this, this is a problem that you will have when using single quotes. Because in the
English language, we use this apostrophes often. So a simple way to deal with this is using
double quotes. So as you can see here, if I use double quotes, everything is okay. Everything
remains as a string. But with single quotes, it doesn't happen. So only the i gets this
string by dress, it doesn't get a string value, or the string datatype. So just make sure you
use double quotes every time you have these apostrophes, and that's it. Okay, now, let's put
these two messages together. So message one with message two, I want to put them together.
So this is called a string concatenation. If we want to put message one, in message two, together, we can use the plus operator. And we can
just do this. So I'm going to copy message one, or the variable message one. And now I'm going
to copy the variable message underscore two. And I use the plus in the middle to concatenate
this first message with this second message. So run it, let's see what happens. So here, we
can see that the two messages were concatenated. But here, there isn't a space between these two
messages. So this is the first message and this is the second and there isn't any blank space
in the middle. So what we can do here is to just add a blank space. So I'm going to copy
this one and paste it here and show you how to do it. So here I add a new plus operator in the
middle, we open A string. So with single quotes, or double quotes, in this case, I'm going to use
single quotes here, integrate this blank space, I'm going to press a space. And here we have
our blank space here. And then we run this code. And now let's see. And here as we can see,
there's a space between Python and that. And we have this blank space. And we want we can
assign this new message to a new variable. So I'm gonna assign this to a variable called
message. And I write message here. And I include here below the code in here, I can print
this so as you can see, if I run this, we can see that the message is there. Okay, now let me
show you an alternative way to join two strings. So this is called the F string, and it works
like this. You write F and you open A string, so we write a single quotes here. So one and two in
here. As you can see the whole, the whole thing is red. So it's like everything is a string in here
inside, we can write the message. So let's see, let's say we write a simple Hello World. So hello,
world. And we run this. And as you can see here, this is a string, it just has this F, in front of
that string. In here, one of the advantages that this f string has is that it can have variables
inside the string. So here, for example, we can write a variable opening these curly braces. So
these collaborations can have variables inside it. So here, I can write message, underscore
one, and we can print it. So if we print, we have this string, I'm learning Python a now
if we want to concatenate this first message with our second message, we just have to include
curly braces, again, I put it here. And now I write message two. And between message one
in message two, I just have to press pace. And we have this. So I'm learning Python, and
it's fun. So here, we just press is pace. And this pace also appears here. So for example, if we add
some random text, let's say ABC, we get this ABC, between Python in between. So this is how f
string works, do you just have to write the F, then open single quotes, and inside you can write
any message. And to include any variable, just you have to open these curly braces, write the
variable name, and that's how to join strings. Okay, now it's time to see a data type
that is used. Often in data analysis, I'm talking about this. In Python lists are used
to store multiple items in a single variable list are order and mutable containers. In Python,
we call mutable, two objects that can change their values, that is, elements within LA's can change
their values to great Alice, we have to introduce that element inside the square brackets separated
by commas. So let's create our first list. First, we have to set the name of the list.
In this case, I'm going to name it countries. And now to create the list, we have to open
square brackets SAS said before. So here, we open square brackets. And here we have
to write the elements. So I'm gonna include in these countries list just strings, and they're
going to be names of countries. So the first one, I'm going to write the United States. So
this is the first element in my list. And to write the second, we have to use the comment. So
here, comma, and now the second. So let's write India, tomorrow. So now China, and finally
Brazil. So these are the four countries, as you can see here, these are lists. So we have
the square brackets that represent the list. And we have four strings. And this is how we
define or how you create a list. So now I'm going to run this one. And to see the content, I'm going
to paste the name of this list in now I run here, I include only strings. But keep in mind that
lists can have elements of different types. So for example, one string and the other
and integer, and then a float, and so on. And also lists can have duplicated elements.
So for example, I can have here, United States, written twice. So here, for example, I can write
United States, twice m, that's okay, because this can have duplicate elements, but I don't want
it that way. So I'm gonna delete it and leave it as it is. Okay. Now, if we want to get an element
inside this list, we have to use something called indexing. By indexing, we can obtain an element by
its position. So each item in a list has an index, which is the position in the list. Python uses
zero based indexing, that is the first element so United States has an index zero, the second So
India has an index one, and so on. To access an element by its index, we need to use the square
brackets again. So let's see some examples. Let's start by getting the first element.
So United States. So what we have to do is to write the name of the list, in this case
countries, and then open square brackets, in inside square brackets, we have to write the
position of this element. So it starts with zero, so we write zero to get that first element.
And then we run this code. And as you can see, we get the first element. So if we write
here, countries square brackets, one, we get India. And if we write countries square
brackets to we get China, and we do this, with the number three, we get Brazil. So to
verify this, I'm gonna print each of them. So let's see what happens. So here print. And
finally print this one. In now I'm going to run and we shall get each element of the list from the
United States to Brazil. So let's try out. So here we have each of them, United States, the first
one, then India, then China, and then Brazil. So it's correct. So this is the most common way to
use indexing, but there is also negative index, this helps us get elements is starting
on the last position of the list. So instead of using indexes from zero and above,
we'll use indexes from minus one and below. So let's get the last element of the list.
But now using a negative index, so we want to get the last element which is Brazil. And
we did it before with countries square brackets three. But now we're going to do it with negative
indexing. So here, I'm going to write countries and copy and paste it here. And now I open
square brackets. And instead of writing three, we're going to write minus one. And these minus
one represents the first element is Talend. From the last position to Brazil will be minus one,
China is minus two, India minus three United States minus four. And that's how it works. So I'm
going to run this one countries, square brackets, minus one, and we will get Brazil and we got it.
So let's do this one more time. And in this case, I want to get United States, which is minus 123,
and four, so it's countries minus four. So we run this and we got United States, but now using a
negative index. Okay, now let's see something called as slicing is slicing means accessing parts
of the list, as lies is a subset of list elements is slice notation takes the form of a list.
So the list name, and then a square brackets and this tart, then this colon and stop this is
Todd represents the index of the first element in his top represents that element to stop at without
including it in the slides. So let's see some examples. So I'm going to use this country's list
again, and use I'm going to copy this one, and I'm going to paste it here. So this is the name
of my list. And now I open square brackets. And we're going to get, let's say, we're going to
talk at position number zero, and then column and let's get from zero to position number two,
so we have to write three, because it stops at three without including these elements in the
position number three. So let's run this one. And as you can see, here, we have index zero, index
one and index two. So it didn't include index number three, you know, let's say we want just
the first element, so we write from zero to one. So it's only zero and one no, because it doesn't
include one, and it's topped at one. So here I run, and we got only United States. So now let's
try something different. Let's say we want to get elements from index one to the last one. So let's
say let me see here. We want to get from India to Brazil. So it's one two and three. So we have to
write four because it stops at four and we got three. So let's write here 124 English
we'll get Yeah, India, China, and Brazil. So this is one way to do it. But another
way to do it is just delete this and leave it as it is and then run the code. And
as we can see, we got the same result. So every time you want to get from one position
to the last one, you can omit the top element, and just leave it without that element.
So just as we did here, and the same goes for the start. So let's say we want to get
from the first position, so index zero to two. So we don't include that start element, and we write
only colon, and two. So we're on this, and we get United States. And then we get India, because this
is the first and this is the second. So every time we want to get from the first element, or into the
last element, we can omit that target and its top elements, as we did in these two examples. Okay,
now let's see how we can add elements to a list. There are different methods that help us add
a new element two lists. So let's have a look. The first one is called append. And we're going
to use that counters list as an example. So I'm going to write countries just so you can remember.
And here it's countries. And as you can see, it has four elements. And let's say we want
to add any country to this country's list. So what we can do is just right here, or paste
here, countries, you know, add, append, or that append in here, as you can see is this method. So
inside parentheses, we can write the new country, we want to add to this list. So let's
say we want to add the country Canada. So write Canada. And now we'll run this
code. As you can see, nothing is printed, but it will print the counters list again, we
see here a new element. So as you can see here, that append method adds a new element at the end
of the list. So this is by default at the end. But what happens if you want to add an element in a
different position. So here, you can use another method, which is called that insert method. So
let me show you here, I'm gonna copy countries, you know, I'm going to use the Insert method. So I
write that insert, then parentheses, and this one accepts two arguments. The first one is the index.
So the position of the element do want to insert. So let's say we want these are the first position.
And the second argument that it takes is the new element do want to add. So in this case,
let's say we want to add that Elements pane, so these, another country, and it's going
to be in the first position, so index zero. So let's try I run this one. And again, nothing
happens, apparently, nothing happens. And here, if I run this country's list, again, we can see
that there is a new element, and this element is pain. It's located in the first position. Unlike
Canada, that was placed in the last position. This is one of the difference between the append
method and the insert method. So with insert, we can specify the position, we want to
insert this new element, but with append, the element is added at the last
position. Another thing you can do is to join two lists, using the plus operator would use
the task operator to concatenate strings before but you can also join two lists. So let me show
you here. I'm going to create a new list just to show you how it works. So my new list is going to
be called countries underscore two. So I'm gonna include different countries. So in this case, it's
going to be the UK, then Germany am. That's right, Austria. So we have three countries in this new
list. And now I'm going to run this one. And if we want to concatenate these first
list countries, with the second list, countries to We can use the plus
operator. So here, I write plus. And then I run this one. And as you can see,
I got five elements from the first list. And three elements from the second list in
another cool thing you can do in Python is putting these two lists inside another list,
which is called nested list. So let's try out. So here, I'm gonna create a new list, it's
gonna be called nested underscore list. In here, I'm going to open square brackets
to create a new list. And as elements, I'm going to write countries, which is my first list,
and then comma, and then countries underscore two. And this is my second list. So as you can
see here, these elements inside this list, the first is a list in the second is the list.
So we have lists inside another list, which is called a nested list. So I run this one, and then
I paste nested underscore list, and we run and we get here. The first is as first element and the
second list as second element, you won't see these nested lists so often, but you will encounter this
a couple of times, so it's good for you to know. So now we're going to say the opposite of
adding an element to a list, which is removing an element. So here, I guess, pasted the country
slate we had before. And what we're going to do is to remove some of the elements of this list. So
there are different methods that help us remove an element from the list. One of them is the
remove method. So to remove an element using this, we have to first write the name of the list, and
then use that that sign and then write remove, and write parentheses in inside here, we have to
write the element we want to get rid of. So first, it's United States. So write United States.
And let's run this one. And as you can see, apparently, nothing happens. But if
we paste countries, here, we have all the elements, but United States is not there.
So as you can see, the first matching value was removed. But you can also remove an element
by its index. So this is accomplished without pop methods. So I'm going to copy all of this.
And now I'm going to paste it here. So instead of writing that, remove, I'm gonna write that pop in
here, I'm not gonna use the name of the element, but its index. So I write the index. In this case,
let's remove the last one. So it's going to be index minus one. And what pop is going to do
is to remove that element with index minus one, and then returns this element. So this element is
Canada, I didn't run this code here, so you can ignore it. So I'm going to come in this one. And
our reference is going to be this this list. And to verify we use write countries, and then run,
and here, as you can see, there isn't Canada anymore. And that's how you remove an element
using the pop method. But there's still another way to remove an item using an a specific index.
And it's that Dell. So I'm going to show you here, del, it's the function del function. And here, we
have to write the countries list. And then again, open square brackets in here, write that index.
So I write here, the index. And unlike the pop method, we're not going to get the name of the
element we're getting rid of, but just deleting the element. So I run this one. And here, we
didn't get anything. And I'm gonna print this. So countries and that element at index zero was
removed. So it's pain because that's the first element so we delete it or we remove the first
element. So we only got India, China and Brazil. And there you have it three different ways
to remove an element from a list. Okay, now let's see how to sort a list. We can easily
solve a list using the stock method. Let's create a new list called numbers. And then sorted from
the smallest to the largest number. So here first I write numbers, and then open square brackets. So
I'm going to write some random numbers. So force four, then three, then 10, then seven, one, and
then two. So this is my list. So I run this code. And now to sort it from the smallest to the
largest number, we write numbers, then sort, then open parentheses. And by default, this
is going to be sorted from the smallest to the largest number. So I run numbers again, in
here, it starts with one, and it ends with 10. And as you can see, it's from the smallest to the
largest number. So that's the default behavior of the SOC method. But we can control how this
works. So we can add that reverse argument to the SOAR method to control the order. So if we want it
to be descendant, we set reverse to true. So here, again, I'm going to create again, the numbers
list, and then write numbers. That sort in inside parenthesis, I write the reverse
argument in, I'm going to set it to true here. And then I'm gonna print numbers. So here, I get
an error, because here it I wrote number and its numbers. So here, I'm going to add the s, and here
s two, so run again. And here we have, from the end here, we see that the list is sorted from
the largest number to the smallest number. So as you can see, the default behavior of this
sort method is reverse equal to false. So you can control it here, by writing reverse equal to true
as we did here, okay, now let's see how we can update values. And always, to update a value on
a list, we use indexing to okay, that element we want to update, and then we set it to a new value
using that equal sign. So let's say we want to update the first element of this numbers
list. So now it's four, but we want it to be, let's say, 1000. So we write here numbers.
And we use indexing. So we write numbers, the first element has index zero, so we write
numbers of square brackets than zero, then we set it equal to the new value we want to include.
So in this case, I'm going to write 1,000th. And now I'm going to print the numbers,
please, to see the results. So run this one. And as you can see, here, the number of
leads we got is from the last change we made, so the one that's taught with 10. So it's not
this one, but this one because it's the last one we ran. So instead of 10, we replace this one
with 1000, because this is the first element with index zero. So with ID, numbers, square bracket
zero, and we update that first element with 1000. Okay, finally, we can make copies of the list we
created. So there are different options to create a copy of a list. One of them is that slicing
technique. So as you might remember, to do slicing, we have first to write the name of
that list, which in this case, is countries. And then we open square brackets, then
we're supposed to write the start and stop. So in this case, we're not going
to write start in a stop but only column. So if we don't write start in, we don't write
stop, it means we want the whole list. So let's try this out. I'm going to run this
one. And as you can see, here, we got the whole list. So the counter sleaze doesn't have
the original values, because of the changes we made when we added and remove elements. So I'm
going to pace the original counters list with four original values that are the United States,
India, China and Brazil. And here let's see the changes in how we test it out. In as you can see,
we got the whole list. So from the first element United States to the last element Brazil, because
we're slicing the whole list. So if we write here, new underscore list, and we set this
equal to countries with this slicing, what is going to happen This new list is going
to have the same values as the country list. So I write here new list. And as you can see
here, it has the same values. So recreated copy of that counters list. So this is one way how
you can create a copy. And the second way is more straightforward, or is it more explicit,
so is using the copy method. So we write, again, countries the name of the list, and
then we use the copy method. So write, copy, and then parentheses. So with this, we create
a copy of this list. So let's run this code. And as you can see here, it returns the
list. But if we assign these to a new list, we're going to create a copy. So here, I'm going
to write new underscore list underscore two. So here, we assign this copy to this new list. So
I'm going to copy this new list and paste here. And as you can see, here, we have the values
of this list, which are the same as the original countries list that is here. And
that's it. That's how you make a copy of a list. So now let's see how dictionaries work in Python.
In Python, a dictionary is an unordered collection of items used to store data values, and a
dictionary contains a key and a value. So this is what you will often see in a dictionary.
So here, for example, the name of my dictionary is my underscore dict. And to create this dictionary,
we have to use these curly braces. So we open curly braces in inside, we write our first item in
the first item consists of a key here on the left, and then our value here, and it's separated with
the colon. So here we have the key, then column, and then the volume. And then we have here the
second item. So the second key and the second value. So now let's create a dictionary that
has some basic information about me. So I'm going to name this dictionary, my underscore data.
And now to create this dictionary, I'm gonna open curly braces. And the first key is going to be
name. So I write name, and it has a value that is my name. So I'm going to write Frank. So
I open single quotes, and then write Frank. And then I'm going to add a new item. So I write
coma. And then the second key is going to be age. And the second value is going to be my age. So in
this case, I'm going to write my age, which is 26. So as you can see here, the first is a strength,
the first value is a string, and the second is integer. So we can mix different datatypes.
So now I press Ctrl, enter to run this code, and we created this dictionary. So now
I write my underscore data in here you have the dictionary we created. So here we can get the
keys of this dictionary, we only have to write my underscore data that keys so this is the keys
method. So we run this, and we get this dict underscore keys. And the values are name and age,
which are the keys of this dictionary we created. So name the first key and age the second key.
Now we can get also the values. So my name and my age. So we just have to use the values method. So
I'm going to paste this one here. And instead of writing that keys, I'm going to write that values.
And now run this and we get my name and then my age. So next, I'm going to get the items. So
as I said before, an item is this. So this is the first item. And this is the second item. So we can
say that the item is a pair of key and volume. So we can get this by using the items method. So
instead of writing dot values, I'm going to write here that items and then run this one. So
here we got the first item. So the first pair, key and value, which is my name am well
that key name and then my name Frank. And then the second items so the key name,
age and the age which is 26. Now we can add a new pair of key value in this dictionary we
created. So let's say we want to add my height. So I write my data in. Let's say we want to add
the key name height. So I write height. So we use square brackets here. And then we set this to the
value. So let's say it's 1.7. So I write my data, and then square brackets, then hide inside it,
and then equal to 1.7. So if I run this, in, then I run the dictionary, we can see that there
is a new item, and it's the height. So height, column, and then 1.7. This is how you add
a new item to the dictionary, a now we can update this height. So let's say I'm not
1.7, but I'm 1.8 meters. So what we can do is to use that update method to update this
value. So I write my underscore data. In here, I can use the update method. So I write update,
and then inside parentheses, we have to open curly braces to update this new item. So
I'm gonna write the key, which is height. And then I'm going to set the new height, which
is 1.8. So let's try this out. I run this, and then let's see the values. So let's see if it was
updated. So I ran this, and we got the height 1.8. So it's perfect. So now let's see how we can make
a copy of a dictionary, the same way we did before for the lists. So to make a copy, we just have to
write the name of the dictionary, in this case, it's my underscore data. And then just as we did
for the list, we can use that copy method. So we write that copy with parentheses, and then we
create an a copy. So here you can see the copy. And now I can assign these to a new dictionary.
So I'm going to write new underscore dict. And now, I'm going to copy this one, I'm going
to run and then I write new underscore dict. And run this. And as you can see, it has the
value of the my underscore data dictionary. And something I didn't tell you when I make a
copy of the list is that if you change the data inside that my underscore data dictionary, so the
old dictionary, that effect is not going to be seen in the new dictionary. So for example,
if we write one, that nine, and here, I update this in the old dictionary, so here
you can see height 1.9. And if we run this new underscore dict, we can see that after running,
this height, remains with the same value 1.8. And he doesn't change to 1.9. This doesn't happen
if you make one of these copies most people do. So let me show you what I'm talking about. So
most people just make a copy doing new data, underscore to equal to my data. So this is the
old dictionary, and this is my new dictionary. So what happens if I run this, and then I,
I'm going to show you the values of this new dictionary. So this is 1.9. And if I update this
to, let's say, one point, 95. So update here, update here, here is one point 95. And if I
run this new underscore dict underscore two, we can see that the value was updated to
and this shouldn't happen. So if you want to create a new dictionary that works
independently from the old dictionary, you should use that copy method. And this is
the same if you're making a copy of a list. Finally, let's see how to remove elements from a
dictionary. So just like we did with the lists, we can remove an item in a dictionary. So
there are different options. First, we have the pop method. So right, my underscore data, I'm
using the old dictionary we've been using so far. So my underscore data, and I'm gonna
write that pop. So this is the pop method. So here I can write that key. So in this case,
I'm going to write the key. Let me see here, my underscore data the key name, so I write And then
parentheses them name. So as you might remember, the pop method returns this value of that key.
Before we did with the list, and it returned the list element, in this case, it returns
the value of the key. So this is the key name, it returns the value. So if we print this my
underscore data dictionary, we see that this pair, key value is in here. So we successfully remove
this item. Another way to remove an element or an item from a dictionary is using the delta
function. So we write del, and then we write that name of the dictionary. So my underscore
data, and then we have to specify again, that name of the key. So we open square brackets and open
quotes. In here, let's say we want to delete or remove the H key with its value. So write H, and
we run this. And then if we print this dictionary, again, we get the dictionary and we say that the H
key was removed, and also its value. And finally, you can remove all the items in a dictionary with
a clear method. So we write my underscore data and use that clear with parenthesis. And now if
we bring this dictionary, you can see that this is an empty dictionary, because we removed
all the elements from this dictionary. Now let's see one of the most common statements
use in Python. This is the if statement, the if statement is a conditional statement used
to decide whether a certain statement or block of statements will be executed or not. Here, you can
see the syntax of this if statement. And as you can see, it starts with the if keyword, followed
by that condition. So if the condition is true, this code here is going to be executed if
the condition is not true. So it's false. The code here in the lf it's going to be tested.
So here in this LF block, this new condition will be tested. And if this is true, this code
below will be executed. But if it's not true, then the else block will be tested. And here,
this is the last block, and automatically this code will be executed. So here one little detail
that most beginners forget to write is the column. So it's sometimes easy to forget, it's there,
but you have to include it in one order things some people miss is this indentation. So here,
there is an indentation, you have to include after the column. So every time you write here column,
you press enter in you automatically. In most test editors, you're gonna get this indentation. But
if for some reason you don't get that indentation, and you get something like this, you can indent
this line by using the tab key in your keyboard. So just press tab, and it's going to indent
this line. So make sure you're right that column and do include an indentation for
each code that will be executed. So here, here and here. So now let's have a look
at some examples to see much better how that if statement works. So first, I'm going to
create a new variable. And as you might remember, to create an variable, you have to write a name of
this variable. In this case, I'm going to name it age, and then you have to set it a value. So in
this case, this is going to be a number. So I'm going to set this age to the number 18. And now
I'm gonna write this if condition or if statement, so I write f, h is greater than or equal to 18,
then column and then this code is going to be executed. So if this is true, I'm going
to write, print and then a message. So if this person or if the age is equal or greater
than 18, I'm gonna write the message. You're and adult ng as you can see here, I'm using single
quotes, and I wrote down pastor feet. So I'm going to use double quotes, and everything is fine
now. So here print, then the message and you're an adult. So if this isn't true, I write else in then
column and print. Here a new message, which is, you are a kid. So let's see this again.
So if the age is equal or greater than 18, then we print, you're an adult. But if it's
less than 18, we print you're a kid. So here, we run this code English, we'll get this,
because 18 is equal to 18. So let's run ng as you can see, we get the message, you are an adult.
So now we can play with this, we can change the age value. So here, I'm going to set it to 15. So
I ran in as you can see here, 15 is less than 18. So this is false. And this code is executed. So
this block here is going to be executed. So we got you are a cape. So we can try this one more time.
So in this case, I'm going to write another age. So 30. And again, 30 is greater than 18. So this
is executed, so you're an adult. So now let's add a new block, and I'm gonna use the LF. So I write
LF, and then h. And then greater than, let's say 13. And then column, press enter, and we got this
indentation. And then we print another message. So if the H is equal to or greater than 13, we write
the message you are at teenager. So teenager. So if it's between 13 and 17, or well, less
than 18, it's going to be your a teenager. But if it's less than 13, it's going to be you're
a kid. So let's try this out. So I ride first 10. And then we get your kit, because it's less than
13, then we're changing this to 14. And then we get you're a teenager, because 14 is greater
than 13. In finally we write 20. And we get you're an adult, because 20 is greater than 18.
And that's it. That's how the if statement works. Now it's time to see one of the most common loops
in Python, this is the for loop. Python, for loops are used to loop through an iterable object and
performs the same action for each entry. One example of an iterable object is a list. So we can
look through each element of a list and perform the same action on each element of that list. Here
you can see the syntax of the for loop, and as you can see, here is the for keyword, and then we have
to use a variable, then we have to write that in keyword. And then that iterable in this case, as
I told you before, the most common is the list. So you have four variable in list. I'm gonna write
here lists so you can see much better and then we have to write that column. And then after a
column, it goes and indentation. So here we have the indentation in the code that will be
executed for each iteration here that we make with a for loop. So to see this much better, I'm
going to use that countries list we created before so these are the countries list. And I'm going to
loop through this list. So right for and then we have to set a variable that is going to be just
just temporarily, so this variable is going to be called country. So this variable doesn't exist,
we just created temporarily. So for country in and then we have to write the name of that iterable
which is in this case a list. So countries, so for country in countries and then column
and then enter in we get this indentation. Then we say print Country. So for this
variable in this iterable, which is a list, print each element, this is what we're saying in
this for loop. So we run this, in, as you can see, each element of the list country is printed.
So we're looping through that countries list and printing each element. So the first is the
United States, then India, then China and Brazil. And this is how the for loop works. Now,
let me show you a new function that you can implement along with a for loop. And it's called
enumerate. So I'm going to write here enumerate. In here, I'm going to put this country's list
inside this new function. So what this enumerate function does is to enumerate each element of the
country's list, as we loop through the list. So I'm going to add here a new variable, and it's
going to be i, then comma and then country. So this enumerate will return two elements, the first
one is going to be the number of the loop. And the second one is going to be the element itself. So
here, I have to print apart from the country, the i variable that I just created here, or
it's just temporarily here. So write print, I, and then print country. So here, we're going
to print here, that number of iteration in that element. So I run Ctrl, enter, and here we get it.
So first is the United States. In that iteration, the first iteration with each, which is zero,
then we get India in the second iteration, which has one, and so on. So as you
can see, here, the AI starts with zero. So this is how enumerate works, it starts
with the number zero, and it returns the number of the loop and the element. And finally,
let's loop through elements in a dictionary. So let's use the dictionary we created before that
was my underscore data. Well, this is empty. So I'm gonna use the original dictionary.
So here I have the original dictionary, and it's here, so I'm just going to print
it. So this is the dictionary in now we're gonna loop through this dictionary. So let me
show you here. First, we have to write for, and then we write the key. And value because one
item, as you might remember, is made of a key, and the value, so key and value. So we say, four key
coma value in, and then the name of a dictionary. So right, my underscore data. In order
to get the items of this dictionary, we have to use the items method, so we write that
items, and then parentheses, then we write column, and we press enter. So here, we can print the
key. And we can also print the value. So key and value, and then we run this code, and as
you can see here, we get the key, the first key, and we get the volume, we get name, and we get
Frank, and then the second key H ENDA H 26. So this is how you loop through elements or
items inside a dictionary. Okay, now let's see how functions work. In Python. A function is a
block of code, which only runs when it is called, you can pass data known as parameters into a
function. So here is the syntax of a function. And as you can see here, we have first to set
the keyword def to create this function. And then we have to write the name of this function.
And inside parentheses, we define the parameters of the function that we're creating. Then we
write column and below, you have to write the code and every function should return something. So we
have to use that return keyword, and then return something like a variable for example. So now
let's create a basic function. So first we write def, and then we write the name of the function.
So this function is going to do something really simple. It's going to sum the values we pass into
it. So it's going to be name, some underscore values. And as parameters we said a coma B, then
Column M, press enter, then what this function is going to do is to add the a plus V values,
and we're going to set this equal to x. So we write x equal to a plus b, as I told you before,
you should return something after we finished our function. So we write return. And here, we're
going to return the x variable. So write x. And that's it. That's how you create a function.
I ran this code, as you can see, apparently, nothing happens, but this function was created.
So to use this function, we have to call it so to call this function, we have to write the name of
the function. And then we pass some parameters in, in this case, it's called arguments when you call
the function, so I'm going to write down argument one and argument three. So once you call this
function is going to go to the function here, and is going to set this one equal to A in these
three equal to b. So you have one plus three, and this is four. So x is going to be equal to
four, and then this function is going to return the value of x, which is four. So this is supposed
to return the value of four, so we run this, and we get the value of four. So this function is
working properly. Okay, now let's see some built in functions that Python has. Python has lots
of built in functions that can help us perform a specific task, let's have a look at some
of them. So let's start with a land function, we only have to write the word land, and then
we open parenthesis. And as you can see here, you better not look gives the green color to
functions. Now let's calculate the length of the country's lease. So I have here the conscious
waste. And now I'm going to copy this one, paste it inside parentheses. And what the len function
is going to do is to calculate the length of any iterable object, in this case, a countries list
is an iterable object. And now I'm going to run to calculate the length of this object. So I run this
one. And as you can see here, dial length is four. And this is how the land function works. Now let's
see a different function. In this case, I'm going to create a new list that contains only numbers.
So I'm going to write random numbers here. 1063 81, then one there, 99. So this is my new
list. And I created this list with only numbers to try the max and min function. So the max function
is this one, we write Max and then parentheses, and this one returns the item with the highest
value in an iterable. So my iterable is this list, and we're going to get the highest value of the
elements inside this list. So we'll run this one. And as you can see, here, the maximum value
is 99. And we can do also the mean function, and it's going to have the opposite effect. In
this case, we're going to get the minimum value of this list. So we run and we get one. Okay,
another common function used in Python is the type function and this function give us the type
of the object, we only have to write type in what this function does, is to return the type of an
object. So in this case, let's copy and paste that country's object. And if we run this, we can see
that this object is a list. And that's correct, because here we created a list with square
brackets. So that's what the target function does. And finally, the last function we're going
to see is the range function. This one returns a sequence of numbers that start
with a number and ends with another number. So let's see how it works here. So this one
has three arguments. First I start number, this one, I'm going to write one, then the
number where the sequence stops. In this case, I'm going to write let's say 10. And then
the last argument is the increment. So how this sequence is going to grow by how much so in
this case, I'm going to say that this sequence is going to grow by two. So write two. Now I run in
as you can see, Nothing happens, we only get the same text here, that if we make a loop here,
so I write for I, in wrench, now print this i. So this is a for loop, we saw this before. And
here we run. And as you can see here, we're iterating over this range, and we're getting the
elements inside this range. So the first element is one, the second is incremented by two, so one
plus two is three, then three plus two, five, then seven, and then nine. And then we should
get 11. But the last element here, it's 10. So this sequence stops at 10. So we only get until
number nine. And that's how the range function works in Python. And that's it. Now, you know,
the most common built in functions in Python. Okay, in this video, we're going to see what
are modules in Python. In Python, modules are files that contain Python code, a module can
have classes, functions and variables in even runnable code. And to get access to a module,
we have to use the Import keyword, this one, and to see a module in action, we're going to see
that oh as module, and this one comes with Python, so you don't need to install it. So to get access
to these always module, we have to write import always. And that's it. We only write this in now
let's see some functionalities of this module. So the first one that we're going to see is the
get current directory method. So to get access to that method, we right always, then get C, W,
D, and then parentheses. So this C W D stands for current working directory. So we're going to get
the directory where our Jupyter Notebook file is located. So this file I'm working with right now.
So let's run in, let's see what happens. So as you can see, here, I have the path where the Jupyter
Notebook is located. So this is the complete path. And you can see it by using the get CWD method.
So now let's see another method. And in this case, we're gonna list all the elements in the
folder where this Jupyter Notebook file is located. So here, to do that, we're going
to use the method list Dir. So this means list directory, and I'm going to run it. And as you
can see here, I have this Jupyter Notebook file that is named untitled. As you can see here,
the name of my file is Untitled. And this order elements, you can ignore it, they are not files,
they are just some hidden elements in my folder, but they don't matter. So right now, the only
file I have in this folder is this untitled file. So this is what the list der does. So it lists
all the elements in the folder where this Jupyter Notebook file is located. And now let's see the
last method, which helped us create a new folder. So this method is called make Ders. And we have
to write always that make the errors, and then parentheses, and inside parentheses, we have to
write the name of the folder we want to create. So in this case, I'm going to name it New
Folder. Simple as that. And now if we run, we're going to see that nothing happens. But now
if we use this list dir method to list all the elements in my folder, we can see that there is
a new folder. So here, if we compare this result we got before with this new result, we can see
that there is one new element. And this element is that New Folder element, which is the folder
we created using that make ders method. And that's it. Those are some basic things you can do with
the OAS module. In the following videos, we're gonna install different libraries, packages and
modules, so we can do even more things in Python. In this first introduction to pandas,
we're going to learn what is pandas? We're going to compare pandas with Excel, and then
we're going to learn what are pandas data frames? So first, Pan This is probably the best tool
to do real world data analysis in Python. It allows us to clean data wrangle data, make
visualizations, and more. You can think of pandas as supercharged Microsoft Excel, because most
of the task you can do in Excel, you can also do it in pandas and vice versa. That said, there
are many areas where pandas outperforms Excel. So before you learn pandas, let me show you why you
should learn pandas, especially if you already know Excel. So there are some benefits that
pandas has over Excel or Python has over Excel. So before dedicating time to learning pandas and
also Python, let's see what are these benefits. So first, limitation by size, Excel can handle
around 1 million rows, while Python can handle millions and millions of rows. Another benefit
that Python and pandas have over Excel is the complex data transformation. So in Excel memory
intensive computations can crash workbook while in Python. When you work with pandas, you can handle
complex computations without any major problem. Also, Python is good for automation. While
Excel was not designed to automate tasks, you can create a macro or use VBA to
simplify some tasks. But that's the limit. However, Python can go beyond that with its
hundreds of free libraries available. And finally, Python has cross platform capabilities. This
means that Python code remains the same regardless of the operating system or language set on your
computer. Okay, before I start writing code, let me explain to view the core concepts of pandas.
So we're going to start seeing the concepts of arrays. So arrays in Python are a data structure
like lists. So you can find like one dimensional array or two dimensional arrays, also known as
2d array. And the two main data structures in pandas are series and data frames. So the first
is a one dimensional array. Why the second, a data frame is a two dimensional array.
In pandas, we mainly work with data frames. But if you didn't understand so much the
definition of a data frame with arrays. Let me show you another definition, this one using Excel.
So a panda's data frame is the equivalent of an Excel spreadsheet, pandas data frames, just like
Excel spreadsheet, have two dimensions or access. So there are two axes and one is the row and
the other is the column. So the column is also known as series. So what we seen before this
one dimensional array series is a column this is another name to call the columns in, in a
panda's data frame. On top of the data frame, you will see the name of the columns. And on the
left side, there is the index. By default index in pandas start with zero. That intersection of
a row with column is called a data value, or simply data. We can store different types of data
such as integers, strings, Boolean, and so on. Right now, you see on the screen, a data frame
that shows the US states rent by population. I'm going to show you the code to create a data
frame like this later. But now let's analyze this data frame. So the column names are also known
as features. So our features here are states population, and postal. While each row value is
known as observation, we can say that there are three features and four observations because
there are three columns and four rows. Keep in mind that a single column should have
the same type of data. In our example, the states and postal columns only contains strings. While
the population column only contains integers. We might get errors when trying to insert
different datatypes into a column. So avoid mixing different type of data. So now let's see that
terminology translation between Excel and pandas. So as I mentioned before, in Excel, we work with
worksheets. In pandas, we work with data frames. So the columns in Excel are also known as series
in pandas. But we also mentioned or we also say, often the word columns. And in pandas we worked
with index. So the index are those numbers that are on the left. And in pandas, we also say
rows, we have many rows with observations too, but rows are fine. And finally, in pandas, we
work often with these n a n that stands for not a number. And this is the equivalent of an empty
cell that you might find in Excel. So that's it for now. In the next video, we're going to learn
how to create a panda's data frame from scratch. Welcome back. In this video, we're going to learn
different ways to create a panda's data frame. So as you might remember, a data frame looks like
this. It has columns and rows, and the columns are series. So series are 1d array. And arrays is how
we create a data frame. So this is the first way to create a data frame with arrays. So these
are arrays, we have 1d arrays, 2d arrays, in 1d arrays are basically columns, while 2d arrays
are data frames. So usually, to use arrays, we use a library name NumPy and NumPy is what
is under the hood of pandas. So to use NumPy, we have first to import NumPy. We're going to do
that a bit later when we write code. But just to give you an idea of what a numpy array looks like,
here, I wrote a basic array, we have to use in P that array to create this data frame that you see
on the right. And well this is one way to do it. You can also use lists, as I'm showing you right
now. And as you can see here. And the second option, when you create a data frame with
lists, you don't need to use NumPy arrays, because you're using some kind of lists arrays.
So we're going to write that code to create a data frame with arrays. But let's see
the second option to create a data frame. So the second option is dictionaries, you
can create a data frame with dictionaries. And as you might remember, a dictionary has
a key and a value. So we can use the key as column name and the value as the data. So
the value can be a list. So this data will be many elements inside a list. So a pair of key
and value is known as item in a dictionary, in this case is going to be a series
because it's one column what we have here. So this is the second way to create a data frame
with dictionaries. And we're gonna see that with code a little bit later. But now let's see
the third way, which is with CSV files. So CSV files are files that can be open in spreadsheets
like Excel. And this is the easiest way to create a data frame because we only need to read the
CSV file and then the data frame is created. And that's it. So now let's go to Jupyter notebook
to create a data frame writing some code. Okay, now we are on Jupyter Notebook. In here, we're
going to write the code to create a data frame. And we're going to use the three ways I showed
you before. So the first thing we're going to do is to import the libraries we're going to use to
create a data frame. So that's the first line of code. And I already wrote that. So it's here. So
first, we import pandas, and then we import NumPy. So import pandas as PD. PDS is convention to
name pandas and NPWS. away to name NumPy. So to run this code, just press ctrl enter in our use
weight in we import pandas in NumPy. So let's see the first way To create a data frame, so the
first is with arrays. And to create an array, we have to use a numpy. This is the first
option. So we write in p, which is the short name for NumPy. And then we use the array
methods. So we write array, open parentheses, and inside we write the array we want to create.
So I'm going to create, I'm going to write random numbers just for the sake of this
example. So I open double square brackets. And then let's write, let's say one and four.
And then let's say two and five, and the last one is going to be three and six. So each pair
of let's call it list. Actually, they are lists each list or percent row. So this is the first
row or this is going to be the first row, this is going to be the second row in our data frame.
And this is going to be the third row. So here, we can name these arrays, and I'm going to name it
as data. So that is equal to this numpy array. So I'm going to execute this code. And now we have
this data. So we created the array using NumPy. Now let's create a data frame with pandas. So to
create a data frame with pandas, we have to write pandas. In this case, I can write PD, because I
name it like this here in my first line of code. So I write PD. And then to create a
data frame, we use the the data frame method. So we write that data frame, and
then we open parentheses. And here we have to feel some arguments. So the first one, and
that's what something that you always have to include in this data frame method is that data
because you cannot create a data frame without data. So first, we include the data. So first,
copy here, our array, and then you paste it here. That's the first argument. So you can create this
data frame as it is, I'm going to show you here, use CTRL. And enter su as you can see, here,
here's my data frame. But as you can see, it's full of numbers and column names also have
numbers and the row names also have numbers. So to make it more understandable, we can
rename this. This column names and row names, or index, actually, the name of the row names
are index. So first, we can name this index as rows. For example, we, you only need to add
the index argument have some writing right now. And then you have to specify the names you
want to set. So you have to open list. So this first or this second argument has a form of
a list. So the first element is going to be the first index. So here, zero, so in case
you don't want it to be zero, you can set here another name. So in my case, I'm going to set
it as row one, then Kuma to set the second index as row two, and the third as row three.
So now we can add also, or we can modify also the column names, we have to use that
column argument. And here we write it columns. And then we open square brackets, because it's at
least here that we're going to add it. And in this case, we have to modify only two elements. So the
first is going to be I want to name it, call one and the second call two. So I'm going to write
this one. And actually, I'm gonna name this data frame. So I'm going to set it to a variable,
and this is going to be equal to the F, the F, it's the common way to name a data frame. So DFS
stands for data frame. So I'm going to run this code now. And as you can see here, it ran. Now
to show the data frame I can write here DF, so df and now we have here, the data frame. And as
you can see here, the first row one for it's my first my first list, and the second is the
second row and the first column Well, that's a serious as we've discussed before. So we have also
the column names that we modify and the row names. So now let's quickly see how to create a data
frame with arrays. But in this case without NumPy. So I'm going to copy these line of code, and
I'm gonna paste it here, option two. So here, I'm going to paste this because this is the base
of this arrays with list shape. And I'm gonna just delete this, I don't want numpy array anymore,
just this double square brackets. So I run this. Now, to create a data frame is the same way we
did before. So just copy this and paste it here. So run this, and now I can run the I can write
df, and now execute this code. So as you can see, we have the same result, I'm just showing you
the second way. So you don't have to worry about learning right now. NumPy. Okay, now let's create
a data frame from a dictionary. And we're gonna use lists in this example, and we're going to
create a data frame using more meaningful data. So in this case, to create a dictionary, I'm going
to use two lists, the first is going to be least name states in the second, it's going to be the
population, and it will contain the population of each state. So the first list is states, and I'm
gonna write it here. And I open square brackets, because this is a list, you know, I write some
states in, in the US. So the first is California. The second is going to be Texas, let me write it
here. The third is going to be Florida, and the last one, New York. So I quickly write it here.
And now I'm going to create a population list. So in this case, going to pay. So in
this case, I'm going to paste this data, so it pays to the population on each state, you
know, I'm going to create a dictionary from these two lists. So I'm gonna write the name of the
dictionary. So the name is going to be dict. Underscore states, then this is a dictionary, so
I should use square brackets, sorry, curly braces. And now I'm gonna set the name of the key.
So the first key is states, then colon, and now the element or the value.
So this is states, the first volume. And the second key and value is population,
I'm just gonna set it to with a capital letter. And the second is the least population that we
have here. So with this, we create our dictionary. So I'm gonna run these two. And now we have lists
and the dictionary. So now we can easily create a data frame using the data frame method that we
used before for the first option when we create a data frame with an array. So to do it, just write
PD, then that data frame, and now we have to write inside parenthesis, the name of the dictionary.
So I'm going to copy a dict underscore states. And I'm going to set this to add a new variable. So
I'm going to name these DF underscore population. So data frame about population. So now I run this,
and here I get an error because I didn't write data frame correctly. Here is in capital letter.
So run again, and now everything is okay. So now to show the data frame, I use paste this
one here, and now Iran. So here we have this data frame. And as you can see here, my first
key is states is the name of my first column in the data inside the state's list is here.
So here is my first column or my first series, and the same goes for population with its data. So
here we created a data frame using a dictionary. Okay, finally, let's create a data frame from
a CSV file. To create a data frame from a CSV file, we have to use the read underscore
CSV method. So first, we write as usual PD, that stands for pandas. And then we use the method
so we write rate underscore CSV, open parenthesis, and then we have to write name of this CSV file
here, I'm going to paste the name. So it's name, students performance that CSV and download
this data, you can check the notes of this video. And actually, we can have a look at this
data before importing into pandas. It's here I have it in Google Sheets. And as you can see here,
we have this course of some exams, math, reading, and writing. And we have some other data. So
we can import all of this data, all of our 1000 rows in pandas. So all of this is going
to be here. So here, we only have to define the name of this data frame. So here, I'm
going to name it DF underscore exams. So now Iran, and to show now the first five
rows of this data frame, we can use a method named head that we're gonna see later. But just to
give you an idea of this, we can write that head, and we get the first five rows. So as you can see,
here, we have the first five rows of this Excel or actually CSV file. And you can see here, for
example, the first row, it says female group B, and math score 72. So let's check if that data
is the same here. So we have female group B, and math scores 72. So we have all this data
here in this data frame. So if we want to see all of them, all of the rows here, we can forget
about that head. And now we have all the rows. Well, here, we cannot see part of the rows. I'm
going to show you how to see that part later in this course. But now, as you can see, if we
run these DF underscore exams, we can see like the summary of this dataset, or well data frame
this case, by the way, in pandas or when we work, actually in Python, we usually call these type of
CSV files. We'll call it data sets. And when we read our data set, using what pandas, the
result is a data frame what we have here, so the CSV file, it's a dataset, and this
when we read it with pandas is a data frame. And that's it. These are the three
ways to create a panda's data frame. Okay, now it's time to see how to display
a data frame in pandas. So here I have the CSV file we use before to create a data frame.
And a little detail I forgot to mention before is that this CSV file should be located in the
same directory where your Jupyter Notebook script is located. So what I mean by that Jupiter not
postscript is what we're seeing right now. I mean, the, what we're working right now is a Jupyter
Notebook script, this this file that we're working right now. So what you have to do is to
download this CSV file and place it in the same folder where your Python or your Jupyter Notebook
script is located in the same folder, and this is how you're going to read this CSV file using the
read underscore CSV method. So just make sure both the CSV file in the Jupyter Notebook
script is in the same place in the same folder. Okay, now, I'm going to run these first two
lines of codes that we've seen before. So the first input pandas and the second reads this
CSV file, so I run this, and now we have this CSV file is stored into these DF underscore
exams. This is my data frame. So now, let's see how we can see this data frame. So the
easiest way to see this data frame is just copy this name this variable in our pasting here. Now
I execute this, you know, we have the data frame. Actually, this is a summary of the data frame
because not all the rows are seen here. So here we scroll down a little bit. We can see here that
there are 1000 rows and eight, eight columns. So here we can see all these rows and the
columns. But as you can see here in the middle, we cannot see the the rows, so it's until four in
there. It continues with 995. So usually when we work with pandas, we don't need to see that data
one by one. So row by row. That's not how we do it with pandas. But if For some reason, you need to
see all the data in pandas, as you will do it here in Excel or in Google Sheets. I'm going to
show you a way to do it a bit later. But first, I'm going to show you different ways how we
usually displayed a data frame in pandas. So the first way to do it is using the head
method. So here, to use the head method, we only have to write the name of the data frame, in this
case, DF underscore exams, and then right head, then parenthesis, then we run this, and this is
how we get the first five rows in a data frame. So as you can see, here, we have from row zero
to row four, and this is how we got these first five rows. So this is the head method in the same
way, we can get the last five rows of this data frame by using the tails method. So here, we only
have to write again, the name of the data frame, in this case, well, the same DF underscore exams,
and then write that tails, then parentheses, run this, and actually, I think it's tailed. Yeah,
it's tailed in singular. And now we get this, we got the last five rows, so it's from 995, to 999.
So these are the five rows, the last five rows. And now in case you want to get more rows, so
not only the first five, or the last five rows, you can add an argument to the either the head
or the tails method. So I'm going to use here the head method as an example. So here, I copied
this, and I'm going to paste it here. So let's say now we want to get the first 10 rows. So we right
here inside parentheses, 10. And now we run this, and I scroll down here, and we can see that the
first 10 rows are here. And we can do the same with tail. So here are right tail. And as we
can see, the last 10 rows are displayed here. So you can specify the number of rows that
you want to display. And that's how you do it. So now, I'm going to show you how to display all
the rows of this data frame, as you will do it in Excel or in Google Sheets. To do so first, we
have to know how many columns this data frame has. So an easy way to get the number of columns
is using the Shape attribute. To get the shape attribute. First, we write the name of the data
frame. So in this case, DF underscore exams. And then to get to this attribute to get access to
this attribute, we use the DAT and then the name of the attribute in this case shape. So now we run
this, and we get 1008. The first is the number of rows, and the second is the number of columns. So
we have 1000 rows. So now to display all the rows, we have to use that set underscore option method.
So we'll write PD dot set underscore option. And inside parenthesis, our first argument is
going to be the following. In this play that Max underscore rows. So here, we have to specify
one more argument. And this is going to be the number of rows we want it to to have.
So here it's 1000 because we have 1000 rows, and we run this. And as you can see here,
nothing happened because we only modified the default behavior of pandas. So if we want
to get the data frame, we just press enter and execute this data frame. I'm going to scroll
down in here as you can see here, there are all the rows of this data frame. So I'm going to
scroll all the way down here. And as you can see, it says 999 So all dot rows are here displayed.
In that's it for this video. In the next video, I'm going to show you the different attributes,
methods and functions a data frame has in pandas. Welcome back. In this video, we're going to see
some basic attributes, methods and functions that we can use in pandas. But first, let's learn what
are each of them. So first, attributes are values associated with an object and they are
referenced by name using that expression. So to get to an attribute, we have to use the
DAT sign. So for example below you can see that we have a data frame named df and to get
columns, we have to use that that columns. So columns, it's an attribute. And that's how
we get this attribute of this data frame. So now we have a function. A function is a group of
related statements that performs a specific task. So we've seen functions before. In Python, we've
seen some Python built in functions like the max that gets the maximum value of a list, or main
that gets the minimum value or length that gets the length of the list. So those are some Python
built in functions that we can use in pandas to. And finally, methods are functions which are
defined inside a class body. So we haven't talked anything about classes, because it's not the main
topic in this course. So just keep in mind that functions are inside a class. So when the creators
of pandas built pandas, they use many classes. And those functions inside some classes are
known as methods. So for example, below, you can see the head method. And we've seen also
the tail method and some other methods. So far, as a rule of thumb, when we use methods, we have
to write the parentheses. But when we want to get access to attributes, we only write that that
and the name of the attribute. So the methods, it's with that in parentheses, and the attribute
is with only that in the name of the attribute. So enough talk now let's write some code in
Jupyter Notebooks. So here, we're going to use the same CSV file we use in the previous video.
And we import pandas, as we did before, then we read this CSV file with a read underscore CSV
method. And now we show that data frame simply by writing the name of the data frame, so we've seen
this before, I'm just reminding you, now we'll see some basic attributes, methods and functions
that we can use in pandas. So first, let's check some attributes of this data frame. So first,
I'm going to copy the name of the data frame. And now let's check. So the first attribute, it's
going to be the shape. So we've seen this before, I believe. And to get to the
attribute, we write the dot, and then we write the name of the attribute. So
its shape. So DF exams, that shape and we get the name of the attributes. The first is the number of
rows, and the second is the number of columns. So that's good, the next attribute, the next
attribute is going to be that index attribute. And as you might expect, we have to write only that
name of the data frame, then that and no index. And this is how we get the index of this
data frame. So as you can see, this has some form of range, arranged, as you might
know, has three arguments. And actually two are necessary. The first is the start,
in this case, it starts in zero. And the second is this top, so the last element
is tops at 1000. So this is true, because here, my data frame starts with zero and, and finishes
with 999. Well, it's 1000, because tops one before 1000. And here it increases by one, so 012 and
three, and so on. So a step is one. So this is my, my index attribute. So now let's continue. And
now let's get access to the column attribute. So to do so we write the name of the data frame.
And then we write the name of the attributes. So in this case, column, it has to be written with
S, so in plural, so we run this and we get the name of the columns. So as you can see here, we
have eight columns, the gender, race, ethnicity, and so on. And we can use this attribute
even to modify the name of the columns, but we'll see that later. And now let's see how
we can obtain the data types of each column. To do so we have to use the D types attribute. So
we write well the name of the data frame again, and then D types. And this is going to give us the
type of each column. So the gender is object and actually from the gender to the test preparation
course our objects while the math scores reading score and writing score are integers. So
numbers. By default, anything that says object is some kind of string. So I'm going to bring this so
you can see much better. So here is the data frame again. And as we've seen before, from gender
to test preparation has that type object in, as we can see here, all of them are strings. So we
can say that objects are the same as strings here. And also anything that has a score
here represent some kind of number. So that's why we get here integers. So in 64, so
these are the most common attributes in a panda's data frame. Now, let's review some methods.
So first, let's see the first five columns. And as you might know, it's with a hat method. So
we only write the name of that attribute, sorry, the name of that data frame. And then we write the
head method, so head and parentheses. So we run this and we obtain the first five rows. So we can
also obtain some summary, input the data frame by using the info method. So here we write the name
of data frame info, parentheses, and execute this. So here, we have some information about this
data frame. And here, we have, again, the data type here, and also how many rows are non null. So
as you can see here, all the data that we have in this data frame are non null. So there isn't any
empty data here in this data frame. Okay. Now, if we want to get some basic statistics of a data
frame, we have to use that describe method. So we write the name of the data frame in right describe
parentheses, execute this, so we run this code, and we have some basic statistics. So first,
the count. So this indicates how many rows each column has. So each of them have 1000 rows,
then we have the mean. So it's basically they assume each of the data here, that numeric data
and then divided by 1000, because there are 1000 rows, then the standard deviation, the minimum
value, for example, in math score, the minimum value was zero, then 25% represents
the percentiles. So this is q1 25%, q2 is 50%. In q3 is 75%. Then we have the maximum
value on each score on each exam. And we see that the maximum score is one candidate, and each of
them to the describe method is a useful method whenever we want to get some basic statistics of
the data frame, especially of the numerical data that we have in our data frame. Okay, now let's
see some functions that we can use. In pandas, we can use some built in functions that Python
has in pandas, for example, if we want to get the length of a data frame, we only have to write
land, and then inside parenthesis the name of the data frame. So we run this, and we obtain that
the length of this data frame is 1000. Actually, the length of a data frame indicates only the
number of rows. So here I made a mistake is rows. And this is how we obtained the number of rows
of data frame. So also, we can use other built in functions that Python has like the max function,
so write Max them the name of the data frame, we run. In this case, we didn't get anything,
anything meaningful because we get like a string. But if we write here, the index and we write
Max, as you might remember, if we use this, this attribute, we're going to get the list
of index. So if we use the max function, we're going to get the maximum or the
highest index here, so run and is 999. So we can also get the lowest index of a data
frame. We only have to copy this and instead of writing the max function, we write min. So in
this case, we get the minimum index n is zero. So now we can obtain the data type of the data
frame. Well, the data frame has data frame type, but we can verify that using the type
function. So we write type, then, sorry, write only the name of the data frame. And we run.
So here you can see, the type of this object is a data frame. And finally, we can use common
function that is the round function. So we write only round. And this has two arguments.
So first, the object that we want to run, and in this case is our data frame. And the
second argument is the number of decimal points that we want to have. So in this case,
I want two decimal points. So we'll run this. And we're not going to get this number of decimal
points in this particular example, because the, the numerical data we have here, it's integers.
So they are not floats. So this doesn't have any effect. But if you have a data frame with float
numbers, you can round those numbers using the round function. And that's it. These are the most
basic attributes, methods and functions that we will see often in pandas. Alright, now it's time
to learn how to select a column from a data frame. So here I have the same CSV file we've
been using in the previous videos. And well, let's import pandas, and let's read this CSV
file. So I have this in the same data frame. And I'm just showing the first five rows. So now to
select one of the columns of this data frame, we have two options. So let's see the first option.
The first option is using the square brackets. This is the preferred way to select a column in
pandas. And let's see how to select that gender column. So the first one here, so the first thing
we have to do is to write the name of the data frame, in this case, DF underscore exams, and then
open square brackets. So I open square brackets. And now we have to write the name of the column.
So we open quotes in here, I'm going to copy the name of this column, and I'm going to paste it
here. So we have here, the name of the data frame, and then the name of the column we
want to select. So now we press Ctrl, enter to run this code. And as we can see,
we have the first column of this data frame. So here we have this in, as you might expect,
this is an array. So this is a 1d array. And as we discussed before, in previous videos,
1d arrays are series, so we can verify if this is true, so we can do this with that
type function. So I'm going to copy this column this selection. And now what we're going to
do is to use the type function, so we write type, then open parentheses, and then inside
parentheses, we write the object we want to evaluate. So in this case, is this. And now we run
this. And as you can see, here, we get a series and series, just like pandas, data
frames have attributes and methods, so we can access those attributes and methods.
And actually, the attributes and methods between a series in a data frames are very similar.
So for example, if we want to get the index attribute of this series, we only
have to write that name of the series, and then write that and the name of the attribute.
So index, so we'll run this, and we get this index in form of a range that starts with
zero and ends with 1000. So another method that's sure pandas in series is the head
method. So we can also get the first five rows by writing that head, and parenthesis. So as
you can see here, we get the first five rows of this series. Alright, that's it for the first
syntax. This is my favorite syntax. And actually, most people use it because it's the most practical
in our time to see the second syntax to select a column from a data frame. So this syntax
involves writing that that sign, which is here. So let's say we want to get the same gender
column, so we write the name of the data frame, followed by that and the name of the column
so gender, in this case, we don't need to open quotes. And we don't need the square brackets. So
we run this code, and we get the same series. So it's here. And probably now you might be thinking
that this is more practical than the first syntax. But this syntax has some pitfalls. So now, let
me show you here. So what if you want to get one column that has two words, for example,
what if you want to get, let me show you here. This column that has as name math is core. So now
let's try to get access to this column. I'm going to copy this column name. And now scroll down.
And now let's try. So I'm going to write first the name of the data frame. And now the.so. To get
access to this, or to select this column, we have to write the column name. So this is the column
name. But as you can see, if I run this, we get an error. Because Python doesn't work like that. In
Python, when we have two words, we usually add as underscore. So that's how Python understands this,
that this is a variable. But if it's like this, Python will not understand what you're trying
to do. However, if you use the first syntax, so the square brackets doing have this problem. So
let me show you here. Now I'm going to write this. I'm going to copy it now I'm going to
paste it here. And instead of having this only dot notation, I'm going to open
the square bracket. So open square brackets, and then add the quotes. So as you can see here,
the column names has a string type in Python know that this is a string in now, if you delete this
dot sign in, you execute this, you get this column without any error. So these one of the bandages
that the square brackets has over the that sign, and that's it. In this video, we'll learn how
to select one column from our data frame. And in the next one, we're going to learn how to
select two or more columns from a data frame. Okay, in this video, we're going to learn
how to select two or more columns from a data frame. So as usual, we're going to start
by importing pandas and reading the CSV file we've been using so far. So we execute these two
lines of code, and we get here that data frame. So what we're going to do in this video is to
select two random columns from this data frame. So first, let's pick some columns. So I'm one
to select the gender column and also the math score column. So to select these two columns, we
have to use that square brackets again. So here, in this case, we have to use two square
brackets to select two or more columns. So to do this, we have to write first the name
of the data frame. So it's DF underscore exams. And now we open square brackets, so we write
one and two twice. So we have two pairs of square brackets. In inside, we have to write
the name of the columns we want to select. So we said that we wanted the gender column, so we
write gender. And the second column that we chose was that math score. So I open these quotes,
and now I write math score. So here, I have this two columns. And by the way, the order that
we write these columns is the same order that we're going to get that data frame, I mean, we
can define the order of the columns inside this square bracket. So here, we're saying that
first is the gender column. And second, it should be the math score column. So now, let's
run this. And as you can see, here, we obtained first the gender column in second math score
column. So here, we can see that it's data frame, and there are 999 rows. So now, we can verify that
this is actually a data frame by using that type function. So let's check if this selection is
a data frame. So now I'm going to copy this in here. Let's check out the data type of this
selection. So here I paste it in. Now we use the type function, we open this parenthesis, and now
we execute this code. And as you can see here, we get that this is a data frame. So here one
little detail I want to tell you, is that when we use these two square brackets, or two pairs
of square brackets, we're always going to get a Data Frame. But when we use only single pair of
square brackets, as we did in the previous video, we get a series. So one pair of square brackets
is for a series and two pairs of square brackets, it's for a data frame. Okay, now to
continue with the video, I'm gonna select two or more columns using these two pairs
of square brackets. So now let's choose the columns that we're gonna get. So in this
case, I'm going to get that gender column and all the scores that we have here. So the
math score, reading score and writing score. So to do so first, I'm going to copy this first
selection with it, to have it as a reference. And now I'm going to paste it here. So here,
so far, we have two columns. So let's add the two remaining columns. So here, an easy way to,
to write these columns, it's just by copying this in the data frame in here, we can paste it. So
instead of writing those names, we can just paste it here. Now I delete and I put it inside quotes.
So here inside quotes, and here we have it. So here, as I said, before, we can change the
order of the columns, we use have to, for example, here, I cat, this, and let's say we want to
have the writing score in the beginning. So here, I paste writing score. And now what
we're gonna get is first the gender column, then raw writing score column, and then the math
score and reading score columns. So now, let's run this code. And as you can see, here, we have
this data frame in the order that we defined here. Okay, now, you might be thinking, if there is
a way to select two or more columns using that, that sign, so let's check if that's possible.
Here. For example, let's say we want to get the gender in the math score column using the dot
notation. So here, I have it. And as you can see, here, this doesn't look right, because it
you have two strings separated by a comma, but you don't have a list, you have square
brackets, this is probably gonna fail. So let's check, I'm going to run this code.
And as you can see, here, we get an invalid syntax. So it's a syntax error. So as you can
see, we cannot select two or more columns with that sign. And this is one of the disadvantages
that that sign has over the square brackets. This is why most people prefer to use the square
brackets instead of the dot notation. And that's it for this video. In this video, we learn how
to select two or more columns from a data frame. Okay, in this video, we'll see different ways
to add a new column to a data frame. So here's the same students performance data frame. And as
you can see, we have three columns with scores, math score, reading score, and writing score. So
let's say we want to add a new score. So in this case, let's add our language score. So to add a
new column in spreadsheet, like Google Sheets, or Microsoft, Excel, will simply insert a
new column. And that's it. But in pandas, we have to use different methods, or different
ways to allow us to insert a new column. So let's see how to do it here. So first,
let's add a new column with a scalar value. So a scalar value is simply a single value. And
in this case, it's the column is going to have one single value, so all the rows is going to have the
same value. So to do so, we're going to have to select this imaginary column because this column
doesn't exist so far. So what we're going to do is to select a column, as we will do with any
other column. So first, we write the name of the data frame, in this case, DF underscore exams.
And then we open square brackets and open quotes, as we will do in any column. So here, instead of
for example, writing math score, I'm going to copy this. Instead of selecting math score, we have to
write the name of the column we want to create. So in this case, let's write language score. So
this is a new column, we want to create a now we have to assign to this new column, we have
to give it a new value or a new scalar value. In this case, I'm going to add a value of 70.
So now if we run this code, we're going to see that nothing happens that if we now show the
data frame, we're going to see that we have a new column and this column is name, language
score. And value that this column has is the same value. So it's 70, in all its rows, so
we have 70 in row zero, and if we scroll down, we're gonna see that it's 70 in older rows, so
even in row 999, but it's a bit weird that in an exam, you will have all the students with the same
score. So what you will usually do is to add some different values to this column. So to do
this, we have to use arrays into great arrays, we have to use NumPy. So here in the second way to
add a new column, we're going to use arrays. So in this case, we have first to see how many rows this
data frame has. So in this case, it has 1000 rows. And this is important because the number of
rows has to match with the number of the rate we're going to create. So let's create this array.
And first let's import NumPy. So we write import NumPy as NP. So we run this code. And now
we import NumPy. So now we have to create an array of 1000 elements. And to do so we're
going to use a method called arrange. So it's written like this, our range. And this gives
us an range of numbers that start with the first argument, and that I'm going to write
zero. And the last argument that in this case, it's going to be 1000. So these are the
limits of my range. So I execute this. And as you can see, here, it starts with zero
and till 1000. So to verify the length of this range, we have to use the length function. So
as you can see, here, the length is 1000. So the rate has 1000 elements. So now I'm going to
assign this to a new variable. And I'm going to name this variable language score. So language
underscore score. So we execute this in here, I was planning to see the length of this array. So
I quickly do it here, as we did before, so land, you know, we count the length of the array. So
now we have to add a new column to a data frame with this array. And to do that, we have only
to use the same way we did before. So first, we write the name of the data frame. And then we make
the selection. So this selection is going to be with the new column Well, in this case is not
new, because we already created it. But let's imagine it's a new column. So it's language score.
And now we have to set the array to this column. So we write language score here, and we set it
to this new column. So now to see the results, we only show this data frame. And as we can see,
here, we have a new column. And this new column starts with zero, and it ends with 999.
So it doesn't have a single value anymore, but now has a range of values, you know, there
is a little detail we have to take care of. So it's course are supposed to be between zero
to 100. And we have here from zero to 199. And also here we have a sequence of numbers, so it's
from zero, and then one and it increases by one. And usually in scores, you will see that students
have random scores. So we have to create here, an array with random numbers. And to do that, we
have to use NumPy again, but here we have to use a different method. In this case, the method is
named random dot Rand i n t. So let's write it here. np dot random that ran. And then i NT. So
the first argument is the lowest value of these random numbers. And by the way, these are random
integer numbers, because it's course are usually integer numbers. And in this case, I'm going
to say this, this to one. And the second score is the highest number or value in these random
numbers. And I'm going to set it to 100. And the third argument is the size. In this
case, we want an array of 1000 elements, so we set the size to 1000. Now we execute
this, we run this and I'm not going to see this rate again. I'm just going to check that it has
the land we want to By using the length function, in here we have 1000 elements. So now let's create
a new variable and store this in a variable. So here, this is going to be i n t, and then language
underscore score. And this is going to be our new variable. So here one little detail you should
know is that the first argument is inclusive. And the last one is exclusive. So this means that if
we here, let's say, we get the minimum value of this new array, we're going to get that minimum
value is one, because this first argument is inclusive, which means that it can be included
in this new array. However, if we print now, the maximum value of this array, we're going to
get that one candidate is not there, because it's exclusive, which means that this second argument
shouldn't be included in this array. Okay, finally, let's insert these random integer
numbers in the new column that we created. So we have to just use the same way we did before.
So here, I copy, and now I paste it. So here, instead of assigning this language
underscore score, I'm going to use this IMT language underscore score. So here, I'm going
to run this code. And as you can see, here, we have this, the same column. And
we have now this data that is random, random integer numbers from the rows zero to
the row 999. So now, these new data looks more like a scores like real scores, because these are
random numbers. And these are between zero and 99. And that's it. Now, one more little detail
I want to share with you is how to create random float numbers, because before we created
a random integer number, but if for some reason do want to create random float numbers,
there is a way how to do it with NumPy. So we only write in ping, then that random,
then that uniform, and arguments are the same. So the minimum value and then the maximum value,
then the size, which is 1000. Then you run this, and well, it's similar to the one we got before.
But now we have float numbers. And that's it. In this video, we'll learn different ways
to add a new column to a data frame. Alright, now it's time to see some operations we
can perform on data frames. So here we have the same data frame DF underscore exams. And here we
can apply some common operations to the numerical columns like math score, reading score, and
writing score. So let's see how to do this in pandas. So first, we're going to see how to
make operations in columns. So our first task is to calculate the total sum of a column. So let's
pick first our math score. And let's calculate the sum of this column. So to do that, we have first
to select a column. And as you might remember, to select a column first, we have to write the
name of the data frame, in this case, the F underscore exams, then we open square brackets
and then write either single or double quotes, then we have to write the name of the column.
In this case, it's this one match score. This is the column we want to select. And now
instead of selecting, we're going to perform operations. So in this case, I want to calculate
the total sound of this column, and we have to use the sum method. So we write that sum in
parenthesis. And this is how you calculate the total sum of this column. So to verify this,
we run this code, and here we get 66,000. And this is the total sum of this math column.
Great. Now we can make some other alterations do will do in Excel, for example, we can calculate
the number of rows using the account method. So here, we can easily do that. I'm just going
to copy this one. And now instead of writing the sum method, we write count. So here count and
now let's see. So we see 1000 rows. And yeah, this is correct because these data Has 1000
rows. So now we can calculate the mean of this math score column, we have to copy this one, now
paste it. And instead of writing count, we have to write mean. And here we got the average value
of this math score column. So to get the average, we have to sum all the rows in this math score
column, and then divided by the total number of rows, in this case 1000. And this is how you
get this mean value, then we can get other other operations using the method. So here, for
example, we can get the standard deviation by writing STD. So we execute this, and the standard
deviation of this math score column is 15 m, we can get also the maximum and minimum volume.
Let's do it quickly here. So first, the max, and then the main value, you can actually do it
with Python built in function. But we can also do it with methods. So here I ran in as you can
see, here, the minimum volume of the math score is zero, and the maximum is 100. Okay, now I'm
going to show you a quickly way to make the same calculations. Using that is quite method. I think
we saw that it's quite method in previous videos, but in case you don't remember it, I'm going to
write here, the name of actually, we only need the name of the data frame with, we don't need the
name of a specific column, we only need the name of the data frame. And now we can use the describe
method. So write that describe with parenthesis. Now we got like a summary table with some
important statistical values. And here we have the account that mean the standard deviation,
the minimum and maximum value. And as you can see here, we get all of this with one method.
Okay, so far, so good. Now, instead of making operations in columns, we're going to learn how to
make operations in rows. So now let's calculate, let's say, the sum of the math score, reading
score and writing score. To do so we have to make some selections. And in this case, we have to
make some independent selections. So to show you, I'm going to copy the name of these three
columns. I copied it. And now I paste it here. Now we have our math score, reading
score and writing score. So now let me delete that sign. And now we have to make some
independent selections. So first, we write the name of that data frame. So DF exams. Now to make
the selection, we open square brackets in quotes. So now, let me do this quickly in the orders.
Now here, so I open a square brackets. And now let me do it here too. And now it's ready.
So here we made some independent selections, in order to make to calculate the sum in a
row, we have to use the plus sign. So here, the plus operator, we have to write it here
and here. So basically, here, we're making some in each row. So to verify this, we run this
code. And as you can see, here, we got the sum of the scores column. So here, let's verify fast
the sum of the first row. And it's 72 plus 72, plus 74. So 72 with 72 is 144. And with 74 is
218. So here we have it. It's correct. So now, let's do something else. So now instead of just
summing these three rows, or actually these three columns, what we're going to do is to calculate
the average to get like an average score. So here, let me copy this in here, we're going to calculate
the average by summing this and then dividing this by three. So this is how we calculate the score.
In our let's assign this result to a new column. To do so we only write equal in them. As
you might remember from previous lessons, we have to add a new column by writing
the name of this column. So we do that writing the name of the data frame, and then
making like a selection so we open square brackets, then open quotes in here we write
the name of the column that we want to create. So this is same as we did in previous lessons
where we added a new column. So in this case, I'm going to name this new column as
average. And I'm going to execute this in our to verify that this new column was created.
I'm going to show this data frame here. Below, in here is our data frame. So now, in the last
column, you can see that there is an column named average, it has the average value of this math
score reading score and writing score a Now here, we can control the number of decimals, we can
just use the round function and write the number of decimals we want to get. So in this case, I
want only two decimals. So I run this. And as you can see, here, our data frame looks much better,
because we only have two decimals. And that's it. In this video, we'll learn different ways to make
operations in columns and rows on data frames. Alright, now let's have a look at the value counts
method. So so far, we have seen how to count the number of rows in a data frame. So for example, if
we want to count the number of roads in the gender column, we either use the length function, so we
write land, then the number or the name of the data frame. And we only have to write the
name of the column. So as you might remember, this gives us the number of rows. And we can also
use that count method. So here we write count. And we get the number of rows that what if we want to
count the gender elements by category, so female, or male? What if we want to know how many female
in how many male elements are in this gender column. So this is when the value counts comes
in handy. So we can use this method to count each category of the column. So to use this method,
we only have to write the name of the data frame, followed by the column that we want to count. So
in this case, is that gender column. And then we have to use the value underscore counts method,
as you can see here. So now we execute this. And as you can see, here, we have not only that
total rows in this gender column, but now it's divided by category. So we have that there is 518
females and 482 males. So this is how the data is spread in the gender column to now we can do
more with the value counts method. So we can get the percentage that each category represents in
the whole column. So here, I'm going to copy this in now to calculate the percentages, also known
as relative frequency, we have to add an argument name normalize. So we write normalize,
equal to true. And then we execute this, as we can see here, female represent 51% of
the total observations in the gender column, while male only represents 48% of the total
observations. So as you can see here, the value count method is useful when you want
to have a look at the data by category. Okay, now let's see another example. And in this case,
let's pick a different column. So here, I'm going to choose this parent table level of education
column. I copy this. And now let's calculate, let's count the elements by category. So here,
I'm going to write the name of the data frame the exams. You know, I open square
brackets quotes in here, I paste this column. Now to count the elements by category in
this column, we use the value underscore count method. So we run this code. In here you can see
how the data is divided in this column. So most people have some college level of education, while
just a few people have a master degree. And now if we want to get the percentages that represent
each category, we again use the Normalize arguments. So we write normalize equal to true
and now we're going to get that percentages. So we can see the percentages If we want to
round these to two decimals, we use the round method. So we write that round parentheses
in our two decimals. And as you can see here, we round it to two decimals. And that's it. Now
you know how to use that value counts method. Okay, in this video, we're going
to see how to source a data frame using the sword underscore values method. First,
let's import and read the CSV file that we've been working with in this tutorial. And now
let's store the data frame. So here we have the data frame, as you might remember, it's, it has
these three numerical columns. And now I'm going to swirl it using one of these columns. So let's
use the sort underscore values method. And first, I'm going to write the name of the data frame,
which is dF underscore exams, and then right sword underscore values. Now I open parentheses.
And now I can use this help here. And as you can see, the only mandatory argument is by so we can
use this one by and this one, we have to specify the name of the column we want to sort by.
So in this case, I want to sort by that math score. So I'm choosing this numerical column
to start with. So I'm going to write math score, actually, I'm going to copy this one, and paste
it here. So my math score. And sorting this data frame is as simple as that. Now, we can run this
code, as you can see, here, the data frame was sort ascending by default. So it starts with
zero, and it ends with 100 in the match score. So this is how the source and the score values
behave by default. And here one little detail, you don't need to specify the byte word, we can
omit it. And we run this in as you can see, here, it still works. So here we can modify that default
behavior of the source anger score values method, we only have to add a new argument in is that
ascending argument. So let me show you here. I'm going to copy this one first, and show you here.
So in this case, we're going to sort these sending by the same column, so we only write, comma,
and then we specify the sending arguments we write a sending equal to, and here, I want to
show you something in this little help here, your the sending is set to true by default,
this means that is ascending by default, but we can change this default behavior by setting
ascending equal to false. And that's what we're going to do here ascending equal to false, so it
means descending. And now I'm gonna run this one, and as you can see here is sort descending by the
math score column. So here, it starts with 100. And it ends with zero. But that's not all, we can
do much more with a sort underscore value method. So first, I'm going to show you here how to
sort by two different columns. So here, let's copy and paste these one. So in this case, we're
going to sort descending by multiple columns. So instead of writing only math score, we're going
to add here, one more column is going to be that reading score column. So here, I copy this one.
I'm gonna copy and paste it here. But first, we have to add the square brackets,
because as you might remember, when we write two or more columns, we need the square
brackets. Now I write comma, and I paste this written score. Now I add quotes. And that's
it. That's everything you have to do to sort by multiple columns. Now I'm going to run this
one. As you can see here, it was sort, descending, first by the math score column, and then by
the written score column. So the priorities are set here in the list that we include here.
So first is the math score column first priority, and the second priority is the reading score
column. And that's what you can see here. Now I'm going to show you a little detail here.
Let me copy the DF underscore exams. And if I print this one, you can see that the changes we
made weren't updated. So this here, the math score column has the original values. This happens
because the sword underscore values method, like many other pandas method on Create a copy of
the data frame. So here we obtained a copy. This one is a copy, but it doesn't update the values
of the data frame unless we add a new argument, which is the in place argument. So I'm going to
show you here. But first, I'm going to delete this tf underscore examples. And now I'm going
to copy this one, and show you how to update the values of this data frame. So here, I'm going
to copy, those are the same values. But now I'm going to add a new argument, which is the in
place argument. So here we're right in place equal to, and now I'm going to show
you the default value. So here, the default value of employees is false.
This means don't update the data frame, but only create a copy that if we set it to
true, it means update this data frame. So here, I'm going to set it to true to update the data
frame. So here writes true. And now I run this, in apparently nothing happens. But if now we
print that DF underscore x times data frame, we're going to see that we have that data frame
sorted. In case you don't want to add that in place argument. And you want to update the values
of that data frame, you have another option that we used before, which is overwriting the values
of this data frame. So for example, you can only delete that input argument and write df underscore
exams equal to this. So this is overwriting the values. But in this case, we're not going to do
that, we're going to add that in place argument, as you can see here, finally, we're gonna see how
to sort but now not with numerical data met with text. So as you can see, here, we before a sort
this data frame, by the math score column, in this one has this numerical data. But in this case,
we're going to solve it by their race ethnicity, which has this text, so we're gonna sort
this one. So first, we were supposed to get group one, and then Group B, C, D, and so on.
So let's do this here. I'm going to scroll down. And first, we have to write the name of the
data frame, followed by the swore underscore values method. And now specify the name of the
column. So here, I'm going to copy race ethnicity, here, let me copy here, and it's done. Now I
have the name of the column, I'm going to set to ascending to true, I know that new argument
we have to add to sort, this is that key. So add key, then equal to in this case, we're
gonna use that lambda function. I'm not sure if you're familiar with the lambda function. But it
works similar to an average function we've seen before in the Python Crash Course. But in this
case, is going to behave a little bit different. So let me show you here. First, you have to use
the lambda keyword, so we write only lambda. And now we should write the object that is supposed to
return. In this case, I'm going to write the call that stands for column. And then we have to write
a column and specify the operation we have to make over this variable. So in this
case, I want to write a column or call and then access that a string attributes. So I
write that str, and then use that lower method. So what we're saying here is get the string values
of the column and then transform it to lowercase. So here, we get that textual data in
lowercase. And with these three arguments, we're saying, sort the values inside the race,
ethnicity column, and sorted, ascending, and then sort the textual data of this column in lowercase.
So here, we have this a, b, c, d, e, in uppercase, but we're going to get it in lowercase and sorted
by this text data. So now let's run this one. And let's see the results. So as you can see
here, we have this race, ethnicity column, and it's order ascending. So here, we got the
A and B and C and D, and so on. And that's it. These are the different ways to store a data
frame using the sword underscore values method. Welcome back. In this video, we're going to
see different ways to make pivot tables. If you're an Excel user, probably you
make many people tables in the past. In pandas, we can also make pivot tables. And
in this case, we use two different methods, the pivot method and pivot underscore table
method. In this video, we're going to see the difference between the two of them. So
first, let's see what's the pivot method. So the pivot method, reshapes data based on
columns values in it doesn't support data aggregation. So this means that this is not the
regular pivot table you'll see in Excel. Because you can only reshape data with a pivot method, and
you cannot do anything else. To explain you better what the pivot method does, I'm going to show you
an example. So here we have a little data frame. And this one has six rows and four columns. As
you can see here, there are many duplicate values. For example, in a column foo, that one value is
repeated, at least twice. And the same goes for the two value also in the column bar, you
can see that the a, b and c is duplicated. So when we have this type of data frame, we can
reshape it to have a different view. And to make a better analysis. In this case, we can use the
pivot method, as I'm going to show you right now, you only have to write the name of the
data frame, followed by the pivot method, and then specify three arguments. So the first
one is the index. In this case, I'm going to reshape this data frame to send that column
food as an index. This means that the column foo will be in the position where is right now
the numbers from zero to five, on the left. Next, you have to define that column. So these
are the new columns that we're going to see in our new data frame, the one that we're going to
reshape, so in this case, I'm selecting the data inside the bar column as new columns. This means
that A, B and C will be the new columns in our new data frame. And finally, we have to choose the
values we wish to show in this new data frame. So in this case, I'm choosing the best column.
So all the values inside there will be shown in our new data frame. So this is the column
that I'm selecting. And now I'm going to show you the result of this pivot method. So here
it is. And as you can see, here, we have the foo in the index, as I told you before, and A,
B, and C, that are data from the bar column. Now, our columns in this new data frame, also all the
data inside this bass column, is the only data that is displayed in this reshaped data frame. And
now let's see why is sorted this way. So why one is here, two is here, three is here, and so on. So
here, the value is defined by the index or row in the column. So between one index one, and column
A is one and why that happens, because if we go to the, our previous data frame for the original data
frame, that is here, we can find that here is one, A, and the value that corresponds to that pair
is the number one. So let's pick another one. For example, five, here, we have two in B.
And if we go here to our original data frame, we have that two and B, the value that corresponds
that pair is five. So that's why this value is here. And that's how this new data
frame was reshaped. Okay. And finally, we have the pivot underscore table method.
And this one creates a spreadsheet style pivot table. So this is similar to the pivot table
that we will find in Microsoft Excel, for example. And this one supports data aggregation and explain
you more about the pivot underscore table method, as well as the pivot method. We're going to see
some examples in the next video. And this time, we're going to write some code so you can
understand much better what we're doing. Alright, now it's time to say how the pivot method
works in action in pandas. So first, as usual, we import pandas as PD. So here, I import this
library, and then we're going to Use a different data set to work with this peel method. So to read
this data set, we use the PD read underscore CSV method. And inside parentheses, we write the
name of this data set. So in this case is GDP, that CSV that you can find in the notes of this
video. So this is the new data set. And now let's have a look, I'm going to run this one. And as you
can see, here, we have data about GDP per capita that is in this column. And basically, this is
how the GDP grew over the years for each country. So here, I'm gonna tell you, which are the
columns we're going to use for this example. So first, we're going to use the Country column
that contains data about different countries, then we're going to use that year column that
well contains different years. And that GDP per capita, that it's in this column. So
basically, what we want to do in this exercise is to obtain a different view of our original
data set. So this data set that we're reading here with pandas has this view that we want to
get a different view to have a better analysis. So the goal of this exercise is to see the
evolution of the GDP per capita over the years for each country. And then we're going to put
that country names in the columns. So the only data we're going to show in our new data frame is
going to be that GDP per capita that it's here. So I want to show you now this with code, and
let's write it here. But first, let's assign a variable to this data frame. So here, I'm going
to write df underscore GDP. So this is the name of my data frame. And now I'm going to show
it in its here. So now I'm going to copy this data frame. And to use the pivot method,
I'm going to paste this one. And now right that pivot, now, we open parentheses. And now
as you might remember, from the previous video, we have to introduce three different arguments.
And if you don't remember the three different arguments, we have to introduce here, you can only
press the shift and tab keys on your keyboard, and you will get this. And here, you can see
there are three arguments I'm talking about. So first, we have to write that index argument.
So write index. And, as I told you before, I want the year column to be the index of my new
reshaped data frame. So I'm going to set this year as the index of my new data frame. So right
here, year, next, we write coma, and press Shift, and tap to show this. So the second argument is
the columns. So we write columns, then equal and open quotes. So here, as I told you before, I want
the countries here listed in the Country column, I want each country to be an independent column.
So for example, here, let's say we have the United States. So I want the United States to
be column number one, then column number two, China, then Australia, then Spain, and so on. So
each country should have one independent column. So that's what we want. And to get that we have
to set the Country column here to the columns, argument. So here, country, and that's it. Now,
again, Shift plus tab to show this window here. And now the third argument is values. So here,
I'm going to write values equal to open quotes. And here, the only data I want to show here
in my new data frame is going to be that GDP per capita, which is the one that is here. And
now I'm going to copy this one and paste it here. So remember, our goal. Our goal is to
see the evolution of the GDP per capita over the years for all the countries listed here
in this column. So here, we're going to execute this code and let's see the result. So here Ctrl,
enter, and as you can see here, I have the new view of this data frame in It looks much better,
it's more readable, because we can see the GDP evolution over the years for each country.
So now let's verify if everything is correct. So here we have the index year. And here we have
the year as index. So everything is fine, then the columns should be country. And now we have each
country in the columns. So it's correct. Next, the values are the GDP per capita. And yeah,
we have here, that intersection between the row and a column is our value that corresponds to
the GDP per capita of that country in that year, so everything is working fine. And there you have
it. This is how the pivot method works in pandas. Okay, now let's see how the pivot underscore
table method works in pandas. So in this case, we're going to work with a different data
set. And to read it, we're going to use the method PD rate underscore Excel, because in
this case, the data set is not a CSV file, but an Excel file. So we use rate underscore
Excel for an Excel file. So in this case, the name of this dataset is super market
underscore sales, that x LS x. And this is what we're going to see after you run this. And
here, you can see that we have different columns about what specific person bought in a
supermarket. And here Well, we have the branch, the city, the gender and different data. So here
to make a pivot table, we're going to first name this data frame, and I'm going to name it DF
underscore sales. Now, I'm going to show it here. And okay, now it's here. Okay, the goal of this
task is to see how much female and male is spent their money in this supermarket. So to do
that, we're going to use the pivot table method in pandas. So first, I'm going to copy this
data frame. And now I'm going to paste it here. And now we're going to make a pivot table and
add an output function. Because remember that the pivot underscore table method allows us
to add an aggregate function, and the pivot method doesn't support that. So we're gonna
use the pivot underscore table this time. And now we're gonna introduce some important
argument. So the first one is the index. And in this case, if we want to see how much
male and females pant in this supermarket, the index is going to be the gender. So here, I'm
going to copy gender here. And it's going to be here, index equal to gender. So this is the first
necessary argument. And the second one is going to be the aggregate function. So we have to write
a Double G, F, U, and C, and then equal to and then write the aggregate function we want
to perform. So in this case, is going to be a sum. So we write, sum, and now everything
is ready. So what we're supposed to get here is the information about the sales here in
this data frame, but now divided by gender. So we have the female category, and then the
male category. So let's verify this. I'm going to run this one. And as you can see, here,
we have this summary table or pivot table, and now it's divided by gender. So we can see how
much female is spent here in the total column, and also how much male is panned. Also in the
total column, and here in the Quantity column, we can see how many products they bought, how
many products female and male bought in this supermarket. And one detail you might have
noticed is that only that columns that contain numerical data are displayed here. So for
example, here, branch and city that contain only tax aren't here in this pivot table,
because here in the aggregate function argument, we indicated that we want to sum and when we sum
values, we cannot some text, but only numerical data, so only the columns that have numerical
data are displayed in this new pivot table. Okay. That's our first pivot table. And we can do
even more. For example, we can select a pair of columns that we're interested in. So let's say we
only care about the quantity and the total column. So we want only those columns. So we can get that,
I'm going to copy this one. And to show you how to get only those two columns, I'm going to add
a new argument. So here, I'm going to write, in this case, the name of the argument is
values. So I read values equal to, in this case, I'm going to select the quantity and the total
columns. So I open square brackets, because I'm going to select two or more columns. In inside, I
write the name of the columns. So first quantity, right here, and then total. So here, too, so we're
going to get the same pivot table, but in this case, only the quantity and the total columns
are going to be shown in this table. So I'm going to execute this one in here, I get an error,
because I didn't include this comment. So I'm going to add it here. And now everything should be
fine. And yeah, we got the same pivot table, but only the quantity and total columns are displayed
here. And here, we can clearly see that female spent more than male in this supermarket.
But we can get even more detail here. So far, we know that female is paying 167,000 In
this supermarket. But with pivot tables, we can even know in which product lines,
this money is spent. So let me show you here, we can see how the money is spent in this
product line column. So we only have to add a new argument to this pivot table method. So
I'm going to show you here, first, we copy this. And now I'm going to paste it here. And
we're going to make a pivot table that says how much male and female spent in each category or
Well, product line. So we add a new argument, and this one is going to be the columns argument. So I
write columns, then open quotes, I add the comma, in here, I write the name of this column, that is
product line. So I scroll up, I copy this column, and then we're gonna see in which category he
spent the money. So health and beauty or sports, and so on. So now I scroll down in here, I paste
it. And before I run this code, here, we only want to display that total, because we only want
to see where the money goes not the quantity. So only total, so I delete the square brackets to and
with total, we're gonna see where the money goes, divided by gender. So here, I run, because it's
ready. And now, as you can see, here, we can see how much female in males pant in each product
line. So we can quickly see, for example, that female is spent more money on fashion accessories
that male and that kind of makes sense. And also in sports, women is pant, or female as pant more
money than male. So we can easily see all of that by using the pivot underscore table method in
pandas. And this is similar to the pivot table you will find in Excel. And that's it. That's
how you make a pivot table in pandas. Alright, before showing you how to make visualizations
with pandas, first, we have to check the data set. And also we have to make a pivot table. So
we can easily make the plots with pandas later. So first, we have to import pandas to read this
CSV file. And well, I have this import pandas as PD. So we just run this code. And now let's
read this new data set. So as you might remember, to read a CSV file, we have to use the read
underscore CSV method. So we write PD, that rate underscore CSV. And then we write the name of the
CSV file. So in this case, the name is population. And I'm going to use this population underscore
total that CSV so I pressed top to get this the name. So we have now the name. And now I'm going
to assign these to a new variable. So the variable is going to be DF underscore population. And
there's core row. So this row data, and now we're gonna have a first look at this dataset. So
I paste this. And now I'm going to run these two. And now we have this data frame. So here, as
you can see, we have the population of many countries throughout the years. So for example,
we have China here, United States, and India. So we have their population, and Kira wrote the
name row, because this dataset was extracted using some web scraping techniques. And then it wasn't
modified. So now we have to make some changes to reshape this data frame. So we make it easy
for us to make visualizations with pandas later. So what we have to do here is to make people
table to reshape this data frame. And that's what we're gonna do here below. So we're gonna
make a pivot table, and we're going to use that pivot method. So as you might remember, the
pivot method returns a reshaped data frame organized by given index column values. But it's
a pivot without aggregation. So this is what we want. So we only want to reshape this data frame.
So we're going to start by dropping no values. So we do that by writing the name of the data
frame. And now I'm gonna just copy the name, and I paste it here. And now to
drop null values, we have to use the drop any method, so I write drop in a, and then
we have to run this. And as you can see, here, we have the result, it's a copy from this data
frame. But if we want to save the changes that we make to that data frame, we have two options.
The first option is to use that in place argument. So I write in place, and then set this
to true. So if we do this, and we run, all the changes that we make to the data frame are going
to be saved. And the second option is to do something like this to overwrite the content
inside this data frames. So we do something like this, we write df underscore population underscore
row is equal to the same data frame, but that drop in a so we're overwriting the content inside this
data frame. So I'm gonna choose the first option just to reduce some code. So I write in place
equal to true, and now Iran, and this new data frame shouldn't have any new values. Okay, now
it's time to make this pivot table. So first, I'm going to show you what I'm going to do. So
we have a better idea before writing the code. So here we have the original data frame. And what
we're going to do is to reshape this data frame. So I want the year to be in that index. So
the year column, I want it to be here in the index instead of 01, and so on. And then
I want that Country column or the country, the values inside the country column, I want it
to be here in the columns. So for example, I want China here in one column, then United States in
another column, and then India in another column. In I want the population data inside the data, I
want this to be the only data here. So to do that, we have to use the pivot method. And that's what
we're going to do here below. So let's do it here. So first, we have to write the name of the data
frame, which is this one, and then write that pivot, then we open parentheses in here. Let's see
the arguments that this pivot method accepts. So I press shift and tap. To get this helpful. Let's
call cheat sheet. And now we have the arguments that this pivot method accepts. So first is the
index, then the column and then the values. So as I told you before, the index, I want it to be the
year column. So we have to write index equal to open quotes and I write in year, then comma, and
let's check another argument. So the next argument is the columns. So I want the columns to be the
country. So the data inside the country columns. So here I write columns. Then I open quotes in
here, I read country. So country. And now the last one, I think, is values. And yeah, its values. So
I want the values to be the population data. So let me see if that's correct. And yeah, it's here.
So population, and I'm going to press Enter here, so it looks much better. In our population, it's
here. So I have that three arguments that index, the columns and the values. Now, I'm going
to reshape my original data frame. So here, I press Ctrl, Enter. Now, as you can see, here, we
have the countries in the columns. So here we have many countries. It's from the first country,
Afghanistan, to Andorra, Argentina, Uruguay, and many other countries. So we have also
the year, so it's here, the year from 1955, to 2020. So we can see here the evolution
of the population throughout the years for all the countries in this dataset. But as you can
see, there are many countries. So what we can do here is to select just some countries. So
we can simplify our visualizations later, in pandas. So here, I'm going to select some
columns. But first, I'm going to name this new data frame, I'm going to give it a name.
So I'm going to name it DF underscore pivot. So this is my new data frame. Now I'm going to
rearrange this, and now it looks much better. So now I'm going to run this. And now let's
select some countries. So I copy this pivot data frame. And now we open square brackets, double
square brackets to select two or more columns. And here, let's write some countries, the
first United States, then, let's say, India, then China, to more countries, Indonesia.
And last but not least, Brazil. So here, we have the five countries. So I run here in we
have these five countries and population from 1855 to 2020. So great. Now we simplify this data
frame. And now I'm going to overwrite the content inside that data frame DF underscore pivot. And
I'm going to write here, DF pivot equal to DF pivot and with the selection, so I'm overwriting
the content. So I press Ctrl, enter, and our new DF underscore pivot is here. So we have it here.
And now I'm going to show it to you. And this is our new DF underscore pivot data frame. And
that's it. Now our data is ready. So we can use it to make free visualizations with pandas.
And that's what we're gonna do in the next video. Okay, now it's time to make some visualizations
with pandas. In here, I have the data frame that we created. This is the pivot table we created
in the previous video. And as you can see, here, we have five countries in the columns. In here
we have the year in the index from 1855 to 2020. So what we're going to do now is to make our
first visualization, so I scroll down here, and the first one is going to be line plots.
So here first to make this visualization, I'm gonna copy the name of the data frame, and I
paste it here. So now to make plots with pandas, we have to use their plot method. So we
write that plot. And now I open parentheses. And one necessary argument we need to introduce
is the kind of argument so I write kind, now equal to, and here I have to write the kind of
plot we want to make. So in this case, is a line plot. So we write line. And this is actually the
mandatory argument we have to introduce here. And now we can run this code so I press Ctrl N As you
can see, here, I have the line plot. So in this line plot, we can quickly see the evolution of
the population throughout the years. For example, China and India, which are green and orange
lines, they had some fast growing population, while the United States, Indonesia and
Brazil, they have lower population mean, also, the population didn't change so much in the
past 50 years. Here, we can add more arguments to this plot method to customize this line plot.
So here, we can introduce another argument, which is the x label. And this x label is what
you can see here, here, when we created this line plot, by default, it was assigned this year
label, but we can change it. So for example, let's say we have, we want to write year, but
now with capital letter, so we right here, here. And now let's say we want to add a new label
here in the y axis. So here, we have only to write y label, and then equal to open quotes. And
here, we have to write the name we want. So in this case, I'm going to write only population.
And finally, we can also add a title. So we can add any title we want. In this case, I'm going to
write well, the name of the argument first title, then equal to, and then the name of the title
is going to be, let's say population from 1855, and to 2020. So this is the title. So let's run
this. In. As you can see, here, we got the title, population 1955 to 2020. And the x label and y
label were modified to finally we can add one more argument in this case, the argument is the size of
the figure. So to change the size of this figure, we can add the argument name, fixed size, and
this is a tuple. So we have to open parentheses. And now to edit the size, we have to add two
arguments. The first is the size of the x axis and the second the size of the Y axis. So in this
case, I'm going to set it to a and then four, which means that the x axis is going to be
large, while the y axis is going to be short. So here, I'm going to run this code, and let's check
it out. So here the figure has a different size. And that's how you can customize this line plot.
Okay, now let's make a bar plot with pandas. So the first thing we have to do is to select
only one year, so the bar plot only accepts years one year, and we can plot their their
population of different countries. So let's select one year of this data frame we have before. So I'm
going to copy the name of the data frame. So you can check it out again. So this is the data frame,
and we're going to select one year. So to do that, we have to use the index attribute. And then that
is in method. So first, I'm going to show you the index method, in case you don't remember.
So here the index, sorry, again, that index attribute allows us to see all the index in this
data frame. So we have here from 1855 to 2020. So that's what the index attribute does. And
now if we use that is in method, we can filter out some inks. So here, let's say we want
to select only that 2020. So I copy 2020. You know, here, I write equal to, and
first, I'm going to make the selection. So it's here. And now I'm going to show you what's
the result. So here, I press Control, Enter, and the result is this little data frame that
only contains the population in the year 2020. So this is important because the bar plot is
supposed to show only population in this year. So here we have it. And now what we're going to do
is to name this guy Add a frame. So here we write equal to. And then let's give it a name. So I'm
going to name it, DF underscore pivot underscore 2020. So here, I press Ctrl, Enter. And now I'm
going to show this new data frame. Well, again, here, and here, one little detail, I have to tell
you is that when we make barplot, we have to put text data in the index. So here, the name of the
countries should be in the index. So to do that, we have to use the transpose method. So this
transpose allows us to switch rows and columns, and vice versa. So here, we can easily do that by
writing the data frame the name of the data frame, you know, that T. So if now we run this code,
we can see here that we have this. So now the year 2020 is in the column and not in the index
anymore. And country names are in the index here. So this is the format we need to have before
making the bar plot. So now I'm going to overwrite the content in this data frame. So I write df
underscore pyboard, underscore 2020, equal to this same data frame, but that T. So here, I run this,
you know, it's time to make the bar plot. So here, I copy the name of the bar plot, you know, I
use the plot method. So I write plot, again, open parentheses. And the first argument is
the kind. So I open quotes, and we write bar. So now it's ready. And we can run it. So as you
can see, we have a basic bar plot. And it has some default values, like the name of this x label.
And also the default color is blue. And we can customize this bar plot a bit more, for example,
I want a different color. So I write the color argument and then open these quotes. And let's
say I want it to be orange. So I write orange. In also, we can change the X and Y label. Actually, I
can copy this here, so I can save some time here. So x label and wire label are here. And let's
paste it here. So x and y label. And finally, I can add also the title, which was here. So I
copy and paste it. But in this case, the title is a bit different, because in this case is not from
1855 to 2020. But it's only 2020. So here, I have only 2020. And now let's run this to see the
results. So you can see here we have the title, the x and y label, and bar plot is in orange. So
that's how you customize the bar plot. Alright, so far, so good. Now let's go one step further
by making bar plots grouped by n variables. So here, we have to select a group of years to make
these bar plots grouped by n variables. So I'm going to copy this code we use before to select
only the year 2020. I'm going to copy this, in in this case, I'm not going to select only one year,
but a group of years. So let me show you here. Instead of choosing only 2020. I'm going to
show you the pivot table again. So you can easily understand. So instead of choosing only
2020, I'm going to choose some other years here, so I'm going to delete this. And I'm going to
write it here. So let's say 1980 1990, then 20, then 2020 10. In well finally 2020. So we have
a group of years here, and we're selecting this using the index and is in method. So here,
I'm going to give it a different name. In this case, since it's a sample, I'm going to
write the F underscore pivot underscore sample. Now I'm going to first I'm going to show you
this one, so you can see what this looks like. So now we have five countries, no five years. So
now I'm going to assign these to my data frame. So DF underscore p with underscore sample, I
run this and now we have this new data frame. So It's time to make these grouped bar plot.
So here, we write the name of the data frame, and then the plot method. So write that plot, you
know, let's add the first argument, which is kind and equal to bar. Now we run this. And
as you can see, here, we have the plots, or the bar plots grouped by year. So here's
1980, in 1990, and so on. And you can also add the same arguments we added here. So for example,
I can add the x and y label, so I can do it here. I'm gonna do it fast. So here, I run.
And as you can see, here, we have the, we modify the X and Y label. And that's it.
That's how you make bar plots with pandas. Okay, in this video, we're going to learn
one of the most common charts that we can make intenders in actually any other visualization
tool, and these are pie charts. So before we make this pie chart, first, let's give a look
to the data frames we're going to use. In this case, to make a pie chart, we're
going to use the same data frame would use for making the bar plot because it follows the same
logic. So here, I'm going to copy the data frame we created for the bar plot, which is this one,
DF underscore people underscore 2020. So this is what we created before by using that index
attribute. And that is in method. So here, I'm going to copy this. And now I'm going to show
you here so so you can remember what's inside this data frame. And it's here. So here, as you can
see, we have the column 2020. And the countries are in the index. So everything is fine. That's
what we need. That's the format we need for making the pie chart. But there is one little thing
we have to modify. And this is the column name, because now it's 2020. In this is a number, it's
actually I think it's an integer. So it's not a good practice to have numbers in columns. So
what we have to do is to make this a string. And to do that, we use that rename method. So we
write that rename, open parentheses a now we use the columns argument, so we write columns, then
open these curly braces. And now we write the name of the column we want to change, which
is 2020. And we're going to make this integer value into a string. So we open quotes and
write 2020. So apparently, they are the same, but the green one is an integer, and red one is
a string. So now to make to save these changes, I'm going to write in place equal to true. And
I'm going to run this. So now we can make the plot here, I'm going to write the name of the data
frame. And now I'm going to use that plot method. So here I write that plot. So the first argument
is kind, in here, I write pi. So the current is pi a now I run this in here, I forgot to include
that y argument. And I'm going to write here. So the y argument is supposed to have the data.
So in this case, I'm going to show you here again, the data frame. So the data is here in 2020. So
we should write here 2020. So I'm going to delete this. And here in the Y argument, you write the
column that has the data. So that's what we did. So now I run this. And now we finally have
our pie chart. So here is the pie chart. So that's how you make a pie chart. If you want you
can even add another argument like the title for example here. I can say that this is a population
in 2020. But in this case in percentages, so write this in our we have this title. So
that's how you make a pie chart in pandas. Alright, so far, we made a pivot table in
many plots using Pandas, and in this video, we're gonna learn how to export the pivot table in
also the plots we made with pandas. So let's start by exporting the plots we made with pandas and to
do that, first we have to import matplotlib. So we write import Math plot lip, that pie plot, and
then we write as PLT. So this PLT represents this matplotlib.pi plot. So now we run, and we
import matplotlib. And now we can use this PLT to save the plot. So we write PLT dot save fic.
And now we open parentheses in here, we have to write the name of the file, we want to export.
And here, I'm going to write my underscore test that png. So this is the extension. And this is
the name of the file. And now before exporting this file, I'm going to show you something
here. So probably you know this that when we make the plot with pandas, we get these words
here that says access subplot and all of this. So we can get rid of these words by using the
show method. So we'll write PLT that show with parentheses. And if we run this, we're going to
export this figure. And also we're going to get rid of these words. So let's try to run this. And
as you can see here, all those words disappeared. And also we exported the figure to a PNG file. And
now this file should be located in the same folder where you have this Jupyter Notebook file. Okay,
I'm going to open that file. But first, I'm going to export the pivot table. So here, I copy
this DF underscore pivot, and I paste it here. In order to export it, we have to use that to
excel method. So right to underscore Excel. And now I open parenthesis here, we write the
name of the file, where we're going to export this pivot table. So in this case, I'm going
to name it pivot underscore table that XL s x. So this is the extension of Excel. And this is
the name of this file. So now I run this, and now the Pivot Table shall be exported. Alright,
now I'm going to open the Excel file and the PNG file we created. So it's here, and here we have
the plot, we export it, and also the pivot table. So as you can see here, the plot looks exactly
the same as the one we created here with pandas. And the pivot table is the same. So I'm going
to show you how the pivot table looks. And here is the pivot table in here is the pivot table
we exported. I open it in Google Sheets and looks exactly the same. And that's it in this video to
learn how to export data frames as well as plots