We live in the age of AI. Most AI tools are now based on large language
models. Today, we'll explore how to perform data analysis
using the Pandas AI tool locally. Let's take a look at what we'll cover in this
video. First, we'll briefly explain what Pandas AI
is. Next, we'll load our dataset. After that, we'll get a large model with Ollama. Lastly, we'll analyze data with this model. Keep in mind that, to work with the large
model, we will not use any api key, such as the openai api key. This means that the analysis we'll do is completely
local and free. Let's dive in! What we're going to do now is look at Pandas
AI. Pandas AI is nothing but a Python tool. You can talk to your data using this tool. It allows you to explore, clean, and analyze
your data using generative AI. Overall, using Pandas AI, it is very easy
to perform your data analysis projects. Okay, we've seen what Pandas AI is. Let's go ahead and start the setup for our
project. Now, we're going to create a virtual environment. To do this, we're going to use conda. Let's open the vs code terminal and write conda create, to give a name, let's use -n,
and then let's name it genai. I already created this. After that, we'll activate this environment. Let's do this. conda activate genai Okay, our environment is ready to use. What we're going to do now is install the
tools we'll use. To do this, let's create a requirements.txt
file. We're going to click on the new file and then
name it. requirements.txt After that, let's write the tools we'll use
here. To read our dataset, we'll use pandas, pandasai,
to import Ollama, we'll leverage langchain, langchain-community. For now, these tools are enough to analyze
data. Now, we're going to install these. pip install requirements.txt Yeah, loading started. There you go. Our tools are ready to use. To write our codes, we're going to create
a notebook file. Let me click on the new file and then give
a name. Let's say, pandasai.ipynb After that let's select the kernel. To do this, let me click on the python environments. And then select genai. It's time to read our data. To load the data, we're going to use Pandas. First, let's import this library. import pandas as pd The dataset we're going to use is country
populations. You can find a link to this dataset in the
description below. Let's have a look at this dataset. This dataset includes the
population of several countries. What we're going to do now is read this dataset
with the read_csv method. Let's say, data = pd.read_csv("population.csv") Okay, our dataset is ready to dive in. Let's take a look at the first five of the
data. To do this, we're going to use the head method. data.head() There you go. You can see countries and populations. Nice, our dataset is ready. What we're going to do now is initialize the
model. To do this, we're going to use Ollama. To leverage Ollama, you need to install it. To install this tool, go to the Ollama website
and then click on the download button. After installing this tool, you can use it
in your terminal. Let me show you the version of Ollama. Let's write, ollama --version There you go. To load a model, you can use the pull command. To see the models you can leverage, you can
click the models. There you go. Here are many models you can use. For this tutorial, we're going to utilize
the mistral model. Let's pull this model. To do this, go to our terminal and then write, ollama pull mistral There you go. Our model is ready to use. What we're going to do now is import Ollama
from the langchain-community. Let's write, from langchain_community.llms import Ollama Now, let's initialize our model. Let's say, llm = Ollama(model="llama2") Awesome, our model is ready. To talk to our dataset, we're going to use
the SmartDataframe class. This class allows you to interact with a single
dataframe. First, let's import this class. from pandasai import
SmartDataframe After that let's convert our data into SmartDataframe
using this class. df = SmartDataframe(data, config={"llm": llm}) Nice, our dataframe is ready to chat with
it. Let's go ahead and start to talk to our data. The first question we're going to ask is to
find the top 5 countries by population. Let's write, df.chat('Which are the top 5 countries
by population?') There you go. You can see the top 5 countries by population
here. It is important to keep in mind that we don't
use any api key. We just leverage an open-source model from
Ollama. This means that we performed this analysis
locally and for free. Great, we've seen the top 5 countries by population. Let's go ahead and find the total populations
of the top 5 countries. Let's write, df.chat("What is the total populations
of the top 5 countries by population?") Let me run this cell. There you go. That was easy, right? You can explore your dataset with the prompts
like this. Yeah, that's it. In this video, we've covered how to perform
data analysis with PandasAI and Ollama. Using these AI tools, you can do your data
analysis projects locally and for free. If you have any questions, let me know. The link to this notebook is in the description. Hope you enjoyed it. Thanks for watching. Don't forget to subscribe, like the video,
and leave a comment. See you in the next video. Bye for now.