Sentiment Analysis of Financial News in Python - 3 Ways using Dictionary, FinBert and LLMs

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

in this video I'll show you three methods for extracting sentiments from financial news using python the three methods I'm going to show you go from the simplest to the more recent so I'll first start with a dictionary method which is super easy super simple and intuitive but maybe not as modern then I'll move on to finbert which is a bird base model and then I'll see how you can use large language models this is Vincent Codes Finance a Channel about coding for finance research if that's something that interests you please subscribe so before we start analyzing sentiment we'll need to get some uh Financial tax so for that I'll use financial news today that I'll get from yaho Finance so in order to retrieve that automatically I'll get that using a document loader from Lang chain which is called news URL loader uh in order to be able to use that uh loader you'll need to have installed the newspaper 3K package and that newspaper 3K package also requires uh lxml HTML clean but it doesn't install it by default so you'll need to install both for this to work and obviously you'll also need to use uh Lang chain and all the other kind of related Lang chain packages that we'll need uh all the other packages will install them uh as we go I'll I'll discuss them as as we go along uh but obviously we'll also need pandas so I'll store everything as we go into a pandas data frame so that we can kind of keep track of our progress so as I said we'll be using news from Yao Finance so this is one of the news piece that I've included talks about the oil industry in uh Canada so first we'll need to import pandas then we'll import the news URL loader then we'll Define a list of URLs here that will contain all all the news pieces URL that I want in my data set here I only have five for uh our sample but obviously you could do it with uh lot more and then what I'll do is I'll create a news URL loader and then I'll use that loader to load the data then if we look at the data it is a list of documents so documents there it is a built-in type for Lang chain uh and what I'll do is I'll just extract that data uh into a data frame I'll store it into a data frame so I'll only keep the title and the text of that uh of that data set and now what we see is that we have basically a data frame with news titles and then the text content of each news piece the first method I'm going to present is the dictionary base method so for the dictionary based approach there are multiple ways to do that I'll only show one today uh which uses the dictionary of luran and McDonald uh which is the um most widely used in finance research so in order to get the dictionary you'll have to downloaded from their website I do have the link uh so as always all the code in uh this tutorial is available on GitHub you can look at the video description to get the GitHub link and also a link to the written description and instructions for this tutorial so the luron McDonald Master dictionary can be downloaded from here I'm using the one in CSV format here uh be aware of the license if you want to use it for commercial uh purposes it's a different license but here we're only doing it for research so that's fine and then when we have this dictionary we'll almost be ready to apply it to our text but we'll have to do some pre-processing for this method the pre-processing is going to be fairly simple exle first we'll take our text change all the words to lowercase remove punctuation remove stop words so frequent words remove numbers uh and typically we'd also be doing either some stemming or lemmatizing but in the case of the luran and McDonald uh dictionary there's no need for that because they include all the variations uh for different words so first I'll get my list of stop words so common words that I want to exclude from the the text because they are not believed most of the time to have any meaning uh so for that I'm just have all these stop words defined in a text file these are the words that luron and McDonald use there are multiple other lists for stop words so for example if you use the nltk library in Python it does include list of stop wordss for many languages uh but here we're just using the same stop wordss as luron and McDonald they are in a text file here so I'll first start here by reading this text file so these are the first 10 words that we'll be uh excluding then we'll pre-process all our text so pre-processing is very minimal uh in this case so we'll take our string we'll split it we'll convert all the words to lowercase then we'll only keep the words that are not stop words so we'll filter those stop wordss out and finally we'll only keep words for which all the characters are letters so we'll exclude all numbers and words that are mixed of numbers and letters we'll output a string so we'll join all of these words separated by spaces we'll apply that to our data set and now we have a new column of clean text and this clean text is ready to be processed uh with the dictionary so the next step will be to actually load the dictionary so this is what the the CSV file kind of looks like what we'll do is we'll make a list of positive words so what we'll do is we'll look at our dictionary we'll keep only the words for which the positive tag is not zero and we will uh keep that column of words convert it to lower string and make it a list and we'll do the same thing for negative words so here we look we have our first 10 positive words and we'll have similarly the uh first 10 negative words now when it comes to Computing a sentiment score it really depends on what you want to include or not so there's different ways to do it so you can just count the number of words uh that are positive and count the number of words that are negative do the difference so that would give you a level uh you can also compute that as a ratio using all the words as the denominator or only the words for which there is the sentiment as the denominator so we'll do all of them so here what we'll do is we'll first compute the number of words in our uh clean text so that's will be our one of our denominator for one of our ratio so in order to do that what we do is we take each of these uh strings which is the clean text we split it uh based on spaces so that will give us words a list of words and we'll compute the length of that so that's going to be our total number of words then what we'll do is we'll compute the number of positive words so for that what we'll do is we'll again text take our clean text for each of our clean text what we'll do is we will take all the words that are in our text and then only keep them if they are in the list of posit postive words that will give us a list and then we'll compute the length of that list and we'll do the same thing for negative words so once we do that this is what we'll have we'll have say for in this example we have 261 words we've got one positive words and seven negative words once we have that we can compute different measure for sentiment so we'll do it multiple ways here so first we'll just do it in levels so we'll just look at the difference between the number of positive and negative words then we'll do a score a sentiment score which is the same thing but divided by the total number of words and finally we'll do a second score which is divided by the number of words that have some sentiment attached to uh them so this could be problematic sometimes if there are no sentiment words in your text so uh be careful if you if you do that um but these are kind of different different ways to do these uh metrics uh and finally what we can do after that is if we really want a categorical classification so just have like different classes of label attached to it um so in this case we'll want to either classify it as positive negative or neutral then you can use a cut of value of one of these ratios so here we'll use a cut of value of3 uh on our second score so we'll see if the score is positive and above3 we will say that this is a positive sentiment text if it's uh below minus. 3 we'll classify it as negative and otherwise we'll classify it as neutral so this is what we do Here and Now what we get is well we get that for this first text for example the level sentiment is this these are the score so it would be classified as negative actually they would all be classified as negative except for this article which is positive it talks about uh the restaurants that were along the eclipse path uh saw like record business because people wanted to see the eclipse and eat also during that day this is not the only dictionary based method there's many variation that exist to these methods there's also data driven methods where the dictionary is built using data and there are different transformation that exist also that can be done on wordss directly so this is just one simple way but it is very commonly used in finance research the second approach that I want to use for sentiment analysis is the finbert model which is a derivative of the Bert uh model that is fine tune for sentiment analysis on financial text so you can see all the details for this finbert model on hugging phas and we'll use the hugging phas Transformer library to download and use that uh Bird model so this model is based on bird so the bird model was introduced in this paper the link is in the description if you go on hugging face you'll see there's a big community and uh wide array of model that are based on bir there are even fine-tuning version of finbert but this in this case today we'll use the basic finbert so the way finbert and most bird based models work for doing sentiment analysis is that they first take the text input convert it to tokens and then the model will create Vector embeddings so a vector representation just an area of numbers so a vector representation of that uh text and then in the final layer of the model uh these weights this Vector is going to be used as an input for three different models that will generate probability associated with each class so here we'll have a negative positive and neutral class and we'll classify the one as being the answer to our classification uh process as the class that has the highest probability so there are different approaches to pre-processing when using bird-based models some will uh include masking some words so I say removing dates company names things like that uh in this case uh today what we'll do is absolutely nothing I'll just use the raw text and feed that to the model so for this example we'll use the Transformers library from hugging face uh and we'll also need P torch and scipi so we'll be using Transformer scipi and P torch for running our models so from the Transformer Library we will import the auto tokenizer and the Auto mod model for sequence classification that will just allow us to load the finir tokenizer and the finbert model for classification loading these models may take some time on your computer uh the first time you load them for me they are cach locally so it's really fast then what I'll do is I'll Define a Finn bird sentiment function that will take a text as input and then produce a few different values so what I'll want to return is a topple that contains three floats so the probabilities assigned to the positive negative and neutral labels and a string which is the label of the one with the maximum probability assigned to it so the way the function works is first we'll hide behind a torch no. no grad block uh this is a way to tell pytorch that we won't be doing any gradient descent here because we're only doing INF we're not doing any training uh so that will uh make things go faster then what we'll do is we'll take our text pass it to the tokenizer uh by adding some uh padding uh make sure that uh we truncate if that's required we'll cap the length the maximum length at 512 uh so that we are kind only looking at the beginning of of the article there are different variation on different models will have different maximum length but here we'll just keep it to 512 once we've got our inputs to the model so we've taken our text it's been tokenized and now we can run it to the model to get the actual output for each classes so these will be our outputs and then we'll get the logits which are the basically the score so the the probability associated with each of these uh classes we'll combine them into to a dictionary where we'll have for each possible label so positive negative and neutral we'll put the value of that uh associated with that score so for that we'll use scipi to convert these logit uh using a softmax function to have a probability assigned to it once we've computed our scores we'll return our scores as a topple so the scores for the positive negative and neutral and the maximum score so the label associated with the maximum score once we've defined that function then we can just use that function and apply it to our text and that will return a topple so if we want to make sure that we can assign it to New columns in our data frame we'll then take the result of that apply of that fin bird sentiment the uh basically the topple and we'll apply again the series uh function which is the Constructor for panda Series so that will take care of taking these four values and making them in a format that can be assigned to four new columns and once we've got these four new columns I will also compute a score which is the difference between the score or the probability assigned to the positive class minus the one assigned to the negative one now if we just look at our title text and what we get with finbert this is kind of where we get so for this first piece of news we see that it is actually the neutral class that has the highest probability assigned to it so our finb Bird score will be3 but it will get classified as neutral the next three are classified as negative and finally this news here will be classified as positive so finbert is only one model it is the most commonly seen in the literature for bird their arve model for doing sentiment analysis on financial text but there are some fine-tune models that have been fine-tuned using different data sets so here in our case the finbert model has actually be trained on financial news so that's pretty good but if we were looking at other types of financial documents maybe doing some fine-tuning would be a good option so that's the topic for another video finally the last approach I wanted to talk about today is the use of large language models so obviously see everyone has seen GPT 3.5 GPT 4 llama 2 all of these models seem to be pretty good at analyzing text and understanding text so here we'll see how we can use them to extract some sentiment so for that I'll use Lang chain and I actually won't use any open AI models I want to keep everything kind of Open Source model so we'll use all Lama based uh model I've talked about AMA in my previous videos if you don't have it installed yet you can just go on Ama uh click download if you're on Mac Brew install AMA and then after you've downloaded uh AMA you have to go and look at a model so you'll need a model or multiple models depending on what you want to try so in this video I'll be using llama 2 and mix troll so in order to install llama 2 once you've got AMA installed you'll do AMA pull Lama 2 and same thing for mix troll so we'll need a few more libraries in order for this part to work so we'll need a lang chain which we already use to read the data and if you want you can install Lang chain open Ai and open AI it will be exactly the same process unless you want to use some fencer feature from open AI but in my case I'll only use the AMA models for this so first we'll build our query and make sure that we can actually get the result in the format that we want so for that we'll use the pantic support that Lang chain has so we'll import the base model and field from pantic and we'll use the pantic output parser and we'll use uh also a prompt template to create our prompt if you're not familiar with pantic it is a library that lets you define data structure so it's a bit like data classes but with many more features that include data validation uh so it is perfect for our use case here and Lang chain has built-in support for pantic then we'll import our chat model so here I'm using chatama and then as you'll see chat models large language models are not good at following all the instructions sometimes we have to resubmit our request especially at least from my experience with the L 2 models but mixs uh as well so for that I'll use the tenacity Library uh that has a retry decorator that we can apply to our function that will kind of retry our function until we have no errors up to a certain limit so the first step will be to Define kind of what we want in term of output so here this is where the penic model uh comes in so we'll Define a new class called sentiment classification which is derived from the pantic base model which will have a few Fields the first one is sentiment it is a string that is the the sentiment of the text and that has Tre allowed values positive negative and neutral so this is our label uh then we'll just for kicks we'll see whether we can get a numerical score from our large language model so this will be the score of the sentiment somewhere a float between minus1 and one then I'll also ask the model to justify the decision so the justification of the sentiment and because these large language model can be used for other things so we'll see whether we can actually at the same time extract the main entity that is being discussed in that text next we'll Define our function to compute our llm sentiment so it will take a chat client as an input uh which will depend its chat client might be for llama 2 or for something else but for any of these models what it's going to do it's going to take a string as input or text that we want to analyze and again return a topple so string float string string so these will correspond to what we have in our classification here so the sentiment the score justification and Main entity so first the function will Define a ptic output parser that is uh based upon that sentiment classification class that we just defined then we will create a prompt using a prompt template so our template will be described the sentiment of a text of financial news we'll inject the formatting instruction and we'll inject the news and to that template the input variable to that template there's only going to be one since the news put the text that we'll uh receive and then we also have partial variables so we'll fill in the format instructions here using the parser so the parser because the ptic output parser knows what format it's expecting it can actually generate the format instructions uh that will be injected into our prump and sent to the large language model so the main workflow in Lang chain is to build chains that chain operations so we'll build a chain that first takes the prompt uses an llm that we're receiving as an input and then applies the parser to it once we've got this chain in our text we'll call our run chain function which is defined above we'll do it in a minute if everything is good we just return the sentiment score justification and Main entity if there was an error which most likely will be a retry error which means that we've retried a few times but still can get an answer so in that case we'll just print the error and return an error label here so what does this run chain method look like well it's just one line chain invoke and then we're passing text and our news and then the result which is going to be a ptic object will'll just extract the information to it will convert it to a dictionary so that we can use it here why do we need a specific function for that is because I wanted to apply the uh retry decorator to that function so that's using tenacity so what this the Creator does here is that if there's an exception if there's an error while running that chain it's going to rry five times before stopping but after five times if it doesn't have a valid uh answer uh then it will uh raise a retry error which we'll catch here so this is how we'll compute our sentiment I'll first try with a Lama 2 so I'll create a chat Lama model I'll pass it the model name llama 2 I'll give it a temperature of 0.1 you can play around with that uh this is where you would replace chat Lama with chat open AI if you wanted to use open Ai and then what I'll do it's the same thing as we did with the finir so I'll I'll uh just create four new columns in my data frame so I'll apply our LNM sentiment function that we just Define to our text and then the result of that will be a topple so I'll have to apply again to the result of that uh the PD uh series uh Constructor so that everything gets assign here cleanly okay so as we can see we had three errors so we have three documents that were not classified the first one here that got classified it is positive it has a score of 7 the other one here is negative it has a score of 7 as well so the two don't seem to really kind of line up but this one this article actually when I read it it doesn't really have a specific sentiment uh to to me it said that the oil has been claiming they kind of focus on reducing cost there was so there's um kind of a mix of both in this uh case and this article so whether it gets classified as positive or negative I'm not quite sure this one here is on Boeing it's really about the confidence crisis that blow Boeing is going through so this was really bad news uh so it should be classified as negative it is a bit unfortunate that these here were not able to be classified I've tried that a few times sometimes they work sometimes they don't uh the model will actually return something every time it's just that it seems like clam two isn't that good at formatting the output according to the instructions this is kind of a downside of a Lama where as open AI I've kind of fixed this issue with their API by having the functions API where you can really instruct when in the function call when you call their large language model what the format of the output should be Lang change supports that when you're using open AI but it doesn't work yet with Alla so this is kind of downside of the using this ol now if I wanted to change the model so here I'll use dolphin mixol which is a version of mixol that is fine tuned in part to be uncensored so uh this is the one I had installed well if I wanted to use that model it's exactly the same function I don't have anything to change all I have to do is Define a new model here and specify the name of the model that I want to use this mixol model is supposed to be the best one that is available currently on all In this video, I'll show you three methods for extracting sentiments from financial news using Python. The three methods I'm going to show you go from the simplest to the more recent. I'll first start with a dictionary method, which is super easy, super simple, and intuitive but maybe not as modern. Then, I'll move on to FinBERT, which is a BERT-based model. And then I'll see how you can use large language models. This is Vincent Cod Finance, a channel about coding for finance research. If that's something that interests you, please subscribe. Before we start analyzing sentiment, we'll need to get some financial text. For that, I'll use financial news today that I'll get from Yahoo Finance. In order to retrieve that automatically, I'll get that using a document loader from LangChain, which is called NewsURLLoader. In order to be able to use that loader, you'll need to have installed the newspaper3k package, and that newspaper3k package also requires lxml_html_clean, but it doesn't install it by default, so you'll need to install both for this to work. And obviously, you'll also need to use LangChain and all the other kind of related LangChain packages that we'll need. All the other packages we'll install them as we go. I'll discuss them as we go along. But obviously, we'll also need pandas, so I'll store everything as we go into a pandas dataframe so that we can kind of keep track of our progress. As I said, we'll be using news from Yahoo Finance. This is one of the news pieces that I've included, talks about the oil industry in Canada. First, we'll need to import pandas, then we'll import the NewsURLLoader, then we'll define a list of URLs here that will contain all the news pieces' URLs that I want in my dataset. Here, I only have five for our sample, but obviously, you could do it with a lot more. And then what I'll do is I'll create a NewsURLLoader, and then I'll use that loader to load the data. Then, if we look at the data, it is a list of documents. Documents there, it is a built-in type for LangChain. And what I'll do is I'll just extract that data into a dataframe. I'll store it into a dataframe. I'll only keep the title and the text of that dataset. And now, what we see is that we have basically a dataframe with news titles and then the text content of each news piece. The first method I'm going to present is the dictionary-based method. For the dictionary-based approach, there are multiple ways to do that. I'll only show one today, which uses the dictionary of Loughran and McDonald, which is the most widely used in finance research. In order to get the dictionary, you'll have to download it from their website. I do have the link. As always, all the code in this tutorial is available on GitHub. You can look at the video description to get the GitHub link and also a link to the written description and instructions for this tutorial. The Loughran and McDonald Master Dictionary can be downloaded from here. I'm using the one in CSV format here. Be aware of the license if you want to use it for commercial purposes. It's a different license, but here, we're only doing it for research, so that's fine. And then when we have this dictionary, we'll almost be ready to apply it to our text, but we'll have to do some pre-processing. For this method, the pre-processing is going to be fairly simple. For example, first, we'll take our text, change all the words to lowercase, remove punctuation, remove stop words so frequent words, remove numbers, and typically, we'd also be doing either some stemming or lemmatizing, but in the case of the Loughran and McDonald dictionary, there's no need for that because they include all the variations for different words. First, I'll get my list of stop words, so common words that I want to exclude from the text because they are not believed most of the time to have any meaning. For that, I'll just have all these stop words defined in a text file. These are the words that Loughran and McDonald use. There are multiple other lists for stop words. For example, if you use the NLTK library in Python, it does include a list of stop words for many languages. But here, we're just using the same stop words as Loughran and McDonald. They are in a text file here. I'll first start here by reading this text file. These are the first 10 words that we'll be excluding. Then we'll preprocess all our text. So, pre-processing is very minimal in this case. We'll take our string, we'll split it, we'll convert all the words to lowercase, then we'll only keep the words that are not stop words. We'll filter those stop words out. And finally, we'll only keep words for which all the characters are letters, so we'll exclude all numbers and words that are a mix of numbers and letters. We'll output a string. We'll join all of these words separated by spaces. We'll apply that to our dataset. And now, we have a new column of clean text, and this clean text is ready to be processed with the dictionary. The next step will be to actually load the dictionary. This is what the CSV file kind of looks like. What we'll do is we'll make a list of positive words. What we'll do is we'll look at our dictionary, we'll keep only the words for which the positive tag is not zero, and we will keep that column of words, convert it to a lower string, and make it a list. And we'll do the same thing for negative words. Here, we look, we have our first 10 positive words, and we'll have similarly the first 10 negative words. Now, when it comes to computing a sentiment score, it really depends on what you want to include or not. So, there are different ways to do it. You can just count the number of words that are positive and count the number of words that are negative, do the difference. That would give you a level. You can also compute that as a ratio, using all the words as the denominator, or only the words for which there is sentiment as the denominator. So, we'll do all of them. Here, what we'll do is we'll first compute the number of words in our clean text. That will be one of our denominators for one of our ratios. In order to do that, what we do is we take each of these strings, which is the clean text, we split it based on spaces. That will give us a list of words, and we'll compute the length of that. That's going to be our total number of words. Then, what we'll do is we'll compute the number of positive words. For that, what we'll do is we'll again take our clean text. For each of our clean texts, what we'll do is we will take all the words that are in our text and then only keep them if they are in the list of positive words. That will give us a list, and then we'll compute the length of that list. And we'll do the same thing for negative words. Once we do that, this is what we'll have. We'll have, say, for in this example, we have 261 words. We've got one positive word and seven negative words. Once we have that, we can compute different measures for sentiment. So, we'll do it multiple ways here. First, we'll just do it in levels. We'll just look at the difference between the number of positive and negative words. Then, we'll do a score, a sentiment score, which is the same thing but divided by the total number of words. And finally, we'll do a second score, which is divided by the number of words that have some sentiment attached to them. This could be problematic sometimes if there are no sentiment words in your text. So, be careful if you do that. But these are kind of different ways to do these metrics. And finally, what we can do after that is, if we really want a categorical classification, so just have different classes of labels attached to it. In this case, we'll want to either classify it as positive, negative, or neutral. Then you can use a cutoff value of one of these ratios. Here, we'll use a cutoff value of 0.3 on our second score. We'll see if the score is positive and above 0.3, we will say that this is a positive sentiment text. If it's below minus 0.3, we'll classify it as negative, and otherwise, we'll classify it as neutral. So, this is what we do here. And now, what we get is, well, we get that for this first text, for example, the level sentiment is, these are the scores, so it would be classified as negative. Actually, they would all be classified as negative, except for this article, which is positive. It talks about the restaurants that were along the eclipse path, saw like record business because people wanted to see the eclipse and eat also during that day. This is not the only dictionary-based method. There are many variations that exist to these methods. There's also data-driven methods where the dictionary is built using data, and there are different transformations that exist also that can be done on words directly. This is just one simple way, but it is very commonly used in finance research. The second approach that I want to use for sentiment analysis is the FinBERT model, which is a derivative of the BERT model that is fine-tuned for sentiment analysis on financial text. You can see all the details for this FinBERT model on Hugging Face, and we'll use the Hugging Face Transformer library to download and use that BERT model. This model is based on BERT. The BERT model was introduced in this paper. The link is in the description. If you go on Hugging Face, you'll see there's a big community and a wide array of models that are based on BERT. There are even fine-tuning versions of FinBERT, but in this case, today, we'll use the basic FinBERT. The way FinBERT and most BERT-based models work for doing sentiment analysis is that they first take the text input, convert it to tokens, and then the model will create vector embeddings. A vector representation, just an array of numbers, so a vector representation of that text. And then in the final layer of the model, these weights, this vector, is going to be used as input for three different models that will generate probability associated with each class. Here, we'll have a negative, positive, and neutral class, and we'll classify the one as being the answer to our classification process as the class that has the highest probability. There are different approaches to pre-processing when using BERT-based models. Some will include masking some words, so I say removing dates, company names, things like that. In this case, today, what we'll do is absolutely nothing. I'll just use the raw text and feed that to the model. For this example, we'll use the Transformers library from Hugging Face, and we'll also need PyTorch and SciPy. We'll be using Transformer, SciPy, and PyTorch for running our models. From the Transformer Library, we will import the AutoTokenizer and the AutoModel for sequence classification. That will just allow us to load the FinBERT tokenizer and the FinBERT model for classification. Loading these models may take some time on your computer. The first time you load them, for me, they are cached locally, so it's really fast. Then, what I'll do is I'll define a FinBERT sentiment function that will take a text as input and then produce a few different values. What I'll want to return is a tuple that contains three floats, so the probabilities assigned to the positive, negative, and neutral labels, and a string, which is the label of the one with the maximum probability assigned to it. The way the function works is, first, we'll hide behind a torch no_grad block. This is a way to tell PyTorch that we won't be doing any gradient descent here because we're only doing inference. We're not doing any training. That will make things go faster. Then, what we'll do is we'll take our text, pass it to the tokenizer, by adding some padding, make sure that we truncate if that's required. We'll cap the length, the maximum length, at 512, so that we are kind of only looking at the beginning of the article. There are different variations on different models. Will have different maximum lengths, but here, we'll just keep it to 512. Once we've got our inputs to the model, so we've taken our text, it's been tokenized, and now we can run it to the model to get the actual output for each class. These will be our outputs, and then we'll get the logits, which are basically the score. The probability associated with each of these classes. We'll combine them into a dictionary where we'll have for each possible label, so positive, negative, and neutral, we'll put the value of that associated with that score. For that, we'll use SciPy to convert these logits using a softmax function to have a probability assigned to it. Once we've computed our scores, we'll return our scores as a tuple. The scores for the positive, negative, and neutral, and the maximum score. So, the label associated with the maximum score. Once we've defined that function, then we can just use that function and apply it to our text, and that will return a tuple. If we want to make sure that we can assign it to new columns in our dataframe, we'll then take the result of that apply of that FinBERT sentiment, basically, the tuple, and we'll apply again the Series constructor, which is the constructor for pandas Series. That will take care of taking these four values and making them in a format that can be assigned to four new columns. And once we've got these four new columns, I will also compute a score, which is the difference between the score or the probability assigned to the positive class minus the one assigned to the negative one. Now, if we just look at our title text and what we get with FinBERT, this is kind of where we get. For this first piece of news, we see that it is actually the neutral class that has the highest probability assigned to it. So, our FinBERT score will be 0.3, but it will get classified as neutral. The next three are classified as negative, and finally, this news here will be classified as positive. So, FinBERT is only one model. It is the most commonly seen in the literature for BERT. There are other models for doing sentiment analysis on financial text, but there are some fine-tuned models that have been fine-tuned using different datasets. Here, in our case, the FinBERT model has actually been trained on financial news, so that's pretty good. But if we were looking at other types of financial documents, maybe doing some fine-tuning would be a good option. That's the topic for another video. Finally, the last approach I wanted to talk about today is the use of large language models. Obviously, everyone has seen GPT-3.5, GPT-4, LLaMA 2. All of these models seem to be pretty good at analyzing text and understanding text. Here, we'll see how we can use them to extract some sentiment. For that, I'll use LangChain, and I actually won't use any OpenAI models. I want to keep everything kind of open-source model. We'll use all Ollama-based models. I've talked about Ollama in my previous videos. If you don't have it installed yet, you can just go on Ollama.com, click download. If you're on Mac, brew install ollama, and then after you've downloaded Ollama, you have to go and look at a model. You'll need a model or multiple models depending on what you want to try. So, in this video, I'll be using LLaMA 2 and Mixtral. In order to install LLaMA 2, once you've got Ollama installed, you'll do ollama pull llama2, and same thing for Mixtral. We'll need a few more libraries in order for this part to work. We'll need LangChain, which we already use to read the data. And if you want, you can install LangChain OpenAI, and OpenAI, it will be exactly the same process unless you want to use some fancier features from OpenAI. But in my case, I'll only use the Ollama models for this. First, we'll build our query and make sure that we can actually get the result in the format that we want. For that, we'll use the Pydantic support that LangChain has. We'll import the BaseModel and Field from Pydantic, and we'll use the PydanticOutputParser, and we'll use also a PromptTemplate to create our prompt. If you're not familiar with Pydantic, it is a library that lets you define data structures. It's a bit like data classes but with many more features that include data validation. So, it is perfect for our use case here, and LangChain has built-in support for Pydantic. Then, we'll import our chat model. Here, I'm using ChatLLaMA, and then, as you'll see, chat models, large language models, are not good at following all the instructions. Sometimes, we have to resubmit our request, especially at least from my experience with the LLaMA 2 models, but Mixtral as well. For that, I'll use the Tenacity library that has a retry decorator that we can apply to our function that will kind of retry our function until we have no errors up to a certain limit. The first step will be to define what we want in terms of output. Here, this is where the Pydantic model comes in. We'll define a new class called SentimentClassification, which is derived from the Pydantic BaseModel, which will have a few fields. The first one is sentiment. It is a string that is the sentiment of the text and that has three allowed values: positive, negative, and neutral. So, this is our label. Then, we'll just, for kicks, we'll see whether we can get a numerical score from our large language model. This will be the score of the sentiment, somewhere a float between minus 1 and 1. Then, I'll also ask the model to justify the decision. So, the justification of the sentiment. And because these large language models can be used for other things, so we'll see whether we can actually, at the same time, extract the main entity that is being discussed in that text. Next, we'll define our function to compute our LLM sentiment. It will take a ChatClient as an input, which will depend. Its ChatClient might be for LLaMA 2 or for something else. But for any of these models, what it's going to do, it's going to take a string as input, or text that we want to analyze, and again, return a tuple. So, string, float, string, string. These will correspond to what we have in our classification here. So, the sentiment, the score, justification, and main entity. First, the function will define a PydanticOutputParser that is based upon that SentimentClassification class that we just defined. Then, we will create a prompt using a PromptTemplate. Our template will be "Describe the sentiment of a text of financial news." We'll inject the formatting instruction and we'll inject the news and to that template, the input variable to that template. There's only going to be one since the news, put the text that we'll receive, and then we also have partial variables. We'll fill in the format instructions here, using the parser. The parser, because the PydanticOutputParser knows what format it's expecting, it can actually generate the format instructions that will be injected into our prompt and sent to the large language model. The main workflow in LangChain is to build chains that chain operations. We'll build a chain that first takes the prompt, uses an LLM that we're receiving as an input, and then applies the parser to it. Once we've got this chain in our text, we'll call our runChain function, which is defined above. We'll do it in a minute. If everything is good, we just return the sentiment, score, justification, and main entity. If there was an error, which most likely will be a retry error, which means that we've retried a few times but still can't get an answer, so in that case, we'll just print the error and return an "error" label here. What does this runChain method look like? Well, it's just one line, chain.invoke, and then we're passing text and our news. And then the result, which is going to be a Pydantic object, we'll just extract the information to it. We'll convert it to a dictionary so that we can use it here. Why do we need a specific function for that? It's because I wanted to apply the retry decorator to that function. So, that's using Tenacity. What this decorator does here is that if there's an exception, if there's an error while running that chain, it's going to retry five times before stopping. But after five times, if it doesn't have a valid answer, then it will raise a retry error, which we'll catch here. This is how we'll compute our sentiment. I'll first try with a LLaMA 2. So, I'll create a ChatLLaMA model. I'll pass it the model name LLaMA 2. I'll give it a temperature of 0.1. You can play around with that. This is where you would replace ChatLLaMA with ChatOpenAI if you wanted to use OpenAI. And then what I'll do, it's the same thing as we did with the FinBERT. I'll just create four new columns in my dataframe. I'll apply our LLM sentiment function that we just defined to our text, and then the result of that will be a tuple. I'll have to apply again to the result of that, the PD Series constructor, so that everything gets assigned here cleanly. Okay, so as we can see, we had three errors. We have three documents that were not classified. The first one here that got classified, it is positive. It has a score of 0.7. The other one here is negative. It has a score of -0.7 as well. So, the two don't seem to really kind of line up. But this one, this article, actually, when I read it, it doesn't really have a specific sentiment to me. It said that the oil has been claiming they kind of focus on reducing costs. There was so there's kind of a mix of both in this case, and this article. So, whether it gets classified as positive or negative, I'm not quite sure. This one here is on Boeing. It's really about the confidence crisis that Boeing is going through. So, this was really bad news. It should be classified as negative. It is a bit unfortunate that these here were not able to be classified. I've tried that a few times. Sometimes they work, sometimes they don't. The model will actually return something every time. It's just that it seems like LLaMA 2 isn't that good at formatting the output according to the instructions. This is kind of a downside of Ollama, whereas OpenAI, I've kind of fixed this issue with their API by having the functions API, where you can really instruct when in the function call when you call their large language model what the format of the output should be. LangChain supports that when you're using OpenAI, but it doesn't work yet with Ollama. This is kind of a downside of using this Ollama. Now, if I wanted to change the model, so here, I'll use Dolphin Mixtral, which is a version of Mixtral that is fine-tuned, in part, to be uncensored. This is the one I had installed. Well, if I wanted to use that model, it's exactly the same function. I don't have anything to change. All I have to do is define a new model here and specify the name of the model that I want to use. This Mixtral model is supposed to be the best one that is available currently on Ollama. Okay, so this one did a bit better. Only one news wasn't able to be classified. So, the first one here, so this one, it actually classified it as neutral, which is how I would have classified it. This one is negative. This one is positive. In this case, it seems like the numbers actually lined up with what we were being provided as a label. And then, if we look at the main entity, it seems to be doing a kind of so-so job. So, yes, this one is about Boeing. This one, is it about the eclipse? About the restaurants? Not sure. But it actually produced a result. One thing that's important to note, though, is the time it took, right? So, obviously, you can use GROQ or OpenAI, their online APIs to run these models online. Models right. These models will be costly to run compared to FinBERT, which can easily run on my machine very, very quickly. Now, let's look at a kind of justification. We'll look at the first text, and we'll see what LLaMA 2 and Mixtral provided in terms of justification. This was the beginning of the article. So, it was about a conference, industry conference for the oil industry. LLaMA 2 says the article focuses on the oil and gas industry in Canada. So, it seems to at least have a good justification. Say, all the it's kind of forward-looking. They say that it's still growing, so the overall tone of the article is positive. This is kind of the justification for why it classified it as positive. Mixtral says that it takes discusses the focus on spending discipline and predictability. So, they say that the sentiment is neutral. Does it express strong positive or negative emotions? Right. So, this is what we get. Obviously, with large language models like this, the prompt matters a lot. So, there's a lot of variation that you can try to get to the result that you want. You really have to be as precise as possible as to what you want from the output. Maybe it would have been better for me asking fewer things. The score might be throwing off. I'm not sure what's causing issues with the formatting, but obviously, you need to do some trial and error to get exactly what you want. Now, if we compare the overall results that we got, all for the first one that I judge kind of neutral, one model classified it as negative, one as positive, the other two as neutral. This one, which is clearly positive, has been classified by every model as positive, except LLaMA 2. After five tries, still can't give an answer. This one, which is a clear negative, all the different approaches defined it as negative. So, whether it's worth going into really fancy models like Mixtral, LLaMA 2, or OpenAI, like large language models, compared to much simpler approaches such as FinBERT or even dictionary methods, well, you'll have to look at your specific use case. But sometimes, these models, these approaches that can be orders of magnitude faster and cheaper to use, can be really, really, really good in terms of output. It really depends on what you want to achieve. That's it. I hope you enjoyed. Let me know what you try in the comments, and consider subscribing if you want to get notified of my future videos.

Info

Channel: Vincent Codes Finance

Views: 715

Rating: undefined out of 5

Keywords: researchtips, research, professor, datascience, dataanalytics, dataanalysis, bigdata, data science, python pandas, big data, chatgpt, gpt, ollama, artificial intelligence, chat gpt, machine learning, uncensored, opensourceai, llama2, mistral, private, privacy, opensource, javascript, code, programming, python, langchain, huggingface, finbert, sentiment analysis, nlp, natural languange processing, sentiment analysis python

Id: FRDKeNEeNAQ

Channel Id: undefined

Length: 30min 59sec (1859 seconds)

Published: Fri Apr 12 2024