Getting Started with LangChain and Llama 2 in 15 Minutes | Beginner's Guide to LangChain

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone my name is winning and in this video we're going to have a look at how to get started with long chain using Obama 2.0 if you want to follow along there will be a completely free tutorial that is available on mlxpert.io on which you can find the complete code a lock with a tutorial in text explanations and of course a link to the complete Google clock notebook that we are going to use why should you use a link chain well the most important use case for using link chain is to combine your watch language models such as charge GPT or Walmart 2.0 with some external source of data more of the common examples include PDFs CSV files external databases or external tools such as Google web searches API requests and other tools such as calculators and even called interpreters so this is the main use case for blockchain and here is a summary of that of course the library is available in Python but there is also a version of it that is available in typescript and I would say that the typescript support is very well maintained and the library is pretty much on a feature parity as well as the pi as the python Library so what are the building blocks of the long chain Library the most important parts of course are the chains and we have a foundational box which include things such as the model wrappers so for example if you want to use charge GPT or maybe Walmart 2.0 or Falcon models you can use those right within the long chain Library you have prompt templates which allow you to add parameters essentially to your prompts so you don't have to just input some strings within your prompts to the watch language models that you're going to use also you have indexes and Vector stores those are useful when you want to embed essentially your external data such as PDF files text files Etc and then you have a memory components which allow you to remember the conversation between the user and the chatbot so this will allow you to have a session data in which you can pass into the watchful language model so the chains are essentially the main building block within the long chain library and they allow you to combine the foundational building blocks right within the long chain library for example you might have a chain that is a retrieval chain which essentially uses the indexes in the vector stores in order to get some data from a PDF file and then you will pass this data or this text to a watch language model and this is all done right within the chain of course there are other ways to use the chains and chains can be actually combined with other chains I'm going to show you an example of that in a minute and yeah chains are very powerful you can create your own and combine them in very interesting ways and the final building block are the agents so the agents are essentially a more powerful chains that allow you to do some actions based on tools that you provide them and some of the more important or more commonly used tools are for example online search where you can pass in the essentially the superpower to your watch language model to search online for information also you can pass past paste in something like a code interpreter which allow you to your watch language model to run code against uh for example python interpreter and you can call external apis again to get some data so you can imagine that you can build very powerful applications using those tools so how do you install link chain and well it is as simple as PP install long chain and I'm going to show you how you can work with the library I have a Google Cloud notebook with a T4 GPU and the link to this Google Co-op will be included into the text tutorial that it will be available on amazon.io and here are the installation libraries that we're going to use the most important part is the link chain Library here we are also installing the Transformers Library along with some helper libraries in order to get the Walmart tool to run on this Google Cloud notebook and the final import is the unstructured library which will allow us to read an external PDF file right here so first here is how you can essentially use a Walmart 2.0 and this is a version that is using the gptq library in order to run right within a link chain pipeline so essentially you're initializing the model in this case we're going to use the 1 to 13 billion From the Block and you're passing in or creating this pipeline which is again from the Transformers Library here I'm passing in the model and the tokenizer along with the generation config and you can see the configure right here and finally the most important part is to create a pipeline from the blank chain hanging phase pipeline instance so this will be compatible with the link chain and this is essentially the watch language smaller wrapper that we are going to use in order to run this you just have to pass in a string to the lrm just as you would an adding model and here is the result from this you can read that on your own the next more important part is the prompt templates as you can see you can pass in some variables right here and for example I have a template that is specific to Walmart 2.0 and I'm passing in the system message right here and then the text so this is the parameterized version of this prompt template and here I'm specifying that the text variable is going to be an input variable from this prompt template and as you can see I'm passing in the text explain what are deep neural networks in two to three sentences and when I format The Prompt right here you see that this essentially goes ahead and replaces the text right here with the text that we are passing in so this is how this from template think works and of course you can pass in this to your watch language model and this should work just fine so next here is how you can create some chains and um more simple or probably the simplest way to create a chain is to create just an error chain this will essentially wrap your watch language model and you can pass in a prompt and again you are calling just chain dot Rune and then passing in the text this will replace the prompt right here and run the chain with this prompt and yeah you are essentially getting the same output as just this one before so the ROM chain is very simple and here is how you can essentially combine multiple chains and I want to create a chain that is getting a summary of some text and it is giving uh three examples of practical applications so this will go ahead get this input and then I run through the first chain then use this examples chain right here and in order to combine the chains I'm going to use this simple sequential chain on which I'm passing in the first chain which we've already created and then the examples chain and I'm again passing in the multi-chain.room with the text and note that I'm using the variables equal to true so this is what we are going to get as an output since we are using the verbose output you see that we are entering this chain and this is the first response right here and this is the second one so when I finish the response and I strip the result you'll see that I'm getting the three examples that we've asked for practical applications from the second chain but the input was first given to the first one so this is how you can essentially combine multiple chains one after another one very popular application of long chain is to actually create chat bots so in this example I'm creating a chat between a teacher and students so in our case I'm going to pass in this system template and know that I'm not using the actual warmer 2.0 format right here so you might want to format this more properly for your use case and the template is that the chatbot is actually an experienced high high school teacher and the teacher is always giving examples and analogies so this is the system template and this is given from system messages from template from template as you can see I'm going to build these messages then a human message hello teacher a message welcome everyone so essentially we have the system message then a human question a response from the AI which of course you can manipulate as you can see and then this is the actual template that we're going to use and in our case I'm going to pass in the template that is going to be on artificial intelligence so this will replace this subject right here and then this will be the the text or the question that we're going to ask so for the text from this template we're going to press in or pass it this one and this is the format of the final example as you can see all the variables have been replaced from our template and in order to run this through our watch language model I'm going to call a special method called predict smash issues which understands the messages right here so the types as you can see those are not just strings but there are actually objects and in order to get the result I'm going to call the result.content and yeah as you can see this is a response that is given to us next I'm going to get the original Bitcoin paper and this is taken from this markdown that is available on Satoshi paper repository bitcoin.md and this is I believe the original paper converted into markdown which is of course the Bitcoin appear to peer electronic cache system from Satoshi Nakamoto and this markdown is actually warded using the unstructured magdal loader and you can see here that I'm loading the documents and the next part is to essentially get the text and convert it into multiple chunks since this paper is quite long and yeah it can take a lot of text in order to give the complete markdown to the model so we are going to split it into chunks of 124 1024 characters and this is how it will all work so after this is complete we are getting 29 chunks of texts and as you can see the embeddings that we're going to use for this one is going to be a GT watch embedding I'm not going to go into details on how and why I'm choosing those embeddings but you can go to the text tutorial and I'll include some more details about those embeddings in this tutorial we're using just open source models so both the Walmart 2.0 model is open source and then the embeddings are also open source so nothing from open AI or other closed Source models Within These examples and yeah I'm just ex embedding the query from the text and this is going to give us for example for the first text we're going to get this many this number of vectors size and to just get all of the text and embed them within using the embeddings I'm going to use the chroma Vector store in our case this will be a vector store that is actually adding the documents embeddings within a vocal directory call which we're going to call DB and in our case you can query this similarity search and look for some text and then you can specify how many text or results do you want to get in our case I'm going to specify two and this is the first response that I'm going to give and this is actually going to be the text that we're going to pass into the watch language model I'm adding a template right here which we've already seen I'm using a prompt template right here for the context and the question and the context is actually this text right here and the question that I'm going to ask in how does proof of work solves the majority decision making problem explaining it like M5 and in order to run this I'm going to use a special type of chain a retrieval q and Q and A chain which I'm going to pass in the model I'm going to embed everything as stuff and then I'm going to add the database as Retriever and I'm going to get only the first two results I want to also return the search documents and then for the argument I'm going to pass in just the prompt which is essentially this template right here so this is the response you can see that we are getting the query the result and then some Source documents and if you look through the result a you will see that the model is actually using the correct response format and it is explaining like M5 I urge you to go through this explanation is very good actually from one 2.0 and then I'm asking another question very similar to this one and you can see that it takes about seven to eight seconds at least or answer the first one and then about 16 seconds on this GPU in order to respond to the first one and this is the response from the second one finally we're going to have a look at the agents and in our case I'm going to create a python agent which is going to be called from this function create python agent from the agent toolkits and here I'm passing in the watch language model again the Walmart 2.0 in our example and then for the tools I'm going to pass in a python repo 2 so this will be our interpreter that the agent is going to use and I'm going to show you how this is going to be used with the variables equal to true so we look through the way that this agent is actually running and here is the question that I'm passing in calculate the square root of a number and divide it by 2. so we are using the Aleppo 2 and we're given a warning that this can execute arbitrary code so yeah you should be careful with that of course and just for you I've actually got this code and executed you can see that the actual result is actually two but still very good thinking in order to come to the conclusion that the answer is actually wrong but still the the idea and the method was quite good so this is it for this video thanks for watching guys please like share and subscribe also leave a comment down below if you want to have a look at other videos that are related to link chain or Walmart 2.0 please also join the Discord channel that I'm going to leave down into the description and I'll see you in the next one bye
Info
Channel: Venelin Valkov
Views: 21,946
Rating: undefined out of 5
Keywords: Machine Learning, Artificial Intelligence, Data Science, Deep Learning
Id: 7VAGe32YptI
Channel Id: undefined
Length: 15min 47sec (947 seconds)
Published: Sat Sep 23 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.