Using Langchain with Ollama and Python

Video Statistics and Information

Video

Captions Word Cloud

Captions

can you talk to your own documents to get answers out of them that's the goal for many tools out there and the concept is called embeddings feed your notes your collected PDFs or well anything else that you have to your llm and ask it questions to generate new insights olama is already the easiest way to work with llms on your laptop but can we support embeddings in this video I'll show you how a python developer can use embeddings on olama using Lang chain so let's say you have a document that you want to ask a question about I'm going to use this Wikipedia article about the horrible fires on Maui just a few days ago the URL is on the screen so how do you get started well since we're using Lang chain in this video let's start with Pip install Lang chain okay now create a python file and start building we first need to import olama from langchain.lm's import olama and then we need to initiate our connection to a model on olama olama equals a llama base URL equals HP localhost 11434 and model llama2 we can specify any model here that either is on the Alama Hub or that you've created locally now to test that that works print olama Y is the sky blue now if we run that we will see why the sky is blue perfect now we can work on loading the document when I did this the first time I needed the python module bs4 so run pip install bs4 now we can add the import for the loader from langchain.document loaders import web-based loader then load the document loader equals web-based loader and the URL then data equals loader.load now when we ask it a question chances are the whole document isn't needed to answer the whole the question but rather just the parts that talk about the concept that I'm interested in but how do we limit what gets sent to the model databases are great for searching for a subset of content and spitting out the results so let's use a simple Vector database called chroma DB but any Vector DB doesn't really want the full document it wants the document chunked up into smaller pieces so we need to split it up first Lang chain has something for this called a text splitter so let's add it from langchain text splitter import recursive character text splitter and now we can say how we want the text split up text splitter equals recursive character text splitter chunk size is 500 and overlap is zero so every chunk is going to be 500 characters and there's no overlap between the chunks and then all splits equals textplitter dot split documents and feed it our data now we can add those chunks to the database but Vector stores don't store just regular words they store vectors so the embeddings function is what converts the words as of this recording we don't have an embeddings function that Lang chain can use so we'll use the GPT for all embeddings first install the python modules pip install GPD for all and chromodb then in the file we add from langchain embeddings import GPT for all embeddings and then import the database from line chain Vector stores import chroma now instantiate that data store Vector store equals chroma Dot from documents and documents equals all splits and embedding is our GPT for all embeddings in line chain the central concept is the chain which connects a bunch of tasks together to complete a larger class task we can use the retrieval QA Chain by adding to our file from linechain.chains import retrieval QA and then QA chain equals retrieval QA from chain type and then olama and retriever equals Vector store as retriever now we can ask our question to our document question is when was Hawaii's request for major disaster declaration approved so QA chain with a query of the question and that's it pretty quickly we get an answer sometimes the results of the query isn't spot on I think one of the biggest factors in this is how we split up that Source information I think it's better to have a bit of a overlap so if we set our overlap to 50 we often get better answers but you may need to play with that and that is how you can use olama with line chain and python to ask a document a question you could use a different loader to point to a directory instead of one document and you would probably want to keep the data store up between questions so you don't have to keep importing I'll share examples on how to do that later and if you want to see how to do this with JavaScript keep an eye out for that video thanks so much for watching goodbye

Info

Channel: Matt Williams

Views: 10,182

Rating: undefined out of 5

Keywords: large language models, machine learning, langchain ai, langchain in python, artificial intelligence, langchain tutorial, prompt engineering, llama llm, llama 2, tutorial, large language models tutorial

Id: CPgp8MhmGVY

Channel Id: undefined

Length: 5min 16sec (316 seconds)

Published: Fri Aug 11 2023