Build overpowered AI apps with the OP stack (OpenAI + Pinecone)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
some fine tuning open ai's fine tuning process um and about a month and a half ago we very quickly and drastically pivoted towards embeddings and generative search and all the things that I'm about to show you because I just think it's a much more interesting exciting area fine-tuning has its own benefits as well but I'm just completely obsessed with the embeddings and generative search space so that's what we're doubling down on now what is mahi asked AI in a nutshell it's your own chat GPT and it uses your own help docs your own files your own website web pages content any information really that you can think of and throw at it you add in the content you create your own chat gbt and then you can launch it anywhere from your website slack whatever you can think of we really want to make it as easy as possible for any person or organization to harness some of this new technology and apply it to their own business to their own situation so first up let's just talk about how it all works and I'm just going to share my screen Amanda I can't see oh there it is my dad [Music] foreign [Music] I think we lost Alex but I'm sure he'll be joining us again shortly uh in the meantime we have a question for James Briggs in the chat James where do you get your shirts and someone that wouldn't have wouldn't be a good demo without that happening of course of course we were just filling the time with some silliness so James you can answer that in the Q a and Alex why don't you take it back sorry it's a upstage the synonyms James right so we're going to explain how myosk AI works can you guys see that okay Amanda is that come through that is yeah cool okay so this is our website um we're going to show kind of the end result first and then work backwards from there now this is actually an Ask AI that's been embedded within our own site and we're just going to ask it because why answer a question when AI can do it for you right that's kind of the topic of today uh how does my ask AI work and what it's going to do is it's going to look at all the help docs that we've trained this on and it's going to look at those and it's going to try and come back with the best answer possible even if this answer isn't specifically defined and certainly this paragraph you see here isn't actually written just like that so my ask AI works by breaking down your content into chunks mapping and displace blah blah blah we will look into that in a little bit more detail and actually think about what's happening there a better explanation and maybe a shorter one is if we think about okay how do we explain this like I'm five years old this is a really nice little definition I don't think I could write it better myself ask AI is like a really smart robot that can answer your questions you'll read lots of books and information and then when you ask it something it tries to find the best answer from all the things it's read kind of like a teacher but you're actually speaking to a robot couldn't have put it better myself so what's actually happening here so when you're adding content to our tool we break that content up into small chunks we then embed each of these chunks with open ai's embeddings API which basically turns text into numbers in a very simplistic sense these then get stored in a database a vector database which is pinecode which is our host today now when you ask a question like how does ask AI work we kind of follow a very similar process we start by taking that question we embed it we turn that text into numbers and then we query pine cone we query our database and pine cone is very smart and will find chunks of text that are very similar to our question it'll return a couple of these chunks we'll pass it through a custom chat GPT prompt and we'll say answer this question with some of these chunks information and then it'll write this nice bespoke answer for us now what's really interesting is as a user of our service you don't see any of that stuff that I just explained and that's important because it's technical it's complex a lot of people are going up there and building their own which is amazing and it's you know it's not the most uh technically complex thing to go and do yourself but obviously this is not going to be the domain of a lot of businesses a lot of people so we kind of handle all of that side for you you drop in your docs and you can just ask your questions you've also spent a lot of time engineering the prompts in our entire system around ensuring that we only answer questions that are contained within your context within the information that you've uploaded so we don't get this hallucination which is this topic in AI right now which is chat qbt is kind of getting a little famous for which is you know it lies it confidently lies and makes stuff up it sort of bends and warps the truth a little bit and it can be really hard to know when it is and isn't telling the truth sometimes so we're trying to solve that by allowing you to put in your own information there may be private and chat GPT doesn't even know about because it's not on the public web or it's really recent and new and obviously chat GPT is a cut off in terms of all of the kind of world information that it's that it's consumed so that's a really kind of important difference between us and chat GPT now let's actually jump in and start building so it's going to move this part here so we're going to build the James Briggs rki the brig spot and we're going to start by adding in a transcript from one of James's amazing YouTube videos about semantic search so this is semantic search we're actually going to put in some of these nice references that'll come in handy later and we're going to go ahead and build that and this is going to look at that it's going to break it up into chunks it's going to embed it all happens fairly quickly and then actually it's ready to use straight away which is pretty cool it's a pretty topical question would be how is Pinecone being used in this process it's actually created those questions automatically from your content just to get going a little bit and then hopefully James would agree this is a fairly accurate answer but Pinecone is a vector database and it's being used to index the data and a bunch of other details and obviously we could explain this like I'm 5 again if we really wanted to break this down or translate it or give a longer explanation and there's a bunch of kind of follow-on questions that we could ask there well what happens if we ask it something that it might not know about so what are spots dents and beddings and we'll talk about what this is in a second but this isn't something that's contained within that transcript and importantly here it's not going to try and make up an answer and in this case make the James briggsball Look Silly by saying something that is incorrect right so what we're going to do to fix that is we're going to add in a little more content and we're going to add in a transcript from another one of James's YouTube videos which actually covers sparse dense embeddings I won't bother with the information this time we'll put that in there and again it's going to look at that big chunk of text it's going to break it up it's going to go through the embedding process and then we're going to be able to ask that same question and it's going to think about it and it's going to come back and it's going to tell us what the answer is so you can see there then the adding of that new transcript has actually made a difference to what it can and can't answer it so it shows really clearly about the kind of demanded information that this thing can have gone on to questions on and then a really nice thing which a lot of this Tech supports out the box is we support 95 languages out of the box both in terms of the content you upload and the questions you ask so if there's any Spanish speakers in the audience and this is basically us asking the same question and you can kind of get the gist there that this is a similar answer it will be in Spanish this time so that's a kind of simple example now we're going to rumble over to a slightly more complex example and this is a pine cone developer assistant that I made earlier uploaded a bunch of their web pages that it can crawl and get the text from PDFs docs text files csvs you can really throw a whole bunch of different information at this then and what we're going to do now is we're going to think about how can we customize this a lot more because the use case is going to be Pinecone as a business want to put this somewhere that their developers can access to answer the questions and kind of act as a as a lead into some of the developer documentation so we'll give it a bit of a description so it has some context um which is about developer questions so people know what to ask it we're actually going to edit the suggested questions because we want to have a nice broad range and not necessarily ones that are automatically generated so three kind of topics that we'll have in here some nice kind of starter questions for developers to get going so those will set in we're going to actually change the answer style so it comes in bullets so things are like nicely broken down and make a lot of sense and you don't get like big chunks of text we're going to change the play sort of text for the search input just so again we can kind of reinforce what this thing can and can't do when it doesn't know an answer we're going to change that and we're going to push out to the pine cone email support just so people can do that at this stage we'll just quickly save some of that and just test some of that stuff out who's James Briggs unfortunately not mentioned in the documents that I've uploaded so there it's going okay why don't you email our support then we're going to look at the references and this is a really interesting thing that we've pulled in so when we ask a question like what is hybrid search when we're searching through all of the documentation it's going to change one thing so you can see this first um show the references we actually show these chunks of text and the actual reference links to the documentation that we pulled back but in Pine cone's case they say well you know we don't want to show the chunks of text we'd rather just sorry for jumping around with Roger shown the links to the most relevant developer Docs and then they can toggle that kind of compact references as we call them and it puts in the most relevant documentation you can see there the links for the semantic and for the hybrid search are the relevant support docs for this question and for this answer that's a really nice kind of Link in to some of their developer docs there then we're going to go even further and we're actually going to customize how this thing looks because pinecon's a brand they're a company they want this thing to look how they're going to look we provide each ask AI with a kind of public or private page that they can use and we give you the URL that you can just share with anyone you want and then you can customize a bunch of things like the logo adding this nice CTA with a link back to the dev docs the colors the fonts a bunch of other things you can start to see this looks a little bit like some of the pine cone components on their website as well another nice real feature is also the analytics so picture yourself as a pine cone as a business they want to see what's the uptake of this new tool that we're using to get some basic analytics but also really importantly they're going to download the questions and the answers that are coming from the developers they can start to see where the documentation is strong and where it may need a bit more information and where there is actually just some stuff that isn't included whatsoever the next thing we're going to look at now is a final kind of two things is where can we place this that is outside of a kind of mild AI site so we're actually going to jump over to card which is a website builder just for this example you could use webflow you could use Squarespace you could obviously use your own website you could use a bunch of different things and what we're going to do is we're going to get our embed code from down here and we're going to paste that into this custom HTML component which is kind of familiar on most website builders we're just going to publish this site and then imagine this was actually pine cones developer site it could place their own widget anywhere they want on the site and they should interact with exactly the same way as we looked at it before another option going even further is to use the API that we put available which gives developers kind of full customization over the inputs and the outputs but we also just released a Zap app recently which means that you can now harness your ask Ai and connect it in the thousands of different apps so here we have a nice example of it being integrated into slack where we're actually launching a slack app very soon so you wouldn't even have to use the API to do this but it's a really nice example of where you can bring ask AI your ask AI into the place where your team are doing the work team slack you know whatever else you're using you can push that in there and you can see then here that if we ask a question it's going to use this the API within the zapier connection and it's going to come back with an answer to the kind of questions that we were looking at earlier so in this case what is ask Ai and then it'll come back with a nice answer there all within slack or within your kind of workspace so that covers off everything that I wanted to share uh excited to hear any questions and interested to hear what you guys think thanks very much thank you so much Alex and questions we do have so let's jump right on into the Q a if there are additional questions you would like to ask Alex again please add them to the Q a portion and not the chat uh and we're going to get to as many as possible so number one from James Taylor what techniques and specific metrics are you using to monitor the back end and quantify the effectiveness of components like the embeddings and the subsequent prompt chain responses okay good question nice and specific to start um so I guess we can break this into two parts we can look at the different parameters that you can set when you're embedding and you can think about how do you test uh for the quality of changing those different parameters talk about that in a second and then you can think about the actual completion The Prompt the how the answer is being constructed from the information that we get from Pine Cone and obviously there's a bunch of settings that you can put within that I think it's fair to say that we and probably a lot of other businesses are in a very early stage of kind of testing at scale all these different parameters like everyone is just experimenting right now on what is the best solution and I hope James would agree with this statement that there really isn't a right answer for like how big should your chunks be when you embed them or how exactly should you write your prompt when it comes to doing like um extractive q a or something like this so there's a lot of trial and error you've got to think about you know what is the common use case in in for your particular platform uh you know this probably isn't going to be amazing for um I don't know like research papers and trying to summarize those right that's not what we're trying to do as a business we're trying to give you an answer that is buried within that research paper not tell you everything about that research paper in like five bullets that's just a different kind of product and the way that you embed and write your prompt will be very very different to that we have done a lot of experimentation for embeddings on like the chunk sizes and the way that we run that process and the same for how we write the prompt which is probably where we spend the majority of our time kind of experimenting and iterating and changing a little bit of time being spent on changing some of the core parameters of the open AI chat GPT apis in terms of things like temperature and and top pay and frequency penalties and these kind of things to make sure you're not getting repetitive answers and this kind of stuff I'd love to say we had a really elaborate system that was like test this prompt versus this prompt and you can I know you kind of know which is better but it feels like whack-a-mole sometimes right because you you develop what feels like a really good prompt and it worked for 9 out of 10 use cases really well and then you come up with a 10th use case from a user and it doesn't work for them or you've broken language support or like something else is happening because the way that it follows prompts is is not incredible sometimes obviously that's got a lot better with gpt4 so yeah just a lot of experimentation James and no kind of no silver bullets unfortunately hope that helps yeah so you just spoke a little bit about chunking but Chris in particular wanted to know how do you turn so how many sentences is in each chunk do you use sentence overlaps so if you could speak to that a bit Yeah so I guess all or most AI models in a kind of simplistic sense deal simplistic sense deal and tokens in in terms of the limits um tokens and characters have a kind of rough mapping in the open AI world it's about four characters to a token depending on the language you're using and the character set so Relic uh Mandarin whatever that varies quite a bit so it's quite difficult to set your limits based on Words characters is a bit safer but tokens is probably the safest in terms of saying we only want our chunks to be this big and it's best to Define that in terms of tokens we started out in characters and I think it's a good simplistic case but it can break occasionally for like Hebrew right that that recently like to throw up some some through a spanner in the works a couple of weeks ago when we were like starting things out um so that's one way to think about how you break things up there in terms of the content length or how you actually break it up there's a bunch of really great libraries like Lang chain uh which I know yass is using with like I think Tech splitting is the name of the kind of sub package you can use there that actually you can just tell out how big you want the chunks to be and how much overlap and this kind of stuff we actually built a lot of that ourselves just so we can have like maps from Maximum configurability over how we do some of that stuff but yeah Lang chain if you're just starting out is a great place to like think about how big your chunk's going to be how you're going to split them up and and where you're going to pass them onto next so that's that's a good start starting point yeah super helpful we actually just did a event with Harrison chase the uh founder of Lang chain so you can check out and learn a little bit more about that um via the event recording which you can find on our YouTube or social um so this mitt wants to know something specific to Pinecone usage so James if you have any additional color feel free to jump in but how do you separate customer data within pine cone database do you create a separate index in Pinecone for each customer um I think well I think we we see different people do this in different ways um so I mean you can use namespace that you can use different indexes it kind of depends on or you can even use both it it depends on which approach you want to go for but I actually I actually think Alex probably has them a more hand done like Insight system to what I did yeah happy to answer that one so James mentioned namespaces and indexes probably the two biggest ways you can separate data indexes you know probably more of a I don't know what the technical word is but you know the walls between your information there is much higher than namespace where that is basically a an identifier against all your information within the same index all that data sitting together in the same database and not kind of partitioned in a kind of slightly more secure way namespaces are still a very secure way of separating information and as long as you're handling namespaces and how you connect to pine cone in a secure way there's no way that users can see what's in other namespaces the reason that we take that approach I'd be interested here if you ask that there's anything differently I imagine he does because otherwise this Pinecone bill would be super expensive would be you'd have to pay for each index for each customer and that's going to cost you a minimum of like I know 60 70 whatever it is so setting up an index for each customer on a small scale for people with only a couple of docs would be like incredibly cost inefficient if you had an Enterprise client and you had loads of different Enterprise clients then you'd probably want to give them each their own Pinecone index so their data is like super secure and partitioned and obviously if they're spending thousands of dollars a month with you then then it doesn't really matter they're paying for their own index so yeah we use namespaces it's cost effective it's secure enough for all for our purposes and there's no way that people can see other people's data anyway so um it's a pretty neat solution great thank you when it comes to integrating an organization's knowledge base example Guru or Confluence what are some ways to deal with different permissions and prevent the Q a bot from answering questions with data a user should not uh should not supposed to have access to shouldn't have access to so right now the way that we would handle that would be that you would build an Ask Ai and upload content that users using that ask AI would be kind of eligible to see we don't have a concept of kind of different permissions within a single ask AI so that certain users can see a certain tranche of the content and other users can see another challenge of the content it's an interesting use case I can't say we come across it too regular requests from users because I think they would just create like an internal ask AI with certain pieces of information then a public ask AI with another kind of information and maybe there's some overlap as well so that's kind of how we would handle that at the moment great how do you use uh Auto generate the questions after uploading the content yeah so I guess maybe this we start to delve into a little bit of secret sauce with some elements of it but um you know if you put a piece of information into chat GPT and ask it to do something with that information like write questions uh then it's going to be pretty proficient at doing that um you just got to figure out a way of getting the center of the entire document into that so that it can write questions which are not about just the first like three paragraphs basically so you can probably read between the lines there and figure out a little bit what we're doing there but yeah we're using chat GPT and some of that kind of superpower to generate questions off kind of large part of that document if the system is capturing or deleting new data that requires to keep syncing with Pinecone so my question is how do you create update or delete embeddings at scale with pine cones so this this part your experience James also feel free to to happen so I'll be in quickly I mean uh adding content is like we spoke about you know that can be done at scale you can upload through the UI we're also about to launch and upload API as well and we offer a couple of bulk functions for some of our customers and clients as well behind the scenes for like thousands of docs and stuff um and that seems to work pretty well we don't see any issues with that deleting same as adding like you can just delete at scale that's obviously a much faster operation of dating uh you can update the information within an individual um vector or like you know chunk of chunk of text and and uh and numbers uh the the most efficient way probably to do that at scale would be to just delete it and then to add the new information on top of that as opposed to trying to find like individual vectors and updating them one by one or maybe there's maybe there's a more efficient way of doing that but at least the efficient way that we see at the moment is we'll just delete the information associated with the old web page and update it with new information the web page and that's all like kind of totally seamless to the end users they don't need to know how that works but like they can very easily update content and remove it yeah perfect we'll ask one more question and I know there are still a few remaining uh if you want to follow up with me directly I'll get your questions to Alex so you can reach me at Amanda pinefone.io I will add that to the chat but we'll end on this note so how are summary questions handled what if there is not a specific summary section uh present in the document but I still want to a summary of the whole document yeah so this is probably a bit off topic for like our tool which is about kind of finding answers from within a document but it was a really interesting piece on opening hours site which you can find if you do the right Googling which talks about this kind of chain summarization technique where you take as big a chunks of of a book as you can and you Summarize each of those and then you put the summaries together and then you keep kind of summarizing down as a tree and they ended up summarizing Alice in Wonderland into like a single paragraph like from the entire book so it's a kind of change method that you have to do that maybe there's something on Langton that can do that but yeah dig into open ai's docs and they talk about how you how you can summarize a kind of gigantic piece of information into something really short great well thank you for your expertise your knowledge and sharing with us uh Round of Applause again I always say this is the thing I miss the most about in-person events when we do these Round of Applause thank you yeah yeah yeah me um so thank you so much again if you have additional questions for Alex or we weren't able to answer your questions uh you can reach out to me personally I will add my email to the chat thank you Alex now we're going to turn it over to our friend Yasser Yasser how's it going great I'm happy to be here we are happy to have you I'm gonna turn things over to you why don't you start by uh telling us a little bit about yourself sure so my name is Yasser I'm a fourth year computer science student based in Toronto so I should be graduating this may um I did like a few Tech internships but I've been always been doing such projects um and then chatbase is my latest project that turned into a real company the way I started working on like AI projects and vector embeddings and stuff is I found um Lang chain super early on and then I started like playing with it playing with just writing Python scripts um and then like I saw the potential as soon as I started playing with it I saw um I talked to a few like companies and like few friends who I think would need something like this and I saw a need for basically creating um your own charged GPT or a chart GPT that at least knows your data and can also like be a normal charge GPT that you can ask a question other than your data but the main reason is that you can inject your own data into chat gbt and that ask it questions um so I'm using the same technology as Alex is using so same thing embeddings um so I take a chunk of text sorry I take a big piece of text I chunk it embed it um put it in Pinecone and what what embeddings does is that it helps you do semantic search so semantic search means that instead of doing keyword search with which is like the old way of searching you now do semantic search and the difference is you don't have to use the same word or or even like you don't have to do similar words you can just if the meaning of the whole chunk is similar to your question then semantically both of them are going to be close to each other and then when you ask a question I'm gonna go to Pine Cone and find like the chunks that can have the answer to that question by doing similarity search and fortunately Pinecone makes something like that super super easy um so I'll share my screen I'll show you what chatbase is and how it works can you guys see this yep so this is this is the end product um so basically let's Let me refresh so what people come like use chatbase for is to get a chat bot they can embed on their website like this um that can answer questions about their website their content if you have a company it can answer questions about your company if you have like an open source project they can answer questions about your documentation if you're just an individual who has a Blog um you can train it on all the content you have and then um chat with the content so this is the chatbot that I created using chatbase for chatbase um and then you have your like suggested questions so let's ask like what chatbase is actually let's go here because I can't see here there's another demo here let's ask what chat base is so it answers chatbase is an AI chatbot Builder which it it it it came up with this answer by by looking at the document that I uploaded that explained what chatbase is um and then it came back with the relevant chunks from that document that might have the answer to the question um and then it gave the the chunks plus the question to charge GPT and it says like using this context to answer this question and that's what it does um and then you can ask follow-up questions you can ask um how can I upload data yeah so it tells you how you can upload data and all of this is coming from from a document and the cool thing is it's this is a chat interface so you can say like um explained you can say explain in simpler terms and it knows that this is asking about this message and now it's it's explaining in simple terms it's not using like dot PDF or Dot txt um so I'm going to show you like how how to create a chatbot so you come here you create an account and then you say build a chat bot and then you choose your data source so you can choose a website um you can choose text which is you can just paste any any text here or you can choose a file or multiple files if you choose a website here um you can add your website and then what chatbase does it is it's going to crawl the whole website um and then using all the content on on your website it's going to create the chatbot so if I say this is like another project that I had sap um and then I can go here and say I want to exclude the terms uh this page included the terms of service and I want to exclude the privacy policy um so this is going to crawl the whole website and it's going to exclude these pages and also if you wanna if you all if you only want to crawl like let's say the blog section of this website you can just say this and it says here this will crawl all the links starting with um sa pal slash blog so if you have blog slash one or block slash two it's gonna crawl all of these things too but for now I'm just going to crawl the whole website excluding like these links and then now I have a chat bot that knows everything about essay about which is another product um I can ask it like what is essay about foreign assistants help you complete sentences or paragraphs um I can also create it from a file so I have a file here yeah I have this so this is a document explaining chat base it has 8 9 000 characters almost and then I can create a chatbot using this and then this chat bot is is how this is exactly like how I created the chatbot here so I can ask you the same question I can ask it what is chat base and it answers me and the cool thing is that you can go to the settings to chatbot settings and then you can edit a bunch of stuff to customize your chatbot the coolest one and the one I think people use the most is this which is the base prompt or for people who are more technical it's the system message um so this is super helpful especially for companies because this allows you to give your chat bot a name a personality um you can give it instructions on how to behave you can just say I want you to answer all the questions in French even if you're asked in English you can just give it instructions to behave in any way you want so it's it's super cool to see like some of my clients they um they showed me how they're editing this based from so some of them will say your name is um like company name your name is X your personality is let's say you're making a chatbot for Disneyland so your name is Disney AI your personality is fun um you make jokes you're silly and um you use a lot of emojis so basically by editing this you're creating a uh a Persona or a character for your company um that doesn't have to be a company but you're creating a character that has a name a personality and it also knows everything about your uh your data and can answer any question about it um and then here you can choose the model you can choose 3.5 or gpt4 you can set the visibility and then the domains you want to embed your chatbot on and then here you can edit the chat interface so you can remove this and let's say this is this is for chat based right so I'm going to say hi um job base AI and then ask me so here you can edit the initial messages you can also add suggested minute messages which are the ones that show up here you can say um what is the pricing how does it work and you can add as many as you want you can also edit the theme so if you have a like a dark website a dark theme you can edit this theme to be like this and you can also edit the uh you can also edit the color of this message so you can you can come here you can like make it whatever you want let's make it this green so now it's this green and you can also edit this the color of this same way um and then a cool thing is um you can if you go here you go to the dashboard and then you see all the conversations that happened with your chat bot this is empty because I didn't have conversations with it but if I go yeah I actually want to talk about this too okay I'm going to talk about this first so if you go to feature chat Bots these are some of the chat Bots that um I I added and it's like public information that people can uh can just like test uh chat base with different books or different authors so here I have program which is the founder of Y combinator um I added his website to chatbase and what chatbase did it crawled the whole website and then it created a chat bot um named program Ai and then you can ask it questions like this how to find co-founders and these answers are generated using the um the content on programs website and all these essays and here it shows you all the sources that chatbase used to come up with this answer by using um Vector embeddings so these sources are stored in point corn basically um and yeah what I was trying to show you guys is the dashboard so if I come here I own this chatbot so I can see what people [Music] um what the conversations they had with it and if you if you own a chat bot on chat based you you're gonna have your own dashboard and then you see the conversations so this is one of the conversations that people have with um with ball Graham I'll judge this um another thing I wanted to mention is the API so if you come here you can chatbase offers an API to do basically everything you can create a chat bot updated chatbot deleted chat bot and then chat with it so basically you can build your own chat base um because I expose all of these um all of these apis to you so anyone can make their own website and it can have the same function functionality as chatbase but they can use chatbase on the back end um and here like I explained how to use the how to use the API it's so similar to uh to the open AI API if people are familiar with it it's the same it's you use your secret key and then messages are formatted the same way as the open AI messages um yeah I think this is it for a chatbase do you guys have any questions do they ever jump right into the Q a but thank you so much for uh demoing chat base for us now Chris has a question same question he asked to Alex earlier about chunking uh how do you chunk how many sentences is in each chunk do you use sentence overlaps yeah that's a good question and this is something that I don't think there is an objective answer to I think it's trial and error as Alex said um you can use libraries like playing chain to make this easier and if you use like the default um like settings or leg chain I think it gives you a good enough result um for like the general use case but if you have like your own use case and you know your data you know let's say you have paragraphs and each paragraph is small enough that you can chunk it and embed it and each paragraph is speaking about like a specific topic they don't overlap then it would make more sense to chunk each paragraph on its own instead of just adding everything all together for chat base because this is intended to uh to be like customer facing and I don't know like what data they're gonna put I'm using and yeah I'm using sentence overlap I think I'm doing 2 000 characters but what I try to do is that I go to the end of the sentence or the end of the paragraph and the beginning of the paragraph um just to make sure I don't lose any meaning if I cut the sentence too short so I go like 3000 characters and then I go to the end of the sentence or paragraph and to the beginning too just to make sure I don't lose any meaning and I also do a little bit of overlap for the same reason because I don't want to lose any meaning when I'm doing shortcut makes sense how is chatbase able to crawl a lot of questions coming in so I'm losing my questions how is chaplace able to crawl the whole website does it read the whole HTML doc a technical explanation would be awesome thank you sounds great yeah so what it does is that when you when you come here and you give it a URL um basically on the back end chatbase open that HTML page and then it scans the whole page for other links within that website so if you have a link to let's say Instagram it's going to ignore that but if you have a link to chatbase.com blog it's going to go to that and then do the same thing so it's something called recursive crawling of a website and you just have to make sure that you don't don't crawl the same page twice because you can have a page pointed to another page and then the other page pointing the to the first page so just keep track of uh the pages Pages you already crawled and then you can you can search for this it's uh just search for um JavaScript or python crawler recursive crawler and you're gonna find it great similarity search always returns results how are you able to tell when you should not answer a question because you don't have data about it or about this yeah this is a good question so right now the best solution or what chatbase is doing is I'm embedding each question and then I'm going to Pinecone getting the results and then giving all of that to chat GPT but then ice in the base prompt I I specifically say to chat GPT not to use any other data from let's say the question is about like what's the speed of light right if you ask what's the speed of light to chair to the program bot so it's going to come up with sources but those sources are not going to have the answer to the question right but chat GPT is smart enough to just ignore all the sources and then um tell you like I don't know the answer to the question because in the base prompt here you say if you don't know the answer just say you don't know the answer thank you so we have some uh chat based specific questions about document limits so what's the document limit that I can upload uh into a chat bot in chat base So currently if you would go to the pricing page it says here um for the hobby plan you get 10 chat Bots and then 2 million characters for each other for yeah virtual boat I'm trying to increase this limit um especially for like the standard and unlimited plans um because this limit is currently like a technical limit but I think I can I can I can increase it if you want to know like what your documents how many characters it has so if you go here build chat bot file and then attach my one or two files or on the basic plan here so it's going to tell you like 52 000 characters um the problem is when you're trying to upload a website it doesn't tell you and I'm trying to find a solution for that also uh if anybody has a solution feel free to add it to the chat I like to I like to give our community the opportunity to contribute uh we have a question I want to create an application which should answer based on the data feed by the user can I just give a directory path having multiple kinds of file like PDF doc text CSV Excel Etc as an input to to the chat back if yes how will you parse all these different kinds of files and generate a single knowledge base yeah that's a good question and then this is something I'm currently doing with chatbase so chatbase there is an option to upload multiple documents it can be PDF text docs whatever and what I'm doing for chatbase is that open AI suggests that you ignore all formatting so if you have like like new lines or titles in the in the document the best way to do this in my experience is to ignore all the formatting just keep the punctuation and the spaces between like words so you just like put everything all together and then you just have to trust that semantic search works and the chat GPT is smart enough to find the answer to your question based on the given sources how is chatbase using chat gbt3 for generating answers you mentioned it but I missed it I know the embeddings I know the embeddings they're stored in Pinecone and I thought you are using completion endpoint yeah I am using completion endpoint so the flow is once someone asks a question ask a question I embed that question go to Pinecone get the relevant messages from the document and then I make a request to chat GPT saying given this these sources and this question what what is the answer basically so yeah I use the completion endpoint we have a few questions about uh chat GPT who hallucinations how do you deal with making sure the chat bot does not hallucinate so this the solution for this if you want to keep using chat GPT instead of chat gpt4 is a lot of experimentation with prompt engineering as I said on chatbase um you have the option to to change the base prompt so you can you can change it experiment and then ask it random questions and see how it behaves in my experience 3.5 is always going to be not as good as four 3.5 doesn't listen to the system prompt as good as for if you want the best experience and if you want to have like a product facing in production chat bot you should change this to gpt4 which is going to be more expensive but it's much better at staying in character so if you give it a name personality it's much better to stay in character and it's also much better to um just provide answers from the given sources and not hallucinate I think gpt4 is is much much better at not hallucinating especially if you give it good context uh we have somebody who says thanks yeah sir how did you arrive at the pricing what are the considerations made for minimizing costs with index Creations kind of a business oriented inquiry foreign yeah um one of one of the biggest problems I'm facing right now is that I'm having I have a free tier on chatbase so a lot of people um just use chat based ones they upload one document and then I put it on Pinecone and then I have to have like an index running for all of the free tier customers so what I'm planning to do and I think this would make the pricing uh like the my margins better or maybe I can uh lower the prices too is that I'm gonna make a uh I'm gonna say like if you don't use a chat bot for maybe like 30 days or a month or like 60 days then uh your chatbot will be automatically deleted and I think I think if you're trying to build something similar or like if you're doing consulting for a um like bigger Enterprise customers um the way to lower the to lower the pricing especially on Pinecone is to delete whatever you're not using yeah so I think we have time for one more question um how do you decide if it's better to fine tune or to use embeddings so in my experience embeddings work much better if if you want your chatbot to know your data I think fine-tuning is good if you want your data to be outputted like in a certain format but um in my experience embeddings are good enough and are the way to go if you want to inject data to check GPT and make it answer questions about this data which chatbase is doing okay well thank you so much um really appreciate your time showing us all things chat based very cool product um anything else you want to say in your final in our final closing as we hit time no I just want to thank you for having me this was super fun of course and again audience members if you still have questions you can reach out to me and I will serve as your inquiry liaison diazor and Alex you can reach me at Amanda pinecone.io and yes this video will be available uh and I'll send that out shortly so again another silent Round of Applause for our incredible speakers and we hope to see you at our next event thank you so much everyone thank you
Info
Channel: Pinecone
Views: 8,407
Rating: undefined out of 5
Keywords:
Id: RL-ZnpE9hwM
Channel Id: undefined
Length: 49min 57sec (2997 seconds)
Published: Mon Apr 03 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.