ClippyGPT - How I Built Supabase’s OpenAI Doc Search (Embeddings)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
👍︎︎ 1 👤︎︎ u/Educational_Ice151 📅︎︎ Mar 08 2023 🗫︎ replies
Captions
Supabase hired me to build ClippyGPT, their  next-generation doc search where we can ask   our old friend Clippy anything we want about  Supabase and it will answer it. For those   who haven't heard, OpenAI released ChatGPT, an  insanely capable large language model that aims   to change the way we interact with computers using  natural language. But for me I'm less interested   in ChatGPT on its own and more interested in  how we can use that technology in our own custom   applications. This video is going to be all about  this. We'll dive into a new field called Prompt   Engineering and best practices when it comes  to building custom prompts. We'll talk about   the number one challenge that people face when  they're designing custom prompts, plus a number   of other challenges that we face when we use it  in the real world. Such as: How do we feed GPT-3   a custom knowledge base that it wasn't trained  on? How do we get past the token limits? How   do you stop GPT-3 from making up wrong answers  or hallucinating? By the way, who is Supabase?   Supabase is an open source Firebase alternative.  You can use Supabase as basically the backend   provider for your platform in the same way that  you could use Firebase to do that. The difference   is though, apart from being open source, and the  number one thing I love about Supabase is that   the entire platform is actually built on Postgres,  one of the leading production grade SQL databases.   And why did Supabase want ClippyGPT? Well like  any good developer focused platform, providing   high quality documentation is key. And if you  have a big site with a lot of documentation,   how do you make it easy to discover that content?  Up until now Supabase used a third-party tool   called Algolia as their search service. But that  only returns links back to the documentation, like   what I would call traditional search. Now that  we have the power of large language models like   GPT-3, let's improve this experience by returning  the answer to the question right then and there. Okay, we're gonna get right in here. I've zoomed  in my VSCode terminal quite a bit here, so those   who are on smaller screens can hopefully see  what I'm doing—hopefully I don't regret this! So,   the way I'm going to go about this is, I'm going  to walk you through the different pieces of code   that I've already built, and basically show you  what it took to build this thing. And my goal   is to present this in a way that you can take  these exact same ideas and bring them over to   your application, so you can do the same thing.  So right now, we're just looking at the Supabase   mono repo. If you actually want to follow along  yourself, Supabase is open source of course,   as we've talked about, so you can just head  on over to the Supabase GitHub organization,   and then it's just their first Supabase  project. This is a mono repo, it contains   a couple different projects within it, one of  those projects being the documentation website,   which is what we're going to focus on today.  And, fun fact, the documentation website is   actually completely built on Next.js,  which, in my opinion, has made a really   great experience to work with. So I'm going to  split the process I took into three main steps: Step one, we pre-process the knowledge  base. In this case, for Supabase,   this would be their documentation, which  we'll find out in a second is all MDX files.   Step two, we're going to store it in a database  and generate a special thing called embeddings   on them and I'll explain why we're doing  that and what the purpose is there in a   second. And then three, we're going to inject  this content as context into our GPT-3 prompt.   So, how did I know to go about it this way? Let's  start with the number one challenge people face   when they go to customize GPT-3 using their  own custom knowledge base. Here's the problem:   GPT-3 is general purpose. It has been trained  on millions of pieces of text so that it can   understand human language. Sure, it might be  able to answer specific questions based on the   information that it was trained on - for example,  "Who is the CEO of Google" - but as soon as you   need it to produce specific results based on  your product, results will be unpredictable,   and often just wrong. GPT-3 is notorious for just  like confidently making up answers that are just   plain wrong. One solution you might think would  be: "Well, can I just take my knowledge base and   insert it into my prompt every single time before  I ask the question? That should give GPT-3 all the   context it needs, right?" Yes, it will, but you're  probably not going to fit your entire knowledge   base in a single prompt—plus, that'd be pretty  expensive, because you are charged per token,   where a token is approximately four characters  per English word—and your knowledge isn't going   to fit into a prompt, because, at least  with OpenAI's latest language model today,   which is text-davinci-003, it has a token limit  of four thousand. So, you need to fit everything   within that token limit. You also need to treat  each request as if it's a brand new context,   because it is—GPT-3 has no memory between  multiple requests. And, by the way, if you're   thinking "Well, chatbot does, doesn't it? Like, if  I ask it multiple questions, it remembers what I   said before," that's just little tricks that it  does—it's basically sending the entire multiple   messages as context every single time you ask a  new question. So, there are a couple different   approaches to address this, but today we're going  to focus on a technique called context injection.   Context injection breaks the problem down into two  steps: Step one, based on the user's query that   they send you, you first search your knowledge  base for the most relevant--Need some screen time?   What do you have to say? Nothing? Change your  mind?--So, step one: based on the user's query,   you search your knowledge base (whether it's a  database or whatever) for the most relevant pieces   of information that relates to their query. So, if  for example the user is asking "How do I use React   with Supabase," we first search our documentation  for only the most relevant pieces that talk about   React. And then, step two: We inject the top most  relevant pieces into the actual prompt as context,   followed by the user's query itself. So, this  approach, as opposed to something like fine-tuning   (where you actually would need to retrain the  model itself with your own custom data), context   injection does two things for us: Number one, it  primes the prompt with very specific information   that you want it to use within the answer--and  then two, it always uses up-to-date information   right every single request every single query from  the user you would go and fetch that information   from a database which can be updated in real time  right fine-tuning would require you to actually   retrain the model every single time you have a  new piece of information added to your knowledge   base okay so back to this let's take a look at the  first pre-processing step and how we do that with   Supabase so open up our side view here under  the docs project we're going to head on down   to scripts and check out our generate embeddings  script so before I can walk you through the script   I need to First explain how all the documentation  on Supabase is actually built and stored within   this project so first of all the documentation  is in fact all stored within git here that's not   stored in some external knowledge base or database  or whatever it is all within this project and it's   stored as markdown files or specifically MDX files  and if you haven't heard of MDX files MDX files   are basically marked down with JSX built in which  is pretty cool so you know everything you'd expect   from a regular markdown file links whatever and  even things like this actual HTML tags this isn't   part of MDX actually this is I believe this is  part of GitHub flavored markdown where it allows   you to to inject some HTML in there the part that  makes it MDX is something like this so admonition   or let's see do we have anything else not in this  example other than down here you can see we can   actually do exports as well as Imports at the  top right so these are things that you'd expect   with like JavaScript or JSX files specifically  something like this and we're basically merging   markdown with that and for many of you I'm sure  you've seen this all the time pretty much every   single open source project I've seen that has  documentation uses either markdown or MDX and   the MDX gives you a little bit of benefits where  you can have some custom nice geared looking   components within your documentation without  having to like jump ship from markdown entirely   and do the entire thing within pure JSX so pretty  awesome so if you take a look here pretty much our   entire what Supabase calls their guides is broken  down into these markdown X Files and I can show   you really quick how they map right so I'm on the  databases or the sorry the database MDX fall and   if I go over to their documentation subarus.com  docs they have a section here on database so slash   guide database that basically Maps one to one to  this MDX page and as you can see every Supabased   project comes with a full Postgres database uh we  get the exact same thing here so this is the the   markdown converted to HTML there's entire next.js  pipeline that does this for us which is great so   that's how this works under the hood our job now  is to basically as we said pre-process this right   we need to take all of this content and store it  into our database now I know I haven't explained   exactly why we need to do that yet right I did  explain the whole context injection piece but I   haven't explained quite yet why it's necessary to  actually put that in a database as a intermediary   step but just be patient we're going to get there  so back to generate embeddings essentially what   this script does is this is kind of a script that  we run during CI continuous integration as a step   that will basically anytime any new changes have  been made to the documentation it will it'll run   on this script that will do the step that I'm  going to show you of walking through all the   different markdown files and pre-processing them  so all the magic happens here within our generate   embeddings again I haven't explained to you what  an embedding is I'm going to get to that in just   a second I'm not going to go into every little  tiny little detail but I'm going to go over   the high level in case you want to do something  similar you can follow along again this is all   open source so definitely reference this on your  own time but first things first we basically grab   the content so literally a string of every single  one of these markdown or MDX files and we have a   handy function that we've created called walk that  will go through well Pages literally everything   here is under Pages walk through each of them  recursively and grab all the different files   that will give us a list of all the file names and  we're going to just do a little bit of filtering   such as only include the files that are MDX files  and ignore some specific files like 404 we don't   really need to process that one and then we can  move on to the magic so basically we'll we Loop   through every single markdown file and we need to  process it so let's talk about this processing why   why is this processing so necessary we have this  magic function which I can show you that basically   processes the MDX content for search indexing  it extracts the metadata because our MDX files   actually export submitted out of this important  similar to friend matter it strips it of all the   JSX because later on when I feed this content to  our actual GPT-3 prompt I don't want it to get   confused with the JSX so for now that's actually  just getting all stripped out and then it also   will actually split into some subsections and the  reason for that is because when we later inject   this as context into the gp3 completion model it's  better if we can work in smaller chunk sizes right   if we only have the ability to pass in context as  entire markdown files we're limited to just that   markdown file all or nothing right if we split  that into multiple chunks then maybe when the   user has a query about react this part of this  markdown file is most relevant along with this   Chunk from this other markdown file right we can  combine those together as context so in general   this is the best practice recommended approach for  Supabase it's nothing too sophisticated basically   we're breaking it down by a header so every time  it comes across a new markdown header H1 H2 or H3   it will consider that a new section and split that  into its own chunk now you might be wondering how   we're doing these things such as stripping the JSX  like are we actually you know going through this   markdown and looking for uh JSX using I don't know  regex's or something like that how do we actually   strip out just the JSX portions thankfully we're  not using regex for that we're actually getting   a little bit low level and and processing the  markdown ourselves which is pretty fun actually   um we're using the exact same tools that what  probably most all markdown tools are using which   is tools provided by the unified platform  right unified unified JS these guys have   done an amazing job of basically thinking about  every single possible component when it comes to   parsing and Abstract syntax trees so basically  taking things like markdown JavaScript files MDX   files like many different types and creating this  really nice pipeline for processing them right so   number one taking a markdown file for example  and breaking into what they're calling a syntax   tree and from there you can actually go through  every element of that markdown file and do what   you want with it so there's extensions which is  where the MDX side comes in and we basically get   our syntax tree filter out the GSX elements  and return that now worth doing it right now   the solution isn't perfect right what about all  the information that is within the JSX file are   we going to lose that Yes actually right now we  do nothing to keep that so that's that's actually   a problem we are losing some information right  now this kind of like a version one broad stroke   approach whereas in the future we'll definitely  need to take a little bit more fine grained and   find a way to actually abstract meaningful content  from JSX so again why did we do all this work why   do we even need to process in the first place and  why are we supposedly storing this in the date   so I think now's the time to talk about  embeddings and basically the fact that we're   going to use embeddings to solve this problem  is what requires us to use a database which   actually isn't necessarily a bad thing. This is  somewhat cutting edge actually (when it comes to   the ability to do this on Postgres). So let's take  a look so I'm going to skip a little bit here this   is just a bunch of pre-checks I'm going to explain  the database in a second but down here we're going   to get right into the embedding portion so what  is an embedding I told you earlier that we'll be   injecting the most relevant pieces of information  into the prompt but how do we actually decide   which information is most relevant introducing  embeddings embeddings take a piece of text and   spits out a vector or list of floating Point  numbers how is this useful well this Vector   of floats is actually storing very meaningful  information let's pretend I have three phrases   number one the Cat chases a mouse number two the  kitten hunts rodents and number three I like ham   sandwiches your job is to group phrase does it  have similar meaning and to do this I'm going   to ask you to plot them onto a chart where the  most similar phrases are plotted together and   the ones that are most dissimilar are are far  apart when you're done it might look something   like this phrases one and two of course will be  plotted close to each other since their meanings   are similar note though that they didn't actually  share any similar vocabulary more so just they had   similar meaning right and that's important and  then we'd expect phrase three to live somewhere   far away since it isn't related at all and then  let's say we had a fourth phrase Sally a Swiss   cheese perhaps that might exist somewhere between  phase three because cheese can go on sandwiches   and phrase one because mice like swiss cheese  so the coordinates of these dots on the chart   represents the embedding right in this example  we only have two Dimensions X and Y but these   two Dimensions represent an embedding for these  phrases now in reality we're going to need way   more Dimensions than two to make this effective  so how would we generate embed settings on our   knowledge base well it turns out OpenAI also  has an API for exactly that they have another   model which today is called text embedding Ada 2  that's purpose is to generate embedding vectors   for pieces of text compared to our example which  had two Dimensions these embeddings will have 1536   Dimensions okay great so we take our documentation  generate some embeddings for them and then we go   and store them in a database what database are  we going to use to store them in by the way   since we're building this platform for Supabase  what better platform to use as our database than   Supabase itself now if you're following along and  you don't want to use Supabase that's okay the key   components we're using here are Postgres and the  pgvector extension for Postgres so that's all you   really need to get this working it just so happens  that Supabase has this extension along with many   others built right into their Postgres image and  it's open source so you can run it locally how   do we store the embeddings in the database by the  way this is where pgvector comes in pgvector is a   vector tool for or Postgres it provides a new data  type called you guessed it Vector that's perfect   for story embeddings in a column but not only  that we can also use pgvector to compare multiple   vectors with each other to see how similar  they are this is where the magic happens if   we have a database full of content with their  embeddings then when the user sends us a query   all we need to do is first generate embeddings  for the query itself so for example if the user   asks does Supabase work with react then that  phrase itself we're actually going to generate   an embedding on and then number two we're going  to perform a similarity search on our database   for the embeddings that are most similar to that  query and since we've already done the hard work   of pre-generating the embeddings on the entire  knowledge base this is super trivial to look   up this information if you're not familiar with  linear algebra Theory the most common operations   to calculate similarity are cosine similarity dot  product and euclidean distance and pgvector can do   all three of these in the case of OpenAI specific  typically the embeddings they generate are   normalized which means cosine similarity and Dot  product will actually produce identical results   by the way I always want to give credit where  credit is due I just want to recognize Andrew   Kane for building the pgvector extension this is  an amazing extension that I think is more relevant   than ever today and in the bit of interaction  I've had Andrew's been always super responsive and   really great at keeping the extension up to date  also worth noting there is another extension for   Postgres called Cube that is somewhat similar but  unfortunately it maxes out at 100 Dimensions which   won't help us today and there hasn't been a whole  lot of maintenance on that in the last couple   years all right so back here practically how are  we storing these embeddings in our database so   again since we're using Supabase as our database  which is Postgres under the hood literally it's as   simple as just using their supervised client which  comes from their Supabase JS library and then kind   of like a query Builder you can literally just use  their API to insert your data right then and there   and embedding specific quickly when they come back  from OpenAI's API in JavaScript they'll just show   up as literally an array of numbers and turns out  you can actually just pass this array of numbers   directly into their query Builder client and it  will happily store those as proper vectors in the   database under the hood speaking of the database  and the tables there let's take a quick look at   that I do have a migration file here it's just  under in from the root of the projects Supabase if   you have a a local Supabase project typically it  will create a Supabase folder within your project   and then within that is where you'd actually  configure the project and create things like   migrations as well as like Edge functions Etc  so if we take a look at our migrations we have   for Supabase we're calling them pages and Page  sections why do we have two different tables well   because of what I was telling you guys earlier we  are splitting our markdown pages into subsections   for a better context injection so each page  section would be a chunk of that page but I   still want to keep track of the pages themselves  so that we can record for example the path of of   that page some of the metadata on that page and  then we can link the two together through foreign   keys so to actually use pgvector you will need to  after installing on your days you'll actually need   to call this line of code if you haven't already  which is basically tells Postgres to create this   extension in this case if it doesn't not exist  Vector pgvector just shows up as Vector within   Postgres so that's the word we use here we create  our page table nothing too fancy about this and   then our page section also pretty standard other  than this last line here so we create a column   the name of the column is embedding right this  is arbitrary we've got to call this whatever we   wanted by embedding I think matches its purpose  best followed by this brand new data type called   Vector given to us by pgvector extension and then  the size of that Vector represents the number of   Dimensions so once again OpenAI is going to return  1536 Dimensions so that's going to be the size of   our embedding Vector so pretty straightforward  back to our code when you're using the Supabase   client you literally just reference each of the  columns within a regular old JavaScript object   and pass it the information there now fun fact  right now we're using this just within a script   on CI but you can actually use the Supabase  client on your browser client front end to   access your database I mean that's one of the  key features here and how you use Supabase and   if you're like me and thinking what the heck how  how is that security going to work there's no way   I want to expose all the tables my database to  be arbitrarily queried from my front end right   it just feels wrong but basically this client  is built on PostgREST. For those who haven't   heard PostgREST is a REST API layer on top of  Postgres that essentially will build a rest API   dynamically based on the tables and columns Etc  that you have on your database which is pretty   amazing and then again if you're like me worried  about security don't worry that's all covered here   like most applications nowadays it uses JWTs and  you actually have access to that JWT and who the   current user is within Postgres itself now so  basically we're just moving the authentication   and authorization logic from your own custom  application API layer directly into Postgres and   understandably super race is a sponsor of this  as this is pretty core to their product so I'm   going to leave it right there it almost went down  a whole nother Rabbit Hole there caught myself if   you guys are interested in learning more about  Postgres definitely let me know and we can maybe   include that in another video by the way Supabase  also has a GraphQL extension so similar to the   Postgres you can actually query it using GraphQL  some pretty powerful stuff there okay so at this   point we've basically fully covered the whole  ingestion process embedding generation process   how we design our database how we store them in  the database how we're inserting them into the   database so now let's actually get into probably  the funnest part of this all which is the actual   prompt itself and the injection by the way  there's a whole bunch of other code in here   and I'm intentionally just skipping all this this  is just a bunch of checks we have a check some   currently on each document just so that we're not  regenerating embeddings every single time if they   haven't changed it's just kind of an optimization  to only regenerate embeddings on pages that have   changed and so there's a whole bunch of extra  logic around that so to do the completion we do   have a back-end API route that we've created since  we are using Supabase for this naturally we used a   Supabase Edge Function I'm going to stop myself  before I go down a huge rabbit hole into Edge   functions basically it's a serverless function  so super super common these days available on   many different platforms and Supabase has their  own version fun fact Supabase's Edge functions   use Deno under the hood which is kind of like a  newer alternative to Node.js built by actually the   same original Creator as Node.js when he decided  there's lots of improvements that he wish he did   and that's what Deno is so if you've never used  Deno before very similar syntax to what you do   in Node.js just with a couple changes especially  around Imports and environment variables but let's   scroll down to the meat of this Edge function  first thing we do is OpenAI actually has a   moderation endpoint as part of their terms and  conditions they actually do require you to make   sure that the content that you're sending in  complies to their guidelines so like no hate   language stuff like that and since we're letting  users dynamically put anything they want into this   we need to run it through their moderation API  which is free if it passes that we come down we   create the embedding remember we need to create  a one-time embedding every single request on just   the query itself right we need this embedding so  we can use that to find the most similar content   from our database and I'll show you how that  similarity function works and I'll show you   that right now because that's next a couple ways  we could have gone about this number one way would   be if I was actually just using a pure Postgres  SQL client which you can do by the way when you   use Supabase you're not locked in to just using  their libraries they actually expose the Postgres   database externally so you can connect directly to  it from your back end so we could have done that   and then just wrote some raw SQL or use something  like connects JS to to write our query but instead   just to keep things a little bit simpler we're  going to continue to use Supabase's JavaScript   library and they have a special function called  RPC which essentially allows you to to call a   Postgres function right so for those who don't  know Postgres itself is actually able to have   functions which I can show you right now so  basically we're going to create a function   called match page sections and it's going to  have a couple different parameters for it and the   way we design this function it's going to return  basically all the page sections that are relevant   right in this case only give me the top 10 best  page sections that relate to the the user's query   so this functions in a second migration here and  we call the function match page sections this   is how you would design a function in Postgres  essentially what we're doing here is it's just a   simple select query where we're returning in this  case we're doing a couple things since we have   this relationship between the page section and the  actual page itself we're joining those tables just   so that we return the path and that path is  going to be useful down the road when we want   actually provide links back to this content  we're also returning the content itself the most   important thing here so that we can inject that  into our prompt and then we're actually also just   returning the similarity so how similar this piece  of content was to the original query and we're   getting that from this operation right here as we  briefly talked about earlier pgvector provides us   a couple new operators this specific one is the  inner product operator and it's actually negative   by default just the way that Postgres works it's  limited to sorting the result of this operator   ascending only so the Assumption when pgvector  was created is that well if if it's only going   to return it in ascending order we need to negate  the inner product so that the most relevant ones   come first assuming that we're trying to match for  the most relevant right so here to get the actual   similarity we're just multiplying by negative one  by the time it gets returned back to whoever is   calling this function we also just have a handy  parameter called match threshold that you can use   to actually filter the results to only include  page sections that are at least above a certain   threshold of course we need to order by the  similarity there and then limit it by this match   count which we're also passing as a parameter so  hopefully that's pretty straightforward side note   here I had to throw that in this variable conflict  use variable all this means is since we're reusing   the word embedding as a parameter variable but  also as a column on the page sections by default   Postgres will consider that a conflict so I'm  basically saying hey if you just see embedding   myself assume it's a variable otherwise if I'm  explicitly prefixing the table then of course   that would be the table so if you're wondering  why that's there that's that's what that's all   about so since we've kind of hidden all that fancy  logic into a Postgres function again back here we   can just use the superace client to do an RPC RPC  it will look for exactly that Postgres functions   we pass it in the name with these parameters  again we can just pass the resulting embedding   directly into this function and it will work  and the result will be our page sections and   now's the fun part now is when we're actually  going to take this content and inject it into   our prompt and we're going to talk about the  prompts self this part I'm going to Breeze over   real quick GPT-3 tokenizer this is coming from  another library that will basically tokenize our   content I told you earlier that when it comes to  GPT-3 everything is token based and in the English   language every four characters approximately is  a token not a hard rule of thumb but you know if   you had to generalize but if we can actually  calculate the real number of tokens and that   can actually be quite helpful here and so we're  actually doing that here we're actually taking all   the content and parsing out the tokens from them  and then calculating the size and the reason why   we do that is just so that we can limit the number  of tokens in this query right number one we have   to limit ourselves to be within that 4000 token  limit and this also just gives us the opportunity   to kind of fine tune how much context we actually  want to pass in in the first place okay next here   we have the prompt itself so first thing I want to  mention is this is literally as simple as it gets   no fancy templating language uh literally this  is just a JavaScript template literal and we're   just passing in our template variables directly  here the indentation look weird just because we   don't want tabs at the beginning fun fact if you  wanted there's a library called common tags that   uh actually give you some really nice template  literal tag functions and some of those can do   like stripped indentations so if we really wanted  to we could have used that to still indent this   nicely and it would strip them not going to go  down that rabbit hole right now but just a fun   fact so let's talk about this prompt what I want  to cover with you guys is just a little bit of   prompt engineering best practices and and kind of  the reason why I engineered The Prompt in this way   and the best way to visualize this is probably if  I copy this into another tool called prmpts.ai.   Full disclaimer this is a tool that I'm working  on. You can think of prmpts.ai as the JSFiddle   or the CodeSandbox of prompt engineering. So you  can come in here and create your own prompt with   placeholders inputs test it out and actually save  it for later and share the link with people and   the aim is to be a platform where we can all  collaborate together on our prompts lots of   features planned right now it's just simple  freeform text input but we're going to add   different schema types there and even the ability  to use embeddings themselves the ability to test   a prompt and save those tests Etc but let's stay  focused so I'm going to replace this prompt with   the one that is copied from the clipboard  placeholders here are just done using two   curly braces so let me just up that and there's  our prompt so let's figure this out loud you are   a very enthusiastic Supabase representative  who loves to help people given the following   sections from the Supabase documentation answer  the question using only that information output it   in markdown format if you're unsure and the answer  is not explicitly written in the documentation say   sorry I don't know how to help with that and then  we have this label we're calling context sections   followed by a placeholder for the context text  and then we have the question itself so we have   a label for that followed by placeholder for that  and then we finish this off by saying answer as   marked down including related code Snippets  if available and then we have the completion   which completion just marks where the GPT-3 will  complete this prompt so down here we can actually   type in our input so if I actually took real  context from the documentation this is where I'd   paste that of course we're not manually doing this  all of our code back here it as we just described   is dynamically fetching those pieces from the  database using embeddings that's the whole point   of this and then it's injecting that dynamically  there but you can imagine that's what that's for   and then this is the sanitized query I think we  sanitize the query just by take a look right right   now we're just trimming it so very basic trimming  the white space from the ends of it but this this   is likely to get more sophisticated down the road  so here you can visualize right let's just pretend   there's a piece of documentation that said well  let's not pretend let's actually check out the   real Docs okay so I'm on the pg_cron docs PG  KRON is an extension for Postgres built into   Supabase that allows you to do current jobs so  here's an example of code snippet right so let's   pretend that you know the user asked a question  like how do I create a chrome right and with our   embedding saved in Postgres let's say that our  algorithm came up with this code snippet likely   it's going to come up with a bunch of these  but for now we'll keep it simple came up with   this code snippet and maybe that text on how  to in this case "Delete a cron" -- well okay   let's do "Run a cron" - maybe that's a bit more  practical. Copy that and it gets pasted in here   it's we actually do keep the markdown formatting  and actually speaking of markdown turns out GPT-3   is really good at both understanding markdown  and actually creating markdown too and you'll   see a little bit later but this is basically how  we're able to produce these really high quality   responses that look really nice is because we're  actually getting GPT-3 to Output markdown itself   that we can display nicely using a markdown render  so back here we artificially add that back in this   would be a SQL code snippet written in markdown so  this is how it would actually get injected right   in again with some other sections we'd be able  to fit more than just that in there so if we fill   in those inputs we can visualize how the problem  will actually get sent to the completion API down   here you can just adjust which family and model  is being used DaVinci 3 as we talked about is   the latest today as more models come out we'll  be able to use those or as more families come   out from different organizations we can test out  you know different models let's go ahead and run   this and there's a response right you can create a  Quran using the Quran schedule method such as the   one below it's actually outputting markdown once  again in this case I think it almost basically   just copied that one directly but check this out  this cron sets a daily recurring job called vacuum   at 3am GMT each day based on that now notice  we didn't actually talk about that so this   is where the actual generative language model is  becoming very powerful right it's doing some extra   explanation that is very useful here and it just  deduced this on its own so at this point we would   take this completion and return this back from  our Edge Function to our end user and since it   is spitting out markdown right as we talked about  now anywhere we have marked down like the inline   snippet there or the multi-line snippet there  we can actually now run that through a markdown   renderer and get some really nice looking results  so before we finish the video where I'll quickly   show you the front end side of all this let's just  quickly break down this prompt and talk about some   of the components and why I chose to build them  that way now disclaimer here prompt engineering is   an emerging field so some of these best practices  are guaranteed to change and improve over time but   for now this is kind of the aggregation of some of  the recommended approaches today so first thing we   do here is we actually give the model A identity  right you are very enthusiastic Supabased model   who loves to help people what does identity do  well it's priming the model so that it understands   its purpose prior to us giving it a task by saying  very enthusiastic we're hoping that this will help   at least at a minimum make the model as cheerful  as possible use exclamation marks things like that   you know when it makes sense to also by saying  this is a Supabase representative now any kind   of possible query that the user sends the answer  provided will always be within the context that   it was created from a Supabased representative  right so after identity we go into task right   given the following sections from the Supabased  documentation answer the question using only that   information outputted in markdown format so this  part is very important it's the instructions for   the prompt this is what I'm asking you to do  we want to improve the likelihood that we get   the kind of result that we want this next part  here is what I'm going to call a condition so if   you are unsure and the answer is not explicitly  written in the documentation say sorry I don't   know how to help with that right without this  section this is where we are in danger of GPT-3   hallucinating hallucinating is a term we use when  gp3 makes stuff up and as we already talked about   gbd3 is notorious for just like confidently be  giving you the wrong answer especially when it   comes to math I found right it's a language model  after all so any kind of like math operation it's   not amazing at but it will confidently very very  confidently tell you that it thinks it knows the   answer you can even give it a math equation ask  it for an answer and then also ask it for like   its confidence level one to ten how confident  are you that this sounds correct and then it   will proceed to give you like the wrong answer and  then 10 out of 10 confident which is hilarious so   when you're creating a prompt for your own custom  application that's going to represent your product   to end users you want to make sure that you have  a condition in there to prevent the model from   saying something you don't want it to say about  your product or just making something up entirely   which are both bad things next we have the context  so I would call this part of the prompt literally   the context itself this could either be manually  entered in or in our case dynamically injected   again this practice is called context injection  but it is just another input after all the thing   right above it we're calling labels right labels  help give the prompt structure so not only have we   given it a task but now we're reinforcing that  task by saying here's the context that I told   you I was going to give you so label context  and then here is a question that I told you I   was going to give you question followed by the  query now what's with these triple quotes this   is something that is recommended just to make it  very explicit to the model what your question is   right now OpenAI has recommended something like  three quotations triple quotes to do that and   the other thing this does too is potentially can  help with prompt injection as well so if people   try to start to ask this prompt to do something  that's outside of the scope of what you want to   do in this case if they're trying to ask Supabase  to answer something that's outside of the scope   of something related to Supabase keeping it within  these trivial quotes can at least at a very basic   level help with that and then finally at the end  here we have our final label which we're calling   answer so answer as markdown so again reinforcing  that we really want this answer to be formatted   as markdown which in my experience has done a  very great job of doing and then this was added   on later "include related to code snippets  when available". For Supabase specifically,   code Snippets and examples are some of the most  useful things in their documentation so we just   wanted to give it a little bit of help a little  bit of that extra hint to you know if the context   that we injected here had any kind of relevant  code snippet to their query include that if   possible because we encounter certain situations  where they were available but GPT-3 just decided   not to include those so things like this are just  little hints you can do to help coerce the model   to give you something a little bit closer to what  you're looking for I did write a blog article on   all this stuff I could throw that in the in  the video description if that's helpful feel   free to check it out basically just what is prompt  engineering it goes into some of these things and   lets you actually try out a couple of the examples  in the playground here so prompt has been covered   now what's the last step we're just using OpenAI's  library again to call the completion endpoint   passing in the model this prompt that we've  crafted the maximum number of tokens it should   respond with which you can control and in this  case we set the temperature as well I'm not going   to go super deep into temperature but think of  temperature as how deterministic you want the   answer to be so temperature of zero means given  the exact same prompt multiple times it will   produce the identical response each time whereas  any temperature of greater than zero the higher   you go the more varied that response will be and  depending on the situation sometimes that variance   is good in our situation we prefer to keep the  responses consistent if the query was consistent   setting the temperature to zero also helps when  you're testing different scenarios it makes it   a little bit easier to help craft and cater your  prompt and then at the very end we're returning   it back to the user so this video would not be  complete without me showing you the end result so   let's take a look so here we have it guys we have  our good friend Clippy in the bottom right hand   corner here something I want to note is the user  interface that I'm showing you right now is almost   guaranteed to change and improve over time I've  been working with some very talented designers and   front-end developers at Supabase and they're just  doing an amazing job of making this thing look   awesome oh and by the way I want to mention for  some of my followers a lot of you guys watch my   blender videos so I just had to mention uh Clippy  was in fact made in blender so here he is just   whipped up a little model of him of course with  a Supabase customization there super fun building   and animating him but back to this let's show this  thing off so first things first we can click on   Clippy's Bubble there and like we talked about the  whole idea is we can simply use natural language   to ask in this case Clippy anything we want and  ideally it will respond right then and there   like ChatGPT would but cater to bibo Supabase and  now that you know how the entire thing works in   the back end let's take a look so let's start  off with a simple one how do I run migrations   and okay check it out you can run migrations by  using Supabase's CLI so Supabase migration new   new employee for example and it walks you through  those steps of course it's using markdown Snippets   here this fully integrates with Supabase's  existing markdown styling and components and   even links will work here as well when you click  them go straight to the documentation so another   thing you can do which is quite powerful now  that we have a generative language model is   give it something a little bit custom right so  for example if I said how do I create a migration   called sandwich table assuming that I wanted  to create a migration to create a table about   sandwiches let's see what it says all right check  it out so we got something similar but this time   we have sandwich table placed everywhere instead  and even gave us a sample um sandwich table which   is kind of neat what else can we do what if we  said how do JWTs work in Postgres so there we   go it talks about how super race creates users in  Postgres how it will return to JWT when it creates   the user for the first time etc etc so once again  under the hood what happened here is we took this   query generated an embedding on it it searched  our entire database which is pre-processed with   all the documentation from Supabase with an  embedding that matches this query it found the   top most relevant chunks of content to this query  and then inserted that into our prompt as context   followed by this query itself and we basically  let GPT-3 do the rest and use that context to   give us a catered answer right here let's do one  more does this work with Next.js and there we go   yes this works with next.js so potentially that  enthusiastic part of the prompt is contributing   here and then it goes to talk about the Supabase  auth helpers and they have a specific one for   Next.js which you can even copy the install  command rate then and there. Pretty awesome! So that's it for today. Thanks for following  along! I also want to mention that it's been   an absolute pleasure working with the Supabase  team. Props to Paul and Ant and the entire team   there for building a great product. I love  how just everyone on the team jumps in and   helps out where they can and it's made a  project like this really enjoyable to work   on. Thanks so much for watching today guys and  I hope to catch you down the next rabbit hole!
Info
Channel: Rabbit Hole Syndrome
Views: 60,194
Rating: undefined out of 5
Keywords:
Id: Yhtjd7yGGGA
Channel Id: undefined
Length: 41min 52sec (2512 seconds)
Published: Tue Feb 07 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.