Spring Tips: Spring AI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi spring fans welcome to insta of spring tips we're going to talk about something that's amazing something that's near and dear something that has just been top of mind for a lot of us of late especially since late 2022 when chat gbt burst onto the scene Ai and it's potentially amazing and perhaps just a little bit scary implications AI is a big nebulous thing and it represents decades literally Decades of research and investment and R&D so we don't have the time nor the ambition to cover all of it what we need to know today is how you can leverage it from your application and that's I think what's very important here is in the same way that most people don't need to build their own SQL databases or their own message Q's most of us don't need to build our own llms instead what we need is to be able to integrate those and so today we're going to talk about the spring AI framework this is a fairly new framework indeed it just graduated from experimental and it's not even yet ga uh so there's a whole pipeline here and this is this idea of of asking a question and and and in the body of that question adding enough information so as to better educate the llm about the context associated with the question that whole process of of refining the data and getting into such a place where you can actually put in the body that is called retrieval augmented generation the idea is I make a request of an llm the llm has been uh trained on a certain body of data up until a certain point and it doesn't know about anything since right that point so if you want to teach it about upto-date live information in your your your your database for example you need to stuff the prompt you need to add information add any context required to then uh be able to act on that request so today my friends we're going to talk about retrieval augmented generation with spring Ai and its various Integrations along the way so we'll look at uh the support for working with data we're going to look at the support for storing things in a vector database and then we're going to look at the support of course uh for talking to a large language model there are different implementations and Integrations for things like open AI azour hugging face or Lama Bedrock vertex from Google Etc so we you can make your pick but we're just going to look at open Ai and we're going to do that as always by starting a journey here here at start. spring. we're going to use Maven I'll use uh we're just going to call this AI at the moment the uh the AI project is not on the spring initializer you know sadly that's okay though so we're going to go ahead and bring in the jdbc support we're going to be using uh there are a number of different Vector store implementations out there the one we're going to be using today is PG Vector store and that's one of the supported options for spring AI users uh and it's really convenient because all you need is a simple post based database with the right plugins configured so uh we're going to use that um I'm just gonna add the jdbc support really uh do I need anything else I guess I I probably don't do I need aspect maybe I want the aop support right I can add that manually all right let's just go ahead and hit generate and open this up all right so we have to go back to our build here a couple things um we're going to use uh the post support I forgot to add that so let me just copy and paste all these definitions uh from uh another uh project and we'll just go through them line by line just to make sure that we understand and are aware of every single dependency okay very good and we need a spring AI version at the moment we're going to be using spring AI 08 snapshot okay now obviously this is subject to change it's not even a ga release but it's uh you know it's conceptually very interesting to understand the the possibilities okay so we've pasted the dependencies into the class path let's go ahead and see what we got we have spring boot stter jdbc that's for the PG Vector store support and of course it's always good to have a JBC around you know we've got the web uh support not because we're necessarily trying to build a web service but because we might use a web dependency um but I don't I'm not even sure if that's required uh we have the spring AI open AI spring boot starter we have the spring AI PDF document reader and the spring AI PG Vector store Spring boot starter so these dependencies allow us to work with the different pieces of Machinery required to to support our use case today we're going to also bring in the aop support and the retry support um one might argue that this should be part of the spring Ai dependencies and perhaps it will work without it but I'm going to cargo cult it and we'll just use it you can try removing it and see seeing if everything goes well uh we have post grids of course and then the test support which is duplicated good all right AI application Okay so we've got our public setting W Main and what we want to do we know that we're going to be talking to data on the uh in a in a PDF I want to be able to ask the llm questions about data that it's going to discover in a PDF so that data here is in the PDF's directory okay I've got this file here called uh Medicaid Washington facts okay so this is about the Medicaid Program in the state of Washington in the United States uh and it's a notice October 11th 2023 so new right newer than at least as far as I know all the cuto points for chat gbt so I'm going to be able to ingest this data right uh and then ask the llm questions about it I want to be able to teach the llm about this so that I can then query it and provide kind of an ivr experience we're going to initialize the pipeline to read the data in and then we're going to ask our llm a question about uh what it's just what it's just read with that data so we're going to here's our initialization our demo right do that okay very good and we want to have a method that first of all resets the vector store implementation so jdbc template okay we'll say template. update lead from Vector store fantastic okay and now I want to have a way to read data in that data by the buy will live in a PDF uh in the uh on on my desktop we saw that right so value file home desktop PDFs and the PDF in particular is Medicaid W fqs PDF okay so here's the resource there's the template going to delete from the vector store let's connect by the way to that Vector store okay uh I have this Docker compos file here that you can use to spin up your own uh Vector store I'll put that in the source code for this repository on github.com Spring tips and you can see it's uncan PG vector v50 and I've got the usual environment variable specifying uh you know credentials authentication credentials health check and even a admin if you want to be able to use that to administer uh the post because inance me I just tend to stick to the command line so it's fine okay so note of course that the database is Vector unor store the user is post ciz everything else is post so right user password and uh and so on is all post okay let's start that doer compose instance up there we go so now it's running in the background there I can connect to it right PG password equals post G psql you post postes H Local Host postes is the database okay and C Vector store here we go there's nothing in there at the moment okay brand new Vector store installation uh so we want to delete from it if it if it exists right um we want to be able to read some data in okay so the goal here is to read some data get it into a place where we can understand the text from the PDF and then put that in the vector store and then ask questions of R LM pointing it to that Vector store okay so first things first config it is a bit of a mouthful but basically we're just specifying how uh uh fast and news we want to play with the text data in the um in the PDF document okay uh then we have the reader which in turn takes the resource which is that and it takes the config that we just created and now we have a text splitter which will tokenize the data we just read and now we can actually uh write that data to the vector store and so in order to get that we'll just inject a pointer to type Vector store and there are again a good many of these out there you can actually there's actually an inmemory one that comes with spring AI that is well you know pretty easy to understand if you understand the math um I don't preport to understand at all it's looking at uh it's able to given text kind of give you an idea as to whether it's close closer or further away from some other text right and so we're going to use that to write data we are not going to use the in memory one right even though it's just mathematically just very simple we're going to use the post G based one uh we'll say Vector store. accept and then we'll put the documents in there text splitter. apply PDF reader doget okay so there you go so there's our updated code we're writing data to the pg Vector store um so this configuration is going to configure uh our uh document reader for uh PDFs using the pachy PDF box API okay so we then read it with this split up the data and then write it to the data store okay very good now um that's let's just try running it remember though we're dealing with PDF so behind the scenes things like missing fonts and whatever might uh Vex you right you might actually be in a situation where you see errors about missing fonts it's probably fine but do be aware that might be an issue okay so another issue we need to configure our data source naturally so remember what we just did we uh we just talked about spring data source URL jdbc post gz Local Host vector uncor store the username the password and the database are all post scripts we have the database host username password let's go back to our code restart see those errors PDF box text it's all about getting fonts on my Mac fonts that are in the PDF that I can't load in my local machine not a big deal just be aware you might see some you can always disable the log levels for those errors right if you wanted to you could do uh well probably I haven't tried this but you could probably do something like this logging level blah equals debug right that usually works for most kinds of loggers um okay so it's still going it's reading a lot of data it's doing a lot of work but there you go it's finished its work splitting up the document into two chunks uh has it finished writing the data that's the real question here and by the way notice how this took a long time right this is not uh cheap okay so there it is we got a a database now so open that got something in there right so select count look at that 13 rows and we can ask questions like you know what's in a given row right you got the content the metadata the embedding and the ID and that's a vector okay very good so we now have uh information about the uh the PDF in our in our database and we could rerun this every single time I hardly see re why so let's just refactor into a separate method okay we're going to call this setup there you go and we'll just comment that out for now okay so we have now a separate method here that we can use to reinitialize our database I've commented out now comes the fun part actually talking to the singularity talking to our API okay so let's go now and build a client okay and the client in order for to do is its work we'll use the spring AI chat client and here's where client okay now there are other clients out there right this is a model client here you go go back you can see that uh and that's good for uh text messages right that kind of things uh but we also have support for image clients right so you can give an image prompt and get an image response for all the different l that support that for example chat gbt 4 uh supports both text and Dolly image based uh generation so depending on what you need you inject the right implementation uh there's going to be a lot more open AI there's going to be a lot more textual chat clients than there are image based clients but there are a few of those as well so it's just very convenient to have that as an option now we've got the chat client and we can use you can imagine building a CRM or you can imagine building a some sort of Enterprise app where there's an uh AI assistant and you want to be able to have the AI respond with some knowledge about what you're doing and what the user is doing uh and so we need to write the prompt and this is actually a whole sort of type of discipline right prompt engineering how do you write your prompts in such a way as to give it enough information to be able to answer questions about whatever uh appropriately so let's go ahead and do this I love multi-line strings they not just for SQL anymore look at that right good stuff okay um so now here's our question here's our [Music] prompt okay so there we go so this is a a pretty uh straightforward human language uh text right prompt and we're giving it a little bit of static invariant information that it needs to know for all responses and then we're going to give it some information uh from documents from the documents that we've just ingested right so let's just make sure our human text is pretty okay okay looks okay it's it's an AI it'll figure it out okay so now we've got the documents stanza um and now we can ask it questions we've stuffed The Prompt so what we're going to do is send that prompt as a system message which is sort of a overarching message that should be included in all other messages that get sent okay so docs equals um well let's so list of similar docs equals this do ve sorry it's Vector store do similarity search right and then we're going to look for the message and the message will be okay so we've got this prompt you know I want to hide this complexity a little bit behind a chat client of sorts right so let's do that here say static class uh um Karina a my client right not bad okay so string chat string message I'm going to put that prompt there we'll say VAR list of similar docs equals this Dot and then we need the vector store private final Vector store okay and put that as a bean so this. Vector store dot similarity search message dot I guess that's it actually we got that and then bar docs equals list of similar docs. stream.map get the content uh collect collectors joining so we're just joining all the data from the similar docs into a thing and then we can use that to create a bunch of documents that we're going to put in that prompt there okay so we going to create a new system prompt template and the template will have the string that we just put in there right that's this thing right here prompt and we're going to create a message and that message will have some parameters map. of documents that's this parameter right here and the value of that parameter is the collection of documents that we just created now we can create a user message and this you want imagine will be different for every user um user message this is the parameter that we're actually asking you know the question being asked by the client and then we have a a prompt list right so new prompt list. of system message and then user message and then finally we get the response from our AI implementation okay so AI response equals AI client and we want the U chat client injected here do we have that no we don't so private file chat c voila chat client. call prompt list okay and then we got the response so we want to say return AI response. G result. getoutput dogit content right okay so and the reason that there's a get content and get output is because you can actually also get the um the you can actually also get those streaming results the sort of line by line but I prefer to wait for it all to get aggregated and then get the content right so content voila all right so there we go we're returning a string uh in response to this question using our Karina AI client okay now let's inject that here in our little demo here so I guess we don't need that oh let's just leave it in it doesn't matter but so we've got the Karina a AI client now let's ask it a question okay system out print line uh Karina AI client. chat and let's look at the PDF huh let's see what's a let's ask about this okay or something new here's something this is a state change right how what should I know about the transition to Consumer Direct Care Washington okay so let's uh put that that here okay that's our question to the AI which is going to give us a string which we're going to then print out using system out. print line all right let's run this whole thing again and remember we're not going to reinitialize the uh Vector store that should have already been done so now we're going to talk to the thing oh okay the port is blocked let's restart the existing one and now of course at this point it's going to take a little while it's actually making a network call to the llm and doing the actual work of getting us a response um here's the response though cool huh CWA is responsible for providing employment blah blah blah cool so we got data from the AI which is uh summarizing or syn synthesizing from what it's read in those PDFs which we've given it in the uh rag pipeline it's summarizing and giving us a response based on our query this is super convenient and it was very quick too in just a matter of minutes went from having nothing to a retrieval augmented generation based pipeline supporting uh open AI in this case but there are a good many other options out there we're also using PG Vector store uh and of course this is just the beginning I mean you've only seen just the very simplest of examples here but uh the the idea here is that you can now have with just a little bit of configuration and easy integration with any number of different uh llms I have been talking to open AI obviously normally you'd need to specify that open AI API key what I did was before we started this program I created an environment variable that looks like this and I exported it in my shell and put an actual value in there and so on you can you'll just have to trust me that I've done that and you'll forgive me for not leaking my open AI credential um remember this is extremely IO Centric right but remember spring boot 3.2 uh supports virtual thread in Java 21 that's why I chose Java 21 so I do this and now yeah sure it might be taking a few seconds to get a response from the AI but who cares you're not losing any thread capabilities that thread while it's parked waiting for that next bite to arrive from the AI the virtual thread mechanism underneath uh uh in Java 21 and and used all over the place in Spring boot once you set this property will automatically Park your waiting thread business logic somewhere and then relinquish the actual thread and let something else in the system use it until such time as that uh that that that thread of execution is ready to continue because the bites have come back or the thread that sleep has finished or whatever so the cost of doing this now has gone down to basically nil it's an amazing amazing time to be a Java and spring developer my friends all right good luck I can't wait to hear about what cool things yall start building
Info
Channel: SpringDeveloper
Views: 59,747
Rating: undefined out of 5
Keywords: Web Development (Interest), spring, pivotal, Web Application (Industry) Web Application Framework (Software Genre), Java (Programming Language), Spring Framework, Software Developer (Project Role), Java (Software), Weblogic, cloud foundry, spring boot, spring cloud, artificial intelligence, retreival augmented generation, data
Id: aNKDoiOUo9M
Channel Id: undefined
Length: 22min 48sec (1368 seconds)
Published: Wed Jan 31 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.