GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

we already have a great retrieval system to plug in to llms and we already had solved the media side of it that plugged into multimodal models with graph rag you can say oh for this piece of content what are all the entities related to it and what are those entities related to in other pieces of content all right everyone welcome to another episode of the twiml AI podcast I am your host Sam charington and today I'm joined by Kirk marpel Kirk is CEO and founder of graphlet before we get going be sure to take a moment to hit that subscribe button wherever you're listening to Today's Show Kirk welcome to the podcast yeah thanks so much I've been a long time listener and glad to finally be part of this I'm excited to have you on the show and I'm really looking forward to our topic we're going to be digging into uh what you're doing at graphlet but in particular the broad space of graph rag tell us a little bit about uh graphlet and kind of how you're approaching rag as a a space yeah for sure I mean we've been around for about three years had started really building an unstructured data platform for I mean everything I mean multimodal data documents audio video and um really started getting interested in the knowledge graph side of this of of pulling all that data into a knowledge craft to make it explorable and then sort of saw the integration of that with rag um kind of come out over the last year and how we can how we can benefit from that we've known each other for a bit now you attended our uh our first tlon conference in San Francisco that was in 2019 you started out doing um really trying to go after uh applying ML and AI to Media talk a little bit about how that led you to uh the way you think about the problem today yeah I mean a lot of the problem in rag is is the r i mean the retrieval side but you have to have you have to have the data to retrieve in the first place and so we started focusing on I mean there wasn't at the time and still kind of isn't a five trend for unstructured data I mean that's kind of what we're looking at of where is the data pipelines to make the data available to AI models um ml models and so we started at the injust side um pulling in um really all sorts of data and in my background I had a company in the broadcast video space I dealt a lot with file based workflows for that and there's a lot of parallels of I mean pulling in data running NLP on them running computer vision um so we kind of started with the content first of how do you get data into a system like this and then search and retrieval obviously is a is a big step and we start a lot with the metadata I mean how do you retrieve data well how do you store the metadata first and then how do you retrieve it um so like metadata filtering is common now in databases so we were already doing a lot of that um a couple years ago and as rag kind of became the concept we realized I mean we already have a great retrieval system to plug in to llms um and we already had solved the the sort of media side of it um that plugged into multimodal models um very well so that's when we kind of um really focus on graphlet as a platform that anybody could build an application on and and that's really what we um really been pushing for I mean the last year and so graph rag is uh a a concept that grew out of a paper that Microsoft published um some time ago to what degree is your system like trying to implement the you know the specific approaches uh from that paper as opposed to the general idea of applying you know using uh graph graphical uh you know Rel relationships in a rag model it was really interesting to see that paper I mean we' been doing a lot of that already and just hadn't hadn't been talking about it as widely but I mean the first part of it is just how do you build the knowledge graph um and that's that's what we had focus on first of doing anity extraction people places things and and creating that from the content and as we kind of built up our rag system um pulling in that data from the knowledge graph I mean essentially graph rag is really where where we're at now and and that's what um it was really I mean it was good to see it in a paper and kind of see people other people kind of looking at that because it's something I've been thinking about for a couple years um and really honestly a lot of this started with a podcast Discovery platform that I was um trying to build six or seven years ago and it was a little too early I mean and now I know there there's a there's a bunch of great project uh projects out there and transcription got cheaper and that it's just so much easier to build a platform like that let's talk a little bit about that extraction like uh you know where did you start with that how is it evolved is it a solve problem like you know if someone who's listening wanted to go about doing that like what are they going to find that's difficult let's like dig deep into that part of the problem what we found is there's a lot of overlap with the kind of retrieval side of like text extraction text chunking um pulling those pieces together to do en extraction and really any NLP um so you kind of have to solve that that part first and then use a model I mean we've used like Azure AI text analytics I mean we've actually used LMS for this and I mean instructing the model to identify people people places and things um but what they're also really good at is identifying things like places address extraction we found is a um LMS are especially good at that which was really difficult with NLP and I mean with with original algorithms and so that I think that extraction side happens during the ingestion pipeline so what we look at it is I've heard a little bit of uh the contrary in particular that large language models aren't great in a lot of cases for kind of traditional named entity extraction and um that the traditional models are still uh better and maybe more controllable like how do you think about where to apply what we've actually seen that as well I mean we we can use a mix um we've seen the we've seen LMS work great for specific cases like um for events like we were working with a community website where they were wanting to pull out like okay when when was The Stance concert and and when like where was it um we were able to guide the LM really well I mean to extract an event um but I've seen it in other ways where kind of people and companies the classic sort of AI text analytics I mean like Azure works way better and so we we can support both so during the same adjusted pipeline we can actually run both models and you we can instruct it to say okay places use gp4 people in organizations use Azure that kind of thing you would think you can give an llm a paragraph of text or a document and say okay you know produce a Json document that has all of the people identified and it just doesn't work as well as you might want it to work there's cases I mean it it definitely works in in some situations but what we found is I mean you get more noise in in some cases and so I think it's something you kind of want to try and and look at I mean we've we've definitely seen a mix I mean there's some cases where um and I mean honestly the the Azure type I mean or Amazon type models have similar noise problems where it'll identify a term as a company and it's not really a company um so there's a data quality issue that kind of feeds into both these sides and I think that's where it does take a take a bit of um a bit of testing and evaluation to to get it working right and have you been able to identify specific patterns that um characterize the cases or one thing we've looked at we haven't we haven't released yet is is sort of a chaining model using sort of an NLP style to identify the entities and then go and and refine that um we do support data enrichment today where when we identify an entity like a a company we can go call out to like crunch base or Wikipedia and enrich metadata for it but one thing I really want to try is once you've identified the entity go and re like re-identify and prompt the llm to say hey I think this is Microsoft's in here go I don't know grab everything you can find about Microsoft um that that's an area that I want to explore um definitely a bit more so you've got these um you know multiple methods uh to uh extract the entities and you mention that that is part of the uh the data ingestion uh how is that used in the context of ingestion so we what we call our content workflow is kind of a multi-stage so ingestion is really just like downloading it from a blob storage or a website and then we have a stage which we call preparation and that's um basically includes audio transcription or text extraction PDF extraction um and what we what we get out of it is a canonical form of I mean essentially we store a Json file of here's the transcript or here's the text and with semantic chunking or or page chunking and then we have an extraction stage and that's where the this would happen and so we kind of have this sort of State machine that the content goes through and in our platform you can configure each stage of the workflow um and and yeah at extraction I mean you can you can tell it which model or which API to use what you want out of it and then we basically take the results of that and feed that connect everything up in the knowledge craft basically um and then we have an enrichment phase after that which is optional where you can kind of just say oh I've identified I mean a company now go get its address um that kind of thing you recently joined our uh generative AI Meetup um you've been involved in our community and kind of talked a little bit about this and did a a demo and um one of the contexts or one of the questions that came up in that conversation was like rdf and these entity relationships uh you know from the traditional NLP world do you use those kinds of uh relationships in the graph that you construct or is it more ad hoc that's a great question I mean the way we currently do it is we're basic all of our entities on schema.org so Json LD kind of style so we are we're not inventing our own data model for that um and I think that's important to have canonical kind of reusable data and the LMS actually know Json L really well um so we found and so that's kind of the first part I mean today we're not doing we're leaning more on entity to content relationships rather than entity to entity relationships um that's that's most of what um I mean we could do both but I think that's that's mostly what we're trying to see because with the entity to content relationships with graph rag you can then say Oh for this piece of content what are all the entities related to it and what are those entities related to in other pieces of content um and so that our our graph leans leans a bit more on that on the entity to content relationships okay okay and so there's less of a need or desire at this point to kind of TR Traverse entity to entity relationships to pull in this broader graph of content it's more kind of two-step as opposed to a fan out and and that's what I think um like Yohi has been working on some graph things you might have seen on Twitter and there's been different kind of um open source projects that are a bit more like I mean I work at X that kind of relationship and um I mean we could definitely do that I mean but our graph just because of what we're using it for is um is is not leading that direction today so you were talking through your workflow and you uh kind of got through the ingestion uh part of the pipeline uh what's next the the preparation stage and the enrichment stage and all those essentially end up with data in a vector database and in a graph and a document store and so what we do is have a sort of a hybrid data storage model where um and object storage we can leverage as well and so I mean if we're we're caching things essentially there and so we kind of use that all together where the graph is kind of replacing the relational index and that's the relationship between all the content um I mean we have collection kind of patterns we have U the anti to content patterns and so this is something I kind of came up with over the last several years and just kind of refined and and that's worked really well for us where we can then walk from one piece of content either via similarity um in a vector sort of that Vector angle or through the metadata and and the graph and so we can kind of have that that hybrid um approach and essentially a searchable Knowledge Graph what's been your experience working with the the vector databases do you kind of support all of them or do you have preferences are there like key features that uh you're using in in your system that are you know not universally supported or you just using the basic capability of the vector databases yeah that's a good point is I mean we had looked at I mean all the kind of major players and and talked to a lot of folks there's a lot of great ones out there we um we're Azure uh Native today so we're leveraging a lot of um kind of azure data services um like for our graph um database and we were actually already using Azure AI search I mean then cognitive search for our keyword search and right at the time when I was evaluating um vector databases they came out with their Vector um index support and I tried it and it worked and um it's a it's a nice because it supports metadata filtering and keyword vector and hybrid all in the same box um I mean there's it's it's so we're we're we're not plug-and-play with Vector databases like some other things we're kind of more we're going to pick different things connected together and have a managed platform on top of it um and it works great I mean and they they even just increased their performance and um updated their storage and things like that and um we're actually going to be at Microsoft build next month kind of showing this off and um I mean in the that Microsoft domain but it it has worked great for us and we could I mean theoretically swap it out if we wanted to um but we've actually had some really good experiences so far um with this one and so we're kind of broadly seeing um you know as Rag and uh Vector databases become more popular Vector becoming kind of a layer on top of um you know a lot of different data stores uh so I just came back from uh Google Cloud next and they did a similar thing where you know big query now has a vector layer alloy DB which is kind of their postgress now has a a vector embeddings layer uh my question is do you do you see like the same thing you know happening with graph as well like is it going to end up do you do do you see us moving towards a converged world where you know you put your data in and you have all these different kind of semantic abstractions on top of it you know where you're not replicating the data or do you think that um you know graph is you know so distinct that they you know remain desperate systems it's a really interesting point I me it's something I thought a bit about because when I was first looking for kind of the magical database that I really wanted like it didn't exist and I had to kind of create this this Franken database from from a couple different things um the searchable side like keyword search in a graph I know that a couple vendors have have solved in in that world I would Envision the vector comes next um I think there's it it could make I mean you could integrate something like that with with the graph I mean the big problem we had honestly with graph databases is the payload you can't store a ton of data in a node and that was a big reason that we kind of came up with this way that it's sort of a three tier storage model where the graph is the index the heavy metadata is in at Json store and then the really heavy content is in the file system in object storage and so we kind of use that sort of HSM model of like layered storage and and that's worked great I mean we only kind of pull it off desk if we really need to um and but to your point I think I I don't think it's a done deal that it'll go there I mean I think we'll we'll have to see um but I wouldn't be surprised really if um if they start I know they've started adding the keyword search so I wouldn't be surprised on vector and along the same lines like I think you know Vector databases have been around for you know quite a long time and you we were making progress in kind of shifting from keyword-based search to more semantic search and embedding based search and um you know most of the vector databases vendors that we see now kind of started as these uh tools to support this search use case you know rag came and kind of popularized that whole space and you know Vector then started become getting pulled into um pulled into you know every Daya store do you think that graph rag will have the same impact on graph like uh graph databases have been around for a very long time like longer than Vector databases I think right um you know Neo for J I've known those those folks for quite a long time um you know they've always had their place but they've never like I think had the the vector moment like vector databases recently have do you think that that you know based on what you've seen with graph rag like do you think graph uh graph rag is that killer WP for graph databases I think it's definitely possible I mean graph databases I've I mean it has been one of those sort of background things where it's really useful for some specific use cases a lot of times the algorithmic side of like I mean running I mean algorithms heavy algorithms on a very large graph um I think the idea of more of a property graph and the sort of enti graph and graph rag I think could be a way that it's like what happened with vector and rag I mean there's several vendors that like made themselves like that that created huge companies out of just that that um plug-and play so I think we'll have to see I mean the the sort of value of graph rag is still a little TBD I mean it's still we're still exploring it but I think that's what makes it interesting I think there's a lot of interesting paths you can take to to see how to get the sort of squeeze of value um and I mean I'm optimistic I really think that I mean more people putting these relationships in the graph just rather than just a a typical relational database gives them more opportunity to to find the um sort of explore their data in new ways you go through this ingestion process that populates your kind of four data Stores um you know interrelated data stores and is that the end of the the pre-processing uh step and then the um you know then we're to the next half which is what happens at the time of a query pretty much yeah I mean other than enrichment I mean which is which is optional I mean once the data is in the data stores in the vector index I mean that's like I mean if you call our API and say hey in just a UR like a URL then the footprint becomes okay the extracted text and the and the data in the graph then it's all about retrieval and then it's you can make a query and that query um basically can be a mix it could be keyword only it could be Vector hybrid um the metadata we'd already been doing a lot with metadata filtering because one of our key sort of thesis points was index everything in time and space because we were actually doing working in geospatial as well so a lot of our metadata is not just title author keyword it's I mean what lens on a camera are you using what's the GPS location of the the the image or the video so like we have we have a really thorough set of metadata that we can extract from any media and that's when it becomes really interesting I think for I mean metadata driven rag I mean asking questions about locations asking questions about time ranges um and then the graph frag is really pulling out those entities and asking questions about okay well where are these entities mentioned um and so that everything kind of drives towards towards that retrieval model it it strikes me that there is a lot of potential complexity and like the fusion of these different you know retrievals um you know it with yeah I think folks who listen to the podcast have heard me talk you know previously about like it's easy to get a rag demo up and running but to get something that's really really good uh into production is difficult in part because there's a lot of tuning that goes into the retrieval um and that's just with kind of one you know data store with just your embeddings and uh and you know chunks and all that kind of tuning all that can be difficult now you're talking about layering in you know metadata and graph and you know Concepts like reranking like what does reranking mean now across three different uh you know sources of uh of information like how do you approach all that I mean we sort of take a layered approach I mean retrieval the first is is this kind of search that is it definitely does touch I mean it touches the graph touches the search index um the vector and so the retrieval kind of Step happens first but it applies to kind of that Triad of data stores and then we we actually just support um added support for the new cohere reranking model recently and so what we do is the output of retrieval gives essentially A ranked list of of sources and but each of the sources has metadata that we pulled from the data stores and then we provide that to the reranking model or do it do it ourselves when you say a rank list of sources meaning Source documents or sources like your three different you know metadata Vector graph data uh content sources like do document chunks or document sections things like that or chunks of an audio transcript and and things like that so A source is kind of an abstraction for like hey here's a piece of text that we found probably um in in some content got it so you somehow execute uh you know a query across the these multiple systems you get back a bunch of chunks and you treat them you know from a reranking perspective you treat them equally like you're just you doing the best you can to rank them based on the content of the chunks as opposed to uh you know the context in a graph or the you know the neighborhood in a in a embedding space or something like that at least today we are I mean I think that that's something we're exploring I mean one is time um sort of time relevance of um sort of Doing Time clustering I mean we're looking at geoc clustering we're looking at some things like that the graph um can help with that as well I mean today I mean what's what's in production today is is essentially just the content source that is getting reranked and those content sources can come from retrieval um I mean the the big thing we're we're doing today with um we we do support retrieval strategies for like expanding um the the chunk into a semantic Chunk in which is like a a section of a document or I mean chunks of a of a of a transcript and things like that so we we call those strategies that then that's where it's the knobs people could turn during during the retrieval step and then that's where we just added added the reranking strategy which initially supports coher um and that's really helped I mean we can definitely see because the relevance you get out of your vector database or even the output of the search is not I mean that's why these models exist I mean it's not exactly in the order You're Expecting and LMS can adapt to misordered data but it's the filtering that I found is throwing away the irrelevant data um lets us kind of have a a filter kind of a low pass filter on I mean okay let's just get rid of stuff that doesn't make sense and that could confuse the LM maybe this is a good time to introduce the topic of evaluation and how you think about uh relevance and and quality of a ril yeah I mean a lot of it I mean there's everybody kind of goes through that ad hoc phase of okay it looks good and originally VI Che yeah Vibe check exactly I mean and we started there as well and we've actually just started working with a vendor in this space that has kind of more automated evals and um hopefully by I mean we're looking Maybe by next week we'll have some some details on that but yeah it's we're starting to kind of lean more into that because I mean for me like I I'm not a data stist I came from a more traditional software background a lot of this is like unit testing and integration testing I mean and you have to have Suites of these tests to run when a new model comes out or and I kind of look at it in that way and so we we are trying to automate that more um more than More Than Just Vibe check and um I think there's some there's some good vendors out there and good projects that that really help with that are you thinking about Dynamic optimization dpy and those kinds of things you know dig into that whole prompting space yeah I mean I would call it I mean what we do is more of a prompt compiler and so it is dynamic prompting we're not just using I mean there are there are static phrases that we use or like I mean paragraphs that we we use for like some of the instructions and guidance um one of the things I came across and this is actually when I started implementing coh here months ago though how they like XML I mean the the coher prompt really like XML and I was actually able to realize that um opening ey likes as well and pretty much on most of the other models so we have sort of an XML template that we sort of compile to that has different sections and it has a contact Section at the top um and instructions and guidance and so we've kind of broken it out into okay here's how I'm going to guide sort of guidance is like what not to do and instructions is kind of what I want you to do um and then we compile the sources and then the user prompt piece at the at the bottom um we've had to do some things like Dynamic based on model where like I don't know like haou um forgot the schema the Json schema like it you had to remind it of it at the top and the bottom I think it was and I've seen that with a couple different models and um and those are really some things that that it lets us be dynamic on the Fly basic so there's a there's a structure but it is kind of a compiler um that that we dynamically um uh output based on the query the incoming query and and all the the retrieval strategies and all that kind of thing is it an llm that's compiling this into a prompt is it you know a set of rules or heris or a combination like yeah the the compilers just code I mean honestly it's just it's straightforward um we are using LMS for things like um uh what is it a prompt rewriting um so we do have a way that U we have a couple strategies for optimizing first Semitic search versus kind of rewriting The Prompt um we actually just have an experimental one for multi-query now um that'll kind of break it into multi queries which is actually works really nicely um and I know that I mean Blom index and blank chain they've seen that those kind of results as well um those work really nicely um we use LMS for like summarization um of the conversation history um so we support like a window conversation as well as a summarized history so those kind of things but the the compiler itself is kind of it's just it's a code I mean it's just kind of um looking at all the context that it has and then generating essentially a string um that just gets gets put into the LM and when you you said that the llms like XML uh you know how different how how much stronger is that statement than they like structure like they like that particular kind of structure apparently like as opposed to you know uh headings and paragraphs yaml Jason that for whatever reason XML is just like some magic pixie dust that makes them work better I mean I think it's it's a bit of both but I think I learned it from I mean it's it's just how coher documents they kind of say hey here's here's our preferred prompting format with um and using XML to kind of create sections um within that and I found that when I backported that because I I tried DL I tried Json I tried a couple other things and and I mean I don't know if I was a little surprised but I mean it's probably trained on a lot of data like that um but I was I saw how it we didn't have to change our method depending on the model as much it was it was a Common Thread that I found that pretty much every model I've tried um listens to the prompt very well when it's when it's structured like that now could I do that as yaml instead of XML um it may I think opening eyes more definitely more resilient and you can give it more options for the structure I found but coher would then have a downside um with at least with their original models where it wouldn't listen the structure um so what we were looking for was something that kind of worked across across everything and could just be kind of a standard a standard approach for a given use case are using multiple models uh and kind of orchestrating them and trying to identify you know what's best SL cheapest for a given step are you um you know defaulting to a single model that you like the best how do you think about the the model space yeah I mean we we let developers pick um we have what's called a specification it's kind of a preset that you can pick your model pick your tokens um like token limit system prompt all that kind of stuff and then depending on what conversation they're having or what like if they can use a different specification for Tech um Json extraction per se than for having a conversation so we we kind of hand that to the developers that use our platform to choose um but we I mean we're currently using actually uh GPT 3.5 16k as our default like if nobody if they don't pick anything else um but I am actually evaluating moving over to Haiku as as our default um I think we'll I mean we'll probably do that before the end of the month and just because i' I've really like that model just from a price performance and just quality standpoint um so I'm thinking that that's probably going to become our default you you've mentioned on a couple of occasions you know what has given me the picture of like building blocks that uh um a developer at least you know might conceptually think about their application um are you presenting them as building blocks is there a library of of building blocks or are these just kind of the natural steps that someone goes through or or even like how what how does the the product present itself is it you know uh client libraries apis is uh you know gooey thingy it is it's an API first platform we have a developer portal you can sign up for free get an API key start using it same um immediately and we have a graphql API that is kind of the Native um API to it but we actually just released this week um Native sdks for Python and typescript um for like a node backend um and so they kind of hide the complexity of graphql and it just it's it's a simpler um and it's a type uh typesafe experience which we wanted um and so yeah we just I just updated we have a bunch of streamlet apps that use the sdks now and it's been great to see the developer experience that um and I on our homepage now it's like two lines of code for injust a website and prompt a conversation with a prompt and so we've we've abstracted it down to that so this is an interesting point a lot of what we just spent the last 40 minutes talking about the developer doesn't have to think about this is stuff you're doing under the covers enable the developer to pass you a URL and then be able to prompt against it yeah and I think this is a really big difference in how we approach this first that's a lot of the open source projects that have been around and have really led the way for rag awareness but we take a more content first approach where it's almost like a CMS like you're just putting data into this content management system and then we're saying anything you want to change is configuration and so we have this workflow object that you you can create that says here's how I want you to adjust data here's how I want you to prepare it enrich it or extract it enrich it and all you do is when you say Point me at a website and use this workflow and what we're actually pulling from is more the configuration is code model like GitHub actions and it's really like hey okay here's your workflow it's static and as I check in code it's just going to use that and build it and that's kind of the approach we took with this are those workflows predefined or are they defined and code by the developer the lad yep so the developer can and we have built-in ones so like if you do nothing else will transcribe your audio will I mean put it in the vector database like you don't have to do anything um but you uh a lot of it is just like optin do you want to change your transcription model or do you want to use gbg4 um instead of GPT 3.5 for summarization so kind of hooks or tools or things that you can use to kind of insert turn into the process and and you can just configure all that and they're reusable and so you really would probably have a developer would have a set of them for their application um maybe for data extraction or for conversations and they're just reusable across the platform and then as they ingest data they'd say okay and use this workflow um and so that was really where what I was thinking about this last year is really the I mean the idea of not it's really not building blocks per se like we have the building blocks already connected but you could turn knobs on each of the building blocks and and that's an opinionated approach I mean we were a bit different in that way but I think it for us we feel it makes for a really nice developer experience where it's really simple to get started and then you could tune and you're not having to put you're not having to think about Cloud infra you're not having to think about the LMS I mean it just works and and we jumped you know right in and started talking about Rag and I think uh for a lot of people the kind of the the use case concept when you're talking about rag is like you know there's a chat bot I you know issue a prompt or question around uh uh you know a document or some sort of content or something and uh um you know I get a response that's based on you know some knowledge that and llm has beyond the kind of pre-trained you know training Corpus right um but you feel strongly that there are you know use cases that are interesting Beyond chat Bots you know talk a little bit about the ones that you're excited about and you know how you've seen them play out for sure I mean I think I think the chatbot or co-pilot experience is kind of like the first order it's it's direct you're seeing a direct rag put on a page uh which is great I mean I think that's there's a lot of value for that um but what we're seeing is rag as a pattern almost I mean it's like SQL for unstructured data I mean or or like a way to format um data in a way that you can use it for so many things like content repurposing which we have a um something called publishing that you can point at the data you can filter create your subset of data um summarize it and then publish it in a new format like a blog post or um a long form article or social media and what we really see is that content publish angle being that kind of second order of of really how can you use rag as just a function essenti a piece of functionality to deliver more value um because it's using rag in multiple ways I mean it's it's finding all the data it's I mean using LMS doing um different things and then creating like audio summaries um like I have one set up that I have a feed on my email and every morning it actually reads through the email summarizes it creates an 11 Labs audio summary and post it to slack and you can do that with like I don't know four API calls or three API calls with us and that's the kind of stuff that I really want to see developers what they can build with those kind of build I mean those are sort of building blocks um and and that's what excites me is like be able to repurpose that content um into into new ways and we're actually even looking at like Dynamic image generation using LMS to create the text to actually put into an image template and generate marketing copy or marketing graphics and things like that you know if you think about the content generation uh use cases that you know have played a big role in in driving llm popularity and and the image generation popularity it's like it's static in the sense that you try to stuff you know a bunch of stuff into a prompt or use a prompt to d a single prompt to direct um the creation of some content asset so uh you know if I'm want to write something I might give it a bunch of resources and stick it into a prompt and what you're essentially describing is um that you know just like we might use rag to create context dynamically for an interactive system we can also use rag to create content or context dynamically for some of these generation use cases so you've got now uh a prompt that dynamically drives the generation of a context that results in uh a document uh uh an audio a video uh um a uh you know blog post an image whatever yeah and and looking at like I mean integrating with like hey Jen for kind of video clip with avatars like we're I mean it's it's that's the kind of stuff that I think becomes interesting and the and the one thing when you publish that it becomes a piece of content in the system like we reing EST it into the system and so it becomes searchable automatically and eventually I mean we've been waiting for feedback of like we could Auto publish like put it in your Google docs for you or put it on a SharePoint or whatever people want I mean that that part's not hard but I I could see that being the connective tissue of dynamic content generation and then the I'll just bring up agents for a quick sec but once you have that piece of content then you could run an editor agent workflow of like oh hey go have your editor I published this use this LM prompt edit it repu like write that back into the system and you could have all these kind of editor relationships or go generate me a marketing graphic based on this piece of content and then paste it back in so that's why when I say we're kind of like a CMS with llm spill in it's I think that's the long-term value is it really becomes this ecosystem for Content generation that may be interactive or may be offline and and it could work both ways and I'm taking that agent discussion as kind of a future Direction thing something that's possible based on the foundation and not something that you have gotten very far with today I mean we have one thing we call alerts and so that's kind of like a a very simplistic one-step agent that that we worked with and you can it's like basically the slack alert of my email um and so it's a two it's sort of a two-step like summarize publish and then notify um and we can I mean we're actually looking and kind of seeing where I mean there's crew Ai and autogen and and all these different things that are great and and do we integrate with them do we try and do something our own I mean I think there's I I think there's so much Innovation there um the one thing I will say that I really like is I've worked a lot with actor models in the past kind of dynamic um kind of workers that are interacting and I've started to hear this now on some more podcast of there's a lot of learning that people can take from kind of the distributed systems actor model world and apply to agents and I think the the ones that I like the best are kind of learning from those it's it's kind of a solved problem in a lot of ways you how to spin up a kind of like durbal entities or that kind of thing um so I just hope that I mean we don't try to reinvent the wheel um on that and I mean there's it that part of it should hopefully be the easier part but it's really the memory for them and the the workflows of things that'll that'll be the more Innovative part I mean there's you know it's making me think about uh message cues and uh message passing all these you know this infrastructure stuff that we've already figured out for you know distributed systems uh we don't necessarily have to reinvent reinvent all that stuff and I don't know that a lot of the stuff that I've seen reinvented in a good way like it's all single process like let's not start there well that's I mean today we're we're an event driven system I mean we're built in that way from scratch um alerts and feeds are actually kind of Agents like they're actually like a a demon process that's running sitting there doing things on a periodic and running different events um we actually use this thing called durable entities from from Azure from from Microsoft that is sort of this Azure model that sorry the actor model that has State and it sort of like AWS step functions in a way where you can run different steps and and Define a workflow um and to us I mean that's kind of our agent model I mean you can we can construct those dynamically and let Azure run it I mean they'll handle all the retries and I mean provisioning and and it just works um so if we come out with something that's agent likee it's probably just going to be a hey it's a layer on top of that I mean because I think it it's a it's a nice pattern already is it open source is it as a service is it on Prem offprem like yes we're um Cloud platforms of service um so it's a it's closed Source today the the sdks are open we just open sourced the SDK is on top of it um all the sample apps are open source but we we take advantage of a good number of um Azure backend services so we don't do on Prem um today we're going to release in the Azure Marketplace um this quarter that you can essentially have your own sandbox ecosystem of graphlet running in your own asro subscription um and we do we do have inest from all the clouds but currently today it's a it's an aure native service is multicloud a party or is it kind of we'll wait and see it's a wait and see I mean I think it's I mean we could have gone the kind of kubernetes really abstract multicloud model and there were reasons that I kind of thought okay if we lean in on one Cloud that we can get a lot more benefit um I mean the managed databases especially that we're leveraging solve a lot of problems for us I mean and I I mean seen other companies that struggle with their kubernetes infrastructure and struggle with just all that the that side of it and and we kind of build on we decided to build on the shoulders of a couple of those things and so there's still I mean there's a lot we do from the injust pipeline standpoint but I mean I'm not worrying about managing a database I mean right now and so um I don't know I mean we're we're still I mean somewhat small company and so I think it's just where do you want to put your eggs in what basket at this point it's been great to see the progress you've made and how it's kind of evolved over uh past two three years uh since you first uh spoke with me about it um and um looking forward to seeing how it continues to evolve thanks so much no it's I mean there's been so many interesting topics about rag I think graph rag is really we're still in the early days and and that's what's exciting about it I think we we hope we kind of have the underpinnings of it but I think we're still going to learn so much more I mean over the rest of the year well thanks so much Kirk for uh jumping on and sharing a bit about what you're working on yeah happy to be here thanks so much

Info

Channel: The TWIML AI Podcast with Sam Charrington

Views: 3,668

Rating: undefined out of 5

Keywords: AI, artificial intelligence, data, data science, technology, TWIML, tech, machine learning, podcast, ml, RAG, retrieval augmented generation, Kirk Marple, Graphlit, GraphRAG, unstructured data, knowledge graph, entity extraction, LLMs, large language models, data ingestion, metadata, vector databases, graphs, semantic search, dynamic prompting, embeddings, API, cloud platform as a service, graph databases, retrieval evaluation

Id: v4sAeY06ngs

Channel Id: undefined

Length: 46min 52sec (2812 seconds)

Published: Mon Apr 22 2024