‘It’s Like ChatGPT And Wikipedia Had A Baby’: Perplexity AI’s CEO On Building Personalized Search

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so uh perplexity how many people use perplexity is it on your phone right now you got that's got to make you feel good thank you yeah um so uh you don't every day meet somebody that's unicorn status in less than two years and uh that that is a definitely a sign of the times in AI so congratulations on that uh would love to hear your version of the founder Journey because it's been a very very short timeline and um you know your stepping stones in there include uh Google and open Ai and now your startup and then before that Berkeley uh just compress that down to your your journey to get to where you are today can everyone hear me okay uh F firstly you said you said it right AI the whole AI is like basically running on exponential time so the regular time Fields logarithmic relative to AI the speed at which we're going um he's asking us to sit down all right stay away um so yeah it feels amazing uh exhilarating stressful all at the same time uh there's this phrase uncomfortably exciting uh where you feel the adrenaline uh all the time and you feel like you'll run out of business any day but at the same time you feel like your business is growing really fast so that's what it feels like um inflection points every single day wow so uh tons going on in the tech stack do you want to uh yeah so I'll start with a quick story of uh uh my journey with perplexity I had a crazy surgery you you know just a few months ago and I was sitting there in the hospital just before and after the surgery and I could barely see my screen um and the only thing I could use were was perplexity because first of all I'm one of those people who's always worried about like doctors are telling me is this true and if these are my blood reports what does it mean and so I was using perplexity all the time uh nice large fonts giving me a five- sentence summary of what I need to know so I encourage everybody to look at perplexity so magical aspect about perplexity uh for those of you have tried is the summarization you know less hallucination with links out there so if you get a little bit more technical here uh you know you're using underlying large language models right now uh but what do you see going forward I mean are we going to see Transformer continue to win out over next two years Yan Lun is going to talk about you know joint embeddings with jepa of course the progress in rag I know you have certain type of rags and fine tuning that happens on the Fly tell us a little bit about what's under the herd and what keeps you uh what allows you to keep this mod ahead of everybody else right so for perplexity we we we're not like uh bound to the Transformer or any model uh it's more a software system that takes an existing language model state-of-the-art language model uh state-of-the-art rack system and orchestrates the search and the llm together into an answer engine right llms large language models whether it's a Transformer or state space model any other architecture these are models that are like amazing reasoning engines uh you can you can prompt them you can instruct them you can basically program them to do amazing things uh and basically you want to harness the reasoning power in these engines with facts that exist in the form of regular search indexes that have been built over like last two decades and combine these two powers and create a new entity called the answer engine uh it's like chat GP and Wikipedia having a baby together and that is perplexity why because a regular chity allows you to chat with this really General reasoning modotto it feels amazing where it doesn't always get things right and Wikipedia has almost almost all the accurate facts in the world extremely well peer reviewed a lot of citations references you will trust when you read it and like you can get to know about anything about any topic except it's not not personalized to you the same Wikipedia page is written for everybody so you might be reading a Wikipedia page about the Oppenheimer movie and you might only be interested in like certain things there like about the actor someone else might be interested about like how it was filmed right like you don't have to read through the whole thing it's just not the best experience how would it look like if we combine the two together and bring the world's knowledge and give it to you in a personalized conversational manner at your fingertips everywhere that's what perplexity is Just sh maybe a little bit more about what's under the H then clearly you're not you're still using somebody else's uh apis that's right so uh the user doesn't care whether you use your own models or use open eyes or anthropics or mistrals or metas like really they don't care when they come and use the product they just want good answers good personalized fast speed accurate answers uh and um so our model is not our mode our mode is how we orchestrate many different models in one single product lower the cost without compromising on accuracy and make sure that the answers are accurate as as possible uh so lower the hallucinations uh make sure that the latency is really good and make sure the answers are presented in a readable manner we can still do a lot of uh things here we are not there yet nobody wants to read like a big chunk of paragraphs written for like any question you ask so you have to go about and Beyond to understand the user intent and answer like present the answers differently according to the type of query if you can nail all these three things together which cannot be done with just one giant model to do everything but rather really understand how to work for different type of queries and like really understand what the answer type should be uh and really make sure that you can use a family of an orchestra of models uh one model to understand what query like what type of query it is one model to like reformulate and expand a query into like chunk like you know multiple queries go and collect like Pages relevant to each of them one model to Aggregate and summarize all the content you picked one model to just chunk all the documents into different embeddings and another model to like suggest next questions to ask you now all of these are working in parallel at the same time when you ask a question on perplexity and you don't care that's the thing you don't care what happens under the hood when you're using the product should just work it should be fast and it should be it should feel good um and and um that's what like there's the same thing with there's a whole famous Jeff bezos's video right like internet or Internet doesn't matter what matters is the customer experience like same thing here uh GPT or and a claw or llama or mistr it doesn't matter what matters is the answer is accurate and readable so I think that's great advice especially for all the people doing startups and there are a lot of them around here as you would know um one specific just follow up on that if if somebody was starting a company today do you see a year or two from today perplexity or these other startups operating on open source Foundation models or calling apis like gp4 I I I would definitely even if so it's not like dichotomy uh if you start a company and you want to build out a product and get get into the hands of people uh do the easiest possible thing like take someone else's API uh whether it's GPD 3.5 or Hau or uh the latest mrr model that you know offered as an API by any of these API providers doesn't matter just go take it and get the product out in the hands of of people whatever costs you the minimal thing and collect a lot of data see where the problems of failing and not failing uh and then go and collect a data set that addresses these failures and then like you have to go back to open source because that's what will let you really controllably fine-tune things there's only so much you can do with prompt engineering uh and the most capable like gb4 Opus like model are really amazing that just prompt engineering gets you the solution but that's still pretty expensive expensive uh we all supposed to build businesses here right like and and to do that you have to be profitable at some point so you cannot be spending so much on uh these really large Frontier Model providers so you have to go back to open source and try to like specialize the experience to what your product is supposed to do and that lets you achieve better cost efficiency that's Mark gorenberg the chairman of MIT and soon as he said that he started taking notes so you know what you're saying is is worth remembering thank you so on that note I mean if you you know the difference between open source and open weight and open apis um and if you if you start thinking about not b2c but B2B and especially you know creating startups or solutions for Enterprises uh whether it's on Prem or you know kind of privacy preserving how do you see that I know you right now have b2c product yeah uh but companies like anthropic are are really targeting um you know B2B products yeah uh I mean like we will also like f first of all a lot of people tell us that we love without using the pro perplexity Pro the subscription plan but uh we are banned from using it at work Microsoft was to bans perplexity for all their employees today uh for good I mean because they want everyone to use their products it's not a new thing they've done that for Google in the past uh but in general like there is a like a skepticism on every employer to let people use these AI answer Bots or chat Bots at work even though they know that their employees massively benefit from it in terms of productivity time saved how fast they can get work done all of these things uh so any any any answer bought in the market today if you want it adopted in the hands of as many people especially when the value provided is a lot more at work time than personal time uh you should figure out a way to be adopted in the Enterprise and that requires you to work on compliance uh security data retention and uh making sure that you you can even have self-hosted versions of your product on the you know the client that they have so all these things should be worked on and we will also work on it uh and then the there's other part of like you know with the foundation model provider if you're not the provider yourself you have the right slas with them to make sure they delete the data periodically and both anthropic and open AI provide this and if you are going to have your own models and you're going to let others use it you should also provide this so the these are the current ways in which we're working I think fine-tuning when you're taking someone else's model and fine tuning on your own data uh that is even higher level of like requirements like you do need to make sure that data warehouses are secure uh this is where companies like data bricks or snowflake are pretty well positioned because they get the full end to end package to the customer we're not going to work on that market yet so can I go back to your founder Journey for just one second here um you know your business plan your company's going like this your business plan is very very aggressive and I don't usually associate that with PhD backgrounds you kind of associate that with people who drop out after freshman year yeah so what is this is about you and your your journey that got you on this trajectory because there is a a version of the world yeah which I think you very likely believe in where people don't do much Google searching three years from today they just talk to perplexity so the I mean first going back to PhD I just want to mention one thing here uh I applied to MIT and Berkeley Berkeley admitted me mit rejected me uh and uh my mom I went and told my mom hey I got into Berkeley she's like what is Berkeley did you get into MIT I was like no I didn't get into MIT then she was so disappointed like and I tell hey Berkeley is also a good college you know and and I don't even know what Berkeley is I only know MIT so that's the extent to which MIT is popular and I'm very happy to be at MIT for the first time and as a professor I always tell the students here although there are many examples people dropping out and studying you know Microsoft and and Facebook and so on you know be cool stay in school and and know follow because if you see the current AI Trend most of the leaders have finished their phds that's true that's true so yeah going back like what is PhD prepare you for I think It prepares you for taking risks uh that's a common uh feature between doing a PhD and doing a startup uh when you're a new student there's like lots of lwh hanging fruit ideas that your senior students in your own lab or have ready made for you because they don't have time to work on it and they want somebody else to help them and you can get like four or five papers published in like you know one year if you're just like literally following the instructions of your professor or your senior grad student and generally all the advisers want their students to do that to feel productive to feel like they got some wins and get settled into the new place and then take the risk that's just establish wisdom in grad school um I also did that the first paper I wrote was just someone someone else's idea not mine but I I didn't like it I was not enjoying it and felt like okay this is the time to take risk like why am I even here I would rather go and work in Google brain or or some other research or U if I'm not taking risks and the the part of taking risk in PhD is your adviser doesn't advise you anymore he's like yeah yeah go do your thing and if it works out I'll you know we we can we can can like consolidate and figure out a paper uh so for a peri of like 8 months when I I was I was an internet open a in 2018 uh my PhD started in RL and um I saw like like this guy Alec Ratford right gpt1 the first version of GPT uh it's amazing to think this was just like 6 years ago uh but at that time uh everybody ridiculed it nobody even understood it uh it's like oh it's just like you know yet another auto complete but some of us knew that this was real and so I had to go back to my adviser and say we let's we have to like U you know this whole thing if you're in crypto pivot to AI so that that was that was a moment for us is like if you're in RL pivot to generative models it was called generative models not generative AI so and and that I that spent like eight months just not having any paper but it was truly worth it it helped me get shape fundamentals in deep learning generative models unsupervised learning and and and and that's what helps me to do all this today so uh I think PhD generally take you know teaches you to take a lot of risk be aggressive uh there's no like like you know like the whole shoot for the fences that's it's not a truncated arhome distribution can change your life so so that open AI experience is a perfect segue into my next question so you're working with gpt1 one of our partners actually wrote a book with gpt2 and it was terrible but but it it was obvious to her that the fact that it could write a book at all was indicative of something that a year later would be massive so I think your experience with gpt1 must have given you early insight into where things are going now the next iteration is multimodal yeah and that's only a year away and we're going to be shocked yet again what what's perplexity going to look like on those Foundation models yeah so I mean the first idea of perplexity I pitched to uh one our first investor elot Gil who who's an MIT aluminous uh was you know you wear your glass uh the best way to disrupt Google is ask questions through a glass and then uh it has a microphone it it understands what you're saying and then gives you an answer and this way people don't have to type into a search bar and they get answers read out to them on their airpods uh I think this will happen right whether we do it or somebody else does it uh doesn't matter it's inevitable that like search will be in the form of like what on the go everywhere hands like device free as much as possible voice based and uh using Vision inputs not just text and I think that we want to work on that too that's what I think multim true multimodal search is and uh and then like asking questions anywhere like imagine you can talk to any device ideally right like that doesn't have to be just your phone and you ask questions uh you go to a shopping mall and like there's just like uh near any any row of dresses you can just ask questions to some device there about anything it's almost like a consur a bunch of consur everywhere I think all these things are possible or you have you you can point your finger at something and ask questions like we can we have to reimagine Hardware along along with all these like new models and uh that's what I'm excited about like how perplex he can play a role there so so this kind of for Tony Stark you know Iron Man also also MIT guy uh the Tony Stark vision of being able to interact uh augment your world uh this does require I mean you described perplexity as really an orchestration engine uh that brings all this models together once it's multimodal what's in the physical in the dimensional world it's only going to get more exciting uh lot of that will be dependent on on Hardware yeah um and on one end you know we're seeing Blackwell and you know yeah you know trillions of transistors on the other hand if it's mobile augmented it's going to be you know Edge Computing yeah so given that so share us your thoughts on kind of the scaling laws you know trillion parameters 100 trillion parameters also share your thoughts on companies like cus and Gro what's the general landscape for compute and Hardware yeah uh I think like the Nvidia Hardware uh is going to be the best for uh training nobody can get compete there at all um now for inference that's where like it's more subtle uh right now it turns out that even though you think h100 is a bigger Beast than A1 100s and it's likely better for training but for inference you can use cheaper GP use it turns out not to be the case uh it turns out that you can pack more density per chip uh because the chips are more powerful so instead of buying like 64 gpus or 100 gpus for serving more users you can buy like 28 G 24 gpus of h100s instead of like you know 128 A1 100s and serve the same number of users because you get higher throughput uh with the few with fewer chips basically and you spend lower too so that's it it helps you to scale in terms of server racks better to use more powerful chips and fewer of them uh if you want to do a device on the cloud but if you want to do in uh if you want to do inference on on on the device itself that's where I think Nvidia doesn't actually necessarily play a big role and they have like a lot of edge gpus too and Google has hpus Apple has probably the best hardware for on device I think like you have to use all like like you know at some point you have to build custom compilers if you're building your own Hardware there um it's not clear to me what the software consolidation there is going to be I don't see Cuda playing a massive uh having a massive mode for on device inference so that's where I think some people can build new companies um with for as for Gro I think what they're doing is very exciting uh even if they're not replacing Nvidia uh for inference um they're definitely going to bring the prices down or else like margins for NVIDIA is going to be so high nobody can compete there so I hope that it brings inference prices down when when there's more competition and and that's great for everybody because we can scale our products to more users without having to pay as much I mean some of these Hardware companies are 4 to 5x better than Nvidia in the price performance mhm uh training uh and even more on on inference so do you see this kind of one dominant player emerging has emerged in training do you see dominent players emerging uh in in inference or Edge devices I mean you mentioned Apple but setting that up side yeah so if you if you decouple inference into Cloud inference and on device inference there's not going to be one dominant player for on device uh but for cloud inference even today I would say Nvidia is the dominant player and for other people to compete there they need to move really really fast on the software that helps people to deploy on their Hardware uh and and that's that's what is lacking today like everybody still works on Nvidia software they use tensor RT for like inference like and Nvidia is like going really moving really really fast like that's why they are our investor too we work with them on tensor RT we work with them on fine-tuning infrastructure like neotron and they just like moving really fast so whoever wants to compete there should also move equally fast on the software stack too so we only have time for one more question according to John there and I really have to ask you this because I was looking at your LinkedIn profile so your founder Journey makes it seem like PhD open AI brilliant idea perplexity but somewhere in there you ended up seed investing in mistal cognition labs grock 11 Labs there's got to be more than a story there how did that happen I mean someone on Twitter told I just invest in hype companies actually that's what I do I I don't have any time to like evaluate all of them uh they sometimes they ask me for investing and I they like if the products are good it feels good to use them I I just invest like there's no time to evaluate oh like what's the business model what's your mode cuz if somebody did that to us to me or perplexity it would be ridiculed fast like how how are you ever going to compete with open AI or Google like for every startup you can come up with like thousands of reasons of why they wouldn't work um and and if you and so like if you only focus on that you're never going to be able to invest in any of these companies so rather you would focus on okay like like what what does the out success outcome look like if it in case it works and if that seems pretty big and the downside is not that high either uh you should just go for it these are asymmetric bets so that makes it sound like all these insane insanely awesome startups came to you did it all happen in the last 18 yeah we uh we share a investor with L Labs Nat fredman uh he's one of the most popular AI investors and so he connected us Matty and me um and um you know we are customers of lemon Labs we perplexities voice features our podcast uh discover daily podcast is all powered by L Labs voice and we work very close see with them on voice to voice um for gro we got connected through um chamat he's an investor in Gro uh so so I think like M was similar like you know I knew some of their Founders before and um and also like we use their models like a lot of people here so there's like some Synergy between what we do at perplexity like if Gro works out we'll do influencing on gro we use mrr's modeles we use 11 Labs of software so there is a lot of synergy between like my investments and how perplexity can make use of these other companies of services or products really I just want to point out how mind-blowing like all of what you just said is all in the last 20 months that's right the founding of the company the Unicorn valuation the Investments everything you just described that is completely unprecedented in world history to be moving through life at a pace like that so just to and I know we have 30 seconds left um what would you share as top three sectors or even top three startup ideas that folks should be thinking about here I think on device would be really cool uh I'm I'm learning more about it myself um somebody was telling me yesterday that uh if they were to start a company today uh they would really really work on a 7 billion parameter model that's as good as uh clot on it or you know the somewhere like 3.5 to for gbd 3.5 to4 in that Spectrum capability um and that model could let you like control the phone um like like be able to control the OS and like that's the sort of company that you like you know you would expect open AI or anthropic to not overly invest in because they want to go after the most General extremely capable reasoning model that lives on the cloud and like can do everything but this is the sort of model that will help you build a lot of value for every customer every every consumer can just use it uh some of you might have seen mkb criticism of Humane chip Humane pin uh while it was harsh there's one major point there the latency is very poor and that can be only addressed if the models are living on on on device yeah fantastic absolutely an honor pleasure thank you thank you for coming
Info
Channel: Forbes
Views: 67,643
Rating: undefined out of 5
Keywords: Forbes, Forbes Media, Forbes Magazine, Forbes Digital, Business, Finance, Entrepreneurship, Technology, Investing, Personal Finance, AI in search engines, AI, artificial intelligence, how do search engines use AI, the future of search, the future of search engines, Perplexity AI, perplexity, AI of the future, the future of AI, how to make money in AI, how to make money from AI, AI entrepreneurs
Id: w8M76fuyn8o
Channel Id: undefined
Length: 25min 9sec (1509 seconds)
Published: Mon May 20 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.