Okay Rabbit R1, You Have Our Attention!

Video Statistics and Information

Captions Word Cloud
Reddit Comments
we have one more thing to talk about surprisingly the entire last section of the Pod we've just we've dedicated just one thing because it's captured our attention so impressively that we have so many thoughts about it now it was initially like teased on social media this is I'll just say it's called the the rabbit right so this is this AI device assistant thing I think it's called R1 sorry the company is called rabbit and it's we didn't know what it was called but it was some rabbit AI thing and now we know it's called the rabbit R1 which is like rabbit rabbit one R well it's like the yeah the the central phone ph1 but that was phone pH yeah whatever it's R1 the chip in division Pro is an R1 no it's the M M2 plus R1 that's the other one I forgot about that one R1 what is the R1 oh the R1 chip to process visual data we're off topic anyway it's the it's the rabbit R1 and what the thing is is it is a piece of hardware that has an AI built into it that can do lots of things based on natural language input lots of helpful things lots of things that would normally take you a bunch of Taps on a smartphone lots of things that you might want to perform on a regular basis it has a microphone it has a speaker it has a camera built in it has a little screen on the front and it's a little rabbit AI that does all these things for you now my first exposure to it and I think yours was too was a video of a bunch of people holding a blurred item cuz it wasn't revealed yet giving it prompts mhm uh I'm going to actually play that video so you guys can hear some of the uh insane things that it was asking this thing to do order me an Uber and find me a good podcast to pass the time oh and tell everybody that I might be late that was delicious check the fridge and order the ingredients to make that again tomorrow create a route that works with my goals then start the best playlist to keep me motivated watch what I'm doing here process all my new photos today just like this find us a nice restaurant near here then get us there so you get the idea it's very focused on natural language input and then just a super helpful thing that carries out the task for you now I've been I've been thinking about like AI assistant a lot lately just because like we're we know Google assistant is going to get better we're waiting for sir to get better we already know Alexa and stuff are getting better and I think the idea that we want them to be able to be good at is everything a human assistant would be good at right because most people don't have a personal assistant so if your phone can be as helpful as possible and you can just tell it what to do and it can do it that would be the ideal solution but they've never quite gotten there you have to speak in a certain way when you wake up and talk to Alexa you have to say uh tell me what the weather is in this city for this day a lot of things you have to be specific about and know how to prompt it in order to get it to do exactly what you want yeah and a lot of other things it just straight up can't quite do and so the idea behind why I think people are so compelled by this is wouldn't it be cool if I could just talk to it like a human and it could just figure out what to do with AI and do it and I think that's super cool I also think it's super difficult and so I'm curious how well this thing is actually going to work in real life and now now we have a 25-minute keynote with a whole bunch more examples and it's it's very much you know an apple style Steve Jobs esque like black background co-founder on the screen telling me smartphones are $700 and you have to tap the screen eight times to call an Uber the thing we just made it's $200 and you just tell it to call an Uber and it does it for you and the the proposed solutions to all the problems makes perfect sense I'm just very interested and probably a little skeptical about how good it will be at doing these things without just turning into a phone so I'm curious what you guys thought when you saw the keynote and the announcements and what this rabbit AI could actually be in maybe best case versus worst case yeah so the promo video that you showed um especially because they were not showing the actual device and showing how you were going to interact with it in a touch way uh was kind of funny cuz she just says call me an Uber she doesn't say where which is kind of a problem thetically she should be saying call me an Uber home yeah or something like that and I was just going to say at the end she says and tell everyone I'm going to be late imagine telling tell everyone just like contact list cont I'm going to be late and I think so with that even I think the Believers would say it's got to be context aware it knows where you are and it knows that you just finished some appointment and it knows you're going to go home so when you say call me in Uber it knows your location from GPS and it's going to know to call you an Uber home and hopefully gets it right and when it says tell everyone I'm going to be late it knows about your next appointment at home and it knows who's there and so knows to tell like it's going to hopefully process all of this stuff and all this context and get it right hopefully it just seems like so many things it has to get right yeah they call this a large action model because with large language models uh the foundation of that is a Transformer and the best part about a Transformer is that it understands contexts of words so you don't have to use like extremely specific words to trigger the assistant in the right way and interacted with it in the right way and if you've ever like had someone who's not super tech Sav each try to interact with a virtual assistant usually it says sorry I don't understand because they don't use the right trigger words exactly so theoretically the best part of a uh large language model that you would interact with with your voice is that you could basically say anything and then it would be able to do it now the next step of that is what rabbit's trying to do here where you say things to it it understands the context of what you're saying and it can act on those context clues yeah um that's clearly a very big Leap Forward because right now all you can really do with these large language models is like talk to an assistant and have it chat back to you talk back so actually being able to interact with your apps is very interesting and the reason that in the commercial they had her go Clum moover oh and um like they had her do that on purpose because it shows you that this is the way that humans actually interact with each other and you don't have to use like very specific keywords to make it work the way that you plug in your apps is really interesting there's a whole setup yeah you can log in so it has a little LCD screen that kind of looks like a little phone it kind of looks like a um what is the the play dat the play dat it's made Byer right and it's bright orange designed and it's got this swiveling camera and all this beautiful it's beautiful but it is the size of a phone yeah yeah well not a whole like it's like a Galaxy Z flip it's big enough to be like it like the Humane pin whole thing was that it's like a small thing that connects to you this AI Hardware is like it's not your phone but you're not they they specifically said this is not getting rid of your phone but it still is a Phish sized object that you're carrying around probably exactly how you're carrying your phone around right the other so just wanted to throw that out there that is like yeah it is about phone size Marquez you're going to need more Pockets yeah run out of pockets now yeah I mean I have to give them some uh credit like there are things about this that actually kind of surprised me and that said okay that makes a lot of sense uh he kept mentioning that the phone was supposed to be a thing that saved you time but actually at this point it's become a thing that just distracts you I have so many thoughts on that yeah I mean I don't think phones are the problem I think that addictive apps are the problem and things like social media apps and maybe either just using your like reduced screen time thing or not having your social media apps on your phone could do this but the big question here is like do you need a dedicated Hardware device or could this one be an app or two two won't most of these features probably get added to Google assistant and Siri within the next year within the next year yeah he was very adamant about saying that apps like we've had been in this ecosystem that it's app focused and that using apps always consists of multiple touches and different drop- down menus and all sorts of different things where if you can just speak what you want to do it's much easier to do that then to know exactly how to use every single app which kind of feels like you know our generation the generation under us is very good at controlling apps so it's one of those weird things where this sounds like it's trying to we've talked multiple times before about large language models being great as assistance because then people can talk to it with normal words and like somebody troubleshooting something doesn't need to call us they can just ask normally this feel that kind of sounds like they're what they're pointing to but it's also new technology that yeah I don't know but um where do I start there's a lot there so a lot of people were saying if this was an app nobody would care about it so they had to make it both a hardware product and they had to make it really cute and fun and that's why they got teenage engineering involved which makes sense I mean Fair they got I was looking into this company they secured their series a funding round in October two months ago Ser or three months ago they were they secured their series B funding round in December one month ago so this is an an extremely new company that's breaking into the scene really really fast it's like we we were joking yesterday it's like literally the opposite of the opposite where Humane like Hypes their product for two years kind of pivots the point of their product because they were making it before uh like chat GPT even came out and then they added AI to all their stuff afterwards yeah uh whereas this is like it's $200 versus $600 it no subscription there's no subscription required although it does take a SIM card tray so if you want it to have access to uh stuff all the time you do have to pay for for another cellular plan which sucks MH um or I guess you could tether off of your phone or use Wi-Fi more often but you're going to want to be able to use it whenever wherever yeah so yeah the Humane pin is also a projector under your hand which is a whole bunch of other a whole whole buch the hilarious part about the Humane AI pen is the rabbit ai's entire thing is there are too many Taps involved to do something on your phone and the Humane AI pen takes even more gestures to do a single task that's fair okay so I I I have a bunch of things to talk about I think the first part is the part where apps take too many Taps is debatable I think if you use the example of calling an Uber for example yes there are a bunch of different apps you can use to call a ride to go somewhere or to do anything and the setup process which we're going to touch on is you just connect your rabbit to all of your accounts your Spotify your Uber your Amazon your eBay everything so it just knows with the AI where to go to do every task so you need to call an Uber on your phone that's open up your phone open the Uber app tell it where you want to go hit confirm hit Uber X confirm yes I want to go there and it goes to the GPS location of your phone maybe it's five taps or something I don't know and the idea with this best case scenario is you open it up and you just say call me an Uber home and it's it's pushed to talk so it's not listening all the time so it's nice you don't need a special keyword you just talk to it like a human call me an Uber home and it knows based on your GPS location where you are where home is and it just finds an Uber and you just say confirm once and you're good so it saved you Taps mhm and it saved you thought because you just talk to it in a natural way but what are the chances Uber just improves their app or Google Assistant improves with some API plugs to the point where in two months it can do exactly what rabbit can do extremely high chance I think that's very likely yeah so that was the first thing that came to mind but then the second thing is is multimodal AI is still really cool they did this demo of like you pointed out a fridge and hey what can I make with these vegetables that I stacked up perfectly in the front of my fridge perfect fridge the fridge that only uses the front 4 in of the each shelf nothing drawers everything's everything's in the front what can I make with this all vegetables and eggs and they made an omelette yeah you can make you can always make an omelette trust me yeah but the the multimodal AI thing is still really fascinating because I think in general that's where we're all going and prompt engineering has become less and less of a skill and more just can we make it as easy as possible for the for the user to just talk to it like normal and get what they want out of it I think that's all great I just think this could be a it could it could have been a $50 app I think you're right and it would have gotten less attention but it would have been just as useful if I just open up the rabbit app and just say call me an Uber and it do you remember when SoundHound came out before like Google Assistant was any good and it was way better better than Google Assistant for the time and then eventually it just kind of got worse and worse because it was better at those cont context clues um and then eventually it lost relevance yeah because the big players got into it and when I see this I see like this is a small device that has a camera and some processing power it has a 2.3 GHz mediatech processor 4 gigs of memory 128 gigs of storage if I could find some that you still have to pull out of your pocket and you still have to have your phone you still have to charge and has all day battery all day battery it has a camera if I'm trying to think of something that has internal components that with better specs a better camera better screen that I'm already taking out of my pocket you're already charging I could do all of this yeah if it had an app to tell it to do all of this if this could be this is just or if they just update the existing assistance that's the thing like Google Google made a big push in like 2021 where they added these API plugs for apps so that you could do things with Google Assistant I remember they sent me try something Qui they sent me those like self-lacing Nike training shoes along with like a My Fitness Pal like thing because they launched this plug-in with my fitness pal where you can say hey G add this to my fitness palette and it just does it right so they already have those kind of things but the the missing link right there was the the better natural language processing through Transformers which will get added to Google Assistant like within a year yeah this like and this isn't to take away from some of the impressive things that they did software-wise and AI wise like the uh L versus they showed pretty briefly I wish they explained it a little more but just kind of like how it can take action in different apps and how it's connected to that and then also that they did the like watch me do this mid journey in Discord which was really cool and you can train it but all of that could be an app on better Hardware that you own already because if it's not replacing a phone this can't make phone calls as far as I can tell nobody I think someone said it can make phone calls then it's weird calling it not a phone because it is kind of a phone like this is just a different phone but you don't scroll an existing UI yeah that's yeah it's not an app-based uh operating system which is what they were like very adamant about remember when we got new launchers every couple months this would have been a sick launcher this would be a sick launcher like on Android phone if it's just a rabbit launcher and you open it up and it's just a full screen rabbit which is what they're doing it's a full screen rabbit and you talk to the rabbit you're like call me an Uber and the rabbit bounces a few times and it's like all right I did it yeah I used I used the Uber app behind you and I called you an Uber yeah and now it's working I want to try real quick you want me to try and call I just want to see if Google assist how how close does Google assistant get to just doing what this does just call call me an Uber home call me an Uber to New York airport you'd like me to call you by the name and Uber to New York airport clearly the context clues are not I'm so glad you guys laughed cuz I just thought it was right and I was about to hit yes call me and call me an Uber to New York airport hi an Uber to New York airport I'm Google Assistant okay so that did not work yeah okay wait regarding the R1 I do have two things I want to say yeah one last year which was only like two weeks ago or three weeks ago I said that the trend of 2024 was going to be colorful things and here we are it's like a bright orangs only one color the best part about it I know it looks great the second thing is I feel like just with this and all AI stuff in general it assumes a LEL of trust that I don't have with like my phone right now like I will call an Uber and still look at the map and make sure this guy is coming to me I want to just for if your level of trust is in privacy things because that's what I first thought well okay that's different but what you're most saying is execution execution also privacy was a whole other thing but I just mean like will this thing work yes most people don't even trust their technology to do the thing they're doing I also think we have a natural tendency to shop a little bit yeah like yes calling Uber is fine if you are not price sensitive to the ride and you're just like get me home I don't care make it in Uber but some people also have lift and they'll check both and they had this whole segment in the thing where it was like plan me a trip plan me a vacation to London let me see some cool sites and let it be a relaxed schedule and it's like all right I've got your flights your hotel and everything I'm like if you trust it that much I really don't want to just go to the first option I'd rather shop and I think that's a natural tendency if you were able to just say call me a car home make it the cheapest option I don't care about comfort and it was able to search Lyft and Uber and then figure out which was cheaper and then call it and then you get a one star Uber ride home cuz it's the cheapest yeah there's a lot of stuff shopping around you kind of want to find that balance an airplane I would never trust it just be like get me a flight there and it's like are we it did say make us all sit together so it's like are we all sitting in the back of the plane in the cheapest flight and I have no carryon luggage or like yeah you're next to the emergency door yeah right next to the plug so my phone fly the very interesting thing about this product is that it is a universal kind of it's a universal product that can access any app and do any theoretically so we talked a little bit about this earlier but there's this training mode where effectively uh it's on your computer it's like a web-based portal and you show it a website that you're trying to interact with or like a service that you're trying to interact with and you're basically training the AI what it means to interact with that service and the way that you would interact with it and then once you're done training it theoretically you can just use prompts on your phone or on not on your phone maybe in the future on your phone on the R1 and then it's able to kind of go through that process for you so in the demo that they played in the introduction video they it showed him like creating an image through mid journey through Discord and it recorded the session and then when he says create this thing it just pulls up the image is on as rabbit it goes through the whole process like macros it's like a macro so in the off chance that one of the 20 apps that they support off the bat isn't one of the things you want to use like let's say they only have Spotify and apple music and there's a song you want to listen to on Soundcloud well then you have to train it for SoundCloud right do the UI on the computer make sure it knows what it means to go to SoundCloud to look for something and then when you say R1 play me that song from last night on Soundcloud it will know what that means and how to do it cuz you trained it I do like that as a concept because Google in June 2023 released Google Assistant for developers where developers can plug into the Google assistant so that you you can have an open API or you can like have it access your app and do different actions but apis for developers are famously like this happens all the time Razer will release some crazy new RGB API and nobody uses it in their game or their application like they have a couple of partners that do it at launch and nobody else does it so being able to theoretically access anything by just training yourself is a very cool concept yeah now this is going to be hilarious to see how this actually works in real life because if you train something to do it and then you're out and you have you need it to be able to do that and then it just doesn't work then you got to pull out your phone and that's a problem or if the UI changes on the site or something changes like if there's one thing I know about macros is that you spend 45 minutes setting it up and then the next week this the website tweaks something a little bit in the background it breaks everything you built right that's just a thing that happens so yeah or the web page is scrolled down by 25% and so it clicks on the wrong thing I will say that feature is the reason it was in my cart for like an hour and I was debating buying it for so long that feature could be sick I think there's like kind of two ways I see this going and and one is this turns into an app on your phone and I think this company gets bought out and some other company uses it in that so I think that's the the way it goes where it makes a ton of money the way it goes where people buy this R1 Hardware product is using the learning Thing by figuring out very specific tasks to do that makes their workflow a little easier cuz I don't think $200 is the worst price the the product looks cool and if you can figure out to some very specific thing that you nail every time Adam brought up remember the Spotify carplay thing that was like yeah really dumb and no one really wanted it but then people started finding ways to like attach it to their computer as they want yes side screen like I think people are going to find some like hacky really cool methods using the learning manual or module and this just being a cool looking device yeah some people are going to do some cool stuff with this is a company that actually should have a uh store where you can upload your actions that you that other people can just download because that would be really cool if other people could just train it for you and then you pull it down my question was also going to be does this live on the hardware like what happens if I buy this thing I teach it a bunch of different macros I'm doing the thing and in a year or two the company goes out of business so they bought then what does it still work do I need to is it just a pretty paper weight at that point like well it's $200 which is not a small amount of money but I think when you compare to uh like the Humane AI pin they took shots did a lot they took a lot of shot did directly they showed it on screen yeah yeah it's that's a valid question I don't know if that's planned I you know the steam deck stream deck stream deck stream Deck with a bunch of buttons and you can you can map those buttons to macros this is like a tiny Internet connected beautiful stream deck every time we start a new episode I press a button and the new folders pop up on my desktop perfect file structure all that stuff but there's certain things also that I do that with a visual component like this would be super cool and I'm wondering if this would work if that teaching process can can actually work for people people are going to love it just for that one function and I think that's awesome people will spend 200 bucks for that uh edit the whole podcast yeah hit the button learn from me editing this episode and edit the next episode yeah yeah it'll be it'll be interesting to see if that ends up being the thing that most people gravitate to but in the meantime it is fascinating to see if it can pull off any of these other really broad tasks I think there's an analogy to like the Palm phone with this device because they don't want it to be they don't want it to replace your phone he says yet but it can do almost everything that your phone could do I don't think it can do messaging if it can't do messaging or calls then that's the problem with it I think is then you can't so like the one sorry to interrupt a little bit you can but like the one thing about the Humane pin that I'll give it credit for is like it still has that connectivity so like you I think they're trying to say throw your phone in your backpack like the pal phone get you don't need all the things on your phone when you're walking around you can get some basic things out of a small thing on your but this is like you still need your phone in your pocket because there's still going to be hell even ordering an Uber you might need to pull your phone out to know what address you're going to so now you're looking at your phone and speaking into a thing in your other hand well this has a screen though and it can probably show you but can you like search up I guess you could say hey look in my email for this address I got like that yeah this assumes there's a digital Trail for everything like what if you just told me about this new pizza place down to block and I'm like oh take me to this new pizza and it's like what are you talking about it's got to remember everything it here sees if it's poking into your email your calendar everything for it to be fully context aware it needs the context of all the things the human assist they never mentioned phone calls or text messages right I guess it said they never show they said tell remember someone talking about that but so then this is just a phone a small a smaller cheaper less powerful phone with much better natural language and a much worse screen in camera I'd be willing to try to use this as my phone for a week here's the thing like I like the idea like if if this idea actually got built out and people found use in it and it and people could use it it I would like a premium version of this cuz this is like a cheap plastic it' be kind of cool to have like an OLED made of metal yeah I'm shocked it looked like an LCD definitely an LCD I would love an OLED I honestly thought you were joking and you were just going to waste all phone stuff I literally thought you were just joking when you were saying that cuz I thought that's that's basically what you said if I'm like I just wish this was a phone if I'm just yeah I mean I might just be coping right now if I could if I'm just like going to go to the city to have drinks with my friends and all I need to do is to be able to I mean I guess I use like Apple pay or Android pay to be able to Google pay to be able to it's it's now I wanted like pulling this up like pay this cashier 575 and then you like hold it next to the yeah okay that's the other problem that's the other major problem is that everything is voice based yeah that's a huge problem they do have a keyboard option that you can type in I think yeah and you have to press on it so it is touchcreen cu you have to confirm having to talk you want to do when you're people is very frustrating it's a lot of talking it's a lot of talking and listening yeah so it assumes like it basically assumes no social media and no messaging mhm um and the messaging part is the thing that is going to be difficult so it's supposed to be able to tell your friends are going to be late but if one of your friends replies tell everyone it will probably speak it out loud and then also show it on screen maybe but does it how would they reply if it's is it a phone number a separate oh Uncle messaging number it could be a data only Sim but maybe it that was not really detailed is it Android are you a green bubble probably I not if if you can teach it to be a blue bubble and then teach it to like scrape they specifically actually they made like a slight at beeper in the keynote which I think was a slight at beeper they said we're not they said we're not taking Advantage tricking their infrastructure tricking other companies infrastructure I was like wow they're taking shots at literally you can tell how recently all of this came together how recent the SLS are the funny thing yeah yeah it was yeah I don't know it seems actually fairly built out uh the product considering they only got funding in October but um it seems very interesting I mean we've been talking about it for like 20 something minutes like mission accomplished we have Twitter has not I've seen almost nothing on Twitter about this I've seen I've been tagged in nothing not paid for yet some people who are basically tagging me in the keynote being like you got you you got to review this this is interesting the thing is if Google Assistant adds us Fe or Siri could just do this then you just don't need another product I don't want to charge another thing you know that's the main thing for me it only lasts a day I don't want to charge another thing and I don't want to pay for another cellular plan just to have another device to give them credit it's been a long time since siries gotten meaningfully better yes but I fully agree ree it feels like Google assistant is on the doorstep of doing a lot of stuff this is happening iio we're going to see this product launch as Google yeah and it's going to be orange orange mode this feels like the prime question of product or feature yes I think so 100% yeah it's like um it's Clubhouse Clubhouse yeah this is the hardware Clubhouse wow that sounds so mean I'm sorry rabbit I don't mean it like that who yeah platformer feature thing is is I actually think there's going to be a really cool like modding community that does some really cool things with these later and it's going to be a super Niche and there's going to be a million Discord servers I'm okay with that I think it's cool and since it's cheaper that'll I think that'll be the best part yeah probably not what you're looking for as a company with probably hundreds of millions of dollars in capital funded but uh I think it's very smart of them to make it out of really cheap materials for a relatively affordable price because it's an un a untested kind of product that they don't even know if people want yet or if it'll work and so if they can get it in more people's hands and then the ecosystem gets built out then maybe they turn into a company where they can make more premium versions of it and stuff whereas Humane is going like Ultra Premium we spent a ton of R&D money on the projector you know and the rabbit still looks better than it yeah although it's still they are still a bit different there obviously are AI Hardwares that are directly competing with each other although yeah one has a screen one has a projector one's a pin one's a half of a phone I don't I have a question I'm confused the whole do you think the rabbit R1 runs Android I think it does probably it almost definitely does right it has yeah yeah so it is a green bubble it's green bubble but it's like what yeah definitely almost definitely but would you have to do it from yeah I don't know would you rather be a green bubble or never message ever you know what wild there's so many questions about this product and it comes out in March it comes out on Easter I'm going to try to review it I I want to try it I want to try every single prompt that was in the commercial and see exactly what it does connect all my accounts and just send it we're apparently getting review units fairly soon I want to try it yeah and not if they listen to this episode well I want to give them a chance for it to work well it could be awesome no I want to try it for sure I am delighted by the idea I just I'm there are so many on the inst that's a great way of putting it if the idea is so delightful and I would love for it to be true I'm just have a hard time believing that's going to be reality similar to the similar to the vinfast pickup truck at yes mhm I'm delighted by the idea yeah we'll see how dare you say that about we'll see how the $200 price tag makes all of these questions far more palatable yeah if this was like $800 everyone would laugh at it and no one would buy it but yeah well we shall see all right we shall see go back to that podcast I was just watching and hit like And subscribe and by the way leave a comment telling my friends to watch who are you talking about my friends what friends touche confirm question [Music] confirm
Channel: Waveform Clips
Views: 679,720
Rating: undefined out of 5
Keywords: Waveform, WVFRM, Podcast, MKBHD, Marques, Brownlee, Andrew, Manganelli
Id: eAUNvovwSlQ
Channel Id: undefined
Length: 31min 21sec (1881 seconds)
Published: Fri Jan 19 2024
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.