OpenELM: Apple's New Open Source LLM (OpenAI Competitor?)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Apple's been on a bit of an open- Source tear lately I already covered pickle and seems like a lot of y'all love that the idea of an apple based Json alternative now they're challenging something a little bit bigger they're challenging all of the open source machine learning models that we've seen published from everybody from open source companies like mol to Facebook and meta themselves with llama there's a bunch of things here I just never expected to see be it a hugging face account from Apple or an open source GitHub Revo actually acknowledging AIML stuff historically Apple has had AI in their tools and technology and their operating system certainly their Hardware they went out of their way to never say AI in any of their conversations yet here we are with apple making two unexpected drops just sharing a bunch of cornet training deep neural network stuff yeah this is this is not what I expected at all but here we are we have a lot to talk about and man dropping other important links ml explor is apple as well this is crazy the speed at which Apple went from like quietly being a Powerhouse in AI stuff to loudly being is very unexpected this was end of last year okay I didn't know that they were doing all of this already good to know this already existed this video is a llama one version 1 7 billion model implemented in mlx in running on an M2 Ultra okay I knew about most of this that you could run llama on Apple silicon I didn't know that it was through things Apple had actually open sourced that's cool but that's not what we're here to talk about today we're here to talk about the fact that Apple has a hugging face account for those of you all who aren't deep enough in AI to know what hugging face is kind of like code sandbox for AI stuff where you can play with AI models and training stuff in the browser obviously they're running a bunch of Hardware behind the scenes you have to like spend money renting a GPU to try it all but this gives you a single place like a Sandbox to actually play with all sorts of different models for all sorts of different things I've even generated assets for YouTube thumbnails using hugging face in the past so it's cool to see them not going and inventing all their stuff in a corner but actually throwing the stuff to they're building in the place we're already hanging out which in this case is hugging face we have their open Elm pre-trained models which if I click these should actually let me go to GitHub for them too the Apple sample code license let's read this license I'm actually curious I love that Apple refuses to use MIT or some if they wrote their own this Apple software is supplied to you by Apple Incorporated in consideration of your agreement to the following terms yada yada if you do not agreee with these terms don't modify Apple grants you a personal non-exclusive license under Apple's copyright to use reproduce modify and redistribute the Apple software with or without modification in Source Andro binary forms provided that if you redistribute the Apple software in its entirety and without modifications you must retain this notice and the following text and disclaimers actually a pretty generous license this is like an apple flavored MIT I'm into it open Elm is the new open source model that they're giving out as Apple how crazy is it that Apple released a product starting with open that's a language model thing this is I did not expect this at all we introduce open Elm a family of Open Source efficient language models another thing I have to call out it's very rare for Apple to directly site the individuals who did a thing at Apple I have the example of forever ago where um Apple filed a poll request to OBS under this weird uh developer experience GitHub the developer ecosystem engineering yeah this really weird like almost spooky account coming from Apple that was used to anonymously contribute to things like OBS not Anonymous in the sense that we didn't know was Apple but Anonymous in the sense we didn't know who at Apple did it because this was done by Apple as an org this account was weird the pro profile picture before was this this weird black super low pixel lowquality Apple black background thing they've since changed to be an M3 Max logo which is weird but this is Apple's vague Anonymous GitHub account that they've used for contributions in the past so going from that to just straight up naming the individuals who did the thing is a huge shift for how Apple Works yep so open Elm uses a layerwise scaling strategy to efficiently allocate parameters within each layer of the Transformer model leading to enhanced accuracy we pre-trained open Elm models using the Cornet Library we released both pre-trained models and instructions tuned models with 270 million 450 million 1.1 billion and 3 billion parameters this is a weird variety compared to like llama 3 yeah so it's 8 billion 70 billion parameters are the two models that meta put out with llama 3 versus 270 mil 450 mil 1.1 Bill 3 Bill very strange spread of options here for those who aren't familiar with this concept parameters are pieces of information that are used by the ML and like training process to create the the model of what data goes where in the end an llm is effectively just autocomplete where you put in a word and based on the statistics of an infinite amount of information that they have access to what's the next word most likely to be and it does that over and over again to generate results so what's crazy here is the inputs that they're using to generate these models to to make these mappings vary from 270 million inputs to 3 billion versus what meta is doing with a 70 billion parameter model interesting to see the variety here that said these smaller models tend to be much smaller outputs where like a 70 billion model might be hundreds of gigs a 270 million parameter model might be like a gig or less so llama 3's 8 billion parameters they somehow got that down to 4.7 gigs for 8 billion parameters versus 70 billion parameters is 40 gigs so obviously this takes much more time to run is way more data you can't even fit that all in memory for most people versus the 4.7 gigs you can throw that in memory so these as simple as the difference might seem behave entirely different and the 8 billion parameter model is going to run much faster than the 70 billion parameter model so depending on what you're doing make sense to use these differently I am curious to see how long it's going to take for Apple to sneak their stuff in here because it seems like they want to be part of these communities judging by what they've posted here our pre-training data set contains refined web D duplicated pile a subset of red pajama I don't know what any of those things are I'm not that deep totaling approximately 1.8 trillion tokens in saying that they have that much data that they're using for things nuts we provided an example function to generate output from open Elm models loaded via hugging faces Hub in a generate open el.pie the release of open Elm models aim to empower and enrich the open research Community by providing access to stateof the art language models trained on publicly available data sets these models are made available without any safety guarantees consequently there exists the possibility of these models producing outputs that are inaccurate harmful biased or objectional thus it's imperative for users and developers to undertake thorough safety testing and Implement appropriate filtering mechanisms tailored to their requirement yep the usual disclosure there's an important piece here that I want to talk more about though that this is trained on publicly available data sets one of the interesting things about Facebook versus a company like apple is that Facebook has a shitload of data that they arguably probably shouldn't Facebook knows so much about you Apple doesn't they've leaned really heavily into privacy so much so that it's why Siri sucks if you've used an Android phone and an iPhone it's pretty apparent how much better Google assistant is than Siri and a lot of why is the amount of data they have access to about how users use the phone how these responses go over what things the user's done recently that might be relevant to the thing that you're asking for there's just so much more data that Google has access to in order to make better recommendations whereas Apple just doesn't have that information they're relying entirely on their ml as such they've leaned heavily into really powerful ml chips on device because they don't want to have this data on their server because they don't have the data on the server that comes at the cost of the quality of the data they have and the amount of data they have just aren't there so as such they have to train on public data because they don't really have much else which is interesting that uh they're working so hard to lean into the open side but also Mak sense cuz they kind of have to do the way they position themselves it also means things are going to be more private and trusted because you can go look and use all of these things yourself let's take a look at the GitHub for coret though this is one of the most interesting pieces it's a library for training deep neural networks also has 1.1 th000 Stars already even though it came out a little bit earlier today cornet a library for training deep neural networks up until this point most of the open source models have an open source model not the actual training things so the way the model was created is an open source the data might not even be open but the model that they result in is it's almost like imagine that a bakery gives away their cake for free but they don't tell anyone how they made the cake Apple's now showing you exactly how they made the cake very interesting cornet's a deep neural network toolkit that allows researchers and Engineers to train standard and novel small and large scale models for a variety of tasks including Foundation models like clip and llm object classification object detection and semantic segmentation below is the list of Publications from Apple that use cornet that's a lot of things that Apple has published about this stuff that I never would have guessed pseudo supervision for the visual enhancement a fast hybrid Vision Transformer using structural reparameterization they're all in on this holy I I knew Apple was like a closeted AI Powerhouse but seeing it laid out in front of me like this is kind of insane I I did not expect this how's their stock doing ah yeah suddenly this makes much more sense apple does not want to be left behind Apple always was positioned and such that like just simply due to the nature of Apple silicon they've had a huge competitive advantage in AI stuff but they've kept that really quiet normally they don't do these things so publicly but due to the fact that there is so much Buzz around AI the like concept and idea rather than the actual functional uses it seems like they've chosen to just let the world know hey by the way we're good at AI too in fact we're one of the best cuz this is nuts how recently did all of these things come out so like none of this is super new Apple's always quietly had these types of crazy research and things going on like another fun industry kept secret that's not that secret is that Apple helped design the USBC standard they're one of the biggest contributors to it they probably contributed the port itself apple is not scared to contribute things to the greater ecosystem they just don't do it without reason so for them to make this GitHub repo and make all of these reasons it's clear they're trying to signal to the world hey by the way we know what we're doing here here's what we've done if you want to join along and try it out too I just did not expect this to be as absurdly large as it is they're even using get lfs because the repo is so massive and of course good old Jupiter notebook I never expected to see a Jupiter notebook instructions set in an apple repository more fun points being made by chat Apple's also heavily involved with matter in the Internet of Things stuff as well as qi2 being magsafe they also the chi Qi whatever you want to call it charging standard they were very involved in and they pushed it really hard they easily could have made up their own when they added wireless charging but instead they used the chi standard and then they pushed the chi standard they've been forced to use RCS the SMS alternative in the EU but their hesitation is that RCS has no encryption standard rather than paying Google to use theirs and let them have access to all that data they've actually proposed an open standard for encryption for RCS and they're trying to get that in before they have to comply Apple's generally pretty good about these types of things I've just never seen them go from zero to 100 quite this fast it's weird also yes I know USB type c is an inverted lightning cable it's the design of USBC is fascinating you cut consensus expense just making it a torrent yep yeah the fact that so many of models can be just torrented is insane I did a video about mol dropping this before nobody cared the mol Twitter is such a wild ride they did their original announcement in June last year they didn't say anything for a while then in September they just dropped a magnet link if you don't know what a magnet link is you're not into piracy enough this is how you torrent things you drop this in your torrent tracker solution and you just get the file this is how they've released all their models they just tweet the magnet link and this one they forgot to give the uh confirmation so they went and added it after but like yeah Al LOL at the end and they like Banner image is word art it's something else it's surreal so like this has been the state of Open Source AI stuff for a while is this company meil showing up and just Ming their way to the top and now we have the furthest opposite which is Apple cornet with their actual Apple GitHub they're not throwing this on the developer experience or whatever they're putting this on the actual Apple GitHub and really partaking in this ecosystem I never would have thought I genuinely could not have imagined that Apple would come out this hard guns blazing to be a meaningful player in the AI open source space this just made the next WWDC significantly more interesting what do you guys think though should I spend more time covering this stuff I know I don't normally talk about AI on the channel but I have been playing with it a lot more so let me know in the comments if I should make more content about this stuff or maybe even play with these models in the future until next time peace nerds

Info

Channel: Theo - t3․gg

Views: 72,924

Rating: undefined out of 5

Keywords: web development, full stack, typescript, javascript, react, programming, programmer, theo, t3 stack, t3, t3.gg, t3dotgg

Id: tkZ-ajarTks

Channel Id: undefined

Length: 12min 26sec (746 seconds)

Published: Sun Apr 28 2024