Setting Up BEST Local LLMs for Obsidian AI With LM Studio

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hey Mike here today we are going to go over a specific local obsidian AI called stable lm2 it's their 1.6 billion parameter model which is specifically made to run on most modern machines that don't have a lot of Ram or GPU power I highly recommend at the very least 8 GB of RAM in order to run this model efficiently if possible 16 gigs of RAM first things first you will want to head over to LM studio. once you are on their site you will see that they have three different download links one for Mac one for Windows and one for Linux unfortunately anyone that's running on an older Mac that is using the Intel processors you are out of luck here this is only working on the M specific devices M1 M2 M3 and a friendly reminder to anyone who already has LM Studio downloaded they just released their latest version which has a lot of cool new features and specifically their search page is very much improved I get these questions a lot within my YouTube videos so this is a perfect little frequently asked questions piece what are the minimum hardware and software requirements so as I just mentioned this only works on Apple silicon Mac M1 M2 M3 and you also need to be on the Mac OS 13.6 or newer when it comes to Windows or Linux you need a processor that supports AVX 2 which is typically the newer PCS and as I just mentioned in the beginning of the video it is recommended to have at least 16 gigs of RAM as well as if you're using a PC instead of an apple silicon Mac 6 gigs of vram or more is also recommended another important piece of data to understand is of course does lm Studio collect any data the answer is no one of the main reasons for using a local llm is privacy and LM studio is designed for that your data remains private and local to your machine music to my ears anyway let's let's continue once you have it downloaded all you have to do is hop into this little search bar here and type in stable lm2 no spaces then hit go or just click enter and you will be hit with this search result page again if you've used LM studio in the past this will look new to you the way that I have it set up is that I sort things by most likes and then I filter and make sure that the compatibility guess is selected so that that way I have a pretty good guarantee almost that the model that I will be downloading will work for my setup one thing to keep note here is that you no longer upon hover have these little widgets tell you recommended or not even if you go to the one that I've downloaded in the past if you remember from the previous video that I actually did download stable lm2 but unfortunately back then it did not work at the time now of course the reason that I'm making this video is because LM Studio updated things so that it does work that's that's why we're going through the whole testing process but anyway before you would have these little signs whether it's a recommended download or not so the question now is which one of these do you download and that's why this little addition is so perfect because you can really read up and remind yourself which one you should download let's go over this real quick all of this really boils down to this part right here so picking the best quantization level typically involves making various tradeoffs between file size quality and performance higher quantization bit counts four bits or more generally preserve more quality whereas lower levels compress the model further which can lead to a significant loss in quality choose a quantization level that aligns with your Hardware's capabilities and satisfies the performance needs of your task long story short anything that's above a Q4 which starts about let's see right here Q4 forbit quantization it says over here that higher quantization bit counts forbit or more generally preserve more quality anything below that so all of these models here it will compress the model but it will also of course lead to a significant loss in quality so you can see the file sizes on the right hand side here right next to the download button of course with something as the lowest quantized model here Q2 it's only 694 MB but if we go to the high end of it this Q8 it's more than double now with that in mind I will be downloading the 1.75 gig model there's really no reason for me not to especially because I know that I can run 7 billion parameter models just fine because I have a 16 gig here let me show you I have a MacBook Pro M1 with 16 gigs of memory and because I can run 7 billion parameter models just fine that means I can also run this 1.6 billion parameter at the Q8 quantization another model that we will be testing is the neural beagle model so just type that in and it should be if you have it sorted by most recent the first result by M labone 39 hearts at this current recording time anyway you are going to want to head on all the way down to really simplify how I select these models and which ones to download I just check whichever one says that the full GPU offload is possible now again these are 7 billion parameter models and as you can see right here I have an estimated total of 16 gigs of RAM let me show you what happen happens if I try to for example download a 13 billion parameter model let's just type in 13B and I will select a 13B model it doesn't really matter which one I'm talking about here I'm just showing you an example so here you can see right off the bat that these are showing up for me as being full GPU offload possible so these should run pretty decently on my machine when we get to this middle range here it says GPU offload may be possible so it says this model might fit partially in your gpus memory this could often considerably speed up inference which of course that's a good thing so if you really want to push your limits I would say downloading the GPU offload maybe possible ones is still a decent idea now if you get into this red danger zone it does say likely too large for this machine one other thing you have to keep in mind is that if you are not using the full model file which probably means 8bit quantization you are not using the same quality as for example those that are tested and benchmarked on these various leaderboard websites I mean let's be honest this one being 15 gigs is going to perform much better than let's say this 6 gig version because again you are switching out performance for size so always keep that in mind when you are downloading these models but again just to recap these are the models that I would recommend you download for both testing purposes which we will of course conduct the full testing across all of these models in tomorrow's video and also because it's just good practice for when we really get the good models that are open source for example the upcoming mistol upcoming llama 3 models those are really supposed to break through where we currently are which is weird to say because no one could have imagined where we are today with the current lineup of highquality models let's say 6 months ago so if we look ahead a little bit what's going to be on the market when it's June July of this year I mean it's just definitely something to look forward to anyway as I mentioned the first one that you want to have which we discussed in the beginning this is just a quick recap it's the stable lm2 no spaces just type that in and hit enter it should be by second- state that is the one that you're selecting The Zephyr model to be specific and just head all the way down to wherever the last green available option again if you have a somewhat modern computer all of these should be green because these are just 1.6 billion parameter models download it mine's already downloaded of course and then we will head on to the next option now let's talk about the 7 billion parameter model that I have been using as my daily driver for things it is the neural beagle model just type in neural beagle and this one should be the first option that you see M labone head on all the way down make sure that you download the best one that you can for your machine and last but certainly not least we will download another 7 billion parameter model but this one is going to be uncensored what you're going to type in is Dolphin 7B DPO laser and then hit enter and you'll see this specific model dolphin 2.6 mistol 7B DPO laser by the bloke it was posted January 10th and same thing head on all the way down to the last option that is green and download the that model now after you have downloaded all three of these models head on over to this little folder icon here and this is what your screen should look like you should have three separate versions of three different models so first things first for the stable Alm version you want to select the Zephyr preset so make sure you click that for that and for the other two you will just choose chat ml because I found chadl works best for these two models and yeah that's it for now so I'm going to stop the video here because I don't want to go too long it might get a little bit boring so I will provide all of the testing and go through all of your questions I've read and noted down all of the questions that I found on the YouTube video that I posted yesterday and again even in today's video if you have any questions for me to run through each of these local llms while we're doing all of this within obsidian let me know in this video in the meantime if you of course want to play around with these models feel free to hop into this little chat icon in LM Studio select whichever one for example let's select the stable LM here and type in what is the obsidian note taking app and I mean literally just talk to it it will start writing out and you'll see how much it knows about obsidian and yeah so far we have quite a lot of good questions here and yeah I've added all of these to the already quite lengthy test that we will run for each of these LMS as always thank you so much for watching a special thank you to all of my patrons both on patreon and now on YouTube Remember ask away anything related to productivity note-taking artificial intelligence and of course obsidian and I will reply to you as soon as possible make sure to like And subscribe and I will see you in the next video

Info

Channel: SystemSculpt

Views: 9,569

Rating: undefined out of 5

Keywords: Obsidian AI, LM Studio, Local LLMs, Artificial Intelligence, Obsidian Notetaking, Productivity Hacks, Automation, Programming, AI in Obsidian, SystemSculpt, Tech Tutorial, Obsidian Plugins, Obsidian Setup, Knowledge Management, Obsidian Customization, AI Tools, Obsidian Workflow, LM Studio Tutorial, Advanced Note-Taking, Tech Tips, Note-Taking Apps, Obsidian Extensions, AI Automation, Obsidian Features, Obsidian for Tech Enthusiasts, Obsidian Mastery, Digital Productivity

Id: 7OcwwYtKsec

Channel Id: undefined

Length: 10min 52sec (652 seconds)

Published: Tue Jan 30 2024