Setup Tutorial on Jan.ai. JAN AI: Run open source LLM on local Windows PC. 100% offline LLM and AI.

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone thanks for stopping by today to check out this video it's going to be very interesting and I'm going to go through a very thorough walk through and Technical review of this software and how to get it up and running if you've watched some of my other videos you know that I use a Windows operating system and this will be done on a Windows 11 computer system with a pretty high spec because it was necessary in order to run this new AI software locally on my computer so it's called jan. aai and it is free and there is a ton of code that you get for free along with it it's all in GitHub it's open source they give you access to use all these different llms and it's all in a really great web interface and you can control it of course because it's all open source and code that you can change if you have the ability to so basically it puts AI right inside your computer system I can't say that I've seen anything this like awesome up to this point as far as having access to all this technology so it's very exciting and I hope you stick around and watch the video there will be parts of the video that where I scroll through all of the llms for example in case someone doesn't have the ability to run this then you can see what's in all of them and I'll put time codes in the video so you can skip around if you don't want to see stuff like that well with that said let's go ahead and take a look at the homepage of jan. a and I will point out out that you can install it using a program or you can install it using open source and I will be taking a look at the actual open source so here is the jan. a website and I do have it running in dark mode you can switch between them up here in the upper right so here it says bringing AI to your desktop open source chat GPT alternative ative that runs 100% offline on your computer and like I said I'll go through and look at their readme files and specifications there's a lot of steps that you have to go through and I wanted to show all of that in this video in case you're interested in actually setting this up so if we scroll down we can find out more about it well this here on the right is what it looks like once it gets up and running and you can change all these parameters similar to the open AI chat GPT you can customize different parts of it and ask it questions and change your models down here in the bottom left and they do say warning Jan is in the process of being built expect bugs but while I was testing it I did not have problems with it so it looks pretty stable and it's been under development for I think it was about 6 months or something like that and they say here their core team believes that AI should be open source and Jan is built in public and so here is the desktop app you can just download and install an executable on your machine and it does a bunch of things behind the scenes to do that however I was more interested in actually getting the open source code up and running because then I could customize it in any way that I wanted to and really make it my own so that's the desktop app the mobile app they say it's coming soon take your AI assistance on the go go seamless integration into your mobile workflows with elegant features offline and local first conversations preferences and model usage stay on your computer secure exportable and can be deleted at any time so here you can see it says it's open AI compatible Jam provides an open AI equivalent API a programming interface at Local Host 1337 7 that can be used as a dropin replacement with compatible apps and so here you can see what's been completed and what's partially done what the status of those are and it looks like they're also building an assistance framework but the local server that's what I will get up and running in this video I'll show you all the steps that I went through and some of the requirements and so forth so there you go that's an overview of their software and it it is up and running and it was very smooth process and there's other information up here if you want to look at the API reference various documentation developer information here I pulled up the architecture in their documentation Jan has a modular architecture and is largely built on top of its own modules Jan uses a local file-based approach for data persistance Jan provides an electron based desktop UI which does look really nice and I'll be showing that in the video and show how it works with different models it provides an embeddable inference engine written in C++ called Nitro and here we can see the modules the descript and the API docs which there's links to all of that so their documentation is still un under construction but you can find out some really good things here in their documentation so with that said let's go ahead and start getting into more details about how I set all of this up and how it is running locally on a computer system here I've pulled up the GitHub page for the application and we can see it's just mentioning that it's the open source chat GPT alternative that runs 100% offline on your computer it runs on any hardware from PCS to multi-gpu clusters and there's a little demo video that they have on the GitHub page where you can see how it's actually working where you're typing in questions and then you go to browse all of the different like models that you can choose from so that's right there on the GitHub page then we have some additional troubleshooting information as they mention it's in development so you might have have problems and then they have some instructions here on how to reset your installation I didn't have to do that because I didn't have any issues with it now there are some prerequisites so I had to install some different software on to this computer system in order to get this working and here is a command prompt on the left that shows some commands I ran and then I'm going to show here the versions of node that's 20.10 .0 and the version of npm which is 10.2.3 and also the make version which was important the whole thing would not work if I did not install that so on the left window you can see where I installed the make utility with chocolate so choco install make and also I had to install yarn and that's 1. 22.2 so then there is the command I used to install it and then you can see the results there also down at the bottom you can see the make dot or- V so you can see the version of make that was on here 4.4.1 and that did work so once I got all those things installed I was able to take the next step which was to clone the repository and there is a ton of code which I will show later in the video just to give you an idea of just how much incredible code is part of this system so I've popped into the actual running application before we go through all the details of how I got it installed and what I had to do so here we can see you can create new threads or delete your threads and they have the thing called the Hub and that shows different models so I asked it a question here as a helpful assistant tell me about YouTube videos or making YouTube videos there you can see the RAM and the CPU so 30% of the ram is used and only 1% of the CPU and you can see the actively running models there that I chose it was 3.8 GB and we'll look at those in a little bit more detail then we have some settings we can customize some things in the application and there's all a list of all of the extensions a few experimental modes and right there enable GPU acceleration for NVIDIA gpus and that's on because my application and my system actually can support that I did not turn on the experimental mode and I did not enable the API server yet I will probably do that in another video there's so many things to talk about just in this video so I won't be doing that yet and there are some keyboard shortcuts which can be helpful so they show what some keyboard shortcuts are to help you use the application and then clicking there takes you back to the main chat window so then we have all these parameters here on the right side and again this is all running inside the electron application and after I got all those tools installed I was able to get this up and running so this is running a chat against a open source model on my local computer system which is really awesome and during my testing it did give good answers and I've only tried a couple of the large language models or llms so far because there's so many you'll see that in another part of the video just how many there are available and so right there you can choose like the current one which is 3.8 gigs or there's open AI chat GPT or 3.5 or 3.5 turbo or four if you have a key so now here is the Hub and you can see different models so there's recommended models and then you can click on this link here and see this is a 4bit quantitized iteration of this instruct 7B model and it tells you like different information about each of these models so we'll take a look at some of the other parameters here mathematical and logical abilities experimental model of green node LM and Leo Scorpius using well I didn't see that um open chat AI feedback back and also you can see the uh size of these models are getting larger this is based on the Llama 2 architecture and then here's where you can hook up the open AI if you have keys for that it will walk you through getting that set up so you can actually use open AI if you want to as part of all of this so you could do a compar Aron if you hook those up to some of these other models and see how you like the results so this is 3 trillion token data set 1.1 billion parameters llama model this is the Deep seat coder 1.3 billion two trillion tokens so that's a 2.7 billion parameter model llama 2 chat 7B Q4 it's a iteration of meta AI code ninjas finetune open chat 3.5 it's uh getting a little bit larger here on some of these model sizes so I'll just go ahead and scroll through these in case you're interested to see what's available here and so basically what happens is once you find one that you're interested in you can install it but first you have to download it and if you're looking at the size now we're up to 22 gigabytes so they are getting quite large so you'll have to have enough hard drive space and internet bandwith to be able to download these and there's just a download button there and then it shows the progress as it's downloading these different llms and once they're downloaded onto your machine then it's really easy as you'll see later in the video to be able to select between the different uh llms and then in the bottom left of this screen you can see again which model that you're actively using now we're getting down to the last view this is 2462 GB and this one is 3858 gbes 70 billion parameters it looked like it said there and this is meta's AI llama 2 chat 70b model 4090 gabyt so I'm scrolling back up to show the question here what are your recommendations to grow a YouTube channel fast and then it gave like a bunch of different answers and so here again I'm just showing the different parameters highlighting what's in their user interface I set the maximum tokens to 4,096 and you can change this prompt template and you can drag this and make it larger so there's really awesome ways to be able to customize this and the engine parameters you can customize that also so there was my question give me some recommendations on growing a YouTube channel fast and now I'll ask it some other questions and we can see what it says can you summarize top 10 tips to make a new YouTube channel and we'll get to see how fast that it works now I didn't scroll to the bottom so we couldn't see it working but we'll see that next here so there are our top 10 tips and you can see it took 33 seconds so that one actually took a little while some of them a lot of them are fast so let's go ahead and we'll ask it another question here can you expand in more detail on your top three and you can see in real time how many seconds and it it's EXP explaining in more detail just like I asked it create high quality content consistency is key so now I'm going to ask it more personal information do you know who the top YouTube creators are and so you can see here it's giving an answer about different creators and how many subscribers they have I haven't verified to see how accurate this information is but I have heard about some of these YouTubers and I know they're popular and some of them I've watched myself so now I'm going to ask it another question do you know who the fastest growing YouTube channels have ever been in the history of YouTube and so it goes on to tell us yes there's been several fast growing YouTube channels in the platform's history here are some of them so again I've heard of some of these content creators and I know they're pretty large channels then I asked to hey can you tell me how much money each of those channels makes per year on average since their Channel started and I think it's operating pretty quickly compared to using some other services out there chat programs and so forth many times it starts to respond immediately and I like how it streams stuff out here and kind of list it out in a bullet format and so there you can see it's telling us how many millions of dollars some of these YouTubers have made and when they started 2013 2007 2014 and so forth here I'm going to be looking at a configuration file and also we're going to start downloading another large language model so we can try out a different one and we'll go into the list of all of the llms and I decided to choose this one here so let's go ahead and initiate the download of this it's 8.6 GB so here I'm selecting the wizard coder python on it demonstrates High proficiency and specific like programming languages so I'm going to go ahead and let this download and it does take a little bit depending on your internet speed and at the bottom left there you can see that it tells us that it's downloading a new model and then we're going to go ahead and use it so while we're waiting I'm going to go ahead and show some more detailed information inside of the code that I downloaded as I explained earlier how to get the code to your machine you just use a get clone command and so here's a command to run it's called nvidia-smi and you must have an Nvidia driver that is of a certain version 11.4 the Cuda version 11.4 and on my machine I do have that version so I was able to proceed if you don't have that version you will have to like upgrade your machine or upgrade your drivers possibly upgrade your video card and there is the git command that I use get clone and then the name of the Jan repository and you can see there's a lot of files and a lot of folders and I'm going to go through some of them just to give you an idea and the one command that started all of it there was make space Dev and that will create a local copy of the desktop software that we saw running so if you don't have that make Command installed this will never work and so down in the bottom here I'm just showing make space- V and it shows that I do have make installed and I showed how to do that earlier if you don't have a proper version of that utility installed this will never work it won't be able to build it and it did take a little bit to build this whole thing out once I ran that command I didn't show that in the video but it does take a little bit like I don't know maybe a couple minutes something like that and I'm just showing the package.json file you can see there like the dev command what it does to get it started and you see that it mentions electron and we can see some Dev dependencies there here is the make file and there's a bunch of different sections how it's building the UI kit and you can see if it's equal to like a Windows operating system and what it's going to do and again this is running on a Windows 11 machine the latest edition and so you can see the command that it's going to run there if it's on a Windows machine otherwise it does a different command and so then it will install yarn dependencies and it builds the core and extensions so there's a lot of commands that it has to do there it's going to run a pow show command and as it's building it you can see all the stuff scrolling by in your console window I actually ran it here I'm in vs code I guess I didn't mention that the uh free vs code utility and then it does some cleanup there so that was the make file there's a license which you can see which license it is everyone is permitted to copy and distribute for btim copies of this license document and there is some other information a Docker file and then there's a demo gif file again it just kind of shows the the system up and running so it's kind of nice to have that there to see what to expect and then they're just showing going through the Hub and they do have information if you want to contribute to the source code on GitHub they explain how to do that so now I'm just going to click through a few of the files and folders there is so much coding that was done in here it's truly amazing what is in this code repository that I pulled down from GitHub there are so many folders I mean there must be hundreds of files a th man years of code I I don't even know exactly it's a lot and later in the video I will show some of these statistics and charts from GitHub about how much activity has been going on I think it was since August or something like that so now I'm just yeah clicking through to show a few different files and some of the different folders I selected some this is a index. s file and you can see on line 10 there it's running local host1 127.0.0.1 so there's basically like configuration parameters that's one of the nice things about having all of the source code if you go about it this way and set up this application you can change anything that you need all the source code is there so you can make whatever ever kind of changes you want and turn it into whatever you need if it doesn't work the way that you want so there it says use offline llms with your own data run op Source models so here is one of the model files I went to the models directory which is important there's a bunch of different files in there and so you can see in this code window how it tells you where the source URL so that one came from hugging face and you can see the URL for it the ID of it what the name of it is what version it is and all that kind of information so there's like dozens of them in this models directory and which means also so it can be configured for other ones as things change in this space and code ninja I believe that was the other one that I used it's good for coding tasks there's a lot of different read me documents in here because there's so many different folders all the code is super organized into all these different folders depending on what the purpose of the software is so like you can see there's also documentation about engineering and developer and community and just so many things all written in a nice format here so it's a alternative that runs on your own computer with a local API server kind of like chat GPT as you already saw so yeah the web folder has a ton of documents the UI kit there's all kinds of things in here I mean you can see all the different folders that are there and again if you just pulled down this code you can take a look at it or you can just look at it up in GitHub it's just really awesome that you have access to all this code that all these people have been working on and you can make it get your own and do what you need with it or just use it as is it's as I showed very functional and I think it's very exciting to have this kind of Technology now at our fingertips to be able to use it either directly or change it and of course waiting for new features to come out which I'm pretty sure it's going to advance pretty rapidly I will be keeping an eye on this open- source project and I believe I'll be making other videos about it in the future so I'm just showing another read me file here about a different aspect of the software talking about creating your own plugins or templates creating ex extensions to the product so it's configurable and there's the electron which is part of the program that we are looking at you can see it has a lot of different files and folders and code available to change if you need to to make it exactly as you want it and we already looked into the docs there's a lot of good information in there so here I'm back on the Hub page just looking at what's been downloaded and what's in use So currently the lament 2 chat 7B Q4 is used and it was thoroughly trained on extensive internet data so the wi wi coder I'm going to go ahead and change to the wizard coder Python and now I'll do a little bit of configuration here and then we'll start asking it some programming questions so now I'm telling it instructions you're a helpful Pro programming and software engineer and I haven't used this model before so this will be the first time using it and I'll ask it some questions and we'll see how it goes so first I'm going to ask it what are popular languages that you know about and it goes on to say that it knows about Java python C++ and JavaScript but I believe it knows more than that then I asked it do you know C and it took 3 seconds and it said yes I know C now let's ask it what is or are the total languages that you can write code for and it kind of repeated what it already said it can write code in many different languages so now I'm going to ask it can you write me some code and JavaScript HTML and CSS for a tic TCT toe game and then it tells me that it can but it actually didn't write it yet so I'm going to prompt it possibly a couple other ways here so now I tell it just to go ahead and do it and it tells me sure I can help you with that okay well will you write that code now for a JavaScript file an HTML file and a CSS file so it tells me hey sure I'll start working on it this is where I see this little sit situation here how it's different from open AI chat jpt where if you ask it to do something one time it will just go ahead and start writing code so again I'll ask it a couple other questions it tells me it'll email me the code it's kind of funny so now I'm going to ask it a little different question here what are your top tips for writing JavaScript to process Json data and then can you write sample code and show me now so here it's actually going to start writing code here's simple Le code to process Json data in JavaScript so it sets up some sample data and then it does a Json parse and then it logs it out to the console window and then I asked it to write a full program to parse the CSV file and to Output it into Json data and I told it the name of the file to use and so there's a little bit more detailed and it has some comments in the code even and you can copy and paste the code like you can do in chat GPT so here I wanted to show the running program there you can see Jan it's 52.8 megabytes and Nitro is actually 5.6 GB so it's a very large footprint on your computer and then Visual Studio code which is where it's running is using almost 3 gigabytes and if you look right below that you can see memory usage is up to 48% so it has jumped up there's not that much else running on the computer system at this point except this program there are some other browser windows open but there's not that many tools so here I wanted to show another aspect of all of this if you saw where I mentioned the software that you had to install it talks about Cuda and so I'm just showing the screens that I went through to install this piece of software and you can see like it's installing all these different components and sometimes there were versions that I already had on the machine but in a lot of cases this was all brand new and so I had to let it install and download new software and I went into the advanced options where you can see one of the things it tells you to do is use mvcc which is right there inside that block so you have to have that installed in order for this program to work now I did not change any of these options I took all of the default settings but I did just want to show the extent of all of the things that it installed as it was installing it and so there is the nvcc I believe that is the compiler that it's going to use and this is version 12.3 which as of making this video is the most recent so you may have to install this piece of software which you can get from the Nvidia website so now I'm back in the GitHub repository and this is where I was saying of course you can look at all the code there if you like looking at it from the GitHub repository and didn't download a copy you can at least look at it from here on the GitHub page sometimes I like looking at it there instead so here is the like homepage and it shows popular repositories the Jan which is the one that we've been looking at they have like a sample app and they have Nitro a fast lightweight embeddable interference engine to supercharge your apps with local AI it's open AI compatible API and it's just showing the repositories and how much activity you can see has been going on with them and these are some of the other repositories out there some of them I've talked about in other videos like Lang chain and I've used in some of the other software that I've been experimenting with so again yes this is a open source alternative to chat GPT that runs 100% offline on your computer and so far in this video you can see how that has all been possible so we can see there's been changes made up to 33 minutes ago and two hours ago and so forth and there's been 1,771 commits we can see the latest version that's been released and we can see how many contributors there are on the project says 24 there here is some insight the number of contribut computers started looks like in August of 2023 is when the code changes started being checked into the repository excluding merges there's been 20 authors that have pushed 431 commits and 542 commits to all branches 479 files have changed and there's been 12,665 for editions as well as 6905 deletions and you can see who's been contributing the most software also we can see the new issues 105 closed issues 247 and there's been 199 merge PO requests so it's very busy lot of activity going on here I'm showing some other insights we can see again around the middle of August of 2023 is when it got started and you can see spikes and the activity going on up through the end of 20123 we can see there were 599 closed issues which is a lot and 13 open issues right now so it's pretty interesting to go read through some of this to see what's going on to see if there bugs or feature requests things like that so now I'm looking at the Cuda which is what I was showing in those other screenshots there's actually samples and there's a Cuda toolkit and I'll go ahead and open that up here this is what I had to download and it kind of prompts you through if you're on Windows or Linux you choose your operating system and architecture and then do you want it to be a local installer and it's 3.1 gab so I had to download that and run it before or any of this would work and there are some prerequisites they talk about there and then they do have Cuda samples which I will try to look at those in another video and run them now that this platform is up and running on this computer system and they have a installation guide for Windows they tell you what your system requirements are the supported operating systems and installing Cuda development tools and so forth and it does go into talking about building the Cuda examples or samples and you need visual studio and that is installed on this machine but I'm not going to go into all of that detail in this video I will attempt to make another video sooner than later about those Cuda examples and open them up in visual studio and get them to compile and run and we can take a look at that it's very nice that they have all of this set up and available for free that you can download and try things out there's a lot available here in this GitHub Nvidia Cuda samples repository they also have a quick start guide and the Cuda toolkit provides a development environment for creating high performance GPU accelerated applications with it you can develop optimize deploy your applications on GPU accelerated embedded systems desktop workstations Enterprise data centers cloudbased platforms and supercomputers so this toolkit includes a GPU accelerated libraries debugging and optimization tools a c or C++ compiler and a runtime Library so you get a whole lot and again this is all free and now that I've installed it on this machine there are other other things that I can explore and try out they have tutorials Here video walkthroughs and you can see other products that are using it they have blogs that you can read and they're up to date from in 2024 and then have some latest news there and this is on the developer. nvidia.com website Cuda toolkit I will include links to some of these pages that I've been talking about in the video description so that you can easily access them well I'm just about done with this video I know I know there was a lot of details as I said there would be it's probably one of my longer videos but there was a lot that had to be done in order to get this up and running on the machine and now I have access to the source code and if I wanted to Fork it I could pull down more recent versions as they make changes I will be keeping an eye on this project it does seem to be the most advanced capabilities that I've seen so far in this realm of having local and custom chat GPT like programs running with all these different models and there's so much code like we looked at that's available if you pull it down and get it running this way so it's it's really amazing and I hope you enjoyed this video and I hope it helps you if you're going to attempt to get this up and running on your own machine and I thank you for stopping by and I thank you for watching and subscribing to my videos and I'll be making more in the near future have yourself a great day I'll talk to you soon if you like this Channel please like And subscribe and hit the Bell icon to be notified when I post new content

Info

Channel: ScottDevOps

Views: 3,656

Rating: undefined out of 5

Keywords: Setup Tutorial on Jan.ai, 100% offline LLM and AI, This is the BEST AI YET, Bring AI power to your desktop, Make AI powered applications, Exciting new AI product, AI Source source code, Run LLMs locally, how to do chatgpt locally?, chat with AI on your private local computer, use multiple different LLMs, This is Private GPT Use AI MAKE files, Use AI CUDA software, Run Local LLMs on your own hardware, open source ChatGPT alternative, 100% AI offline mode, local AI on PC

Id: ZCiEQVOjH5U

Channel Id: undefined

Length: 49min 15sec (2955 seconds)

Published: Sat Jan 06 2024