I built an AI server for my daughters.
Well, first it was more for me. I wanted to run all of my AI locally. And I'm not just talking command line
with alama. No, no, no. We have a gui, a beautiful chat interface and
this thing's feature filled. It's got our back chat
histories, multiple models, we can even add stable diffusion. And I was able to add this to my notes
application obsidian and have my chat interface right there. I'm going
to show you how to do this. Now you don't need something crazy like
Terry, that's what I named my AI server. It can be something as
simple as this, this laptop, I'll actually demo the
entire setup on this laptop. So likely the computer you're using
right now, the one you're watching, this video one will probably work. And
seriously, you're going to love this. It's customizable, it's wicked fast,
way faster than anything else I've used. Isn't that amazing? And again, it's
local, it's private. I control it, which is important because I'm
getting it to my daughters. I want them to be able to
use AI to help with school, but I don't want them to cheat
or do anything else weird. But because I have control, I can put in special model files that
restrict what they can do, they can ask, and I'll show you how to do that. So
here we go. Get your coffee ready. We're about to dive in, but
first let me have you meet Terry. Now Terry has a lot of muscle. So
for the case, I needed something big. I got the Leon Lee zero 11 dynamic EVO xl. It's a full tower EATX case perfect
to hold my ASUS X six 70 E Creator pro art motherboard. This
thing's also a beast. I'll put it in the description
so you can look at it. Now, I also gave Terry a big brain.
He's got the A MD Ryzen 9 79 50 x. That's 4.2 gigahertz and 16 cores.
From memory, I went a little crazy. I've got 128 gigabytes of
the gki trite D five Neo, it's DDR R five 6,000 and way
overkill for what I'm doing. I think I got a Leon Lee
water cooler for the CPU. I'm not sure if I'm seeing
Leon Lee, right? I don't know. Correct me in the comments. You always
do. And then for the stuff AI loves, I got two 40 nineties, it's the MSI Sremm and their liquid cooled
so they could fit on my motherboard. 24 gigabytes of memory each giving me
plenty of muscle For my AI models for storage, we got two Samsung
nine 90 pros, two terabytes, which you can't see because
they're behind stuff. And also of Corsair AX 1600 I power supply
1600 watts to power the entire build. Terry is ready. Now, I'm surprised to say my system
actually posted on the first attempt, which is amazing. But what's not amazing is the fact
that Ubuntu would not install. I tried for hours actually for a whole
day and I almost gave up and installed Windows, but I said, no,
Chuck, you're installing Linux. So I tried something new, something
I've never messed with before. It's called Pop Os by system
76. This thing is awesome. It worked the first time. It even had a special image
with Nvidia drivers built in. It just stink and worked.
So I sipped some coffee, didn't question the magic and moved on. Now if you do want to build something
similar, I've got all the links below. But anyways, let's talk about how to
build your very own local AI server. First, what do you need? Really all
you'll need is a computer. That's it. It can be any computer running Windows,
Mac or Linux. And if you have a GPU, you'll have a much better time. Now,
again, I have to emphasize this, you won't need something
as beefy as Terry, but the more powerful your computer
is, the better time you'll have. Don't come at me with a Chromebook
please. Now step one, alama. This is the foundation for all of our
AI stuff and what we'll use to run AI models. So we'll head on over to alama.ai and
click on download and they've get a flavor for every os. I love that.
Now if you're on Mac, just download it right now and
run it. If you're on Windows, they do have a preview version, but
I don't want you to do that. Instead, I want you to try the Linux version.
We can install it with one command. And yes, you can run Linux on Windows
with WSL. Let's get that going real quick. First thing I'll do is go to the start
bar and search for terminal and launch my terminal. Now those first bit is for Windows
folks only Linux people to hang on for a moment, we got to get WSL installed
or the Windows subsystem for Linux. It's only one command WSL
dash install and that's it. Actually hit enter and that's going to
start doing some stuff. When it's done, we'll set up a username and password.
I got a new keyboard by the way. Do you hear that link below? It's my
favorite keyboard of the entire world. Now some of you may have
to reboot. That's fine. Just pause the video and come
back. Mine is ready to go though. And we're walking Ubuntu 22.04, which is still amazing to me that
we're running Linux on Windows. That's just magic right now we're about
to install Llama, but before we do that, you got to do some best practice
stuff like updating our packages. So we'll do a pseudo a PT update
and then we'll do a pseudo A PT upgrade Y to apply all those updates. And actually while it's updating, can I tell you something about our
sponsor IT Pro by a CI Learning. Now in this video, we're going to
be doing lots of heavy Linux things. I'm going to walk you through it. I'm going to hold your hand and you may
not really understand what's happening. That's where IT pro comes in. If you want to learn Linux or really
anything in it, they are your go-to, that's what I use to learn new stuff. So if you want to learn Linux to get
better at this stuff or you want to start making this whole hobby thing your
career, actually learn some skills, get some certifications, get
your A plus, get your CNA, get your AWS certifications, your Azure certifications and go down
this crazy IT path, which is incredible. It's the whole reason I make this
channel and make these videos. Check out IT Pro they've got IT
training that won't put you to sleep. They have labs, they have practice exams, and if you use my Code network check
right now, you'll get 30% off forever. So go learn some Linux and thank you
to IT Pro for sponsoring this video and making things like this possible.
And speaking of my updates are done. And by the way, I will have a guide
for this entire thing. Every step, all the commands, you can find it at the
Free network Chuck Academy membership. Click the link below to join and
get some other cool stuff as well. I can't wait to see you there. Now we
can install llama with one command. And again, all commands are below. It's going to paste this in
a nice little curl command, little magic stuff and
I love how easy this is. Watch you just sit
there and let it happen. Do you not feel like a wizard when
you're installing stuff like this and the fact that you're installing
AI right now? Come on. I noticed one thing real quick. Old LAMA did automatically find out
that I have an Nvidia GPU and it's like awesome, you're going
to have a great time. If it didn't see that
and you do have a GPU, you may have to install
some Nvidia Cuda drivers. I'll put a link for that below, but
not everyone will have to do that. And if you're rocking a Mac
with an M1 through M three chip, you're going to have a good time too.
They'll use the embedded GPU Now at this, our Mac users, our Linux users and
our Windows users are all converged. We're on the same path. Welcome.
We can hold hands and sing. That's getting weird. Anyways, first we have to test a few things
to make sure alama is working. And for that we're going to open our
web browser. I know it's kind of weird, just stick with me. I'm going to launch
Chrome here and here are my address bar. I want to type in local host, which
is looking right here at my computer. And port 1, 1 4, 3, 4, hit enter. And if you see this
right here, this message, you're good to go and you're
about to find this out. Port 1 1 4 3 4 is what llama's
API services is running
on and it's how our other stuff is going to interact with it.
It's so powerful. Just check this out. I'm so excited to show you
this. Now before we move on, let's go ahead and add
an AI model to alama. And we can do that right now with alama
Pull and we'll pull down Llama two, A very popular one. Hit enter and it's
ready. Now let's test it out real quick. We'll do Alama run Llama two. And if this is your first time
doing this, this is kind of magic. We're about to interact with a
chat GPT, like AI right here, no internet required. It's all just
happening in that five gigabyte file. Tell me about the solar eclipse. Boom. And you can actually control see that
to stop it. Now I want to show you this. I'm going to open up a new window. This is actually an awesome
command and with this WSL command, I'm just connecting to the same
incident. Again, a new window. I'm going to type in watch dash N 0.5, not four five Nvidia dash smmi. This is going to watch the performance
of my GPU right here in the terminal and keep refreshing it. So keep
an eye on this right here. As I chat with llama two, give me a list of all Adam Sandler movies and look at that GPU Go. Ah, it's so fun. Now can I show you
what Terry does? Real quick? I got to show you Terry.
Terry has two GPUs here. They're right here and Alama can actually
use both of them at the same time. Check this out. It's so cool. All the semi old Jackson
movies. And look at that. Isn't that amazing? And look how
fast it went. That's ridiculous. This is just the beginning. So
anyways, I had to show you Terry. So now we have a llama
installed. That's just our base. Remember I'm going to say bye.
So slash bye to end that session. Step two is all about the web
ui. And this thing is amazing. It's called Open Web ui and it's actually
one of many web UI you can get for Llama, but I think Open
Web UI is the best. Now Open Web UI will be run
inside a Docker container. So you will need Docker installed
and we'll do that right now. So we'll just copy and paste the
commands from Network Struck Academy. This is also available
on Docker's website. First step is updating our repositories
and getting docker's GPG key. And then with one command we will install
Docker and all its goodies. Ready, set, go. Yes, let's do it.
And now with Docker install, we'll use it to deploy
our open web UI container. It'll be one command you
can simply copy and paste. This Docker Run Command is going to pull
this image to run this container from Open Web ui. It's looking at your
local computer for the llama base, URL because it's going to integrate and
use Llama and it's going to be using the host network adapter to
make things nice and easy. Keeping in mind this will use Port 80
80 on whatever system you are using. And all we have to do is hit enter after we add some pseudo at the beginning, pseudo docker run and let it do its
thing. Let's verify it real quick. We'll do a little pseudo docker PS.
We can see that it is indeed running. And now let's go log in.
It's kind of exciting. Okay, let's go to our web browser and
we'll simply type in local host colon port 80, 80, and whoa,
okay, it's really zoomed in. I'm not sure why yours shouldn't do
that. Now for the first time you run it, you'll want to click on sign up right
here at the bottom and just put your stuff in. This login info is only
pertinent to this instance, this local instance, we'll create
the account and we're logged in. Now just so you know, the first account you log in with or
sign up with will automatically become an admin account. So right now, you
as a first time user logging in, you get the power. But look at this.
How amazing is this? Let's play with it. So the first thing we have
to do is select the model. I'll click that drop down and we
should have one llama two. Awesome. And that's how we know also
our connection is working. I'll go ahead and select
that. And by the way, another way to check your connection is
by going to your little icon down here at the bottom left and clicking
on settings and then connections. And you can see our oh LAMA
based CRL is right here. If you ever have to change
that for whatever reason.
Now with LAMA two selected, we can just start chatting
and just like that, we have our own little chat, GBT that's completely local and
this sucker is beautiful and extremely powerful. Now, first
things we can download more models. We can go out to llama and
see what they have available. Look on their models to see their list
of models. Code Gemma is a big one. Let's try that. So to add
code Gemma, our second model, we'll go back to our command line here
and type in Alama pull code Gemma. Cool, it's done. Once that's pulled, we can go up here and just change our
model by clicking on the little dropdown icon at the top. Yep, there's
code gma. We can switch. And actually I've never done this before, so I have no idea what's going to happen. I want to click on my
original model LAMA two. You can actually add another model to
this conversation. Now we have two here. What's going to happen? So code Gemma is answering it first.
I'm actually not sure what that does. Maybe you guys can try it out and
tell me. I want to move on though. Now some of the crazy stuff
you can see right here, it's almost more featured
than chat GBT In some ways. You got a bunch of options for
editing your responses, copying, liking and disliking it to help it learn. You can also have it read things
out to you, continue response, regenerate response, or even just
add stuff with your own voice. I can also go down here and this is crazy. I can mention another model and it's
going to respond to this and think about it. Did you see that? I
just had my other model. Talk to my current.
That's just weird, right? Let's try to make 'em have a conversation.
They're going to have a conversation. What are they going to talk about? Let's bring back in LAMA two to ask
the question. This is hilarious. I love this so much. Okay, anyways,
I can spend all day doing this. We can also with this plus sign upload
files. This includes a lot of things. Let's try, do I have any documents here? I'll just copy and paste
the contents of an article, save that and that'll be
our file. Summarize this. You can see our GPU being used over here.
I love that so much. Running locally. Cool. We can also add pictures
for multimodal models. I'm not sure coma can do that.
Let's try it out real quick. So alama can't do it, but there
is a multimodal model called lava. Let's pull that down real quick with lava
pulled, let's go to our browser here. Once more, we'll refresh it, change
our model to lava. Add the image. That's really scary. There we go. That's
pretty cool. Now here in a moment, I will show you how we can generate
images right here in this web interface by using stable diffusion. But first
let's play around a bit more. And actually the first place I want
to go to is the admin panel For you, the admin, we have one user and
if we click on the top right, we have admin settings. Here's
where a ton of power comes in first. We can restrict people from signing up.
We can say enabled or disabled. Now, right now, by default it's
enabled. That's perfect. And when they try to sign up initially, there'll be a pending user until
you're approved, lemme show you. So now real quick, if you want to have someone else use
this server on your laptop or computer or whatever it is, they can access it from anywhere as
long as they have your IP address. So lemme do a new user signup
real quick just to show you. I'll open an incognito window, create
account, and look. It's saying, Hey, you got to wait. Your
guy has to approve you. And if we go here and refresh our page
on the dashboard, there is Bernard hack. Well, we can say, you know what? He's a
user or click it again, he's an admin. No, no he's not. He's going to be a
user. And if we check again, boom, we have access. Now what's really cool is if I go
to admin settings and I go to users, I can say, Hey, you know what? Don't
allow Chad deletion, which is good. If I'm trying to monitor what my daughters
are kind of up to on their chats, I can also whitelist
models. So you know what, they're only allowed to
use LAMA two and that's it. So when I get back to Bernard
hack Well's session over here, I should only have access to LAMA two. It's pretty sick and it becomes even
better when you can make your own models that are restricted. We're going to mo you on over to the
section called model files right up here. And we'll click on create a model file. You can also go to the community
and see what people have created. That's pretty cool. I'm going to show you
what I've done for my daughter, Chloe, to prevent her from cheating.
She named her assistant Deborah. And here's the content. I'm
going to paste it in right now. The main thing is up here where it
says from, and you choose your model. So from llama two. And then
you have your system prompt, which is going to be
between three double quotes. And I've got all this telling
it what a can and can't do, what Chloe's allowed to ask. And it
ends down here with three double quotes. You can do a few more things.
I'm just going to say, as an assistant education save and create. Then I'll go over to my settings once
more and make sure that for the users, this model is whitelisted. I'll add one
more. Debra Notice she's an option now. And if Bernard's going to try
and use Debra and say Debra paper for me on the Civil War. And immediately I was shut down saying,
Hey, that's cheating. Now Llama two, the model we're using, it's okay. There's
a better one called mixed roll Lemme, lemme show you Terry. I'll
use Deborah or Deb and say, write me a paper on Benjamin Franklin. I notice how it didn't write it for
me, but it says it's going to guide me. And that's what I told
it to do to be a guide. I tried to push it and it said
no. So that's pretty cool. You can customize these prompts, put in some guard rails for people that
don't need full access to the kind of stuff right now. I
think it's awesome. Now, OpenWeb UI does have a few
more bells and whistles, but I want to move on to
getting stable diffusion set up. This thing is so cool and powerful.
Step three, stable diffusion. I didn't think that image generation
locally would be as fun or as powerful as chat GPT, but it's more, it's
crazy. You got to see it. Now we'll be installing Stable diffusion
with a UI called Automatic 1 1 1 1. So let's knock it out.
Now before we install it, we got some prereqs and one
of them is an amazing tool. I have been using a lot called PI ENV, which helps us manage our Python
versions and switch between them, which is normally such a pain. Anyways, the first thing we got to do is make
sure we have a bunch of prerequisites installed. Go ahead and copy and paste
this from the Network Check Academy. Let it do its thing for a bit.
And with the prereqs installed, we'll copy and paste this command, a curl command that'll automatically do
everything for us. I love it. Run that. And then right here it tells us we need
to add all this or just run this command to put this in our bash RC file. So we
can actually use the pie EMV command. I'll just copy this, paste it, and then we'll type in
source B RC to refresh our terminal. And let's see
if pi ENV works, PI ENV, we'll do a dash H to see if
it's up and running. Perfect. Now let's make sure we have a version
of Python install that we will work for most of our stuff. We'll do
PI ENV install three point 10. This will of course install Python
three point 10, the latest version. Excellent Python three
point 10 is installed. We'll make it our global Python
by typing in PI ENV global three point 10. Perfect. And now we're going to
install automatic 1, 1, 1, 1. The first thing we'll do is make a new
directory M-K-D-A-R for make directory, we'll call it stable. And then
we'll jump in there. CD stable diff. And then we'll use this W get
command to w get this BS script. We'll type it Ls to make
sure it's there. There it is. Let's go ahead and make that sucker
executable by typing in CH mod. We'll do a plus x and then web
UI sh. Now it's executable. Now we can run it. Period
slash web ui sh. Ready, set, go. This is going
to do a lot of stuff. It's going to install everything
you need for open web ui. It's going to install PyTorch and
download stable diffusion. It's awesome. Again, a little coffee break.
Okay, that took a minute, a long time. I hope you
got plenty of coffee. Now it might not seem like it's ready, but it actually is running and you'll
see the URL pop up around here. It's kind of messed up, but it's
running on port 78 60. Let's try it out. And this is fun. Oh my gosh. So local host 78 60, what you're seeing here is hard
to explain. Lemme just show you And let's generate, okay, it got confused. Lemme take away the MPA Lupa part.
But this isn't being sped up. This is how fast this is. No, that's
a little terrible. What do you say? We make it look a little bit
better. Okay, that's terrifying. But just one of the many things
you can do with your own ai. Now you can actually
download other models. Lemme show you what it looks like on
Terry and my new editor, Mike, tell me, do this. That's weird. Let's
make it take more time. But look how fast this is. It's happening in real time as
I'm talking to you right now. But if you've ever made images with
GT four, it just takes forever. But I just love the fact that this is
running on my own hardware and it's kind of powerful. Lemme know in the comments
below, which is your favorite image, actually post on Twitter
and tag me. This is awesome. Now this won't be a deep dive on Stable
Diffusion. I barely know what I'm doing. But let me show you real quick how
you can easily integrate automatic 1, 1 1, 1 1. Did I have to
do enough ones? I'm not sure. And they're stable diffusion
inside Open Web ui. So it's just right here
back at Open Web ui. If we go down to our little
settings here and go to settings, you'll see an option for images here. We
can put our automatic 1 1 1 1 base URL, which will simply be HTTP colon whack wack 1 2 7 0 0 1, which is the same as saying
local host Port 78. What is it? 0 6 60 60 think is what it's, we'll hit the refresh option
over here to make sure it works. And actually no it didn't. And here's why. There's one more thing you got to know. Here we have OpenWeb UI
running in our terminal. The head control C is going
to stop it from running. In order to make it work with open web ui, we got to use two
switches to make it work. So let's go ahead and run our script
one more time. Open web UI or web UI sh. And we'll do dash listen
and dash API Once we see the URL come up. Okay, cool, it's
running. We can go back over here and say, why don't you try that
again buddy? Perfect. And then over here we have
image Generation experimental. They're still trying it out.
We'll say on and we'll say save. So now if we go to any prompt, let's do a new chat and we'll
chat with llama two. I'll say, describe a man in a dog suit. This is for a stable diffusion prompt. A bit wordy for my taste. But then notice
we have a new icon. This is so neat. Boom. An image icon. And all we have to do is click on that
to generate an image based on that prompt. I clicked on it, it's doing
it. And there it is right in line. That is so cool. And that's really
terrifying. I love this. It's so fun. Now this video is getting way too long, but there are still two more
things I want to show you. I'm going to do that really quickly
right now. The first one is, it's just magic. Check it out. There's
another option here inside Open Web ui, a little section right here
called Documents. Here. We can simply just add a document. I'll add that one from before
it's there available for us. And now when we have a new
chat, I'll chat with Code Gemma. All I have to do is do a hashtag and say, let's talk about this and say, give me five bullet points about this. Cool. Give me three social
media posts. Okay, go Gemma. Lemme try it again. What
just happened? Okay, let's do a new prop. Oh, there we go.
And I'm just scratching the surface. Now the second thing I want to show you,
last thing. I am a huge obsidian nerd. It's my notes application.
It's what I use for everything. It's been very recent. I haven't
made a video about it, but I plan to. But one of the cool things about this, this very local private notes taking
application is that you can add your own local GBT to it, like what we
just deployed. Check this out. I'm going to go to settings. I'll go to
community plugins. I'll browse for one. I'm going to search for one
called B-M-O-B-M-O Chatbot. I'm going to install that, enable it.
And then I'm going to go to settings. I'll have BMO chatbots. And right
here I can have an Alama connection, which is going to connect
to let's say Terry. So I'll connect 'em to Terry and I'll
choose my model. I'll use Llama two, why not? And now right here in my note, I can have a chat bot come right
over here to the side and say like, Hey, how's it going? And I can do
things like look at the help file, see what I can use here.
Ooh, turn on reference. So I'm going to say reference on, it's now going to reference
the current note I'm in. Tell me about the system prompt. Yep, there it is. And it's actually going through and
telling me about the note I'm in. So I have a chat bot right there, always available for me to ask
questions about what I'm doing. And I can even go in here
and go highlight this, do a little prompt, select generate it's generating right
now and just generate some stuff for me. I'm going to undo that.
Let me do another note. So I want to tell a story
about a man in a dog suit. I'll quickly talk to my chat bot and start
to do some stuff that's pretty crazy. And this I think for me is just
scratching the surface of running local AI private in your home on your own hardware. This is seriously so powerful and I can't
wait to do more stuff with this. Now. I would love to hear what you've
done with your own projects. If you attempted this, if you
have this running in your lab, let me know in the comments below. Also, do you know of any other cool projects
I can try that I can make a video about? I will love to hear that. I think
AI is just the coolest thing, but also privacy is a big concern for me. So to be able to run AI locally and
play with it this way is just the best thing ever. Anyways, that's all I got. If you want to continue the
conversation and talk more about this, please check out our Discord community. The best way to join that is to jump
through our Network Check Academy membership, the free one. And if you
do want to join the paid version, we do have some extra
stuff for you there too, and it'll help support what we do here. But I'd love to hang out with you
and talk more. That's all I got. I'll catch you guys next time. I.