I Trained an AI with 10,000 Memes

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

I trained an AI with over 10,000 memes to see if I is funny. Insert your image, wait for the prompt and automatically get your meme ready. So in today's video I'm going to walk you through how I got this data, how I train the model, and some strange things that I've encountered along the way. So first, it's worthy of knowing how unfunny AI can be right now. I found this image here on Twitter showing a graph. Let’s asked to generate a funny caption. Start from the bottom. Now we're here. Just hope my battery doesn't die before I cash out. Like, I don't even think my grandma would laugh at that. But before we get started, we have to ask ourselves what is a meme and what makes them so funny? Memes have evolved a ton since the very first internet meme, The very beginning was associated with the rise of user generated content dancing Baby Hamster Dance, where all the things that show the internet's potential. Then we have the classic era that could be associated with the ease of sharing content online, like Chocolate Rain, Rickroll or YouTube poops. Then the idea of rage comics and advice animals came along. These images were crude, easy to make, and almost made you feel included by repeating faces or lines. and after. This means could arguably be described just by irony. Making fun of old or current trends and completely over-the-top ways, like over editing and gaming videos or adding filters so much that it deep fry the meme. means have changed a ton. I like this quote right here, a meme is a piece of media that is repurposed to deliver a cultural, social, or political expression mainly through humor. currently the Apple App Store is under scrutiny as of recording this video. So this meme has found a way to be funny. but even able to capture something like this is funny as well. right, history class is over. Let's get into the code. First we grab as many memes as humanly possible. We take the text of the meme and the image and run it through a large language model to explain everything in detail. we take these images and text data so we can fine tune a large language model to give a good meme caption based on was given. Then we create an interface as able to create a meme from an image input. Related News Articles for modern context and other data. the ideas that make means are somewhat funny and at least relevant based on the subject matter, which I can't really do that well. That means when a news story happens, you have a relevant meme ready to go. This is such a stupid thing to work on. First, the data collection. The issue we have with beams is going to be a hot take here. They're just not good, boring and funny. However, there's a lot of websites that are out there that help users generate their own memes. as well as create the templates here. Like for example, we have a bunch of trending templates right here that people are using for recurring memes. And what makes us significant is that it provides us with the actual meme, like in as blank form with the community making posts about it to see what they think is funny about it. And this helps us understand what the meme is about and the relationship as to why it's funny with the caption of that meme. So this gives us a good training framework to give the blank version of the meme while having captions given to it to train. What is considered funny in this particular image. But for now, let's scrape all of these meme templates as well as the memes made about it. We can also grab the meanings behind popular means on wiki sites like Know Your Means, which honestly is the perfect database for these types of things. If you're familiar with web scraping, you know, I can't just use old methods. A lot of websites force you to use JavaScript to navigate their whole website, which makes things really tough. So we have to simulate a whole entire browser to get the information ourselves and act like a user. So I'm going to be using Bright Idea scraping browser, which will be a proxy to my browser automation tool. I've been using bright Data for a while now for a lot of my AI related projects, and they were kind enough to sponsor today's video. The World of Web scrapers as a gigantic rabbit hole. You'll go down. One of the biggest rabbit holes is using proxies, by. With bright data helps you get all of the information that you need using their proxy network, like the one I'm using right now, a player or their massive proxy network of IP addresses worldwide. Okay. So I wrote a script here to start gathering these images. Let's just insert this one line of code here and look at the remote browser to see if everything's fine. Let's go. I mean, it's kind of cool. It's kind of like, you know, hacker vibes. Now I can just leave this running for a while. Right. Data can also save captures and rotate the proxies for me automatically so I don't even have to worry about that. All right, let's see. Wow. So we're able to grab a ton of meme collections here, like a whole lot. Now the reason why having a template with the example captions with it is perfect because it allows us to train the model. Why a particular meme is funny. Then all of those memes we can apply the description of them to the template itself. So when the user decides to get a meme, it will understand what meme to choose based on the relevant context. And all of this work just to make some crappy teams like it's seriously pathetic. By the million dollar question, what makes something funny? So I'm going to use this model that's pretty popular and hugging face called TVL. I hope I'm saying that is a Chinese one. To read a meme and give me a reason why it's funny. For I will give an incredibly detailed reason why this is funny. So I'm going to have that read the meme and tell me what the photo is about. Lava will give me an incredibly detailed reason why this is funny. Then for all the memes, that is a part of, I'll transcribe it into text and also make sure it matches with this meme during training. During training. Love that. open. A.I. has their GPT four vision model. I don't know what it's called. It's what they use in charge. CBT, but it has a 100 day rate limit like I'm sorry here, but I have like a million memes. You think I am going to wait for 10 million years to do this, So I'm going to write this Python script here. was able to run the whole thing on my own machine, which is great. I used a Allama which just came out for Windows recently. highly recommend it if you have a beefy computer. So, like this one, for example, image shows a chalkboard sign on an easel. The sign reads, No, hipsters don't be coming in here with your hairy faces, vegan diets at your feet, your sandal wearing no waste mug, no brews, no hamsters. then the chopper design is on the sidewalk. I'm from a building which appears to be a shop or cafe with a sign that says hipsters. The sign seems to be a humorous and directed at those who fit certain stereotypes often associated with hipsters suggests that there will be, my God, no entry for individuals matching the characteristics listed on the sign. The style of the image is informal taking outdoors and my God, that guy is a long description. or this one. The image shows an open laptop with a screen. All your files are exactly where you left them. The laptop appears to be an Acer model. The desk or table of which the laptop says has a blurred background but seems to have a brown surface. overall settings suggest an indoor environment. So I mean, We have a ton of data here. A concerning amount be perfectly honest. need to fine tune a large language model to use our data to create funny or memes. So the one I found here is called the Sphinx, which is a multi-modal large language model that you can use images as prompt. This would be perfect because I want users to submit their own photos as memes. Now the documentation is as fun to follow. Fine tuning large language models is like turning the knobs of the controls of the large language model itself. Whenever these knobs are turned, it produces a slightly different result that you want. So for things like GB, for they have fine tuned versions of their model to serve their purpose as a general AI that can do anything. and for my scenario, I want to adjust my knob so that it can make a funny meme based off of the scenario. I give it to do this. I'm giving the model training data that looks like this a prompt. I give it the response a large language model would give and the path to an image. There's this project called Onslaught, which makes fine tuning models really fast and support some of the most popular instruct models. So for me, I'm going to use Mistral seven be instruct version 0.2, which at the time has some of the best benchmarks since I can fine tune this locally, it means I can iterate as much as I need to and the documentation is pretty good. made it so you can even do it on Google CoLab if you wanted to. So Sadly, it'll just be with the captions, with image descriptions. We're going to be waiting a long time for this. So something that I have noticed already is that despite the language changing and some of the captions coming out, know, somewhat humorous, is still not funny. Really. So at this point, I almost just completely gave up on the project, started to realize a truth that made me a bit uncomfortable. Maybe I doesn't know how to be funny because while human beings don't really know how to explain something that's funny in the first place, let me explain. why is it that this image here is not funny? did someone order a fill Minion? You seriously didn't laugh at that, did you? But then when you do this to an image, maybe a slight chuckle happens. This is at least more funny than this image. But with this specific example, deep frying, a meme is a form of a parody for this low quality types of images you see by juxtaposed with content that might be familiar to us. this led me down a gigantic rabbit hole of why things are even funny in the first place. Reading the theories, I decided that this would be the perfect way to add an element of surprise to the meme as well. So I decided to split them into six different humor types unexpected exaggeration, absurdity, wordplay, juxtaposition and incongruity. Incongruity, incongruity. Right. I hate that word. So instead of just being like, be funny, too, I we are now telling the AI how to be funny in a very specific way without the user knowing. So the AI is adding humor by just being random, I guess you could say very quirky. part of why the memes are so funny in the first place is not because of what the caption is. we're talking about memes, the largely written model knows how to copy the language that these beams can produce, but that's about it. huge part of memes as well as the context that goes with it, which is massive when it comes to the virality of memes. and one of the biggest sources of beams are current events that go on in the world. So a feature I'd love to try is getting the most current events in the world right now and using that as a context for my large language model. Let's give it a try. I'm going to use Bright Data as proxy network again to plug into my web scraper to get news articles really easily. Again, let's just copy and paste this line of code and we're off to the races so the memes are starting to come out really funny, especially when I attribute it to modern events that are happening. So let's just find a way to make this so everyone can use it. now a meme format that is super common amongst all of the internet is labeling objects within a scene. so you'll often see it as a picture of something. And then labels over top of them to represent some sort of metaphor of what's happening in that scene. This usually is an extreme scenario of some sort by is represented into a relatable scenario. So I was thinking about creating some custom code that could recreate these types of images really easily. Now, because you're seeing the highlight reel, this was much harder than expected. Let me explain. First, we get the information from the news link. We get the meme that it most relates to for most context. We then add a category of humor to it. if you're interested in the more raw and technical details, I have a second channel where I go into the nitty gritty. Check it out if you want. so when it comes to the images to create, we need to be able to locate these objects within the scene. Throughout my entire project, I tried using something called YOLO V, which is a fantastic library for detection, sure it'll be fine for most scenarios that you use. It's a seriously a great library. I recommend it's unable to identify what this thing truly is because it's trained on real data. I mean, one time I said it was a dog. So this large language model called Owl V2 combines this understanding of large language model reasoning and understanding with masking space software as pretty crazy. And this is perfect for my use case because it allows us to identify objects of interest right at the beginning of the software. So we just have to change a couple of lines of code and we're good Last part is to create the user interface behind this whole entire operation. for this. I'm going to use what everyone is using nowadays, Streamline, which is this open source project in Python that gives you a bunch of these cool UI elements to build applications easily. great with this is that I was using just a command line to run my scripts over and over and over again. Trust me, it was horrible. But rather than having to connect to a whole API, create a front end, I can just plug in my function and it just deals with everything. All the UI elements for me. okay, I have something that is sort of done and ready to be shown off again. All of the code is open source and you can try them out in the And here we go. So here's the application right here. candy faces uncertainty due to Chinese imports. Okay, I think that'd be a good one. and I have four demo images. I kind of want to give a try with this whole labeling feature. So let's try this one right here. So. So I came back with licorice glass, okay. Never mind. Now let's see if we do the Drake and Kendrick Lamar Beef and see if we can get a meme out of it. So I'll put it in there. Hopefully a Wikipedia article works. I don't see why not. my God. If they can do this one, they'll be so funny. a kind of guy, right? I mean, this is not really necessarily funny. I mean, I kind of expected it to be like this guy is funny how it just put it in this type of language Taylor Swift debuts revamped RS tour set list with. Yeah, you get the point. Okay, let's try this starting car right here. Sometimes it doesn't come out good. I mean, this image wasn't necessarily going to be funny. I do have some code in there that makes it so that it can do a caption rather than like a label, But sometimes it just doesn't work out. Outside of this, though, here are some of my favorites that came out that you may have already seen before. Now, this video was tough to me because I wasn't sure if I was even going to release it or I had a blast working on this project. But as you can obviously tell from the different background, all over the place, like future and present, like it took so long to make this sometimes I see online people saying that like, you know, I'm a great programmer, I could circle around people. Like, it's just not true. of me wanted to release this video because it was just a nice way to show that, you know, I make mistakes. I make bad code. Sometimes it doesn't work Again, I have my second channel as well, which I kind of go more into detail. Check out the livestream as well. I also have a hackathon coming up soon here, so be on the lookout for that.

Info

Channel: Coding with Lewis

Views: 247,067

Rating: undefined out of 5

Keywords: memes, ai meme, coding with lewis, python code, ai, chat gpt, best memes, ylyl, coding, programming, software, technology, coding with lewis reddit, python, python tutorial, python code review, python code for beginners, web scraping, tech, ai developer, artificial intelligence, code, coding tech, how to code with ai, chatgpt, openai, gpt4, claude, mistral, open source software, open source tech, how to become a developer, software engineer, software engineering

Id: 5GfgrYz9z9A

Channel Id: undefined

Length: 14min 52sec (892 seconds)

Published: Fri May 10 2024