Prompt Engineering - A new profession ?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
and Now for Something Completely Different I was getting close to you all kind of getting the virtual avatars and doing things and scanning myself but it was still annoying for a lot of the tasks cleanup rendering okay so by now I guess most of you in the tech world have heard about Dali you type in a text and it gives you an image the image again doesn't need to exist it doesn't care it just takes Builds on top of so many different images that exist in the world and it builds something creative new Delhi okay Astrid on a horse you can type in different things but it might get things wrong so things that aren't actually possible in the real world although this could be possible especially in Belgium but it is not something that's actually happening and some weird stuff like you know probably you never see like in Skywalker removing the sound with vacuum cleaner so this dolly is part of open EI and open AI is pretty much closed sorts and they built another kind of way of interfacing this querying this instead of an API they used one of the Bots and they called it mid-turning but then lo and behold nobody's gonna stop with a kind of close software and there's always that one group that will think about making things open source thankfully stable diffusion they use a slightly different model than dally but at least you can download the software you can run it yourself it's going to take forever but so they spend what was it 600k dollars on generating the model so good luck if you're at home but at least it's reproducible and you don't have to build a whole model to use it so you can just use it download it and kind of generate your own images and you don't have to rely on an API or get credits from something else and so again open source prevailed and it is now surpassing open API which wasn't really open to real open API and you can even search the data set that was used because sometimes you don't know what images are used in the data set and for a copyright reasons you want to know and as you probably heard in many of the data talks and machine learning the data that you put in will put determine what is generated so this is a set that of data that got changed for Japanese culture so the left image in the in the middle is a salaryman oil painting European style and then the same thing Japanese style but they needed to have a different set of images to do this and of course the whole explosion after this whether that's your photoshop your figma the all the graphical tools all of a sudden have like you don't have to design things anymore just type the text and the image will appear well almost and they're able to do cool things so you type in one text and then you type another text and then you type in all the text so first one like a brightly lit room and then standing in the church and you just keep adding things to the scene again pretty mind-blowing this is technology available for everybody so this is not something you have to pay a lot of money for and then it started to completely get creative so you generate it baby Yoda you generate a Galaxy and then you start connecting the images with the pieces in between so again first more stars because you can never have more Enough stars in Star Wars thank you and then you start removing things so you let AI fill the Gap seamless transition between the two generated images maybe a little bit closer to home for the people creating storyboards and they're always kind of annoyed that they don't find the image somewhere that they want to use in the storyboard this was an experiment done a customer paying at the kiosk in black and white fine line style so think about it as a creative tool not just for your photoshops and so on so maybe more closer to home and we talked about the game engines and in the game engines they started also creating plugins there was a wall in the game engine and we just want to have a poster there type in the poster what do you want what style quite easy so you don't have to do copy and paste of the image look for it there and then for brainstorming maybe visual cues inspire you from one thing to the other but it doesn't always have to be images it could also be medical images and again maybe you need a bigger data set or kind of have things that you need to test so this generative is not just confined to creative art there as well what about the house you always want to express like to the architect like this is the house that I really want so you start typing the text and it just generates the house style that doesn't exist because you can't find the image on the internet but this is exactly what you want you can go beyond 2D for 360 images similar thing and then you can do different things with the text you can select type what you want to select to cut or to mask here you would type select business suit and all of a sudden instead of having to manually select all the dots it knows this is the business suit and you can just say delete this and replace this to a battle suit some more fund ways is kind of you take your Lego you build it and you actually create an image out of this for your kids realistically so they know how it's going to look in the real world but even storytelling one scene I take an image and then the story actually guides the image so just the next part of the story and I can keep building on this and even if I have in a scene I'm a cameraman and I have kind of one shot I can ask it how would this look from a drone how would this look from the side and then the AI start generating this so otherwise you have to go back to the scene and kind of start thinking about what did I not shoot there as well so it's almost that this is a new job prompt engineering you just type in text and magically things appear what you actually wanted to have and somebody kind of made the maybe more from uh pilot or co-pilot from from GitHub they said programming is starting to become a conversation with your computer so you express the intent that you want to have and the computer will actually try to generate what you want so instead of kind of writing what we know as a real code or hardcore code or whatever you want to do it this is a new way of getting a result those meetups there's a newsletter and search engines because you wanna search for Wolf ears then it gives you the images and then you want to know what the prompts were that were used to generate the images so maybe you can have better prompts there as well you can always sell prompts if you have like a very descriptive thing there as well and you can also have other completion you type part of it and then it's just a lot of different styles that you can edit so you get better at typing the prompts like a grammarly for prompts and then you have macros because you group certain image and you say this is a concept and then I want to have that Concept in my prompt verified uh used so I know catoy was a bunch of pictures I know the style that I specify had a bunch of pictures and this is how they actually get to use together so it's using concepts of of kind of grouping of different uh generated images you sometimes have to again it looks too realistically or too strange or The Uncanny Valley so sometimes you have to it's useful that you give it a little bit of a negative prompt so that it's not that clean an image and then again we're an engineer so we want to reverse engineer things um we have the image what could have been the prompt for this image but it doesn't stop at images again this is the explosions since I gave the last talk in Maybe August and all this what I'm not talking about is just brand new that came out of it Google started to have text to video because why not you have the same data Mira couldn't stay behind they have the same thing but they can also create a static from a a moving image from a static image and you can specify a few images and then it creates a transition from one image to the other and you can have like one image and it creates multiple variations of the image but what about texture not just the movement not just the image and I want to say it's a shiny metal Swan so I'm I'm specifying what I want as the texture used or maybe I have a video of a tennis court but I want to change the background by specifying a prompt this is things that exist right this is not like some random thing this is really existing software but then the same thing can be used to generate motion random thing all of a sudden becomes I walk a certain way and you remember that I had to buy all these motions before now I can just generate them from the images throws a ball just generate it and then I can give it a sequence first there is a realistic sort of pretty realistic bear swimming then he passes on the water and then I want to keep him swimming and so on so it's a complete story that I'm specifying to generate a video from here and then from that image you kind of have the 3D model and you can change the lighting to kind of change how the image is done and you start with the 3D model and you start adding things so this is not anymore a generated picture or a motion or a video this is a 3D model that I'm specifying with text and I can give it different properties how about this I have an image it got generated and then I create a 3D version out of this and then I start manually editing this in 3D how the scene needs to look like whoa sorry but this kind of keeps blowing my mind and why stop add an image why not create a whole world because I I took a background image and I did a 360 scan to create a whole world and so this is an example how video editing will start looking City street boom make it look more cinematic boom remove the object so it completely changes this manual labor that we're used to of doing things maybe Google will start changing I will just not ask us for a question but it will ask us like what AI you want to have applied what you want to do in the future and as always the engineers will ruin it we're starting with a simple prompt engineering and you probably will end up with prompt flow prompt Lake prompt Q complex but hey let's enjoy the movement right now this is a bunch of book I recommend I'll keep the slide up later to summarize my simple journey of creating myself for a video is not ended but I was amazed of all the things that I learned and how things keep changing um when you're working let's say a regular company you have a form and an HTML and an API this field is completely different so I had to learn so much new things to get going I hope this actually inspires you to start looking at this technology at the real code and maybe start working and contributing to this my hope is to have a conference about this here in Belgium somewhere in March or April where I try to get engineers in the room that can explain actually what was happening what I've shown because I just gave an overview of all the things that are doing and I hope I can see you there so you can follow me on Twitter to get more announcements and if this talk was of any interest to you please talk to me after the talk and thank you for your time and go enjoy the movie later [Applause] [Music] thank you
Info
Channel: Patrick Debois
Views: 7,323
Rating: undefined out of 5
Keywords: promptengineering, dall-e, stable diffusion, motion, runwayml, txt2mask
Id: w102J3_9Bcs
Channel Id: undefined
Length: 15min 21sec (921 seconds)
Published: Sun Oct 16 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.