Consistent Characters in the New Chatgpt-4o, A Deep Dive

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

open AI spring announcement introduced GPT 40 a revolutionary multimodal Flagship model and its desktop app for Mac the O stands for omn model which represents its multimodal capability to reason across voice vision and text in real time chat gp4 o will be available to both free and paid users with paid users enjoying higher limits and access to voice chat not free users will now have access to Features previously exclusive to paid users including Vision capabilities memory function data analytics web browsing the GPT store and custom gpts currently chat GPT 40 is available to select users through the API and is being gradually rolled out on the chat GPT website open ai's demos highlighted the voice feature and humanlike conversations with the GPT but a notable introduction on their website was the prompting examples under exploration of capabilities one example that caught my attention showcased creating consistent characters through interactions with the GPT in this example an image of a male lady named Sally was generated and subsequent Generations referred to her by name producing consistent results we're going to try to replicate this example we're now on the chat GPT website which has a new URL chat gp.com previously chat. open.com if gp4 has been rolled out to your account you'll see it in the model selection dropdown as a paid subscriber I'm unsure if the new GPT is available to free accounts yet but open AI promises it will be available to free users starting May 19th in open ai's announcement there was no mention of Del 3 being freely available to all users so I turned to chat GPT for clarification unfortunately the response indicated that dala 3 will only be accessible to paid users for the time being as it stands a paid subscription is required to generate images in chat GPT including custom gpts for do 3 if you have a free account and have access to dala 3 please let us know in the comments so now let's copy the prompt from the first example it says that these are Standalone conversations so let's download the image and start a new chat I'm going to upload the image and enter the input from the example as you can see the memory feature has been updated this feature allows users to teach chat GPT to remember details and preferences from their conversations previously exclusive to paid users it's now available for free you can manage the memory by clicking here where you can delete individual data or clear the entire memory the image description has been saved as shown I'll copy and paste The Prompt for the next image hoping Sally's appearance will match the reference image well I don't know who this is but it's not Sally maybe the next prompt will succeed her outfit is more consistent and she's indeed a woman however generating more images using the example prompts yields inconsistent results honestly I doubted that simply using a name in the prompts would produce consistent characters and it seems I was right in the past you needed to reference the seed number or generation ID of the desired image and include it in the prompts to achieve consistency if you have any insights into what went wrong please share your thoughts in the comments when creating consistent characters in d e three methods are commonly employed reference the seed number which is a unique code that initializes image generation ensuring reproducibility and variations using the same seed number will generate the same image or its variations reference the generation ID which is a unique identifier assigned to each generated image distinguishing it from others even if generated using the same seed number useful for referencing specific images and tracking changes and prompt refinement through feedback looping a process where user feedback is used to refine The Prompt achieving consistency through iterative adjustments this method can be combined with either the seed number or generation ID we will test all three approaches to determine their effectiveness before we begin I'll seek guidance on creating consistent characters according to the GPT providing details like name age Etc is best for maintaining consistency across image Generations next I'll ask the GPT to create a character description of a young boy which will serve as the basis for generating images with a consistent appearance with this information I'll create a prompt for D E3 modify it if necessary and start generating creting images this first image is a good match for the description so I'll use it as a reference for the remaining images I'll also retrieve the seed number and use it in every prompt let's experiment with changing the boy's facial expressions first I'll generate an image of the boy looking afraid using the same seed number to maintain consistency the result looks similar except for the eye shape which is likely influenced by the afraid expression next I'll request a happy expression however the boy's appearance is drifting away from the reference image particularly his eyes it seems like the generator is carrying over the eye shape from the previous image to address this I'll inform the GPT that the boy looks different from the original image despite its efforts to match the reference the results aren't consistent perhaps our description needs more specificity let's ask it to describe the first image in detail focusing on the boy's appearance then I'll ask it to remember the description and it will be saved to memory with this refined description let's try generating a happy image of the boy again this time the result looks more consistent with the original image I'll continue requesting images with different expressions and scenes providing feedback whenever the results are inconsistent and sometimes the prompt was automatically updated for instance when I asked for an image of the boy laughing the initial result had him looking different with his hair parted on the wrong side I informed the GPT of the issue and after a few more Generations I finally got a consistent image of the boy laughing sometimes it will repeat the same inconsistent image but occasionally instructing it to modify the prom resolves the issue my next attempt was to have the boy hold a baseball bat but it didn't work out at first I thought widening the image might help but that didn't work either then I asked the model to zoom out and surprisingly the baseball bat appeared but the boy's appearance had changed after providing feedback that the boy looked different I eventually got a consistent image I'm unsure if the image width or zoom out instructions May the difference or if simply rrolling The Prompt was the key I also tested the model's ability to maintain consistency with the boy wearing different clothes as you can see the red shirt was changed to Blue and while the rest of his clothes remained the same his face shape appeared wider after a few Generations I got an image that was close but not quite satisfactory so I asked it to review our conversation so far and suggest additional details to include in the boy's description for a consistent appearance chat GPT generated an updated prompt which I used to get a new image while the boy looked like our character he appeared a bit older after rrolling the prompt once more I finally achieved a consistent look however I wanted him facing forward so I provided further feedback on what was wrong with the image after some persistence I I finally got a consistent image of the boy facing forward and wearing a blue shirt based on the generation so far using the seed number has yielded good results though not perfect next I'll test the generation ID using the same prompt and following the same steps as before interestingly I didn't notice a significant difference between the two methods in both cases I had to provide feedback to refine The Prompt and some times I received a good result on the first try while other times I had to roll the generation multiple times now I'll test the feedback loop method which is essentially what I did previously but without referencing a seed number or generation ID however achieving a consistent look for the character proved more challenging even with feedback on what was wrong with the images after numerous Generations only two images maintained a consistent appearance I believe the seed number and generation ID methods combined with refining The Prompt through feedback are the most effective of the three methods however I wonder if there's a more efficient approach while not necessarily better D e3's image editing tool is certainly helpful to access it simply click on the image you want to modify select this icon to make a selection you can adjust the selector size eyes and click drag the circle over the area you want to change after making your selection enter the prompt to modify the selected area for instance I'll type blue shirt the boy looks identical but now wears a blue shirt but what about changing facial expressions let's try a sad expression I'll select his whole face as most of it would change to convey a sad expression like sad eyes and a frown the prompt will be sad expression unfortunately that didn't quite work maybe I need to refine The Prompt let's start again selecting the whole face and this time I'll use the original prompt that generated the image I'll paste and modify it to get a sad expression that didn't work either perhaps I'm selecting too much so this time I'll select only the mouth and use the prompt sad expression Ah that's what wanted our boy with a sad expression but otherwise identical I'm curious about the prompt chat GPT used to generate this image it's the same as the original prompt except it changed bright red shirt to bright blue shirt and added details for a sad expression including some I didn't specify like slouching it might have even Incorporated parts of my previous prompt like big frown and sad eye eyes you can also use this tool without making a selection as I did in this example remember earlier in this video I attempted to create consistent images of a male lady following the prompts on open eyes website but failed this time I'll try using the seed number to achieve consistency I didn't refine the prompts as I wanted to closely follow the example prompts on the website starting with the first prompt I generated an image then asked chat GPT to describe the image focusing on the woman's features next I asked it to craft a prompt for D E3 based on this description to generate more images of the male woman with a consistent appearance I combined this prompt with a modified version of the next example prompt and added the instruction to use all the features of the woman from the image with the seed number so what do you think do you agree that these images look more consistent than my initial attempt I'm pleased with the Improvement and I think the seed number played a key role in achieving this consistency after experimenting with various approaches I've come to a key realization none of these methods are foolproof and you should expect to refine prompts and reroll generations however leveraging chat GPT can significantly help ask it to describe your character and even even generate a prompt based on an image I noticed that in my last example achieving a consistent look required fewer Generations possibly because I utilized chat gpt's prompt earlier in the process notably I didn't discover any new features in chat GPT 4 that would make creating consistent characters easier all the techniques I used in this video such as image description memory seed number generation IDs and prompt refinement through natural language interactions were already available in previous models this leaves me wondering how the open AI team managed to get consistent images of Sally using only the provided prompts have you cracked the code on creating consistent characters using some new approach introduced in gp4 or Del 3 if so we'd love to hear about your approach in the comments below below if you found this video useful please hit the like button and subscribe to our Channel thank you for watching

Info

Channel: AI Concoction

Views: 2,801

Rating: undefined out of 5

Keywords: openai, texttoimage, text to image, character design, tutorial, ai

Id: ubIkh_sJff4

Channel Id: undefined

Length: 14min 37sec (877 seconds)

Published: Mon May 27 2024