Parseq tutorial 1: Fine grained control of Stable Diffusion and Deforum – Keyframing & Interpolation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi everyone, welcome to the  first Parseq tutorial video. Parseq is a tool to help you create AI  generated animations like the ones you're   seeing at the moment. More specifically  Parseq is a parameter sequencer that   integrates with the deforum extension for  the automatic1111 UI for Stable Diffusion. It gives you a lot of control over the parameters  you can feed into the AI image generation process,   enabling you to create detailed changes  in camera position, prompts and other   parameters, as well as helping you  synchronize the animation to music. The interface can be a little intimidating  at first, but I'm hoping to show you it's   quite straightforward once you get the hang of it. I'm the author of Parseq so if you  have any feedback or suggestions,   don't hesitate to get in touch in the comments Before we get started, there's a few things  you need in place firstly make sure you can   browse to “sd-parseq.web.app”, and that  you can see something similar to this. I'm also assuming you have access to a working  version of the automatic UI – either locally   or via collab, it doesn't really matter  which – and finally in that automatic UI,   you have the Deforum extension installed, so you  can see that Deforum, and make sure you can find   the Parseq section somewhere – it should be  under Init, right at the bottom of the screen And this is the video we're going to  create today. It's a simple animation   with a single prompt and a few camera  movements. The result is pretty simple   and something you could do without Parseq, but  following through with this will help you get   familiar with the basic concepts of Parseq  so you can go on to do more complex things. In subsequent tutorials we'll look  at basic beat synchronization,   more advanced prompt manipulation, and advanced  audio synchronization using the audio analyzer. Let's take a little tour of the Parseq  UI. Most importantly up here if you   ever want to buy me a coffee there's a  button right here to do that :). Shout   out to the 3 or 4 people who've done  that already, thank you very much. There's also a link to the docs  here which are quite complete and   a good reference once you've  got to grips with the basics. Over here at the top left is a  status view that shows you the   current status of Parseq’s processing.  As you'd expect green means good. And there's also some links  to some more advanced parts   of Parseq which we'll cover in future videos.  The top right section up here is  a document manager. This is where   you can do things like revert to a previous  version of the document you're working on. You can load other documents  you've worked on on this system;   everything is stored in browser local  storage, unless you explicitly upload   and export things. And you can also load Parseq  documents that have been been shared with you. And finally you can share your own  documents you can either upload to a URL,   which you can share with people, or you can just  copy and paste the keyframe definitions that   people will be able to load into Parseq using  the the load dialog that we saw previously. Just below that is a document title  which you can edit. Parseq will pick   a name for you by default but you  can change it to whatever you like. Below that is the prompt section, where you can  put in any prompt that works with automatic1111   including weights, composable diffusion, and  so on. Parseq also allows you to do stuff with   Parseq formulae you can put those in the prompt.  We'll talk more about that in a future video, but   that gives you a lot of control over the way the  prompt will evolve over the course of the video. Next, we get into the real heart of Parseq,  which is this grid here. And this is where   you define the keyframes that will result in  parameter changes over the course of the video. Each row is a keyframe; each pair of  columns represents a parameter or a field. “Parameter” or “field” are terms I use  interchangeably for these values here.   We'll learn why each field has two columns  soon, but in short one of them defines a   value at that keyframe, and the other  defines the interpolation, which is   basically a mathematical formula of describing  how the value should evolve between keyframes. Below that is a graph showing a visualization of  the parameters. The nodes represent keyframes,   and between the keyframes are the actual rendered   values that will be used at each  each frame between the keyframes. You can edit keyframes directly on the on  the graph, you can add keyframes as well,   and you can also remove keyframes by shift  clicking. The instructions are right here. In some cases editing the nodes directly on the  graph won't do exactly what you expect because the   value of the keyframe is actually being influenced  by the formula, not just the position of the node,   but that'll make more sense as you  use Parseq, and get to grips with it. You can choose to show either absolute values, or  a percentage of the maximum value. That's useful   if you're trying to compare values that  have very very different absolute values.  You can also show and hide fields  by clicking on the sparklines here,   which is handy. The sparklines give you an  overview of how the different parameters   change. By default the ones that are completely  flat and don't have any variations are hidden,   so you can expand out to see all of  the parameters by ticking that box. And then finally at the bottom  is the rendered output. This is   the actual JSON data structure that will  be used by Deforum while it's rendering,   to actually get the actual values for each of  those parameters during the rendering process. You don't really need to understand what's going  on here, but you might find it's useful in the   future as you're trying to debug or find  out exactly why something is the way it is,   or check that your prompt is the way it should be, There's also another copy of the status screen  here, so you don't have to keep scrolling up   and down… um… I will probably move that somewhere  more sensible, like a fixed footer, in the future. Alright, that's a whirlwind tour of the  Parseq interface. Don't worry if that   doesn't all make sense quite yet, or there’s  things that you won't remember: that's fine,   you'll get familiar with it as you use Parseq. Okay let's get started creating our video.  The first thing I'm going to do is start   from a blank slate. When you open Parseq,  the default document already has quite a   lot of content. It's intended to give you  an idea of what's possible with Parseq,   but today I'm going to nuke all  that and start from scratch. There's a few ways of deleting everything and  soon I'll add a button to make it really trivial,   but what I'm going to do right  now is import a blank document   which I'll share with you in the comments. It's conveniently in my clipboard so I'm   just pasting it and importing  it. I'll just tweak the name. And you can see the document is pretty  much empty: the prompts are blank,   I have just two keyframes at frame 0 and 120 –  these are the first and last frames of the video.   And the only field that's set is the seed at -1.  Everything else will just use default values. Next we need a prompt. I wouldn't recommend  figuring out your initial prompt in Parseq.   That's why I've switched over to the Stable  Diffusion UI. Finding a good prompt here and   coming back to Parseq when we're happy  with it is going to be much easier. Alright, so here's my prompt: “an angry  frustrated shouting crying trendy man   sitting at a desk desperately trying to  produce pretty animations on a computer”,   with a bunch of qualifiers describing how I'd  like it to be, and then I've got a pretty standard   negative prompt with some magic incantations I  don't really believe in, but I chuck in anyway. And I'm going to use the Deliberate model today,  you can choose any model you like. So I'll just   switch over to that and let's see what this  generates. Okay that's good enough for me,   it's pretty cool, there's some trademark  messed up hands which is fine. I'm happy   with that, so I'm gonna go over and copy my  positive and negative prompts into Parseq. Now, let's take a look at our  grid. For the purpose of this   exercise the only things I'm going  to be interested in are: the seed,   the strength, and… let's do a 3D zoom  which is a translation on the z-axis. For the values let's keep -1 for the seed, set  on frame 0. there's no other value set on the   last keyframe for the seed, and nothing  in the interpolation column, which means   -1 will be held for every frame. This will result  in a random seed for each frame, which behaves   just like in straight-up a1111, because the -1  value is passed directly to the underlying engine. Strength in the Forum is a value between 0  and 1 that influences how much the previously   generated frame impacts the current frame.  So if you want some kind of consistency you   need a strength that's reasonably high. But if  you go too high you won't get any variation or   you end up with artifacts. If you go too low  you'll get too much variation frame-to-frame.   I'm going to go with a fixed  value of 0.6. So I just enter   that value on frame 0 and make sure  nothing else is set for strength All set for our first try. We'll  come back to Z translation later. Now “render on every edit” is ticked by  default. This means that whenever I edit   the grid, the Parseq output is calculated  automatically and the graph is drawn.   I can see in my graph my constant value of -1  and 0.6 for both the parameters I care about,   and the default value of 10 for Z translation. All the way at the bottom is  the output Parseq generates.   We'll just copy this into Deforum for  now. There's a quicker way to do this,   so you don't have to keep copying and pasting back  and forth whenever you make changes in Parseq,   and I'll show you that soon, but  for now let's just copy and paste. Now that we're in the deform UI there's 3  important things we need to take care of   before we even think of pasting our Parseq  manifest in. Firstly let's make sure the   frame rate that Deforum will use matches  what we've set in Parseq. If you don't do   this you'll end up with out of sync results.  I've got 20 frames per second in both places. Secondly let's make sure the total number of  frames Deforum will generate matches what we've   defined in Parseq. Technically I have 121 frames  in Parseq because I've got from frame 0 to 120,   and Deforum is only going to generate 120, but  that's okay I don't mind losing the last frame.  Lastly make sure you set the correct  animation mode. I don't really care   right now because I'm not doing any movement,  but when we start playing with Z translation   we'll switch this to 3D else we won't  see the effect of the parameter change. Now that we've checked all that we're  ready to paste our Parseq manifest into   the Deforum extension UI, and we can hit  generate. And I'm getting what I expect:   a desperate looking digital artist changing  on every frame because of the random seed,   but with some influence from the previous image to  create a little bit of consistency frame-to-frame. Great that's finished generating. Now  let's take a look at what we've got. And   so we have our suitably frustrated person  who's struggling to generate some pretty   animations on his computer. I know how  he feels. A few months ago this animation   would have been pretty mind-blowing.  Nowadays it's not all that special. So let's see what else we can do.   There's no escaping the fact that AI image  generation involves a lot of trial and error;   therefore, it's really important to be able to  control exactly what is changing each time you try   something new. Currently we're using random seeds  so we'll get different images on every attempt,   even if we don't change anything. This will  make it really hard to compare our efforts   as we play with other parameters. Let's  look at how we can avoid this problem. Let's take the random seed that  was used for the first frame   and paste that into Parseq instead of -1.  This will result in a fixed seed for the   whole 120 frames. Unfortunately if you try to  generate an animation with a constant seed in   Stable Diffusion and you're using a strength  greater than 0 or 0.1, the results will be   really really messy. We can quickly generate  this and see how bad it's going to look. Um… I don't feel like copying my Parseq  manifest around anymore, so I'm going to   upload the output. This is something you can  only do once you've logged in to Parseq. I'm   also going to tick “upload on every render”, so  every change I make gets uploaded to the same URL. Now I can copy and paste the link that  I get here into Deforum instead of the   Parseq manifes. JSON object, and Deforum  will pick up my Parseq changes every time. All right let's generate this, and pretty quickly  we can see it's degrading. The input isn't varying   enough so the Stable Diffusion artifacts are being  accentuated on every frame. People call this kind   of result overcooked or deep-fried or spaghetti  like. Let's interrupt that because it's rubbish,   and see how we can get the seeds to change on  every frame in a predictable way in Parseq. Let's say we want to start with seed 10. I can  put that on my first keyframe. To make the value   increase by one on every frame I just need to put  130 as the value in the last keyframe. By default   Parseq will interpolate linearly between  the keyframes. There's an easier way to do   this which I'll show you once we've talked about  interpolation functions, but this will do for now. Now we can go and render that without  needing to copy and paste anything,   Deform will pick up the changes automatically. And  we now have a video with a predictable non-random   seed that changes on every frame, but will yield  the same seed every time we do that generation. Okay now it's time to work on  that Zoom effect or Z translation.   Just like we did for the seed, we can put  in an increased value, say 128, for the Z   translation on the final keyframe. Because we're  uploading all changes we can flip right over to   the UI and hit generate, and as expected we've  got a linear zoom in the resulting animation. Well that's a bit boring – how about  we make the zoom bounce around a bit.   One way to do that is to add more keyframes.  There's a few ways to add more keyframes, but   the most intuitive is just to double click on the  graph. Then you can find the node that corresponds   to Z translation – and sorry the color is kind  of hard to see, that'll be fixed soon – but   once you've found it you can drag it around.  I also want to zoom out so I'll just edit the   value manually in the grid and make it negative…  and now let's add a few more nodes in the graph An interesting side-note while we're here is that  Parseq tries hard to let you work with absolute   values. If you're using Deforum directly,  you might be used to using relative values   for some properties, for example if you specify a  value of 1 on a rotation with Deforum that means   move by one degree on each frame. In Parseq it  would mean move by one degree on *that* frame,   and then stay at that position. So when you're  using Parseq you have to think in terms of   absolute values you want to land on in each  frame, not the relative change you want to put in.   There are ways of changing this which  we'll talk about in future videos. Okay let's give that a go. I always find the units  for Z translation to be a bit confusing so I'm   not sure how far we actually zoom in and out,  but let's give it a go… okay that's our video,   that's quite fun. Now we're currently using  Parseq's default way of traveling between   keyframes which is linear interpolation.  One of the Tool's main strengths is giving   you a lot of control over that interpolation  algorithm. So let's see what else we can do. As you might have guessed we're going to  start modifying the interpolation column   for Z translation. I can enter C (upper case)  here for “cubic spline” on frame 0, and that   will carry to all keyframes for this field and we  immediately see the variation becoming smoother.   This is just the beginning of what's  possible: you can use all sorts of   mathematical expressions here, conditionals,  built-in functions for sine waves, pulse waves,   Bezier curves – check the docs for the full  range of possibilities that are available today.   But for now let's just see what it looks  like with a cubic spline interpolation.   Okay, that’s finished rendering. Let's  just remind ourselves of what it looked   like with linear interpolation  from our previous generation,   and you can kind of feel those harsh angles.  If we load the latest render with cubic spline   it immediately feels much smoother and  bouncier as you'd expect given the curves. There’s a much simpler interpolation which  we refer to just with S (upper case):   it just takes the values seen at the last keyframe  and sticks to it as we can see here in the graph.   Let's go and set our seed. That might seem like a  mistake because we know fixed seeds are bad unless   strength is extremely low, but here's a trick:  there's a variable called “f” (lower case) that we   can reference in our interpolation function that  simply represents the frame number. We can also   enter arbitrary mathematical expressions here, so  “S+f” will start from the value of the keyframe   and then add on the frame number at the current  frame. The result is a nice linearly increasing   seed. And now if we want to change the starting  seed, we only need to update one keyframe. So you might be starting to get a feel for  what you can do with interpolation functions.   Another tip is that there's  nothing stopping you changing   the interpolation function on a given  field part way through the animation:   for example I could start off with my Z  translation being (S)tep interpolated, and   then switch to (C)ubic spline part way through,  and you would get the result that you'd expect Deforum can of course do way more than just  zooming. Why don't we do a bit of a showcase   of a few of the other parameters that are  available. Let's do some 3D rotations over X   and Y, a bit of Z rotation which is basically a 2d  rotation, and then some translations over X and Y.   I'll quickly enter some keyframe values  and some interpolation functions for each   of those fields separating things out so we're  not trying to do everything at the same time. And so you can see on the graph we'll  start off with a smooth zoom in and   then we'll do our rotations along each of the axes   and then finally we'll do a translation  on the x-axis followed by the y-axis. Hmm I think it would be better  to zoom out at the start,   so I'll just whack in some minor  signs up here and generate again…   and everything we expected to see is happening  is here. Our poor animator looks... uh...   decidedly worried but hopefully after watching  this tutorial he'll feel a bit better. :) That's all for today, but in future tutorials  we'll be looking at using oscillators for simple   beat synchronization, using Parseq expressions  right in the prompt to evolve the prompt   during the animation, and also advanced audio  synchronization with the Parseq audio analyzer,   that takes an audio file as input and  automatically generates keyframes for it. Thanks for watching. Don't hesitate to let  me know all about your experiences with,   and ideas for Parseq, and come and say  hi on the Deforum Discord. See you soon!
Info
Channel: Robin Fernandes
Views: 17,412
Rating: undefined out of 5
Keywords: stable diffusion, deforum, parseq, tutorial, audio synchronisation, zoom
Id: MXRjTOE2v64
Channel Id: undefined
Length: 18min 42sec (1122 seconds)
Published: Fri Feb 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.