Hi everyone, welcome to the
first Parseq tutorial video. Parseq is a tool to help you create AI
generated animations like the ones you're seeing at the moment. More specifically
Parseq is a parameter sequencer that integrates with the deforum extension for
the automatic1111 UI for Stable Diffusion. It gives you a lot of control over the parameters
you can feed into the AI image generation process, enabling you to create detailed changes
in camera position, prompts and other parameters, as well as helping you
synchronize the animation to music. The interface can be a little intimidating
at first, but I'm hoping to show you it's quite straightforward once you get the hang of it. I'm the author of Parseq so if you
have any feedback or suggestions, don't hesitate to get in touch in the comments Before we get started, there's a few things
you need in place firstly make sure you can browse to “sd-parseq.web.app”, and that
you can see something similar to this. I'm also assuming you have access to a working
version of the automatic UI – either locally or via collab, it doesn't really matter
which – and finally in that automatic UI, you have the Deforum extension installed, so you
can see that Deforum, and make sure you can find the Parseq section somewhere – it should be
under Init, right at the bottom of the screen And this is the video we're going to
create today. It's a simple animation with a single prompt and a few camera
movements. The result is pretty simple and something you could do without Parseq, but
following through with this will help you get familiar with the basic concepts of Parseq
so you can go on to do more complex things. In subsequent tutorials we'll look
at basic beat synchronization, more advanced prompt manipulation, and advanced
audio synchronization using the audio analyzer. Let's take a little tour of the Parseq
UI. Most importantly up here if you ever want to buy me a coffee there's a
button right here to do that :). Shout out to the 3 or 4 people who've done
that already, thank you very much. There's also a link to the docs
here which are quite complete and a good reference once you've
got to grips with the basics. Over here at the top left is a
status view that shows you the current status of Parseq’s processing.
As you'd expect green means good. And there's also some links
to some more advanced parts of Parseq which we'll cover in future videos. The top right section up here is
a document manager. This is where you can do things like revert to a previous
version of the document you're working on. You can load other documents
you've worked on on this system; everything is stored in browser local
storage, unless you explicitly upload and export things. And you can also load Parseq
documents that have been been shared with you. And finally you can share your own
documents you can either upload to a URL, which you can share with people, or you can just
copy and paste the keyframe definitions that people will be able to load into Parseq using
the the load dialog that we saw previously. Just below that is a document title
which you can edit. Parseq will pick a name for you by default but you
can change it to whatever you like. Below that is the prompt section, where you can
put in any prompt that works with automatic1111 including weights, composable diffusion, and
so on. Parseq also allows you to do stuff with Parseq formulae you can put those in the prompt.
We'll talk more about that in a future video, but that gives you a lot of control over the way the
prompt will evolve over the course of the video. Next, we get into the real heart of Parseq,
which is this grid here. And this is where you define the keyframes that will result in
parameter changes over the course of the video. Each row is a keyframe; each pair of
columns represents a parameter or a field. “Parameter” or “field” are terms I use
interchangeably for these values here. We'll learn why each field has two columns
soon, but in short one of them defines a value at that keyframe, and the other
defines the interpolation, which is basically a mathematical formula of describing
how the value should evolve between keyframes. Below that is a graph showing a visualization of
the parameters. The nodes represent keyframes, and between the keyframes are the actual rendered values that will be used at each
each frame between the keyframes. You can edit keyframes directly on the on
the graph, you can add keyframes as well, and you can also remove keyframes by shift
clicking. The instructions are right here. In some cases editing the nodes directly on the
graph won't do exactly what you expect because the value of the keyframe is actually being influenced
by the formula, not just the position of the node, but that'll make more sense as you
use Parseq, and get to grips with it. You can choose to show either absolute values, or
a percentage of the maximum value. That's useful if you're trying to compare values that
have very very different absolute values. You can also show and hide fields
by clicking on the sparklines here, which is handy. The sparklines give you an
overview of how the different parameters change. By default the ones that are completely
flat and don't have any variations are hidden, so you can expand out to see all of
the parameters by ticking that box. And then finally at the bottom
is the rendered output. This is the actual JSON data structure that will
be used by Deforum while it's rendering, to actually get the actual values for each of
those parameters during the rendering process. You don't really need to understand what's going
on here, but you might find it's useful in the future as you're trying to debug or find
out exactly why something is the way it is, or check that your prompt is the way it should be, There's also another copy of the status screen
here, so you don't have to keep scrolling up and down… um… I will probably move that somewhere
more sensible, like a fixed footer, in the future. Alright, that's a whirlwind tour of the
Parseq interface. Don't worry if that doesn't all make sense quite yet, or there’s
things that you won't remember: that's fine, you'll get familiar with it as you use Parseq. Okay let's get started creating our video.
The first thing I'm going to do is start from a blank slate. When you open Parseq,
the default document already has quite a lot of content. It's intended to give you
an idea of what's possible with Parseq, but today I'm going to nuke all
that and start from scratch. There's a few ways of deleting everything and
soon I'll add a button to make it really trivial, but what I'm going to do right
now is import a blank document which I'll share with you in the comments. It's conveniently in my clipboard so I'm just pasting it and importing
it. I'll just tweak the name. And you can see the document is pretty
much empty: the prompts are blank, I have just two keyframes at frame 0 and 120 –
these are the first and last frames of the video. And the only field that's set is the seed at -1.
Everything else will just use default values. Next we need a prompt. I wouldn't recommend
figuring out your initial prompt in Parseq. That's why I've switched over to the Stable
Diffusion UI. Finding a good prompt here and coming back to Parseq when we're happy
with it is going to be much easier. Alright, so here's my prompt: “an angry
frustrated shouting crying trendy man sitting at a desk desperately trying to
produce pretty animations on a computer”, with a bunch of qualifiers describing how I'd
like it to be, and then I've got a pretty standard negative prompt with some magic incantations I
don't really believe in, but I chuck in anyway. And I'm going to use the Deliberate model today,
you can choose any model you like. So I'll just switch over to that and let's see what this
generates. Okay that's good enough for me, it's pretty cool, there's some trademark
messed up hands which is fine. I'm happy with that, so I'm gonna go over and copy my
positive and negative prompts into Parseq. Now, let's take a look at our
grid. For the purpose of this exercise the only things I'm going
to be interested in are: the seed, the strength, and… let's do a 3D zoom
which is a translation on the z-axis. For the values let's keep -1 for the seed, set
on frame 0. there's no other value set on the last keyframe for the seed, and nothing
in the interpolation column, which means -1 will be held for every frame. This will result
in a random seed for each frame, which behaves just like in straight-up a1111, because the -1
value is passed directly to the underlying engine. Strength in the Forum is a value between 0
and 1 that influences how much the previously generated frame impacts the current frame.
So if you want some kind of consistency you need a strength that's reasonably high. But if
you go too high you won't get any variation or you end up with artifacts. If you go too low
you'll get too much variation frame-to-frame. I'm going to go with a fixed
value of 0.6. So I just enter that value on frame 0 and make sure
nothing else is set for strength All set for our first try. We'll
come back to Z translation later. Now “render on every edit” is ticked by
default. This means that whenever I edit the grid, the Parseq output is calculated
automatically and the graph is drawn. I can see in my graph my constant value of -1
and 0.6 for both the parameters I care about, and the default value of 10 for Z translation. All the way at the bottom is
the output Parseq generates. We'll just copy this into Deforum for
now. There's a quicker way to do this, so you don't have to keep copying and pasting back
and forth whenever you make changes in Parseq, and I'll show you that soon, but
for now let's just copy and paste. Now that we're in the deform UI there's 3
important things we need to take care of before we even think of pasting our Parseq
manifest in. Firstly let's make sure the frame rate that Deforum will use matches
what we've set in Parseq. If you don't do this you'll end up with out of sync results.
I've got 20 frames per second in both places. Secondly let's make sure the total number of
frames Deforum will generate matches what we've defined in Parseq. Technically I have 121 frames
in Parseq because I've got from frame 0 to 120, and Deforum is only going to generate 120, but
that's okay I don't mind losing the last frame. Lastly make sure you set the correct
animation mode. I don't really care right now because I'm not doing any movement,
but when we start playing with Z translation we'll switch this to 3D else we won't
see the effect of the parameter change. Now that we've checked all that we're
ready to paste our Parseq manifest into the Deforum extension UI, and we can hit
generate. And I'm getting what I expect: a desperate looking digital artist changing
on every frame because of the random seed, but with some influence from the previous image to
create a little bit of consistency frame-to-frame. Great that's finished generating. Now
let's take a look at what we've got. And so we have our suitably frustrated person
who's struggling to generate some pretty animations on his computer. I know how
he feels. A few months ago this animation would have been pretty mind-blowing.
Nowadays it's not all that special. So let's see what else we can do. There's no escaping the fact that AI image
generation involves a lot of trial and error; therefore, it's really important to be able to
control exactly what is changing each time you try something new. Currently we're using random seeds
so we'll get different images on every attempt, even if we don't change anything. This will
make it really hard to compare our efforts as we play with other parameters. Let's
look at how we can avoid this problem. Let's take the random seed that
was used for the first frame and paste that into Parseq instead of -1.
This will result in a fixed seed for the whole 120 frames. Unfortunately if you try to
generate an animation with a constant seed in Stable Diffusion and you're using a strength
greater than 0 or 0.1, the results will be really really messy. We can quickly generate
this and see how bad it's going to look. Um… I don't feel like copying my Parseq
manifest around anymore, so I'm going to upload the output. This is something you can
only do once you've logged in to Parseq. I'm also going to tick “upload on every render”, so
every change I make gets uploaded to the same URL. Now I can copy and paste the link that
I get here into Deforum instead of the Parseq manifes. JSON object, and Deforum
will pick up my Parseq changes every time. All right let's generate this, and pretty quickly
we can see it's degrading. The input isn't varying enough so the Stable Diffusion artifacts are being
accentuated on every frame. People call this kind of result overcooked or deep-fried or spaghetti
like. Let's interrupt that because it's rubbish, and see how we can get the seeds to change on
every frame in a predictable way in Parseq. Let's say we want to start with seed 10. I can
put that on my first keyframe. To make the value increase by one on every frame I just need to put
130 as the value in the last keyframe. By default Parseq will interpolate linearly between
the keyframes. There's an easier way to do this which I'll show you once we've talked about
interpolation functions, but this will do for now. Now we can go and render that without
needing to copy and paste anything, Deform will pick up the changes automatically. And
we now have a video with a predictable non-random seed that changes on every frame, but will yield
the same seed every time we do that generation. Okay now it's time to work on
that Zoom effect or Z translation. Just like we did for the seed, we can put
in an increased value, say 128, for the Z translation on the final keyframe. Because we're
uploading all changes we can flip right over to the UI and hit generate, and as expected we've
got a linear zoom in the resulting animation. Well that's a bit boring – how about
we make the zoom bounce around a bit. One way to do that is to add more keyframes.
There's a few ways to add more keyframes, but the most intuitive is just to double click on the
graph. Then you can find the node that corresponds to Z translation – and sorry the color is kind
of hard to see, that'll be fixed soon – but once you've found it you can drag it around.
I also want to zoom out so I'll just edit the value manually in the grid and make it negative…
and now let's add a few more nodes in the graph An interesting side-note while we're here is that
Parseq tries hard to let you work with absolute values. If you're using Deforum directly,
you might be used to using relative values for some properties, for example if you specify a
value of 1 on a rotation with Deforum that means move by one degree on each frame. In Parseq it
would mean move by one degree on *that* frame, and then stay at that position. So when you're
using Parseq you have to think in terms of absolute values you want to land on in each
frame, not the relative change you want to put in. There are ways of changing this which
we'll talk about in future videos. Okay let's give that a go. I always find the units
for Z translation to be a bit confusing so I'm not sure how far we actually zoom in and out,
but let's give it a go… okay that's our video, that's quite fun. Now we're currently using
Parseq's default way of traveling between keyframes which is linear interpolation.
One of the Tool's main strengths is giving you a lot of control over that interpolation
algorithm. So let's see what else we can do. As you might have guessed we're going to
start modifying the interpolation column for Z translation. I can enter C (upper case)
here for “cubic spline” on frame 0, and that will carry to all keyframes for this field and we
immediately see the variation becoming smoother. This is just the beginning of what's
possible: you can use all sorts of mathematical expressions here, conditionals,
built-in functions for sine waves, pulse waves, Bezier curves – check the docs for the full
range of possibilities that are available today. But for now let's just see what it looks
like with a cubic spline interpolation. Okay, that’s finished rendering. Let's
just remind ourselves of what it looked like with linear interpolation
from our previous generation, and you can kind of feel those harsh angles.
If we load the latest render with cubic spline it immediately feels much smoother and
bouncier as you'd expect given the curves. There’s a much simpler interpolation which
we refer to just with S (upper case): it just takes the values seen at the last keyframe
and sticks to it as we can see here in the graph. Let's go and set our seed. That might seem like a
mistake because we know fixed seeds are bad unless strength is extremely low, but here's a trick:
there's a variable called “f” (lower case) that we can reference in our interpolation function that
simply represents the frame number. We can also enter arbitrary mathematical expressions here, so
“S+f” will start from the value of the keyframe and then add on the frame number at the current
frame. The result is a nice linearly increasing seed. And now if we want to change the starting
seed, we only need to update one keyframe. So you might be starting to get a feel for
what you can do with interpolation functions. Another tip is that there's
nothing stopping you changing the interpolation function on a given
field part way through the animation: for example I could start off with my Z
translation being (S)tep interpolated, and then switch to (C)ubic spline part way through,
and you would get the result that you'd expect Deforum can of course do way more than just
zooming. Why don't we do a bit of a showcase of a few of the other parameters that are
available. Let's do some 3D rotations over X and Y, a bit of Z rotation which is basically a 2d
rotation, and then some translations over X and Y. I'll quickly enter some keyframe values
and some interpolation functions for each of those fields separating things out so we're
not trying to do everything at the same time. And so you can see on the graph we'll
start off with a smooth zoom in and then we'll do our rotations along each of the axes and then finally we'll do a translation
on the x-axis followed by the y-axis. Hmm I think it would be better
to zoom out at the start, so I'll just whack in some minor
signs up here and generate again… and everything we expected to see is happening
is here. Our poor animator looks... uh... decidedly worried but hopefully after watching
this tutorial he'll feel a bit better. :) That's all for today, but in future tutorials
we'll be looking at using oscillators for simple beat synchronization, using Parseq expressions
right in the prompt to evolve the prompt during the animation, and also advanced audio
synchronization with the Parseq audio analyzer, that takes an audio file as input and
automatically generates keyframes for it. Thanks for watching. Don't hesitate to let
me know all about your experiences with, and ideas for Parseq, and come and say
hi on the Deforum Discord. See you soon!