♪ ♪ Zak Parrish: Hi. My
name is Zak Parrish. I am a tech artist at Epic Games. I am attached to support
and developer relations. What that generally means is
that I am never at the office. I am always flying around to
awesome cities like Montreal, visiting developers and helping
them all make their games better. Today, I am going to talk to you about
profiling and optimizing in UE4. When I was asked to give this talk, I was
really excited for a second because I gave this talk two years ago at MIGS.
Some of you might have even seen it. It was cute.
We talked about the CPU profiler, the GPU profiler, and we did some things
in the engine and it was all great. I thought, "Cool. I will
just resurrect that talk." However, I discovered that I have
a lot more I should cover to make you effective at keeping
your games from slowing down. I started adding on some of what I picked
up from other licensees. In time, my outline got way too long. I actually have more information here than
we can ever hope to cover. I am going to go through some of this
a little quickly, mostly as review. When we get to actually using tools
like the CPU and GPU profilers, how to use it, and what buttons to
click is already available on YouTube. If you need everything step-by-step,
go to YouTube. Today, I am going to bombard you with information you should know about
keeping your games from slowing down. I see some of you have notebooks out,
and that is exciting. You will need them. Here is what we are going to cover today. We will start off with some
profiling best practices to get your profiling environment set up. We will talk about the profiling process and how we go about
profiling things at Epic Games. We will highlight the engine
profiling tools that are available. We are going to talk about
Blueprint optimizations, which I am pretty
passionate about. I love Blueprint; it absolutely
changed my life because it allowed me to make games
for the first time. I am not a programmer at all. We will talk about
how dangerous that can be. We will talk about
draw thread optimizations, CPU rendering optimizations,
and GPU optimizations. We will talk a little bit about
optimizing for a device, like a console. We are going to take a look
at some networking, content streaming, and then I have a few
links for you when we are all done. I will start off with
profiling best practices. How are we doing on time? This is great. I am ahead of schedule
for the first time in my life. By the way, I just want to mention this is
a room full of game developers and you are all here before 11 am.
Give yourselves a round of applause. [Audience laughing/Applause] How often should we profile? How and when do we approach profiling? Profile early and often. You should not wait until the end. I come across studios that
profile and check towards the end as they are getting ready
for a vertical slice or something very important. That is generally a bad habit.
You don't want to do it. Conversely, I always like to advise
not to over-optimize. Early optimization - meaning
if you are stressing over every vertex and
shader instruction, you are probably taking too
long to accomplish too little. You should only be optimizing after you have profiled
and you know what you are trying to do is a problem. Some of you are from the enterprise world. You probably know if you bring in a
many billion triangle mesh from CAD or something, you know you are
going to have to optimize that. Within that scope, don't spend too long
until you know it is going to be an issue. Make sure you test on your
target hardware as soon as possible. As soon as you know you are
shipping on XBOX One, you should be deploying to XBOX One. I don't care what your project is. If all you have right now is a little test
scene, get it on XBOX One and get into the habit of deploying
and testing there. Also, make sure everyone on your team who is touching the engine knows
the basic profiling process. They should know at least 30% of what I am
going to cover today. That way everybody on your
team can be responsible for checking in content that is not
going to slow your game down. That is how we do things at Epic Games. If you are checking in game assets, you are responsible for making sure we
don't drop framerate because of them. Next, profiling in a build. Generally, the safest way to profile to make a full build of your game
and profile from within that. If you try to profile within the editor, there is some noise.
We will talk about in a minute. Make sure you minimize all of the other noise and things which can make
your numbers unreliable. Turn off everything you are not using.
Make sure you turn off VSync. The stat command for that is r.vsync 0. A quick note. I see the camera
films going up in the back. By all means, please do take pictures. But do know we are recording this, and we will find a way
to get this slide deck out to everybody.
It is too important not to. We are not going to hoard this. Next, turn off framerate smoothing. If you haven't seen this you
should know where it is. Under Project Settings >
General Settings > Framerate group. If you noticed, a few versions ago
we made the Project Settings searchable. Go to Project Settings and search for
"smoothing". You will find it really fast. Next, make a test build. This is something we
discovered on Paragon. It was common for us to do
development builds for that project. When we did that, the draw thread
was giving us unreliable numbers. Do a test build if you know
you are going to be profiling. Profiling from within the editor.
This slide is a wall of text. You are going to see a lot of
these today. I am sorry. I wanted to include pictures,
but there is not enough room. I wouldn't do serious profiles
from within the editor. For day-to-day things like
checking in a new asset, a new shader, a new Blueprint, etc., it is okay to profile
from within the editor. Just know you should be testing
on builds often as well. The editor is going to add
a little bit of noise to some things. You can mitigate that. You can minimize that, but
know it is still happening. If you have to test
from within the editor, make sure you are playing
in Standalone mode. Do not profile from within PIE
(Play in Editor). It is a bad idea. Make sure the editor is set to
not update in realtime. In the upper left corner of the Viewport,
click the drop down and uncheck Realtime. Then, make sure you minimize the editor so it is using the least amount
of computer resources. Turn of framerate smoothing, and turn off
VSync. The profiling process and
how we actually go about it. First, you are going to
identify the bottleneck. Find out where your problem really is. If you start cleaning up shaders to speed up the GPU and it turns out
you had too many objects on the screen, you are doing it wrong. Make sure you know where the real problems
are before you try to do anything. Your bottlenecks will either be the
game thread or the render thread. The render thread breaks up
into the CPU or the GPU. The bottlenecks will go back
and forth as you optimize. If you were bottle-necked on
the CPU and you fixed that, your next bottleneck might be the GPU. You might have to just go down the
list until you hit target framerate. A really quick check... I will bring this up a few times today. I am going to repeat a few things,
but it is because they are important. Use the stat command r.ScreenPercentage. If you haven't used this command while
testing your game if you want to know... First, you should know if you are
game thread bound or render thread bound. If you are on the game thread, you are
making bad decisions in code or Blueprint. That is one set of problems. If you are render bound,
then your assets are too much, you are trying to draw on the screen your shaders could be too high,
and that is a whole other set of problems. You should generally know
which set of problems you have. If you want to know if you are
render CPU or render GPU bound, set your resolution lower. You can use the stat command
r.ScreenPercentage10 to decrease to 10%. If the game suddenly speeds up,
then you were GPU bound. You basically told the
GPU to do less work. Once you know where you are bound,
search for bottleneck scope problems. I kind of already talked about that. At a high level, a game thread
bottleneck is code or Blueprint. CPU render is for object count,
draw calls, any culling operations, and basically anything that
sets up your scene. At a very high level
when you are rendering, your CPU is building your scene and then your GPU is going to draw it
and make it look nice for your monitor. Automation. Some teams like to set up
automated tests to figure out what things look like and if they
are hitting the target framerate. In a lot of cases,
you can do this with a cinematic. Feel free to set up a Sequencer fly-through of
a camera pass that goes through higher-end sections of your game. You can run some console commands
to record all of the data. We will talk about those commands
when we are done. Not all teams do this.
We don't do this on Paragon. We have a few technical artists who know the Monolith map really well
and they know where to look for the real bottlenecks. If you have people like that
on your team, that is cool. If not, something like a
cinematic can really help. Next, how do we measure? We don't really care about
the framerate number. You should be measure all
things in milliseconds. Make sure you are using
stat unit and not just stat fps. If you like seeing the fps number on the
screen, you can type "stat fps" to see that. I wouldn't recommend it. Every stat you have running
and every metric is stealing a little bit of your
processing power. I would turn off stat fps. The stat unit command gives you
a breakdown of time per frame. It is not always color-coded. It is only color-coded if you
are using stat unitGraph. This gives you the total
time for each frame. It gives you the game thread,
which is the C++ code or Blueprints. It gives your draw and GPU,
which are the CPU and GPU render times. You can also use stat unitGraph, which
shows a line graph for playback. We don't use those much at Epic. Occasionally, it is nice to
have stat unitGraph going because you can see if you have
a regular heartbeat coming through, which usually means something
is getting in the way of your test. In a lot of cases, the reason is
other software is hitting your GPU. Next, part of our process. Probably what anybody at Epic does first
when they start profiling is to start throwing switches and see
what gets better or see what breaks. That is an easy way to find bottleneck
issues and figure out where they are. Sometimes the best first step
is to go through this process. It is usually as simple as just showing and
hiding different problem areas so you can see what is affecting you the most. It is also useful to see what happens
if you render at different resolutions, which we already talked about a little. We talked about stat
ScreenPercentage a minute ago. We will go into depth so
everyone understands. This is mostly useful for problems
unrelated to the game thread. It is for rendering issues only. You can use stat unitGraph to figure
out what your milliseconds are. Then use r.ScreenPercentage 10,
or any number lower than 100. You could set it to 50. I think 10 is usually what we use. This takes your overall resolution
and you're only showing 10% of that. You are sending fewer pixels to the GPU. The GPU is doing less work. If your game immediately speeds
up, your GPU is over-taxed. It is a great way to know where your
bottlenecks lie in just a few seconds. Oh, I missed a slide. Show flags. I am surprised at how
often this is overlooked. Show flags are a cool way to diagnose issues you might have or
different bottleneck locations. You can turn things on
and off in your scene. You have commands to show or hide things
like Static Meshes, Skeletal Meshes, etc. Another common thing I didn't
list is instanced foliage. You can type "show<instancedfoliage>". A quick story. While I was putting this presentation together,
I thought I might have time to go into the editor and have you click
on a few things. While I was doing that... Have you all see the Epic Games
Launcher and the Learn tab? On the Learn tab, there is project called
Landscape Mountains. It is an old hang gliding game
we put out a while ago. I loaded that up, and I think the
last time we saved it was 4.11. I updated it to 4.17. My framerate dropped significantly. All I did was enter show<instancedFoliage>
and turned that off. Suddenly, my framerate sky rocketed. Between 4.11 and 4.17 LODs broke, but I was able to diagnose that
very quickly by turning something off. Make sure you are going
through this process. If you know you have a lot of
Static Meshes in your scene, turn them off and see how
much faster the game gets. You know it is going to get faster, but knowing how much
will help you understand should I be using LODs
a little more aggressively? Should I be using less transparency? Should I be tweaking lighting
and use fewer lights, etc.? Diagnostic tools. Let's talk about what we have. We have already talked
about a few stat commands. These give you realtime statistics
on different parts of the engine. There are a lot of stat commands.
We are not going to cover them all. I am going to cover the ones
I use most commonly when I am doing different profiles. Here are some of the more common
commands you should know. We talked about stat fps and stat unit. We have a whole slide dedicated
to <i>stat scenerendering</i>. Stat gpu is relatively new. You can get realtime feedback on
what the GPU is seeing. We have stat streaming for
texture streaming stats. Stat emitters and stat
lighting are great if you are trying to figure out how
much your effects cost. Stat scenerendering is important. Always remember this and always use it. This is the only place where you
can see where your draw calls are. Draw calls are a major culprit
for slowing down your CPU. Basically, you just have too
many things on the screen. This could be individual meshes, too many Materials,
a combination of both, etc. Also, this is a good place to see how long
shadows are taking to render. Shadows, decals, post processing, lighting
are all broken down line by line. Run your game, type "stat SceneRendering"
and all of the data is right there. Stat GPU is relatively new.
It has only been out for a few versions. The engine has a GPU profiler already. This gives you realtime metrics
on what the GPU is doing. It is not very detailed, but it can
help you get a high-level overview in general terms of what things cost. For example, you can see the
shadowed lights are costing a lot. I should look into that. You can start decreasing the number of
shadows you are casting into the scene. In most cases, you still want
to use the GPU profiler. That is really the only way
you are going to find individual lights. In GPU profiler, you only sample
a single frame. You break the frame down and
see everything that happened. Using stat GPU,
you can see specifically which lights are shadowing and
how long each shadow takes to render. You can make more intelligent
decisions that way. At a high-level,
stat GPU can really help you. Next, we have some
optimization view modes. These view modes kind of
turn you into the Predator. They are kind of easy to forget about if you have been jamming
and staring at the Viewport all day. There are some cool modes
to help you see how much you are doing in your scene
at any given moment. The view modes enable you to see different
problems live in your scene. There are a few view modes that not to everyone knows about, which I explain
to people every now and then. Let's take a look. First, Shader Complexity mode.
Hopefully you have all seen this. Shader Complexity mode shows you how much work your
GPU will have to do to draw your shaders. Probably what is most useful out of
this is seeing where your overdraw is. By the way, this scene is horrible. When you open it up and see this much red,
you should generally be afraid. This works by color-coding your
whole screen from green to red. Green is generally good.
Red is generally bad. White is absolutely terrible. Just be really careful about the shaders. As transparency starts to stack up, you will see overdraw issues and
you will see the colors start to pop out. You can see some overdraw on the horizon. It may be okay, just be aware of overdraw
and it gets expensive quickly. Next, we have Quad Overdraw mode. This is a relatively new view mode that
hasn't been in the engine very long. A lot of people don't know it is there
or they don't know why it exists. This mode is a way to keep track of how
polygons are being used on your screen. This mode is important if you are on the
forward renderer because of MSAA and how much anti-aliasing is going to be
hammering all the edges of your polygons because it is going to draw the pixels
at the edges over and over. There is a direct relationship between how you are using your polygons and how efficiently and fast you will
be able to render in forward rendering. Keep that in mind. Even outside of forward rendering, Quad Overdraw mode is a nice way
for you to quickly visualize whether or not you should be using lower
LODs. I will jump forward for just a second and
talk briefly about how this works. Your GPU - in order to...
The colors are amazing. These are supposed to be red or pinkish.
Maybe you can see that. A lot of the operations your GPU
is performing are not taking place on individual pixels, but on 2x2 sections of pixels
which we refer to as quads. It is an optimization so you do
not have to do as much work. If you have to re-draw
all 4 pixels of a quad, you are probably not using the GPU
as efficiently as you could be. As you can see, things like
long or really tiny triangles cause extra processing that is just
going to be discarded. You have to watch out for that. I will put this in practical terms
for all the artists out there who don't want to read up
GPUs, which you should. When you are looking at an opaque
object on the deferred renderer and you see a lot of green,
that means all 4 pixels of that quad had to be
recalculated over and over. You should probably be using lower LODs. It just means you have a lot of pixels
that just rendered in a tiny 2x2 section. You should have a lower LOD
that you are switching to. Next, Shader Complexity
and Quad Overdraw modes. This slide is last two slides
I talked about combined into one. It is pretty cool to see
this all in one frame. I do find myself going back to the individual modes to
diagnose specific problems. It is neat to see it all at the same time. Light Complexity mode. I love this mode. This mode allows you to visualize
the cost of scene lighting. It is just direct lighting, so it is
not giving you the cost of shadows. One of the major causes
for GPU slowdown is the number of dynamic lights
you are trying to draw at any given time. Light Complexity mode is a great way
to quickly visualize that. If I walked into this
room in this game and suddenly I wondered why the
game was getting slow, I could switch to this view
and see why. Light Complexity mode color-codes from
cool colors up to warm, to nuclear white. Where you see white, you should
probably rethink your approach. Next, Lightmap Density mode. Lightmaps are awesome. Lightmaps are a way for you to handle
light bakes in an intelligent way. I often find if you don't pay attention,
people will often keep their lightmaps at a higher resolution
than they probably need. In many cases, your Lightmass light
bakes really just need soft shadowing to make it look really nice. You probably don't need a high resolution
on everything. It is also easy to lose track.
This view mode will save you some time. Instead of having to go through a spreadsheet or click on several things
to determine their lightmap resolutions, you can turn on Lightmap Density
in the Viewport and immediately see where
you might have problems. This is just an outline around a door.
It is very flat. I would change that to a lower resolution. If you make a habit of keeping things
as low resolution as you possibly can, your light bakes will be a lot faster. Next, Stationary Light Overlap. This mode is also easy to forget,
especially when you have a lot in a scene. When you are using stationary lights, you can have up to 4 stationary lights
affecting any given object. The 5th light will be
converted back to a movable light so it is fully dynamic.
Fully dynamic shadows are very expensive. The engine does the switch for you
based off radius, brightness, etc. It basically tries to figure out which is
the least important light and switch it to dynamic. You don't want that to happen. The Stationary Light Overlap mode
is basically going to alert you that you have too many stationary
lights in one area. Here is an important point. If you have a stationary sun,
for example. If you have a non-realtime time of day
and use a big stationary light as a sun, you only have 3 lights left for
every other object in your scene. Just be aware of that. Next, LOD Coloration. I used this in the
landscape example from earlier. In the distance,
you can see the trees remain white. None of them were using LODs. LOD Coloration is exactly
as the name implies. It color-codes your entire scene based on
how you are using LODs. You can learn a lot about why your CPU is
slowing down by turning on LOD Coloration. Next, some profiling tools. Again, we are going to be covering the session front end
for CPU and GPU profiling. We will talk about the GPU profiler. We are not going to go click by click.
I am sorry. I wanted to, but there
wasn't a lot of time for it. We have a lot of cool tutorials online to
show you how to use those tools. Most of you don't have
your laptops anyway, so it is okay. CPU profiling takes place with
the Session Frontend. We have a Profiler tab. It is an integrated tool
that allows you to keep track of what is going on each tick. It is a good way to get performance data
on your gameplay code (C++ or Blueprint). It works by measuring a segment of time. When you start playing your game,
use the stat commands: stat startfile and stat stopfile. In between when you enter the commands,
the game is recording everything. You can open the recording
of the Session Frontend, and you can see everything
that happened on every tick. You can look at individual frames or an
average over a given selection of frames. It is very cool. My favorite thing is that you
can go into world tick and see how long individual Blueprints
are taking on the game thread. You can go down and see how long
individual nodes are taking to process. If you have written some Blueprint nodes
or have engineers helping you, you can see how long the work is taking to
process on the CPU. The Session Frontend can also be used for
CPU rendering as well as GPU rendering. On the left hand side, all of the
rendering options are listed. You can see how long they
are taking in milliseconds. It is a very useful tool. Next, the GPU profiler. The purpose of the GPU profiler is to take a single snapshot of a frame
while you are playing, and give you a breakdown of how long
aspects of that scene took to process. It is pretty easy to use. You can press "Ctrl + Shift + ,"
while you are in the editor. Or, you can use the command profile
gpu, which is not on the slide. You can just type "profile gpu",
and it will bring up the command. If you are running the editor, this window will
open and you will get this visual breakdown. If you are in a build and
not running the editor, it is going to output all of the same
information out to the log. You can open the log in any text editor
and see how long things are taking. The GPU profiler is very useful because it
gives you information like how long the base pass is to set up
all your lights and Materials. It tells you all the times
for lighting and shadows. It tells you how long post
processing is taking. At the top, the GPU profiler has
a bar graph you can mouse over to see what portion of the scene renderer
is taking the most time at a glance. It is very, very useful. Next, tracking slow frames. How do you know where
slow downs are happening? We have a command specifically
for knowing where your game is hitching
called stat dumpHitches. You should know about this command. The command is used to dump any hitches
that run over a given time in milliseconds. You can set the number by
typing t.hitchThreshold 0.XX. For example, if you set it to 0.05 when you run dumpHitches it will dump
everything over 0.05 milliseconds to run. It is a great way to track down things that
are running over a given amount of time. At the bottom, there is an
example of how we use this to create reports framerate
on Unreal Tournament. This is actually a report
that goes out every week. There is also the command memReport. There is a good blog post on this,
so I am not going to go into depth. On the Unreal Engine blog,
search for "memReport". Essentially, this gives you a breakdown of
how memory is being used in your game. You can see memory issues and make
intelligent decisions from there. The commands startFPSChart and stopFPSChart
give you a chart to figure out how your framerate may be increasing or
decreasing during parts of your game. If you combine this with a Level Sequence
(like a camera fly through) it can be a nice way to automate
where your game is getting slow and where parts of your
environment are being affected. Blueprint optimizations. How are we doing? We
are moving through. Blueprint is my favorite thing. Blueprint is the coolest thing since
the other side of the pillow. However, Blueprint makes it really easy for visual people who are artistic
and not coders, like me, to make their own gameplay logic. In a lot of cases, those are the last
people you want touching gameplay logic. It is easy for people
to make bad decisions. When I go talk to people about
using Blueprint, there are common problems that
especially beginners run into. We are going to focus on those. Yes, I do generally refer to this as
keeping the kids from eating the Crayons. But the green Crayon tastes so good. All right. Blueprints make it very easy
for people to bad things. We find the best results come
from some sort of mentorship. If you are at a studio and you want
people to be successful with Blueprint, have them sit down with engineers
from time to time. Ideally, have an engineer look
over shoulders or review Blueprints and correct them when they are
doing something bad. How many of you are programmers?
Raise your hands. We can still be friends. If you are an engineer or the engineers on
your team don't want to do this, they don't want to explain why you are
making bad decisions, I assure you. But it is better for everyone. It will take the visual people
who aren't used to thinking like coders, and it will
make them more efficient. Take the time to show them why
they are doing something incorrectly. Some common challenges you will run into. A reliance on Tick functionality. I love Tick. You can plug whatever you want into
Tick and that thing magically works. You should never use it in Blueprint. If you think you should use it in Blueprint,
you should not use it in Blueprint. If you try to use it,
you should probably be smacked. Challenges: Over-use of expensive functions,
(things that iterate on many objects), and abuse of hard references. All the engineers in the room are probably
nodding, thinking every scripting language in the world has
these exact same rules. That is the thing about Blueprint that the visual and artistic people
kind of forget. Blueprint is just a scripting language. It lives on its own VM layer. Blueprint has its own layer of processing,
so it is slower than native code. You have to follow these rules. We will talk about how to get around some
of them. Relying on Tick. Ticking means you are calculating
on every frame. You might think you want to tick;
you probably don't want to tick. You certainly don't want
it in a scripting language. Blueprints should almost never
need to tick. Sometimes you will run into something
like updating a Material, or you have a HUD for a fighter jet game
and you really think it should be updating on Tick.
You are probably wrong. You can probably update slower like 20 or 10 times per second
and no one would ever know. We will talk about that in a minute. Remember to uncheck Enable Actor Tick in the Class Defaults on every Blueprint
that is not ticking. It is easy to overlook. Since we designed Blueprint
to make visual people happy, we wanted them to be able to
plug something into Tick and see it work. By default, all Blueprints are ticking. In the Class Defaults, disable
Enable Actor Tick if you are not using Tick. Next are some alternatives to
Tick you should know about. Look into timers. Depending on what you are
doing, you may use a Timeline. If you have a curve based animation, maybe
consider timelines. There are cheaper ways than timelines
if you are doing anything linear. Manually enabling/disabling Tick on demand
is also very cool. Let's say you did write something,
and you say, This has to tick. I know my engineer told me not to, but it just looks so much better
when it is ticking." Cool. You can leave Tick off until
you need to do that particular thing. Then, enable Tick. The Tick function will run and do whatever
it is supposed to do. When it is finished, you can disable Tick. You are only ticking when
you absolutely have to. If you do that, make sure you at least
think about adjusting Tick Frequency. This is also available
in the Class Defaults. By default, Tick Frequency
is set to 0.0, which means you are ticking
on every single frame. If you set that to 0.1, you are now only
ticking that Actor 10 times a second. Use Tick intelligently. Expensive functionality. Some functions in Blueprint are just
expensive by the nature of how they work. For example, Get All Actors of Class,
spawning objects into the world, or anything that needs to iterate
over a large list of objects. I have been to studios where people use
Get All Actors of Class, plug it into an array, doing a For Each on that array,
and doing a loop on each thing in that array,
and doing that on Tick. Then, they can't quite figure
out why their framerate drops to 5 frame per second
when they run that Blueprint. They tell me that our software is broken.
It's awesome. Try not to use these at all,
if you can get away with it. If you can find ways not to query
the scene for lots of things, that is great. If you can have those things call up
to a master class and say, "I exist. You should take note of me."
That is always faster. That is the kind of thing you will
learn from mentoring with an engineer. Make sure you are doing that. If you have to use things like this,
know the tools you have on hand. Make sure you know about TSets
and try to utilize those. Try to find very seldom times
to do the heavy things. Maybe at Begin Play, maybe after a delay
on Begin Play; it kind of depends. Sometimes if too many things are using
Begin Play to do the heavy processing, it can take awhile
for your game to start. Just be careful about it. Be are of heavy ConstructionScripts
because they can hurt your spawn times. Consider placing those things
in the world beforehand. Another one that I didn't list is
to watch out for any Blueprint that has a lot of components because
it takes a really long time to spawn those. This is not obvious. I run into a lot of studios
that suffer from this. This is using hard references
in Blueprint. It is very easy for Blueprints to generate
references to each other. The problem with references is when you load an asset
(like a Blueprint) into the game, it has to load all of the
references to other Blueprints too. If all of those Blueprints have
references to other Blueprints, then those have to load in
all of their things too. This can actually spider web into a situation
when you load in your game character, you load in 5/8 of your game
all at once. You can't understand why
load times are bad, and you call up Epic. You say,
"Hey, your stuff is broken". Watch out for this.
It is really easy to do. It can sneak up on you. Has anyone ever used
Cast To nodes in Blueprint? Only one or two Blueprint in this room
have ever used Blueprint. That is awesome. Cast To nodes make a hard reference. Cast To nodes are fine to use if you
know you need them. That is great. My example is if you have
a Pickup class like a health pickup. When someone runs over it, you
want to cast to the Player to make sure it was the Player,
that will probably be fine. On the other hand,
if you said the Player has to cast to every Pickup type in the game,
that is probably inefficient. Start thinking about it in
terms of flow of information. Also, make sure to use
Blueprint Interfaces for things like that. The easy way to solve
that problem is to get the hang of ignoring some messages
you see with Blueprint Interfaces. If the receiving class understands
what to do with the message, and implements that Interface,
then great. If they don't, it won't crash your game. Basically, you can cast or get
a reference if you need to, but do it to a generic class like Actor. Then, just send a Blueprint Interface
message to that Actor. Other Blueprint optimizations. These are general tips for
anybody ever using Blueprint. Avoid doing too much of any one thing. It is the same rule as
any scripting language. Watch out for too much
functionality in one class. There is a studio I ran into
a long time ago where they modified the editor
so if you added more than 100 nodes to any Blueprint,
the moment you added the 101st, a modal window popped up that said,
"Go get a programmer right now". It is a real thing.
I was kind of impressed. You don't want to have a
lot of functionality. On that note, have you played Robo Recall? Most of that game is done in Blueprint.
It is actually pretty cool. There are some generic classes
on the back end, but most of the functionality
is totally defined in Blueprint. If you open those Blueprints,
there is a lot there. Just understand it took us
a lot of tweaking to get just right. Make sure you avoid class hierarchies
that run too deep. Watch out for having too many
components within a class. It just makes those classes expensive. It makes them hard to spawn in. If they are Scene Components,
it can make it hard to transform around because they have to be updated
on every frame as well. Avoid too much high-end math. I have seen people who are
really proud of some cool math equation they did with traces
or something in Blueprint, and they went through a lot of effort
to recreate something like dot product. Don't do that sort of thing. UE4 has a math expression node. If you haven't seen it, Right-click in the
editor and type "math expression". You can type an expression and it will
build a graph for you on the back end. That system is already
optimized to go fast. If you have to do high-end math,
try to do it that way. When all else fails and
you just can't make your Blueprints fast, call one of the many engineers in the
room right now and have them nativize it. Generally in our case, that means the engineers build a brand new class
entirely from scratch in 3 lines of code. That is fine. Then, you can re-parent your descending
class up to the new native class. Then for the most part,
things should just work. All right, knowing what
Actors are ticking. There is actually a command for that too:
dumpticks. This command will list everything in your
scene that is ticking right now. You probably can't read it, but it tells you how many Tick functions
are being used, how many of them have Tick enabled,
and how many of them don't. This data can be really useful as well. That is everything on Blueprints.
Now, we can move on to rendering. This is where your scenes can
become really slow and terrible. We will start with draw thread
optimization, which are things on the CPU. It is exciting. Bottlenecks at the draw thread are usually
caused by things like too many draw calls, or occlusion queries,
simulating too many particles, etc. It really comes down to attempting to
do too much in your scene at once. Every time the draw thread is
slow, you are doing too much. Find everywhere you can to put
fewer things on the screen. It could be fewer objects.
You can use fewer vertices by using LODs. Think in that direction. Know about the Actor Merge tool located
under Window > Developer Tools. The Actor Merge tool takes a selection of
meshes and combines them into a new asset that will replace the
original in your scene. We use the Actor Merge tool a lot to
decrease the total number of draw calls. It actually makes a new mesh out
of whatever you had selected. Use Mesh Reduction to lower the complexity
of your mesh when setting up your LODs. The Actor Merge tool works best with
many meshes that have the same Material or as few
Materials as possible. If you try to combine 20 meshes and each has its own Material,
every Material you are not benefiting
from the tool because is going to make a draw call anyway. You are kind of right
back where you started. In the Actor Merge tool, there is an
option to combine Materials into one. It basically folds everything together into
one Material by atlasing all the textures. Understand this requires Simplygon
implementation to use that functionality. Next, Instanced Static Meshes.
These are awesome. Instanced Static Meshes are
a way to create multiple copies of any given mesh, but they will all
be considered the same mesh object. Right now you can only create
them in the editor by way of Blueprint or maybe your own native code class,
like if you were to write an art tool. It is pretty easy to make
a Blueprint that will do this. Generally what I do is
create two Blueprints. One will be a placement Blueprint. The placement Blueprint does nothing, it just holds a Static Mesh Component
you can point at any mesh in your game. You can use that to figure out how the
tables and chairs will be positioned. Then, you have another Blueprint that just collects all of the meshes together, figures out how many meshes it needs, and then populates the Instanced Static
Mesh Component to replace those things. If it is really clever,
it will at least turn off or hide or maybe delete the placement meshes
that were already there. Generally, I like to hide
the placement meshes. They weren't ticking anyway. If I need to replace some things later,
that makes it easier. Also consider Hierarchical
Instanced Static Meshes. These are kind of an advanced
version of Instanced Static Meshes. They handle their own
occlusion and visibility. It is essentially what we are using on the
back end for foliage. Hierarchical LODs are something you should
know about. We use these in Paragon. Paragon is an important case study
in lots of situations because it is a very pretty game that has to run
at 60 Hz on console at 1080p. Know about that. Hierarchical LODs allow
you to combine multiple meshes and then reduce the single mesh
that has been combined. You have 20 meshes. The farther you are away, the meshes lump
together and that one mesh will be simpler and simpler as you get
farther and farther out. It is really cool for things like cities or
any large conglomeration of complex things. Hierarchical LODs do not support
Instanced Static Meshes. Know UE4 will fail if you start
adding Instanced Static Meshes. I have run into that a few times,
even on this trip. Hierarchical LODs to require Simplygon. Next, GPU optimizations. Making the most of all the pixels you are
drawing on your screen. Vertex shader optimizations. Be careful how much you use
things like World Position Offset. By all means, don't be afraid of it. We use it a lot of places. Just know that it has a cost,
and that cost can add up over time if you have too many objects using it. Vertex color can also be costly. On Paragon, we removed vertex color
from all of our assets and just put it in manually where we need it because the cost
was adding up to be too much. Pixel shaders. I just have some dos and don'ts
to make your lives a little easier, instead of doing several
individual case study problems. Don't use too much math. Again, there are the people who
will make really high-end math equations to solve something on a pixel shader,
but that quickly gets really expensive. Obviously, don't use too many textures. The engine will alert you
if you do that anyway. Watch out for procedural functions. We make a lot of cool nodes
that do cool effects on your shaders. Noise is my favorite. We have a really cool Noise
node in the engine now. I wouldn't use it for anything I
was shipping, just to be clear. You can definitely use the Noise node with a Render Target bake
to make tileable textures with it. I wouldn't use a Noise node for anything I
intend to have on my screen. Too many Material Layers
is another common don't. If you are familiar with Material Layers, they are a way to specify
a type of Material (e.g. skin, leather, denim, steel, etc.) and layer them all together
to make your final product. We do this all the time in Paragon
for characters that have armor, etc. You have to be careful because you are
essentially taking that final Material and adding all the instructions of all
of the Material Layers. If the layers themselves
are really complex, it can get to be expensive. Watch out for a reliance on if statements. I run into a lot of cases
where people believe that if you have an if statement, you are
using one side of the graph or the other. You are not. The pixel shader
has to calculate both sides. Good ideas for pixel shaders. Here are things you should be doing. Texture look-ups instead of using math,
if possible. You should be doing anything you can bake
out to a Lookup Table. Compress grayscale maps
into single textures. You can add a grayscale map
into a channel for RGBA. You will see we do this
a lot in the engine. Minimize your Material layer usage. Don't use more Materials
layers than you have to. When you do Material layering like
that, keep the layers very simple. Don't worry about putting a high-end
normal map for your character inside here. Layer it over the top of the Material. Little things like that can save you some
time and hassle. Use Switch Parameters to turn
off the things you don't need. Your individual instances of a
Material can be simplified. It does require you to compile
another copy of the Material. If you use something like
a Switch Parameter and for instance A you have it on,
and for instance B you have it off; you would have
to compile both copies. That is better than not. Next, Material instruction counts. If you make content for the engine,
you may want a picture of this slide. Always pay attention to
Material instruction counts. This number is not accurate in the
Material Editor until you click Apply. It is an easy misconception. I will often open a Material and
first I will move a node and click Apply to recalculate and make sure that I
know what those numbers are. On Paragon, we actually
have an instruction budget. A lot of people asked me this. For characters, it is
about 350 instructions. On the environment, it is 150-200
with an absolute maximum of 255. If you go over that,
you have to start cutting down. It is cool to pay attention to this because
depending on what you do in your shader, you will see some operations will
really increase the instruction counts, and others you almost don't notice. Keep an eye on that number. Dealing with overdraw. Overdraw is probably one of the leading causes
of hurting your GPU; it will slow your game down terribly. One thing you can do
about overdraw is do less. Have fewer blades of grass, fewer
visual effects, fewer sheets of fog, etc. You will probably have to do that anyway,
just so we are clear. Minimize the area of
geometry for overdraw. Did you all see the kite demo we
did at GDC a couple years ago? That had a lot of grass and flowers
as far as the eye could see. We took all the polygon sheets
for the grass and cut them out all the way around the texture. There is almost no overdraw edge
around the whole thing. Don't be afraid of doing things like that. For particle systems, make sure you are
always using the Particle Cutout property. You should always be using this
for translucent particles. It is pretty simple.
It is in the Required Module. You assign a texture and it will snip out your particle sprite sheet to reduce it down
to the size of your texture. It will key off of alpha, off of brightness,
or any of your individual RGB channels. Next, managing texture resolution. You can author textures at pretty
much whatever resolution you like. I should be really
careful when I say that. 8k textures are probably over the top,
but I guess you could do it if you really wanted to. Keep in mind that you are probably
not always using full resolution. The engine is going to try to mip
those textures down, and very aggressively in some cases. Always make sure your textures are all
in powers of 2 so you can mip them down. Use the Texture Streaming views to see
the mip levels for any given texture. We are going to cover this
a little bit at the end. You can use the Statistics panel
under Window > Statistics to see what textures you have loaded
in your level at any given time. You can see what resolution
they were authored at, and you can see what resolution
they are actually drawing at. You can actually see what
mip level they are using. You can start to notice patterns
really quickly that way. We have this really awesome
8k texture on this rock that is only ever using
a tiny 64 x 64 mip. You can start making decisions like that. Next, lighting considerations. Dynamic lights are expensive. Hopefully everybody knows
that at this point. Small, unshadowed lights
are relatively cheap. You want to be careful
how you use those too. You can have a lot of these lights
in the deferred renderer. On the forward renderer, the rules are
a little bit different but not so bad. A few things. Minimize the number of
dynamic lights at all times. Try to keep the number of dynamic lights
as low as you can. I don't talk much about static lights. You bake those and then forget about them. At that point, they are really
just the cost of your lightmaps. Minimize the number of things
dynamic lights have to affect. Dynamic lights, particularly when
you are using dynamic shadows, have to process the geometry of
whatever they are casting shadows on. Dynamic lights against a very high resolution
piece of geometry gets expensive too. Know about that sort of thing. Minimize dynamic light radii. Keep the radii really tight. Cast shadows from as few
dynamic lights as possible. I run into studios where it is better,
and usually not every light is casting shadows but they use
a lot more than they need. Your scene can usually look very good with
few shadow casting lights. Watch out for stationary light overlap. It will sneak up on you
because it is easy to lose track of. Obviously, bake lights wherever you can. I know that baking takes time.
Artists tend to hate that. At Epic, we have nightly light bakes. We generally only use preview or low
quality builds for the most part. We only do like a production build
right before we ship something. Keep it simple for as long as
you possibly can. Don't assume that you need dynamic lights.
You probably don't. I have run into folks,
even on this trip, who say they had to use dynamic lighting because. When you look at their scene,
they are in a world that is very static. Things are not changing. They are not
knocking down walls very often, etc, so you should probably be baking that
whenever you can. Use Mesh Distance Field shadows. If you can afford Mesh Distance Field
shadows, they are pretty cool. Not all projects are right
for Mesh Distance Field shadows. Sometimes on console they can be a little
heavy depending on what you are doing. We use Mesh Distance Field on Fortnite and
Fortnite just went out on XBOX One. They are cool. They give you Distance
Field Ambient Occlusion (DFAO), but they also give you nice distance field
shadows which we use at a distance. You see what I did there. Watch out for dense shadow cascades. Make sure you are aware of Shadow Bias,
which will clean up some artifacting. You can overuse that as well.
Like with a dynamic sun, you will want to limit the
angles that your sun can move so you don't have to
use Shadow Bias too much. Always keep your lightmap resolution
as low as you can, which we kind of already covered. More lighting considerations. Avoid light functions if at all possible. They look really cool. We use them. If we need to do cloud shadows
across the ground, we will generally put a light function
on the directional light. Know those things get expensive,
and don't do it if you don't have to. IES profiles are much cheaper,
so use those if you can. Lit translucency gets really expensive. One random case study. I was helping a studio out with
a game they were putting together and they couldn't figure out why
lit translucency was hurting them so much. If you remember, we released a lot
of content when UE4 came out. We have the one effects cave
that looks kind of like Skyrim. There are a lot of lit
translucency particles in there. They had migrated that asset
into their project and then copied it over and over again
to make ground fog. I found that and said,
"Just because we did it to show off how cool the engine is
doesn't mean you should ship with it". Just a random thing. Just because we did it and it is cool in a tech demo,
doesn't mean you should ship with it. I said that to everyone. Next, cull dynamic lights
as early as possible. Turn them off if you can. Spot Lights are cheaper than Point Lights, at least by a little bit. Don't be afraid to
completely fake shadows. For several of our VR applications, we will either use capsule shadows
because they are cheap. In some cases, we will actually do
completely fake and old school 1997 blob shadows where we have a piece of
geometry with a modulated Material on it. Optimizing for device. I am going to go through this
kind of quickly because of time. You do have some device optimizations. We try to promote the "build once,
deploy anywhere" technique. To keep your life easy,
you shouldn't have to be doing a lot of bespoke individual assets for
the different platforms. All things being equal, this just works. Many of the content tools
in UE4 have options for making intelligent content that will
cook to many different platforms. If you didn't know... Actually, a lot of the things I am going
to talk about are things we are using on Paragon to keep it running
at 60 Hz on console. Make sure you are aware of
the Devices Profiles editor. It gives you access to a list of every
device you will probably want to ship on: every mobile device,
every console, if you have the console content
installed in your editor. You can change cvars based on that. You can say, on PS4
I want lower quality shadows or I want lighting to be calculated
in a lighter way. This enables lower performance devices
to make different decisions. A nice thing about this is you have some
per device Material optimizations as well. There is a QualitySwitch node that allows you to take your Materials and
set them to Default, High, or Low quality. Use can use a cvar to switch out which
one of the inputs you want to use. If you go into your Material
and search for QualitySwitch, you will see as a default setting:
Low, Default, or High. You will use several of these in the
different inputs of your Material. When you cook, you can say on PS4
use the Default version of that Material, and on PC we are going to
use the High version. That is exactly what we do on Paragon. A quick note on network optimizations. Common problems for networking are usually
pretty obvious. You are doing too much and
you are doing it too often. Obviously, replicate as little as you can. I want you to walk away from here knowing
about the net.* commands. There are a few of these net.* commands. If you go into the console type "net.",
you will see a long list of the commands. Probably the most important
is net.DumpRelevantActors. This command was just cleaned up for 4.19,
so it is going to work a lot better. Run these commands on the server often,
so you know what is actually going on. That is easy. Use cheat net.* and
whatever your command is. Be aware that you can do that. There are some features
coming soon to 4.19. Some of our Battle Royale efforts are
to make the engine better for everyone. We have the stat net command. This has been improved for 4.19 to show
more reliable data, and it can be used
from the client reliably. We have Network Relevancy
view mode coming in 4.19. Essentially, this color-codes your screen
so you can see what is currently relevant, what is not relevant, and what is
dormant and will be relevant. It is really nice for identifying what
things are affecting your bandwidth. Content streaming. Again, we will go over this
kind of quickly as well. Textures streaming in and out
of your scene can cause pops if you are not managing
or staying on top of it. We have a few tools to
make this easier on you. The texture streaming view modes were
added relatively recently to the engine. I want to say 4.15. Then, we have the stat streaming metrics. We have Primitive Distance Accuracy. You will find this under
View Modes > Optimization, you will see there are 4 different
streaming view modes. Primitive Distance Accuracy view mode
shows you how efficiently you are mipping your textures
based on the distance from the camera. It is a way for you to see what decisions the texture streamer
is making based on distance. It color-codes to let you know
if you have too few MIPS, too many MIPS, or if you are
on target which is the color white. We can see the same information
based on UV densities. The way you lay out your UVs is
also taken into account by the streamer. This is a way to do the same thing,
but based on how the UVs are laid out on your objects. We have Material Texture Scales Accuracy
view mode. If you have been scaling your UVs
to change the way textures look, that gets taken into account by the texture streamer as well. This view mode is about showing you
errors, which are areas that you may have the worst possible outcomes
and you need to check texture scaling. Finally, we have the
Required Texture Resolution view mode. This shows you the difference
between what the GPU thinks it should render for texture
resolution, and what it is actually getting. The view mode is color-coded
to help you see where you may want to make textures
to your texture resolution. Finally, we have Stat Streaming. These are realtime metrics on how much
memory you are using based on what the streaming system thinks it needs and how much memory is being used.
Know about that. Level streaming. Most games should probably be using
Level Streaming to some degree. It is not always about breaking your game
into individual levels in the classical sense. We use it more like a layering system
for most of our games. It is nice to be able to stream out
the parts of the world that you don't need. We have a lot of documentation and
examples on how this works. Be sure to check out the
Content Examples project. Level streaming is how we collaborate
on levels at Epic. If you are not doing this,
you should. Every now and then, I run into
studios who haven't started doing it For example in Paragon,
we only have one world and we don't need to stream
in and out very much. We break it up into levels for Lighting,
Base Geometry, Collision, Foliage, etc. This is kind of a simplified view. It is cool because in Paragon
we also have a Gameplay level. This contains all of the towers,
spawners, and all the MOBA elements are stored on one streamed level so it is
easy for everything to communicate. World Composition is another approach
to Level Streaming. World Composition is designed
for very, very big worlds. We made a lot of polish in World Composition
for the kite demo a few years ago. World Composition takes a large piece of
landscape and breaks it into a grid. Then, the level will automatically stream
in and out as you move through it. World Composition does not work with
old school streaming volumes. The UE3 streaming volume system
immediately stops working when you use World Composition. Know that it is very easy
to make a Blueprint that will do the exact same job. You can create that Blueprint, use those in lieu of old streaming
volumes, and everything works the same. I have some further information of things
you should at least be aware of. You can get into this and review
some of what we have talked about. How to Scale Down and Not Get Caught
is a talk that Martin Mittring, formerly of Epic,
did several years back. It is still on YouTube and
is worth checking out. It is how we put together
the Vulkan demo a few years ago. The Tech Art Aid videos on YouTube
by Oskar Świerad are very good.
I have checked out his work. It is informative because he breaks down
all of the different passes on the GPU. It is really juicy stuff. Read articles. Read as much
as you can about how GPUs work. The article I really love is
GPU Performance for Game Artists. It is on http://www.fragmentbuffer.com. He has broken down all of the
different GPU passes to make sense of them for
how you are building your content. It is very useful. I have used up most of the time I have. I am not going to take questions
right now; I am going to get offstage. If you have questions,
please come and find me. I am going to be up on the terrace.
I would be happy to talk to you all. Thank you for dealing with my walls of
text and barrage of talking to you. You are all great and
enjoy the rest of Dev Day.