>> In C and C++ you probably spent a lot of time thinking
about memory allocation, but there might be cases
where you want to think about those same things in.NET. If you want to learn more, there is a really cool profiling tool
called the.NET Allocation tool, which we're going to
learn more about on this profiling episode of
Visual Studio Toolbox. [MUSIC] >> Hey, everyone. Welcome
to Visual Studio Toolbox. I'm your host, Leslie Richardson. Once again, I am joined
by Sagar Shetty, who's a PM On the Visual
Studio Diagnostics team. Hi, Sagar. How's it going? >> Good, Leslie.
Thanks for having me. >> Absolutely. Once again we are back for another episode of
our profiling series. What are we going to
talk about today? >> We're going to go through
another deep dive for another tool within the Performance
Profiler in Visual Studio. Today, we're going to talk about the.NET Object Allocation
Tracking tool. A bit of a mouthful, but it's one of two memory profilers in the
Performance Profiler suite. It's basically designed to
show you the top functions, the code pathways, and different types that are allocating the most amount
of memory in your code. >> Cool. I don't know if
you ever simplify that down to NOAPPS. No apps. >> Yeah. Normally, we just call it the.NET Alloc Tool internally
and that's short enough. >> Yeah, that sounds cool. Usually, when I think about memory allocation or just
allocating anything, I typically think of C or C++ where you have to
do all that manually. But in.NET, usually, there's the garbage collector takes care of all that stuff for you, so why have the.NET Allocation tool? >> Garbage collection, that's
definitely a great point, Leslie. Especially with.NET, it's gotten to a point where there is the
garbage collector that, like you said, does a lot of that
memory management automatically. That being said, profiling
at a high level is about getting the optimizations
that you want the highest degree. Even though a garbage
collection can help you with a little bit of
that memory collection, there are still optimizations to be had based on the way that
you write your code, and hopefully using our tool, we can help surface some of those optimizations that you can
do on top of garbage collection. >> Can you describe a
little bit about what those specific use cases
would be for a.NET user, who might want to use this tool? >> Yeah, absolutely. I think the easiest way to explain
is to jump right in. Let's go into VS and then
start looking at some of the views and I think some of those scenarios will become more apparent. >> Awesome. >> Going into VS, again, just like all the other tools to get into the Performance Profiler, it's the same workflow. You go to the context menu, you can go to Debug, and then Performance Profiler, or just use the keyboard
shortcut Alt + F2. Then we get to this page. To give a little bit more of a background with this particular tool, so today we're talking
about a.NET Alloc tool, so I have this box check marked. This tool is going to be good for really any managed scenario
or managed application. It's good for all flavors of.NET Framework,.NET Core, ASP.NET, etc. >> Works with Framework too? >> Yeah. >> Wow. It really is a managed that pretty comprehensive
tool in that regard. For native, you're not
going to use this tool. Just based on the ways it's
architected that it uses the ICorProfiler or just the profiling interface
for the.NET runtime, so it really is more of a
managed experience. Go ahead. >> Is there a similar tool for
C++ users that they can use? >> Yes. On the C++ side and for memory analysis
on the native side, you're going to end up using
the Memory Usage tool. Now, these two tools
aren't exactly alike, but that will give you
some insights in terms of where your code is spending a lot of time for a memory perspective. Going back to the.NET Alloc tool, another thing I'd like
to call out is in this particular settings window
if I click on this gear icon, we come to this particular window. Now, in the past, we've talked about a few
different data collection methods and when Esteban was talking about the CPU
Usage tool, for example, something we talked
about was sampling, which essentially was a
data collection technique where you're taking snapshots of the performance data in our code and
stitching it together. The.NET Alloc tool on the other hand, you can use sampling if you want, like you can switch it over to this. But by default, it uses a slightly different data
collection technique called instrumentation. Instrumentation is
essentially, if you think of snapshots and sampling, it's taking pictures and
stitching that together, instrumentation is like a video. It really is much more
detailed and it's giving you exact call counts and they very fine-tune for in
precise and accurate data. That's cool because you
can get exact call counts. We'll see some of those
values in the example today. One thing I will caution users using this tool is that as a result, this data collection technique
has a pretty high overhead. To counteract that, one
recommendation I have is to keep traces as
short as possible. I will reiterate this at
the end of the video, but performance is definitely something we're working
on with this tool in terms of improving
and speeding it up. I can assure you this is something our engineering team is
very hard at work for. But in general, just as
far as best practices, keeping traces short
would probably be something I'd recommend
with this tool if you want to use instrumentation. >> What is the default set to
like frame-wise or however you measure how much or how
little gets tracked? >> With this tool, it's
going to literally track every single object allocated. As we'll see once we run the
tool and go into the views, it will look at different
object allocation types, and for each different
object type in your code, it will show you how many
objects were created and how much memory that
total is taking up. So by default, the
instrumentation is tracking all of them and grabbing
all of those objects, and it's measured on an object level. >> Sweet. Well, I'd like
to see it in action. >> Yeah. I'm going to
close out that window. I have an Uploader app, and I'm going to go ahead and click "Start" and start the
profiling session. The app we have today is
actually a WinForms app. This app is essentially
what it's doing is helping us visualized
prime numbers. It's a pretty simple app. You have a minimum
value and maximum value and for each of those value
an ellipse is created, and each ellipse, it corresponds
to a specific number. For numbers that are prime, they're indicated by a green icon, and for numbers that
aren't, it's yellow. We'll exercise this a few times. Click "Stop". This app does quite
a lot of memory allocation, so it's a good way to show
off this particular tool. >> In this case is a
object being created for each number or each dot that
was on the screen there? >> Yes. We're going to dig
into the source code a bit more but for each
ellipse that's creating, and in this case, an ellipse objects. We're using an ellipse class, and so there will be an
ellipse object being created. So yeah, a lot of memory
is being allocated. >> Yeah, I can imagine. [inaudible]. >> Ramp up, max up to 10 million. >> Exactly. Awesome. Now we're looking at the report generated
by the.NET Alloc tool, and there's a couple of views here. The first thing I want to point at are the graphs at the
top of the screen. At the highest graph, we have live objects. These objects are
just a total count of the live number of objects that are allocating memory
within your code. This is objects across
all different types. In some of the views when
we go down to the tables, we can look at and categorize
objects by specific type, but this graph is just showing an
all-up total count measurement. That's live objects. In the second graph is the
object delta, so the change. This is showing anytime you
have large spikes in objects, and then as we alluded
to earlier Leslie with garbage collection cleaning up
and reclaiming some memory, you have these red bars
every once in a while. The red bars are actually indicating where garbage collection
is occurring in your code. In general, you can think
of it where green bars are just adding on more and more objects. At the beginning,
you'll see large deltas because it's just the way
the math works out, right? >> Exactly. >> Yeah, exactly. When
you're initializing and you don't have many
objects to begin with, even adding a few more objects percentage-wise because this is
on a percentage basis a lot, over time that tends
to dip a little bit. But the really important thing
with these graphs or swimlanes, as we call them within
the Performance Profiler, is you can time select
and filter down by that. For example, I can select
a range on this graph. What that will do is
filter down the data in my tables by that time
range I selected. So if I was really interested
in a bit of garbage collection, here are some area where there
was a lot of app activity, I could filter down and then look at specific time ranges and really dig deeper for that
specific time range. Another thing I'll point
out, and this isn't as applicable for this tool
but just to reiterate, when you're running multiple of our tools in conjunction
with each other, we will stick swimlanes
at the top if you're using a tool such as the CPU
Usage tool that has a swimlane, and if you're running multiple tools in conjunction with
each other and you want to filter by the same time
range across all tools, if you do the filtering at the top for all the tools
and all the reports, it will filter by
that time range too. With this tool, generally running it by itself because
of the high overhead, but I just wanted to
call that out again. Those are the graphs, and I'm just going to
clear the time selection for now just so we looked
at everything all up. Now we dig into the tables where there's I think
a lot of insights. Generally, there's a
few different ways that you can start your
investigation for today. I'll start with the allocations view. The allocations view is
essentially showing you a bunch of different object types or classes
or structures within your code. There's a very long list here, and so like ellipses
bubbled up to the top. For each one of these
types were showing you the number of allocations. This is the number of, in this case, ellipses or objects of that
type created within your code. Furthermore, in addition to just the number of
allocations of that type, we're showing you
the actual amount of memory that's being taken up. So that's the bytes column is showing you across all
of those allocations. Then, also, average size. That's just the division
between bytes and allocations. >> It looks like a lot of allocations going on for
pretty much all of these. >> Yeah, and so we'll dig into ellipse a little bit more in
the source code in a second. But one more thing I
want to talk about as far as the types is generally they fall into two categories
and two subcategories. The two main categories are
value types and reference types. Actually if you notice, this is something we've modified
over the last few years. We've actually added in icons
into this particular view. This blue icon over here indicates a value type and this yellow one indicates
more of a reference type, so what those are. Value types are things
like, as we see here, a double or an integer
or a Boolean even. I said an integer, for
example, is a value type. Whenever you create a variable of a type of a value types
such as an integer, what happens is a
specific memory address is pointed out and that's where the variable is
initialized and stored. In the case of a value type, the value of that variable and that variable exists at
the same memory location. In the case of the reference type, things work a little bit differently. A reference type are
things like strings, class is a reference type, array is a reference type. In the case of a reference type, where a variable exists and where its value exists are
actually two separate spots. A string might be initialized at a specific memory address
and then its actual value, the contents of the string are at a slightly different
memory address. At the memory address where
the variable is stored, it actually also contains a pointer to the place where
the value is stored. The reason why this
is important, Leslie, is because of how these two types are stored in
memory within the down and runtime. If you think about our memory, it's essentially like a physical
block, it's a finite resource. There's a limited amount of space we have to store data
and allocate memory. Within that memory,
the way it's managed in.NET is at a high
level two partitions; there's the stack and the heap. The stack is generally a partition where things that
are more short-term are stored, local variables,
things of that nature. The heap is a partition where more long-term objects
are stored in general, so it's more objects and things of that nature, things that
are more long lived. This is oversimplified
and more high level, but that just gives you a sense of where those two things are stored. The reason this is important is even though value types are
stored on the stack, if you have an integer and
you cast it to an object, that actually becomes
a reference type. So then it ends up being
stored on the heap as well. Now, you have a value
type that's taking up memory on the stack and the heap. In other words, it's taking
up twice as much memory. You have to be on the
lookout for value types. That's why we surface them
with these icons here. So yeah, we have value types,
we have reference types. Also, if we go back and
look at this backtrace, you may notice I mentioned
that there are two sub-types. There are also these blue icons with these buckets
under them and also, we have the yellow icons with
the buckets under them as well. In that case, those are value type collections and
reference type collection. Again, the blue icon
is the value type, so this bucket is
showing a collection, and then the yellow one with the bucket is showing
collection as well. That's essentially just
taking a value type and showing a collection of it. In this case, this type is effective value entry and then
it's like a list of that type. Then, in this case, here we have like system.object
and it's like a list or an array of objects or
a collection of objects. That's what those icons are showing. Anyway, that's a bit of a tangent
backstory on different types, just thought I'd go into that. But getting back into
the code a little bit. Right now, we have this
sorted by bytes and so whatever has the highest number
of bytes bubbles up to the top. This is taking up a lot of memory. What if we wanted to investigate
this a little bit more and see what's happening
in our code here? If you double-click on this line, what we show you in this right panel, and let me modify this window
a bit, is a backtrace. After we click on this ellipse type, we want to look through
the backtrace and see where in the code it's
being allocated a lot. We see this GeneratePrimes function allocating a lot of memory
and a lot of bytes. Now, what I want to do is ultimately go back to source code
and see if there's any modifications that can
be done to optimize this. I can right-click and hit "Go To
Source File" and in this case I have the code up and so we come
to this GeneratePrimes function. >> Cool. I'm a little surprised because in some of the
previous tools that we talked about in past episodes like the CPU usage tool and
I think memory tool, database tool, they all had hot path function tables with little fire icons
that indicated here are the functions that you should consider honing in on because they're hotspots for all the CPU usage or memory per issues,
that sort of thing. >> Yeah, exactly. We'll actually touch on that again with this tool. In fact, just a sneak peek, we have that same functionality
in the Call Trees window, and so we'll talk about that a
little bit later and here it is. But yeah, we'll come back
to that and for now, keep going through
the Allocations view. But yeah, definitely the
expand the hot path is definitely useful feature and we do preserve it in
this tool as well. >> Cool. >> Now, let's see.
We're looking back at some source code and
trying to see where we can optimize and see why ellipse
is allocating so much memory. We have this GeneratePrimes function, we have these long values, the min and max that we saw
in the application before. If we scroll through here, we see we have a for loop
going from min to max. Within this for loop, we have an ellipse, which is in this case a class, and so we're creating
a new ellipse object for each different
iteration of this for loop. That's why when we
had the application before that is a lot of ellipses. >> Yeah. >> One thing I will point
out here is based on the nature of this particular
application and this visualization, even though we are
allocating a lot of memory towards ellipses,
for this visualization, we actually do want each of those ellipses because
we wanted an ellipse, printed out or shown for each number. So yes, it is taking up a lot of memory but
you just have to think about, is this an actual bottleneck or
are you willing to live with this? Because if we were to want to
improve this, for example, there's only so much we
can do to get around this particular issue based on the way we've currently
implemented this code. Right now, yes, even
though we are creating a new ellipse for each
iteration of the for loop, we actually want to
do that and paint it a specific color based on whether
it's a prime number or not. If we wanted to not
have to use an ellipse, we'd have to really think about how to instruct this code
very differently. It might not be very easy to do that. We probably wouldn't use
an ellipse class seeing as the height and the
width are the same, maybe we use like a circle
or something different. Even though we can fix
this or optimize that, that's not what I
want to focus on for this particular demo
because ultimately, it might be a bit more involved. A question I might
have instead though is this function seems to be getting a lot of bandwidth
and doing a lot of work. Yes, there's a lot of type ellipse
being allocated and created. But are there other types within this exact same
function that are also being allocated a lot and is there
another optimization to be had? To answer that question, I want to go over to another one of our views within the diag session. The first thing I want
do is actually copy this function because I want to
investigate this function more. The question I want to ask
myself now is not so much what are the top types being
allocated for a specific function, in particular that function
I was just looking at, what are the top types that
that function is allocating? Actually, we have a Functions view
that can help you do just that. This is the Functions view showing you similar data to what
we were looking at before, but just grouped differently and
visualized a bit differently. Something I want to emphasize with the Allocations view
and the Call Tree view, which we'll look at shortly here, and the Functions view is
you're looking at similar data, it's just grouped a
little bit different. It's like a slightly different
pivot table, if you will. In the case of the Functions view, we have the process ID up
top and then within that, we have different modules
and then within modules, we have specific functions. Now, I had a function in mind
that I was interested in. We actually have the search bar here. I'm going to post GeneratePrimes
in here and hit "Enter". Now, it's going to bring me straight
to that function of interest. When I come to this
function of interest, if I expand this particular node, I see all of the top allocation
types for this specific function. We have the total allocations
for this specific function, we see the self allocations which is the amount of allocations that
just this one function is doing, total is including what this function generates
and all of its children. Then, we also have the
self size in bytes, so the actual amount of memory. But if we dig into the types before, once again, we see okay, ellipse is the top,
there's a lot of ellipses being generated within this function. But if we look at this a
little bit more, Leslie, there are other types that are also allocating in quite high quantities, even more so than the ellipse. We have 30,000 colors
being allocated and 30,000 colored brushes
being allocated. They don't take up quite as
much memory as the ellipse, but they still take
up a sizable amount; over a million bytes
for each of these. Now that with this
information in mind, I want to go back to source
code and say, hey, look, let's look at that function again, but not so much focused
on the ellipse, but other of these types and maybe there are other optimizations we had. If we optimize this, even though it wasn't the top type, it's still memory we're saving. I want to go back to the source code. I can do that by right-clicking
and saying "Go To Source File" and I have the
code up, so let's go back here. As we're looking at this code again, if we look at this for
loop more closely, so yes, we're creating an
ellipse for each iteration, but we're also filling it
with a specific color. That color is actually not changing. But based on the way we
have this implemented, we're creating another solid
color brush object each time. In this case, this is
the color of yellow and in the case of the prime
fill color, we had a green. We're creating another solid color brush object and making that green. But we're creating a new object
each time in this for loop too. >> Yeah, that's a lot. In the case of the Ellipse, like I was saying before it, yes, there are ways if we want
to optimize that we could, but we would be a little
bit more involved. We probably wouldn't
use the Ellipse class. But in the case of the fill color, we can actually do something
pretty quick here. What we can do is instead I
have the code commented out. But as an example, we could pull out this new
solid color brush objects, and bring it up to a
static member right here, and we could assign that to
a variable like Fill Color and then prime fill color in the
case of the yellow and green. Then instead of doing fill equals
this new instance of an object, we just assign it to this static
member fill color up here. That's what I have
commented out here. So I want to rerun the code because
it might take a little while. But basically, if you do that, what you'll see is that
all of them will go away. Because we're not
necessarily creating a new solid color objects every single time for every
iteration of the for loop, we just have that static
member that we declared at the top and that's not changing, and then we just reference
it each time within the fore loop and it just gets
painted that particular color. So that will save us
a lot of allocations. Because if you come back
to the functions view. This function alone had 30,000 plus allocations of both color and solid color brush and over
a million bytes each. All of that will go away essentially. Something I want to point out here is that the nature of the investigation. It's up to you to figure out
what you want to optimize. Maybe you do want to
optimize and go after the top frame, which was
Ellipse in this case. But as we pointed
out, the code change might be more expensive
or more involved. So it's ultimately up to you to
figure out what's worth your time, and what the trade-off is. But in the case of color
and solid color brush, that's a quick win, and it's still only going
to help your application. So I just wanted to point that out. >> Yeah. I think that's a theme for a lot of profiling
investigations like that. Ultimately, it's like, "Okay. How badly do you want
to fix this path, if it means having to
modify your code in a way that maybe doesn't make
sense depending on the context?" >> Totally. In software development you have to deal with these
trade-offs all the time. Ultimately, on a profiling team, what we're trying to show
you is just data and insights into how your application
is actually performing. What you want to do with that
data ultimately is up to you. But you have to ultimately decide where you want
to spend the time, and what's worth it. >> Not that it's been quoted
a bazillion times already, but with great power comes
great responsibility. >> Great responsibility.
Exactly. Talked about a little bit before how we're showing you different pivots on similar data between the allocations
functions and Call Tree View. Now, I want to dig into that
third view, the Call Tree View, and we alluded to it earlier, Leslie with the Expand
Hot Path feature. So what the Call Tree View
is showing you is just what are the code pathways that are allocating the
most amount of memory. So with the allocations view, you're filtering by a specific
datatype or object type. With the functions
view, maybe you had a specific function you want to
really drill down into and say, "Okay, I'm interested
in this function across the entire time span. " or if you're using the Steamlane
Filtered Downtime spin, what are all the
allocations happening here? The Call Tree View is just saying, "Okay, maybe I'm not interested, or focused on a specific
object type yet, or I'm not focused on a
specific function yet, but what are just the code pathways? Just show me the path that a lot
of allocations are happening." One thing you can do with the Call Tree View is
expand nodes individually. But as we alluded to earlier, what I would recommend
people do is start at a node of interest and use
the Expand Hot Path feature. What this is doing is essentially
showing you where most of your allocations are
happening for a given path. To walk through some of the
metrics in this view again, so at any given node, we have the total amount
of allocations happening. So that's all the allocations at this particular frame and
then all of its children. We have self allocations, which is all the allocations
just at this particular level. So this is native, but if
we wanted to look at Maine, for example, Maine is
allocating a different things. Then we have the bytes
in terms of memory, not in terms of the number of
objects, but in terms of the memory. Then we have the Module Name, which is essentially showing what module that function
is associated with. Sometimes they'll be associated
with multiple modules. So with the Expand Hot Path algorithm is essentially doing is saying, "Hey, as we're walking down this, if there's a lot of self allocations happening
within total allocations, that means you should go into the next function and dig
into that a little bit more because that function is contributing to a lot
of applications." So I started it up here and
then you'd expand Hot Path. What it brings us down to
is essentially two things. One is this Generate
button click method, which is certainly allocating a
lot of memories because that's the button that's essentially
triggering that visualization. Then also this allocations frames. So walking through each of
these two individually. The allocations frame is saying, hey, at this particular
node right above it. So in this case it would be like application.rb, system dot windows. What are all of the top allocations happening for this particular method? So similar to the functions view, but it's specific to this particular Call Tree
and Call Path there. That's something important
to denote because there are functions in the Call Tree View as well as functions
in the functions view. Someone might ask,
what's the difference? The difference is the function within the functions view is
looking at the data all up. It's combining the functions
across all the times it's being called and adding the allocations across
all types within that. The Call Tree View is looking
at a specific call stack. Any given function might
be called many ways, and if we drill down into
these different nodes, you'll see a lot of
the same functions being called multiple times, but it's showing you
each different iteration of a specific call stack. Something you can see
in the Hot Path is the specific allocations
for node of interest. Also it will sometimes end with
another function to look at. I started the Hot Path
from this highest node. Something to note is you can really
start at any level you want. Let's say I wanted to look at
the generate primes function. I can start dot path here
too and it'll show me the allocations or other UI external calls
that are happening as well. This is just another
view another pivot on that data and it's allowing you
to go through the Call Trees, and ultimately see what code pathways are allocating the
most amount of memory. >> That can be really useful. It seems for a lot of
dotNET peeps out there, especially if you're
dealing with a lot of graphic intensive things like that application with
the prime numbers. >> Yeah. Something I want to call it again is especially in this view, this is a time where you're really probably going to
engage some of the time filtering if you're
really interested in garbage collection and you want
to see, what were the calls. I don't care about necessarily
all the functions in the world, but what were the
functions being called at this particular time period because
when you combine the graphs, we just call it review. Again, we're improving
preference bit slow, but it will show up eventually. Yeah, you can combine
those two views together. >> Awesome. You mentioned that [inaudible] is still a
work in progress for this tool. So anything else on the
roadmap for dotNET allocation? >> Yeah, so this actually set ways
perfectly into our last view, which is the collections view. Absolutely clear that selection and then go to the collection view. Admittedly, this view is pretty young right now and
so we want to work on it. But essentially, as we
alluded to earlier, there's a limited amount of
memory you have to work with. Is a question of how to
allocate and manage it best. Luckily, as we mentioned previously, dotNET does a good job of having the garbage collector come
through and automatically scan the keep portion in
particular of the memory, and looks at what
are objects that are allocating memory but are not
being used in cleaning that up. So what the collections
view shows you is one instances where garbage
collection is being occurred. If I click on a specific
row within this table, I see first of all the number of objects that were collected
and how many survived. Then we also get our pie
charts over here which shows you the top types within each
of those garbage collection, what were the types that
went away and what survived? Like I said before, this
view is more in its infancy. Of course, you can also
still time filter, you can still see on the graph
where the red bars are occurring. But what we want do here, and we're still working on designs and the best
way to bring this out, is what are the actionable
insights from this code? So yes, this is showing me a little bit about where
garbage collection is taking place and what
are the top objects they're surviving
are being collected. But we want to in the future show
you more insights all around. How do you go back to
source and what are the optimizations within
your code you can do. It's maybe not how garbage
collection happen as often or it more
efficiently or better. That's something we want to look at, improving this view as well
as also path with the tool. >> Awesome, so many
tables to choose from. So it's called the options. Great. >> A lot of different
[inaudible] on the same day. >> I like options personally. More customization the better. >> Absolutely. >> Thank you so much [inaudible] for sharing the dotNET allocation tool. If people want to try
this out or learn more about this particular
tool, where can they go? >> We've got Docs as always,
with all of our tools. Docs for the on and out
as well are updated. Yeah, we'll point you to those documentations and you
get some more of those details. If you have any questions of course, always reach out to us. >> Awesome. This is not the
end of our profiling series, so what are we going to
talk about next time? >> Next time, S21 is going to
cover the dotNET performance tool. So really excited for that one, that is actually our newest tool, if I'm remembering correctly. So that one should be really fun. >> Great. Well, thanks
once again for coming, probably going to see
in the near future. >> Absolutely. Pleasure as always,
Leslie. Thanks for having me. >> Likewise. Until next
time, happy coding. [MUSIC]