C# 11 and the Nine Uses of ref

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Last time, then, we were talking about the C#  concept of Span and also stackalloc. And as I   promised, this week we're going to look at some  of the underlying language features that really   make that possible. And it all comes down to  the keyword that you see in lots and lots of   places in C#, which is the ‘ref’ keyword. And  in fact, from my count, there are nine different   usages of ref within C#. Other people may count  slightly differently - I may be being a little   bit too precise about the difference, depending on  context, but we can certainly see that many places   to put it. And so what I'm going to do is go  through all of those situations where we use ref,   right up to the most recent one, which was  introduced in C# 11, and makes Spans work rather   more easily. But some of them are quite simple,  some of them are complicated, but let's take a   look at those. And the first one to look at - the  one that's been around since the very beginning of   C# - is simply the idea of passing by reference.  So if I put in a function here, call it ‘ByRef’.   And then - the first use of ref - I can declare  a ‘ref int x’. So that's telling me it's an   integer x, but I'm passing it by reference. And  what that means is - as we've seen in an earlier   video, if you want to take a look at that - that  if I simply do something like that, to modify   it, so let’s just put ‘x++’, that means that not  only will what's happening inside the function   be a modification, but also outside the function.  So if outside the function I say ‘int a = 100;’   and then I just say ‘ByRef (a)’ - well we  get an error there, because now we can see   our second use of ref, you have to repeat the  keyword ‘ref’, really just as a safety check   so we don't do this accidentally, but so we're  passing that in by reference. But now if I do   a ‘Console.WriteLine(a);’ we will see that what  we're getting is 101 coming out of that. So that   was the first and second use of ref. Now something  else that we looked at in an earlier video,   and again for the details, take a look there. But  it's something that’s a more recent introduction   into C#, and that's the idea of returning by  ref. So if we introduce a class in here, and   we'll just call this ‘SomeClass’. But what I'm  going to put inside here is we'll have a private   array of int called ‘ai’ that just equals ‘{ 1, 2,  3 }’. That will do it. But then I'm going to have   a public method that's going to return a ‘ref  int’. So this is returning by reference. We'll   just call it ‘Get’. We'll have ‘int idx’ on there  for the index. And as it's prompting, I'm going to   say ‘return’ and then ‘ref’ again. So what we've  got here is the third and the fourth use of ref:   declaring a ref return type and then again as a  safety check so that we don't do it accidentally,   we're actually returning that. But what that  means is what is returned is not going to be a   copy of whichever integer in the array we've gone  for, it's actually going to be a reference to it,   which can therefore be modified. So if back  in our program, we'll just say ‘SomeClass   sc = new SomeClass()’. And then here we've got  the fifth use of ref. I'm going to say ‘ref int   b =’ and then the sixth use, I've got to put  another of the safety checks in there. But I can   say ‘sc.Get’ and let's get number one out of that.  Okay. And so now ‘b’ is a reference to element   number one in that array. And we can demonstrate  that because if I were to then say ‘b = 5000’.   And then we have a look at ‘sc.Get(1)’ again, then  we'll see that although we've modified it through   ‘b’ that has actually modified element number one  in the original array. And so when we run that   one, we can see we're getting at the 5000. So ‘b’  is just a reference to element one, in this case,   in that array. Having a look at our seventh use  of ref, then we can see it's actually something   very similar. And this is one, I'm sure, people would  dispute whether it's really a separate usage,   because what I could also say here is, if I just  declared an ‘int c = 10;’, say, I could then say   ‘ref int d = ref c;’. And so what that would be  doing is saying that ‘d’ is just an alternative   name for ‘c’. So once again, the two are bound  together. So if I say something like ‘d = 20’   and then print out ‘c’, once again we can see  we're getting the 20 because assigning 20 into ‘d’   is the same as assigning 20 and ‘c’. And you can  see that that form there and that form there are   really very, very similar. Certainly, this ‘ref’  and this ‘ref’ are doing exactly the same thing.   This one and this one are slightly different,  but I'm sure many people would argue they're   really the same thing. Doesn't matter too much.  But having got to that point, we can now actually   get a much clearer understanding of what this  kind of declaration with the keyword ‘ref’ is. So   this sort of usage, as I say, is really just  a safety check, to stop you making mistakes,   but we've declared here a ‘ref int’, we've  declared here a ‘ref int’ and we've declared   here a ‘ref int’. Obviously just using ints  as one type, we could use others as well. But what   this actually is - what something like ‘ref int  d’ is - is in the terminology of .NET, that is   what is known as a managed pointer. A little bit  odd that it's the word ‘ref’ when it's the pointer,   but that's just the terminology we have. Now,  I think we're all familiar with the idea of   a managed reference in C#, because that's really  what we have all over the place. So a managed   reference is something like this, when we say ‘new  SomeClass()’ because SomeClass is the class - a   reference type - so it exists on the heap. So  that creates an object on the heap and then   ‘sc’ is a variable on the stack, referencing the  thing on the heap. And that's what's known as a   managed reference - managed because it's garbage  collected, but a reference to something on the   heap. Whereas here, when we say ‘ref int b’ that  is what is technically known as a managed pointer.   And there are similarities and differences  between them. When you have a managed reference,   it can only refer to something that exists on the  heap. So this 'new SomeClass' exists on the heap,   and that ‘sc’ is referring to something on  the heap. But when we have a managed pointer,   it can refer to things either on the heap or  onto the stack. So here our managed pointer   ‘d’ is referring to ‘c’ which, as we can see,  exists on the stack. Whereas here our managed   pointer ‘b’ is referring to an element in the  array, and clearly that must exist on the heap   because it's an array. So there's no question  about that. So with a managed pointer, it can   refer to either the heap or the stack, a managed  reference only on the heap. On the other hand,   a managed pointer like ‘d’ can only itself exist  on the stack. So although it can refer to things   either on the heap or the stack, it itself can  only exist on the stack, which you can see it's   doing there. But if I were to try, for example,  to as a field of SomeClass here have something   like a ‘private ref int x;’ then we're getting an  error there. And what the error tells us, which is   kind of the punchline of this whole video, ‘a ref  field can only be declared in a ref struct’. So   we'll get onto what that means, but what you can't  do is have that inside of the class because the   class is going to exist on the heap, and these  managed pointers can only exist on the stack.   Whereas as we know, a managed reference can exist  on either the heap or the stack. So here, that is   a managed reference to an array inside a class. So  that thing ‘ai’ exists on the heap. Whereas here,   this thing ‘sc’ is a managed reference, but  the reference itself exists on the stack. So   we've got different degrees of flexibility.  So with a managed reference, the variable can   exist anywhere, but the thing it refers to  must be on the heap. With a managed pointer,   the managed pointer variable can only exist on the  stack, but it can refer to things on the stack or   on the heap. Now, why do we have this rule about  managed pointers? Well, as you might guess, it's   all to do with garbage collection. If you consider  this managed pointer ‘b’, which is referring to   this element inside the array, it's got to be able  to keep that alive. So when the garbage collector   does its work, this array has to be kept alive  by this thing that is its managed pointer. But the   problem is, it's not referring to the beginning  of the array, it's referring to the middle of   it. And so that really complicates matters in  terms of garbage collection. And although it might   have been possible to have garbage collection  even with managed pointers existing on the heap,   it would have been much more complicated and  therefore much slower. So that's one of the   reasons they've made this restriction. The other  reason is to do with multithreading. Because when   you have multithreading, the stack is entirely  localised to one particular thread. So only the   thread that owns it can access it, whereas the  heap is available to all threads. So once again,   there would be risks of concurrency issues, which  are vastly simplified by saying these things can   only exist on the stack. So that's what we've got  there. And that brings us on to our next year use   of ‘ref’, which is the idea of a ‘ref  struct’. So let's also add to this application   something we'll just call ‘MyStruct’. And  we'll just change that to ‘struct’ and then   when we talk about memory management, as you've  seen on earlier videos, we say that a reference   type - so a class - exists on the heap. And we  say that a value type - a struct - exists on the   stack. But that's not really a very good way  of saying it. Although we often have structs   existing on the stack, they can also exist on  the heap. So we sometimes say something like   a value type tends to exist on the stack or  likes to exist on the stack. But it can exist   on the heap as well. Simple enough to do that.  If I take my MyStruct, I can embed that inside   my class. And now because the class exists on the  heap, anything inside it exists on the heap. So   there very simply, we've got a struct existing  on the heap. And indeed, exactly the same sort   of thing also happens if you do something  like boxing. So here, if I say ‘MyStruct   ms = new MyStruct();’ that is declaring it on the  stack. But if I do something like ‘object ms’,   that's boxing. And so that creates a box - which  is a reference type - copies the struct inside it,   but then the box itself exists in the heap. So  that's another way we can manage that. Also,   if you have interfaces, so if we were to  give our MyStruct a nice, simple interface,   like IDisposable and implement that, then  again, if we were to here, say IDisposable,   then that's another example of doing boxing  and therefore getting MyStruct to exist on the   heap. So in general, structs can exist at either  location. But then what came in roundabout C# 7   was the idea of a ‘ref struct’. And so this is our  eighth use of the keyword ‘ref’. And this one is   a little bit of an odd one, because a ref struct  is now a struct that can only exist on the stack;   it's not allowed to exist on heap at  all. So having put that ‘ref’ on there,   if for example we look in here, we're now not  allowed to have my struct inside the class,   because that means it would be existing on the  heap. And also we can see, we're not allowed,   in the program, we couldn't have that  boxed as an object, because that would   put it on the heap. And indeed, we can't  have it boxed as an interface. And in fact,   even more strict than that, when you've got one  of these ref structs, it's not allowed interfaces   at all. Okay, so you're just not allowed to put  an interface on there, because the only way you   could have it accessed through an interface would be  on the heap and that would break the rule that   it's only allowed on the stack. So the only way  now we can have a MyStruct is by declaring as an   object on the stack. So we've got quite a lot of  restrictive rules there. Just to go through them,   the rules are ref struct can't be an element  with inside an array – obviously not,   because arrays exist on the heap. Can't be  a field inside a class - we've just seen,   because classes exist on the heap. Can't implement  any interfaces, because then it would have to be   boxed and exist on the heap. Can't be boxed in  general. Can't be a generic type. Can't be used   in a lambda expression or a local function. Can't  be used in an async method. And can't be used in   an iterator. So lots and lots of restrictions on  what we can and cannot do with ref structs. But in   those situations we can use them, they can be very  powerful, as we'll see in a moment. Now one of   those restrictions is actually really nasty, this  idea that you can't have interfaces. As we saw,   we're just not allowed to put IDisposable on that.  And that's kind of nasty because ref structs may   need to do some tidying up after themselves. They  may need to be able to close a file or something   like that. So what we can do, we can have a  ‘Dispose’ on a ref struct, even though it doesn't   have the interface. So let's just put in here,  so we can see what's going on, let's just have a   ‘Console.WriteLine(“Disposing”);’. And then in our  program, if I just put this in a ‘using’ statement   or in a ‘using’ block, we can see it's happy with  that, even though we haven't put the interface   in there, but because we do have to have the  method. If we didn't have the method there at all,   then we get an error over here because  it's looking for that disposing,   but as long as it's got it, doesn't need the  interface. And when we run this, we'll still see   that we get the ‘Disposing’ call. So a little  bit of a cheat there because we're not allowed   interfaces, but because disposal is so important,  we're allowed to do that. Another thing you'll   commonly see when we're dealing with ref structs,  nothing particularly special to them, but you can   also have ‘readonly ref struct’. So that basically  means that once this ref struct has been created,   it can't be modified. So if I were to put in  something like a ‘private int x;’ in there,   you can see we're getting an error because  we're inside a readonly struct. And so all of   the members themselves have to be declared as  readonly, and that makes it happy. But that's   nothing particularly to do with ref structs -  that's true of any struct. But we'll see it's used   quite often very usefully alongside the idea of a  ref struct. But that brings us on to the new use   of ref in C# 11, which is the idea of a ref field.  So you remember we mentioned earlier on, you're   not allowed to have a ref field inside a class  because a class exists on the heap. And these   managed pointers that we've got here can only  exist on the stack. And that's equally true of   a regular struct. So if this were just a struct,  we wouldn't on here, be able to say something like   ‘public ref int Value;’ because although a struct  may exist on the stack, it still, we've just seen,   possible to live on the heap. But that's  the thing about the ‘ref’ keyword here.   So because it's a ref struct, it can only exist  on the stack. And therefore we're allowed our new   ninth use of ‘ref’ to have this ref field. And so  that can now just like any other ref we declared,   like the ref that we had here, for example, it  can be made as a reference to some other variable.   And that really gives us everything we need to see  for our Span. So if I just do something like this,   if I just say ‘Span<int> x;’ but if I just  look at the definition of that with F12,   we can see there, we've got the main thing. So  we've got our ‘readonly ref struct’ of Span. And   then inside there, we've got our ‘readonly ref  T’ - just in our case int. And then that's our   reference that we were talking about last  time. And then we've also got ‘length’,   so the number of elements that we're looking at  there. But that is the use of ref that's allowed   now with this ninth use of it that we get in  C# 11. Now that's a bit odd, because Spans were   available as of C# version seven. We can see how  that works, actually, if I just close that down,   and let's just wind this back and set that to .NET  version 6. Just do a quick rebuild, so that it's   got all of that. And you'll see obviously that it  doesn't allow that ‘ref’ in there, but if we now   take a look at how a Span is defined, now we can  see something slightly different. It's still this   ‘readonly ref struct’, but rather than the ‘ref  T’, we now have this generic ‘ByReference’. So   before the language feature for ‘ref’ fields came  in - in C# 11 - there was this generic that did   the same sort of thing. But rather sneakily, that  ByReference was itself declared as internal in   the system library. So that could be used here on  Span, which is also part of the system library. You   couldn't use it in your own code. So you couldn't  do this sort of thing for yourself. But now in C#   11, it's been made an actual language feature. So  if we just take that back, then we can see that   we are allowed to have this idea of the reference  field as long as it's inside a ref struct. And so   we guarantee it lives on the stack. What can we  then do with that? So the obvious sort of thing   you might want to do is we could do a ‘Console.  WriteLine (ms.Value)’. The problem there is,   though, if I run that, we actually get a runtime  error saying that we've got a null reference   exception. And that is perfectly reasonable,  because we've declared this thing as a ref,   but we haven't given it an initial thing  to refer to. Now it's a bit odd that,   because in most cases that would have been  caught by the compiler anyway. But because   this is an int - which technically can't be null,  because it's a value type - we're in a bit of an   odd situation. And so that's why we’re getting a  runtime error. We can do a safety check for that.   So in our program, you might think, well, let's  just do it in a straightforward way. We can say   ‘if (ms.Value != null)’, but that causes a problem  because - you can see it's only a warning - but what it's telling   us is because Value being an int can never be  null, then that's always going to return true,   and it's not much help. And that's the slight  odd problem we've got in that C# doesn't really   quite understand what's going on. But we've got  a way around it. What we can do is we use this   class ‘Unsafe’. It's called ‘Unsafe’ - this isn't  actually unsafe. This is not unsafe code. This is   still perfectly regular managed code. But I need to  get hold of a namespace for that. So that's in   CompilerServices. And then we can do on here a  method called ‘IsNullRef’. And so we pass that   in there, we have to pass that in by ref. But  that now checks to see whether that ‘ms.Value’   has an initial value. So we'd actually like to  put a ‘not’ on there. And then we'll do an ‘else’   and then ‘Console.WriteLine("Null");'. So  what we'll now see, when we run that is we've had   a safe runtime check to make sure it's not null.  But obviously, you don't want it to be null,   you want to refer to something. So what we  can do is on our struct, we could put in   a constructor that takes again a ‘ref int val’ and  then we'll just say ‘Value =’ and watch out here:   you don't want to say ‘= val’, you want to say  ‘= ref val’. I'll come back to that in a moment,   but you're probably getting the hang of this, can  see why that is. So then we do that. And what that   then means is back in our program, I could do  something like this, I could say just a regular   int. So this one happens to exist on the stack,  we'll call this ‘e’, give it a value of 100. And   then if I pass ‘e’ in there, again got to pass by  ref, because again, we've got the ref on there.   But now that will mean that ‘e’ is the actual bit  of memory being used, but being referred to by   ‘ms.Value’. And so now when we run that, we see  we get the 100 out there. But again, we can see   that that is a reference, because if I were to say  ‘e = 27;’ and then take all of that code again.   We've changed the value of ‘e’,  but in doing so we also changed   the value of ‘ms.Value’ because it's just  a reference in there. Equally, if I were to   say something like ‘ms.Value = 5’,  and then print out the value of ‘e’,   then it would just be happening in reverse,  we’d see that ‘e’ has now taken on the value 5,   because the two are the same thing. But there's something  else we can do there as well, because if I were to   say ‘int f = 66;’, obviously I could put an ‘f’ in  there. And let's actually print out both of these.   And as that stands, when we run that up, we can  see that both ‘e’ and ‘f’ are 66. Because all   that's happened when we assign ‘f’ into ‘Value’,  ‘Value’ is a reference to ‘e’ and so we get 66   into ‘e’. But there's one other thing we can do  here. And this is again not really a new use of   ref, it's something we've seen before. But if we,  rather than assigning ‘f’ into ‘Value’ we assign   ‘ref f’ into ‘Value’, we're not writing the value  of ‘f’ into ‘e’, we are changing ‘Value’ so that   it now refers to ‘f’. So now if we do that, we can  see that ‘e’ is unchanged, that's at 100, but that   ‘f’ is still 66. And indeed, if we put on ‘Value’  that is also now going to be 66. So if you want to   change the value in whatever is being referred  to, you just say ‘ms.Value = f;’ and that will   just write the value in there. But if you want to  change what it's referring to put the word ‘ref’   on there, and that will change what it refers to.  Now we can actually take a little bit of control   over that because if we go back to MyStruct, we  talked about the fact we can put ‘readonly’ on   the struct itself. But we can also put ‘readonly’  on the ref. And there's actually a couple of ways   we can do that. If I say ‘readonly ref’ and  then take a look around, we'll see that in the   program that means that this second line has now  become illegal. So if you say ‘readonly ref’ that   means once it's been initialised - you're always  allowed initialization with any type of readonly   value - but once it's been initialised, we're  not allowed to change the thing that it refers   to. So we're allowed to change the actual value  in there, but we can't make it refer to something   different. On the other hand, we can also, instead  of putting that ‘readonly’ before the ref keyword,   we can put ‘readonly’ after the ref keyword.  And now if we look at the code in the program,   we can see that it's swapped over. That  now we are allowed to change the reference,   but we're not allowed to change the value that  we've put in. So in this case, the two different   uses of ‘readonly’ are positional and do different  things. And indeed, we can have both of them   if we want. So we can have a ‘readonly ref  readonly’ for which, once it's been initialised,   you can neither make it refer to something else  nor change the value of what it refers to. And so   here we can see both of those are now invalid. And  this reminds me - and may remind any of you with   any C++ experience - of the sort of thing we used  to have in C++. I've just got a quick C++ program   here to show you where with pointers - which  remember is what we're dealing with here, even   though in C# we're dealing with managed pointers.  But one could declare an ‘int*’, call it ‘pi’   and just set it to ‘nullptr’. But what you could  do in C++ is you could declare a ‘const int*’,   which is to say that although the pointer value  can change, you can't modify the thing - the int -   that's being pointed to. Or you could say, ‘int*  const pi’, which meant you could change the   int but couldn't make the pointer point to  a different int. And again, you could have   both of them. So you can have a const pointer to a  const int. And we're doing exactly the same thing,   now, in C# by the fact you can have ‘readonly  ref readonly’ and make both of those types of   modification illegal, and also makes that  illegal as well. So although that is quite   complicated stuff, I hope it gives you some  understanding of what's going on deep down,   and particularly what's happening with the Span  that we looked at last time, because a Span is   a readonly ref struct. So ‘readonly’ – can’t be  modified once it's been created - and then ‘ref’   means it can only exist on the stack. And once  it can only exist on the stack, that means it   can have a ref field because ref fields can only  exist on the stack. And that means it's got this   thing we call a managed pointer, which can point  to either stack memory or heap memory and give us   much faster access to it than if we always had to  have things on the heap. So rather in-depth there.   Hope it didn't go too deep for you, but I think  it's useful to know. But if you enjoyed that,   do click like, do subscribe and I'll see you next  time, perhaps was something a little bit simpler.
Info
Channel: Coding Tutorials
Views: 1,945
Rating: undefined out of 5
Keywords: C#, .NET 7, ref, ref fields, managed pointers, captioned
Id: ZA2b0N53e_g
Channel Id: undefined
Length: 26min 43sec (1603 seconds)
Published: Fri Jan 27 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.