But, what is Virtual Memory?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone Nicola here in this video we will talk about virtual memory I will walk you through the problems how virtual memory solves them and finally how it's implemented we will look at three problems that virtual memory solves not enough memory memory fragmentation and Security in the old days computer memory was expensive and computers didn't have as much RAM as today for instance most CPUs could support up to 4 GB of RAM but it was common for computers to have only 1 GB of RAM or even less why could CPUs support only up to 4 GB that's because registers could store 32 bits and could access up to 2 to the^ of 32 bytes of memory which is 4 GB and this is what every program could do in theory modern CPUs on the other hand have 64bit address space but we will use 32 bits for examples in this video because the numbers are easier to work with our computer has only 2 GB of RAM in this example which is 31 bit space so what happens when a program tries to access the first few bytes that's not a problem but what happens when the program tries to use more than 2 GB of memory the program will crash because there is no memory to access this is why programs used to crash in the 1950s or so now let's look at the second problem let's say that we have installed 4 GB of RAM and we have three programs a video player that runs a movie and needs 1 gab of ram a video game that needs 2 GB of RAM and the photo editing software that also needs 2 GB of RAM if we run programs 1 and two they can fit in memory they use 3 GB leaving 1 GB free if we close the video player the memory usage is now 2 GB and there's also 2 GB of free space we should be able to run the photo editing software but we can't because the memory is split even though there's enough space in total this is called memory fragmentation why can't we split our program across the memory you may ask we could but writing such programs would be very hard if we have a global array of elements we wouldn't be able to access any element using indexes if it was split across two memory chunks for instance we would have to implement some kind of indirection and as you'll see later this is what the virtual memory is let's look at the third problem we've said that each program can access any 32bit address so what happens if two programs try to access the same memory for examp example say that both programs write some value to the address 64 a video game that stores the player's current health and the music player that wres the remaining song duration when the song finishes the music player writes zero to this address and if this is also the player's Health you would be dead and it would be game over that would be very annoying because these two programs are corrupting each other's state so to summarize if all programs have access to the same 32-bit memory space they can crash if they access invalid memory addresses because the the computer has less than 4 GB of RAM they can also run out of space if there are multiple programs running due to the memory fragmentation or they could also corrupt each other's data the key problem here is that programs have access to the same memory space so if we can give each program its own memory space then we might be able to solve these problems the memory space assigned to each program is called virtual memory and every program has its own virtual memory that doesn't overlap with other programs but this means that we have to map each program's virtual memory space to Ram space we will refer to Ram as physical memory and to Ram addresses as physical addresses we use terms physical memory and physical address instead of RAM memory and address because Ram is not the only physical memory device that the CPU can access it can also access memory or registers of other devices that are maed to the same address space when a computer Boots the RAM installed on it becomes available to the OS from the certain physical memory address the OS also reserves part of the RAM for itself while the remaining part is used for programs but I'm diverging a bit from the main topic Ram is the most important device for understanding how virtual memory works so when I say physical address or physical memory in this video you should think of ram let's see how virtual memory solves these problems without virtual memory each program can access 32bit AR space which is directly mapped to Ram with virtual memory each program still has 32bit address space but now we also have a map in the middle this map knows how to convert a virtual address to a physical address if a program wants to access address zero for example we look at the map and find it it corresponds to say address 5 any virtual address can be mapped to any physical address but what happens if a program tries to access more data than what can fit in Ram the data can be stored somewhere else and the map can tell us hey this data is not in memory it's on the hard disk for example the operating system will find the oldest data in this case address zero move it to disk and load the access address 3 it will also update the mapping the program can now read the address 3 the mapping allows us to use a disk to give us additional memory when needed this additional memory is called swap memory when the data is not available in Ram and the oos has to go and read it for from the dis we call this a page fault we will talk about page faults later this is not all that great though because reading data from disc into RAM and updating the mapping is very slow ssds may have up to thousand times slower latency to read the first bite but it's still better than crashing in most cases this is why having more RAM can massively improve performance of your PC if it spends a lot of time swapping the data between RAM and disk now let's see how virtual memory solves the fragment ation problem remember our previous example where we closed the video player and wanted to start photo editing software while keeping the video game running we were left with 2 GB of free Ram space but the space was split into two 1 GB chunks we couldn't run our program because it wouldn't fit in the continuous address space with virtual memory we can map parts of the program into each of the available chunks from the program's perspective nothing has changed and it still assumes that the memory is now let's talk about security problem how does the virtual memory keep programs secure let's go back to the example with a video game and a music player both programs have their own virtual address space and a map when the video game accesses the player's Health at address 64 it is mapped for example to a physical address 10 and when the music player writes the remaining song duration it writes it to say physical address 4 even though each program tries to write to the address 64 they're mapped to different physical locations but the complete isolation is also not great because programs wouldn't be able to share any data fortunately this is easy to fix by having parts of each program mapped to the same physical space for example each program may want to read the same shared Library such as win32 API or libc or they could use a shared memory space in Ram to exchange data without going through some other device like disk or network interface for example now let's have a look at how virtual memory is implemented what happens when a program accesses a memory let's say that a program wants to load data from an address 64 into a register R1 when the CPU executes this instruction first it has to find the mapping from the virtual address 64 to a physical address and then read the data from the physical address this map is called a page table and each mapping is called page table entry in our example we have one entry for every virtual address actually CPUs work with Words which is the size of a CPU register in our example a word is 32 bits or four bytes so technically the page table has one entry for every word in the virtual address space so how much space do we need to store the mapping for each word there are two to the^ of 32 addresses each corresponding to a single bite which means that there are 2 to the 30 words this is about 1 billion entries a single entry stores a physical address which is also 32 bits so the size of a page table would be 4 GB and there is one page table for every program this is clearly a lot so what can we do we can divide the memory into chunks which we call Pages hence the name page table and map pages of memory instead of mapping each individual word for example virtual addresses 0 to 4,95 are mapped to some physical addresses say 16,384 to 20479 this means that we don't need a single page table entry for each word anymore instead we need one entry for every 4 kiloby of data which is 1,24 words the size of the page table entry doesn't have to be 4 kiloby but this is typical my Linux box uses 4 kilobyte pages but it's possible to use other page sizes too with 4 kilobyte pages each page has about 1 million entries the size of each entry is still 4 bytes so each table needs four megabytes of size which is manageable there is a tradeoff here though instead of moving a single word out of memory at a time we now have to move 4 kilobytes of data but this works quite well in practice because nearby memory locations are often accessed at the same time so how do we map Pages previously we used to map each individual word and the translation was just a look up in the page table now the page table tells us that a range of virtual addresses is mapped to a range of physical addresses page zero covers the range of 0 to 495 and so on we have the same for the physical space let's say that the virtual page one is mapped to the physical page two what is the physical address for the virtual address for the 200 for example the answer is 8,296 because each individual address in the virtual page is mapped to the physical address using the same offset address 4200 is offet by 104 from the start of the virtual page one so we know that it's going to be offset by the same number of bytes in the physical page which starts at 8,192 let's see how adress translation works at the bit level let's say that we have 32 bits of virtual address space 30 bits of physical addresses which is 1 GB of RAM and the page size is 4 kiloby which is 12 bits to translate a virtual address to a physical address we keep the offset the last 12 bits of both virtual and physical address are the same to map the remaining 20 bits of the virtual address CPU asks the page table and gets back the remaining 18 bits of the physical address part of the virtual address without the offset is called virtual page number and part of the physical address without the offset is called physical page number page tables map virtual to physical page numbers let's look at an example say that we want to translate a virtual address say 1 2 3 4 5 6 78 the last 12 bits remain unchanged which is 678 so we just copy this to the physical address then we look up the virtual page number 1 2 3 45 in the page table which Returns the physical page number 04 321 to get the full physical address we concatenate the physical page number with the offset now let's talk about page faults we've already mentioned what happens when the data is in Ram and the program tries to access it the page table entry will say that the data is for example on the hard disk but how does this actually work when a CPU tries to read some virtual address it looks up the mapping in the page table the page table says that the data is on this so the CPU doesn't know how to read it from Ram instead CPU raises an exception and this exception is called a page Vault the operating system handles the page fault Exception by choosing which page to evict from Ram which is usually the least recently used one if the page is dirty the OS writes it back to disk we say that the page is dirty if the program has written something to it after loading it from disk if the program hasn't written anything to the page then there is no need to save it to disk because its contents haven't changed so we gain a little bit of performance here after that the OS loads the requested page from disk into RAM updates the page table and goes back to executing the same instruction that caused the page fold this time the data is in Ram so the CPU can access it page faults are very very slow and when this happens the OS usually switches to executing another program in the meantime since dis iio is very slow modern CPU architectures have modules called dmas which stands for direct memory access dma can load data from disk to Ram directly while the CPU is doing something else let's summarize what we've learned so far each program has its own virtual memory space when we say physical memory we think of ram both virtual and physical memory spaces are split into 4 kilobyte chunks called Pages last 12 bits of physical and virtual addresses are called offset while the remaining bits are called virtual and physical page numbers this is assuming for kilobyte Pages page table Maps a virtual to physical page number without the offset there is one page table per program page page fault is an exception that the CPU generates when it tries to access the page that is not in physical memory if you are enjoying this video so far please consider subscribing and sharing this video with your friends let's keep going I hope you see why virtual memory is so useful but isn't this in Direction slow for every memory access we need to find the page translation in the page table which is stored in Ram so this requires another Ram access translation of the address and then access the actual data from Ram again this looks very expensive just for a single memory access and you're right it is expensive we need a way to find a physical address very quickly what we could do is add a special Hardware component in the CPU that can cat translations from virtual to physical addresses this component is called translation look aide buffer or tlb for short this cache is very small but very fast when the CPU wants to translate the virtual address it asks the tlb if the translation is in the tlb then it's very fast usually less than a cycle if not then we need to load it from the page table into tlb which is slow modern CPU architectures usually have two tlbs one for instructions and one for data tlbs are small around 4,000 entries on Modern architectures this is why tlbs are constantly being updated fortunately this works well in practice because of the data locality what can we do further to improve the performance of tlbs well we can add more Hardware CPUs usually have two levels of tlb caches which reduces the need to access Ram another common practice is to have something similar to dma modules to load pages from Ram to tlbs without having to go back to the OS let's see an example CPU wants to translate a virtual address 1 2 3 4 5 6 7 8 same as before the page offset is the same so we just copy the last 12 bits to the physical address the virtual page number is 1 2 3 4 5 which is translated by looking for the mapping in the tlb the tlb is empty so the page is loaded from Ram we locate the virtual page 1 2 3 4 5 in programs page table filling the physical page number in the tlb and we can now complete the translation next time when the program tries to access a virtual address with the same page number it will be loaded directly from tlb what happens if the tlb is full and the virtual page number is not there we remove the one that is least recently used and load the corresponding page in the same way as before if a program wants to access a page that is on disk it's still loaded into tlb and the CPU generates a page Vault actually the piece of Hardware that is responsible for address translation and generating page fults is called mmu which stands for memory management unit memory management unit is usually on the CPU board and is programmed by the OS this looks great but how much memory do we need to run say 50 programs we need 4 megabytes of memory for each program so this would be 200 megabytes modern computers run many programs at the same time for example I have 595 programs running right now in the background on my Linux box which would need more than 2 GB of RAM just for page tables even if most of these programs didn't require much memory for themselves that's a lot of wasted Ram the hard part is that we can't swap the page tables on disk because the CPU needs page tables to translate virtual to physical addresses if the page table is not in Ram we can't find it but we are already swapping arbitrary memory locations so can we do the same for pages to solve this problem we can introduce another level of page table entries for each program let's go back to our example with 4 kilobyte page tables the last 12 bits are used as an asset and there are two to the 20 Page table entries so around 1 million if we organize them into 4 kilobyte chunks then we need 1,24 chunks these chunks are second level pages to swap them out to disk we need a mechanism to track their location in Ram or on disk we can solve this by introducing another level of page table entries which we call the first level the first level provides translation from a virtual address to a page entry in the second level and the second level provides the final translation to a physical address page tables can have more than two levels adding more levels increases complexity and we need more memory accesses Linux uses five level page tables to overcome the limit of 64 tab of physical RAM which some vendors provide today for servers let's see an example where we translate a virtual address using multi-level page tables we have 32 bits of virtual addresses and 30 bits of physical addresses we are going to translate the same virtual address as before the first level page table entes are always stored in in Ram and we have some second level page table entries they're in Ram as well but some of them can be on disk with 4 kiloby Pages the last 12 bits are used as an offset which is the same as before so we just copy them to the physical address now let's see how to translate the virtual page number we will divide it into two parts each part is 10 bits long we will use the first 10 bits to look into the first level page table entries this tells us that we need to look further into the second level page table entries that are stored at the address 00 01 0 we get the second level page table entries from memory and use the other 10 bits to find the exact entry which gives us the final translation while we always need to keep the first level page tables in Ram the second level page table entries can be swapped out on disk to save memory this is especially useful for programs that don't use mat RAM and don't need to address the full virtual address space this is all I have prepared about virtual memory if you've enjoyed this video please hit the like button subscribe and share it with your friends I'll see you in the next one
Info
Channel: Tech With Nikola
Views: 229,458
Rating: undefined out of 5
Keywords: virtual memory, computer science, programming, operating system, memory, ram, swap, swap memory, address translation, translation lookaside buffer, TLB, MMU, memory management unit, memory address, computer architecture, hardware
Id: A9WLYbE0p-I
Channel Id: undefined
Length: 20min 11sec (1211 seconds)
Published: Sat Oct 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.