Azure Infrastructure State of the Union 2021

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Looking forward to settling in with a nice cup of coffee for this one!

👍︎︎ 1 👤︎︎ u/dogmanky 📅︎︎ Jan 08 2021 🗫︎ replies

Captions

hey everyone welcome to my 2021 azure infrastructure state of the union really my goal is to just go over the fundamentals the building blocks of the various infrastructure services that we might use directly for infrastructure things like vms and vm scale sets but also on which the higher level kind of app services the data services are actually built on as i'm sure you can imagine there's a lot of work goes into creating this so a like subscribe comment and share would definitely be appreciated now my goal is to go over a very wide range of topics at a fairly shallow level if you're looking for kind of deeper dives into them if you go to my youtube channel um i do actually then have like an azure master class which is like 20 hours of content as a playlist for that i have deep dives into various different topics so those are places you can go on and get more detail again my goal for this is to really be at a shallower level but to give you kind of the breadth of the different components where we are kind of in 2021 and really how they fit together now i think that the best way to start thinking about well what are these core services is we always think about the shift from on-premises to the cloud and the shifts of different responsibilities and the easiest way to kind of start from that is if i think about just what are the different layers involved well we always think about well fundamentally you have kind of a data center and in that data center you have different things for example i can think about within that data center i have network i have storage i have compute services based servers then i'm probably running some kind of hypervisor it could be hyper-v could be esx something else then in there i get kind of my operating system different types of run time middleware until i actually get what i care about that provides the business value which is kind of the application and then the data now for these things on premises and really what is on premises on premises is essentially well i have some facility that provides capacity it provides capacity in terms of amounts of storage and network connectivity and compute that i then serve up through services and very commonly in an on-premises world that capacity is really served up as vms onto which i put stuff and while we're going to talk about a lot of different topics fundamentally if you think about what is the cloud the cloud is really just capacity as well spread over lots of different regions but rather than just being kind of virtual machines there's a whole set of different services that are exposed and i can then leverage to really maximize hey i want to spend time on my app i don't want to worry about all these different things now when i think about that on-premises if i think kind of my on-premises environment i'm essentially responsible for all of this i'm responsible for every single component i pick my data centers i pick locations close to maybe my users or my customers i'm responsible for network infrastructure the storage could be sans uh network attached storage what are my servers what's my hypervisor i do everything when we think about the cloud this shifts so as soon as i go to the cloud and the first thing we think about is always kind of infrastructure as a service i as and the line for any cloud service now what we do always kind of starts here so i can really think about in the cloud i'm never dealing with a hypervisor or a server or storage or network i don't do those things instead those elements are surfaced up to me as different types of service but in the cloud these are always the responsibility of the cloud vendor now i can kind of attribute so what are these various types of things in the cloud what would an equivalent be of a service that's exposed to me so if i think about data centers when the cloud these could be different regions i can use you'll hear things about availability zones maybe availability sets so some kind of physical isolation or proximity there are also things like proximity placement groups if i want to keep things close together so there are types of resources that are exposed to me that would kind of map in a little way to what i used to think about as data centers now from a network perspective obviously the big thing here is things like virtual networks so v-nets and subnets from a security perspective and isolation we think about things like all network security groups and application security groups i'm going to go over all of these things we think about user-defined routing to control the actual routes we think about peering to connect things together there are gateways there's express route there's vpn there's all these other types of services network appliances they're azure fireball azure virtual wan different types of service exposed to me that i can consume for networking for storage well again there's huge numbers of services there's things like disks so when i deal with virtual machines i think about these managed disks that are of different kind of access performance tiers to me there are just storage accounts that expose different types of service there are database offerings managed from sql from postgres mysql mariadb cosmos db all these other types of capability there's azure netapp files and then when we get to things like the compute and the hypervisor well there's different types of service so these things i can often kind of think about they're going to get lumped into things like virtual machines virtual machines scale sets various types of worker nodes for kubernetes for app service plans so we're not dealing with the actual underlying network infrastructure or storage or servers we get these different services exposed up to us and when we think about responsibilities so these are always these days no matter what these are always the responsibility of fundamentally this is always azure's responsible for kind of the health and they provide like service level agreements about the availability of these services now in an iaz world i'm therefore responsible for everything inside the os and above i can pick hey i can pick if i want it to be kind of windows or i want it to be linux i can pick if it's an image available from the marketplace or i can bring my own image and although i'm responsible there are still things in azure to help me so if i think about well anti-malware so there's things like anti-malware extensions so i think about well there's anti-malware there's things like backup extensions there's ability to do things like replication there are hook-ins to do desired state configuration and much much more even patching there's extensions around patching there's even a whole auto management capability now that i can turn on and it does a whole bunch of those things for me but i'm responsible for turning those things on saying i want them but i can pretty much do anything i want now one of the things you may be saying is well okay i'm not responsible for any of these things but i might still care are they performing well um is there a problem and so for all of these types of aspects what is exposed to kind of you is yes azure's responsible but what i can actually get out of this are things like metrics and different types of logs and there's various services that are exposed and i can use these i could send it to maybe another sim system through a kind of event hub i could just store them in a storage account and i could put them in a log analytics service and then run more intelligent services on top to actually get meaningful insight into well what do these mean there are other services that will give me recommendations based on these to say hey make this bigger or shrink this or various types of different types of input so although i'm not responsible i may still want to kind of know if you think about the whole racy responsible accountable consulted informed i might still want to know about what's going on if they're healthy if there's a problem and so all kind of these health alerts i can surface up through metrics and locks and my focus for today is really about it these infrastructure services where hey i'm basically getting a vm in the cloud so the most basic thing you can think about i as is really a vm in the cloud now the ultimate goal though is there's there's other layers there's things like for example paths so pas is platform as a service and here the line really travels this way i i am responsible green for my app and my data azure is now responsible for the other pieces now many of these are still built on virtual machines and the cloud is not vaporware but i don't have to manage them i'm not thinking about patching or backing up or protection or firewalls it's done for me what i get the ability to do is function focus on my workload my app my my data and there are different types of pads and again it's not a focus for this but often we'll think about different types of kind of path service and we kind of move up through layers you can think about things like azure container instances and azure kubernetes service which is the orchestrator we think about things like well app service plans we think about serverless we kind of go up in terms of the maturity because these could be functions they could be logic apps and the reason we're kind of moving up and why is it called serverless is for app services and for aks there are worker nodes that run your things and what i pay for is essentially vms i pick the vm size and i'm paying for those and i run my work on them but i'm paying at a per vm level no matter how busy it is or isn't with serverless i'm not paying for a vm i'm paying for kind of the cycles that it's using to do the work for that function that logic cap these are normally triggered by something could be an event it could be a schedule could be a manual trigger it could hook into things like event grid that's looking at other types of resource and then calls one of these things in some ways azure container instances could almost be folded off as serverless because again i'm not focusing on the size really of a vm i'm just focusing on hey the size of this container instance and how long it's running for but these are other types of service available that ultimately if i can use one of those there's less things i'm responsible for less work i'm thinking about and therefore it's more optimal for my company and then kind of at the far end there's sas now these are software as a service now this is not azure software as a service could be things like microsoft 365 and dynamics 365 it delivers the complete business value it provides the solution paz provides me a platform on which i can run my own business application that provides the business value sas is actually providing the business value so exchange online sharepoint online dynamics etc and so while we're focusing here on the is as a company generally i try and get as far this way as i can if there's a sas solution fantastic um use it if there's not and i'm creating it new for my company well can i write it using pass again i'm not really in the business of wanting to manage operating systems and runtimes and middlewares if there's a platform out there that i can just deploy my app onto and that that's a better fit for me so we're always kind of thinking about well what can i do over here um to actually leverage this so they're the layers that's how we think about the services and how we think about azure but a key point there's nothing magical about azure it's really comes down to well there's capacity that's available in different regions that exposes various types of service which is what we're kind of getting over to here i in the cloud of never directly accessing storage or compute or networking or data centers or hypervisors instead these services are exposed to me that i can then consume so that's that's kind of the point so what exactly is azure how do i use azure what are these clouds and i think we kind of start looking at it from kind of these layers um going up so i talked about well there's this capacity in azure and really the way this boils down to and starts out is we can think about in azure think about well there's the azure cloud and the reality is we start off with we use and we deploy to a region and a region is really defined as this two millisecond latency envelope so inside that region there are most likely multiple data centers those multiple data centers are connected there are these kind of regional network gateway pairs that go and connect and then when i deploy to a region hey my workload gets put onto one or more data centers within those data centers that make up the region and i kind of drew these regional network gateways because fundamentally there is a massive microsoft network there's this huge global network that these regional network gateways connect to to connect to other regions to connect to edge sites that then connect to isps to connect to the internet that connect to your maybe location for private connectivity now there are a lot of these regions and if i quickly pull up this site so i'll actually jump over so this is the microsoft kind of geography page and here we can see these are all of the various regions and if we zoom in a little bit we can see what they're all over the world there are many regions in the united states in canada in europe in specific countries in europe in south africa in brazil in china in india in australia new zealand um korea japan you kind of name it there are azure regions in all of these different countries now we talk about azure as a cloud so i drew kind of the idea that hey there's a particular region in this greater azure cloud and for 99.99 of us that's what we're going to use there's the azure commercial cloud there are actually some other clouds these are azure clouds but they're based on certain sovereign locations for example there's a china cloud and there's a german cloud there's a government cloud and there's even a top secret government cloud that they wrote a blog about so i'm not sure the word top secret means what they think it means um but there's another cloud we can see these so once again if i actually jump over super quick if i just open up the portal and one of the nice things so i'm going to use the azure portal for a lot of this today obviously in a real production environment we want to be using infrastructure as code if you want to find out more about that i have a whole master class lesson on that but i'm going to open up a cloud shell so this lets me kind of get easy powershell or kind of bash access so i can use the azure powershell module or the azcli and what we can actually do from here so i'm down here in the bottom right what i'm actually going to do is if i do a get a z environment you can see here the the four clouds it doesn't show top secret but what we can see here is the regular kind of azure cloud that's the commercial cloud that we're going to use but then we can also see kind of the china cloud the u.s gov cloud and the german cloud so those are kind of sovereign clouds for example to use the china cloud you have to be in china in a chinese company germany is all around the data sovereignty and requirements there to do business in germany so it's managed by a german company china is operated by a chinese partner obviously us government is restricted to government entities or their partners but for us really for the most part we're going to be using the commercial cloud but if you were in one of those geographies hey that there are those sovereign clouds available if you deal with the government then you may get access to that gov cloud as well but for the most part we're thinking about how these regions and what you saw is when i showed all those different regions as we jump back to that there are normally at least two of them in any geopolitical boundary now brazil was kind of the exception but you can actually see they are building out a a second region in brazil because the idea is i want to be able to have resiliency replication to another region which are generally hundreds of miles apart but keep my data in that same geopolitical boundary either same country for data sovereignty requirements reasons and so in all of these different locations generally there's always at least two regions now some services like azure storage and azure key vault will actually use that paired region for a replica of the data and you can actually go and find these so we do pair the regions behind the scenes so if i turn on things like just geo-redundant storage it replicates to the paired region for many other services um you can kind of pick where you want to replicate to now if i think about what i drew a second ago so a region is really made up of one or more data centers and for as many regions as possible microsoft are introducing the concept of availability zones and i can really think about an availability zone as having independent kind of calling power and communications this enables me to have resiliency from a data center level problem because i can definitely think about well hey look there's another region over here another's over there and they're kind of connected and if i deploy my services to this region and this region well if something bad happens to this big area like a natural disaster well then my services are over here they're hundreds of miles away so this natural disaster definitely shouldn't have impacted this one but obviously when i think about hundreds of miles that maybe limits how i can replicate data would have to be asynchronous to not impact performance so ideally maybe there's constructs at a more local level to give me some resilience from different types of failure and so availability zones are the idea that hey i'll see three availability zones in my subscription now the key point here is what i'll see in my subscription i see kind of an az1 az2 and az3 but that's not really written on the buildings there is no az1 az2ac3 there might be lots of different buildings in this region and what's really happening is at a subscription level if i think about hey there are all of those different data centers let's just say there's four within a particular region remember each of these the whole point of az's is that independent calling power and communication so if a problem impacts one of those data centers is not going to impact the others and then i have a subscription i have subscription number one so in subscription number one i would always see kind of an az1 az2 and az3 and it may map to that one az1 is there az2 is there az3 is there and then i get a different subscription then i have subscription 2 and that will also see az 1 2 and however there's no correlation this az one could be that location az2 could be that one az3 may happen to be the same one but there's no consistency between subscriptions the point of availability zones is it's isolation from the other availability zones but there's no correlation between different subscriptions its goal is if hey i deploy services so i put maybe a vm in terms of this subscriptions over here here and here az one two and three for sub one i know that any kind of data center level problem let's say data center two goes kind of up in flames it's not going to impact az one and three there was a comms problem so it gives me the ability to isolate so a blast radius the other availability zone so that's what and i definitely want to use those now more and more regions are kind of getting those availability zones if we actually go back and look at the picture it actually does show us if we scroll down this picture we can see availability zones presence here and it will show us hey based on the region if it supports availability zones you're basically always going to see three there's never more than three exposed to a subscription if it doesn't it's telling me hey for example in here north central us it's saying no we don't support azs the closest one will be central u.s so that's why i can go and check and say hey well can i use availability zones again that's really something you want to use if you can and when i use availability zones for my service so when i use that construct availability zones gives me a 99.99 sla so that's uh kind of an important thing in terms of those constructs now there there's another construct that's getting less and less um focus and that's availability sets because i can think about sure i drew the region and the different data centers in the region but i could really think about well within each of those data centers if we kind of exploded one of these out for a second it's fundamentally racks of servers racks and racks of servers and in those racks of servers there's lots of kind of servers and i can really think about well yes a data center is a unit of failure in a way that can go wrong but also a rack and a server is a unit of failure each rack has its own power its own network switches so i can think of these as fault domains and so there's this concept of actually that fault domain zero fault domain one full domain two that i might wanna spread my workloads over different racks and we could do that so we had this idea of availability sets and what i would do is i would create the availability set so i just create an availability set and then i just put things in it and azure kind of round robins between the three fault domains now i can pick the number but three is kind of the max for vms and regular availability sets and it would kind of round robin putting my workloads over different default domains there's also within availability set same called update domains and there's generally between 5 and 20. when there's maintenance performed it will only do one update domain at a time so if i had five update domains so i can think about hey there's fault domains which are really kind of racks and then there's update domains which are subsets within that rack and so i might have let's say three fault domains but five update domains and so when there's azure updates rolling out on the nodes themselves it would only pause a fifth if i have five at a time before 20 well then it's only five percent at a time it would pause so that's how it rolls out its updates so availability sets are good it gives me kind of blast radius at a rack level but obviously availability zones are better because availability zones give me a blast radius of an entire data center and so we're kind of seeing availability sets really de-emphasized and in fact what's very common now is a lot of times you'll see the idea of fault domain equals one and that doesn't mean one fault domain it means use as many as you can so if you use virtual machine scale sets which is the ability to automatically create lots of vms from kind of a base image and configuration which we're going to cover if i say fault domain equals one it will actually try and spread over as many racks more than three as it can within any particular data center ie availability zone so this whole concept of availability sets and three maximum is kind of being pushed back as availability zones become more prevalent and now we think about hey let's just try and spread out over many racks as possible so again we're always trying to isolate any kind of risk and impact of some independent failure there is also the concept i talk about proximity placement groups if you think about availability zones and availability sets are about isolation and keeping things apart well sometimes i want to keep things together and so a proximity placement group i create this proximity placement group thing and then the first workload i put in that proximity placement group will pin it to a certain part of a physical location to try and keep things that are now added to that proximity placement group as close as possible so i create a proximity placement group originally it doesn't exist anywhere then i create the first resource and put it in and it maybe i picked az1 so now within az1 it will carve out a portion of the facility for that proximity placement group i created and now everything else i add to that proximity placement group will be put within that kind of more constrained latency envelope the officers you have to be careful because the first resource controls where it's put so generally if i'm going to have a mix of different sizes and advanced types of workload put the most advanced biggest vm first like an n series or an m because that will make sure it gets created in a data center that has that type of workload then the more standard ones will be fine and i can mix these things i can obviously have a ppg with an az az and availability sets are kind of you use one or the other unless i cheat i'm not going to go into that in this video but i can kind of cheat if i use proximity placement groups never mix workloads in an availability set an availability set is not looking at what's running it's just blindly round robbing distributing them between those three racks so if i mixed my domain controllers and my databases and my iis servers in the same availability set through sheer bad luck it might put oh dc sql app server dc sql app server so end up with all the same workflows on the same rack so i create an availability set per unique workload an availability set for my domain controllers one for my sql databases for this sql cluster one for this iis app one for a different iis app to make sure they're always kind of spread out so that's really the goal about that so they're the key construct so no matter what i'm doing vms kubernetes app services doesn't matter these key kind of isolation constructs are going to get used the idea of regions ideally availability zones if they're available in the region and then these fault domains i get to kind of select those things so now let's start talking about actually the resources we want to use and when i create sync i create it in a subscription and that subscription has to have people with various rights and permissions to actually use those things so before i start creating anything actually in azure the first thing we have to do is have an identity provider and the identity provider in azure is always going to be azure ad so this is azure ad and i can create users and groups directly in that azure id most of the time if you had an existing active directory you will populate this by actually replicating a synchronization using something called azure ad connect which will now populate the user objects and groups and machines into this azure id as well which is the identity provider for azure services for microsoft 365 for dynamics 365 and many other services so it does that synchronization again there's different ways i can authenticate to azure id cloud authentication is the best the most secure that means i replicate a hash of the password hash to azure id so i can authenticate directly to azure id then there are things like conditional access and mfa to really give me strong authorization strong authentication with the azure id mfa which i then use to get access to other resources so yes absolutely azure cloud trusts azure id as its identity provider so each subscription trusts a particular azure ad tenant your one but also many other clouds so yeah there's microsoft one so i can figure out well microsoft 365 dynamics 365. they also use azure id as their identity provider but there can be third-party other clouds so just general other sas solutions out there the whole point is i don't want different identities for my users i want to have one identity that i can use in the cloud through microsoft services for other services and so all these other kind of sas services can also trust azure id so now i can add this one identity that is basically synchronized up to the cloud that can then be used for all these other things to get authorized to consume those services and there are literally thousands of different third-party sas services that are built into azure id that i can just kind of turn on and it will then enable me um to use those if there are objects that have to get created in the other cloud this thing called skim i think it's a system for cross identity management or something it can go and create the objects there for me as well but azure ad is going to be the identity provider for azure and many other things but we're focused on azure so i have to have azure id now if you're already using office 365 you've got azure id already someone has already set this up for you so this is where my accounts are that i'm actually going to use to get permissions and access to the resources i'm going to create now i said we get a subscription but there's actually a layers in between there so if i think about well there's my azure id at the top and that's for my tenants there are many different tenants out there i have my own tenant and i can actually have this hierarchy called management groups so there's always going to be kind of this route that sits directly under the azure id so now i'm thinking about my management groups and i can have a hierarchy of these so management group there here i can have this complete hierarchy of management groups to maybe meet my various kind of business requirements now what are those requirements why do i have these management group things when i use the cloud there's a big shift in process if we go all the way back to this idea of on-prem if i was a user and so i'm the user and i want to consume some of that capacity i want a vm well i would probably have to go and make some requests maybe it's an email maybe it's a ticket something else to [Music] an admin let's say they've got glasses and they've got hair so it's definitely not me and the admin would look at that request check it meets requirements check maybe as a project i've got enough capacity allowed allocated to me and then they would actually go and do the provisioning well that doesn't work in the cloud in the cloud we don't have this admin sitting in between a huge part of the cloud is this whole idea of kind of self service and it's not just me manually going to a portal or mainly running a template a huge thing we're going to see is the whole idea of kind of devops and these cicd pipelines are automatically go and provision the resources provision our app and they might do it daily and it's constantly ongoing so there's no way i can have this user in the middle of the process but this user did an important point they checked really they were our governance checkpoint they were the guard rails for our company because they checked well are we meeting our policy requirements and they set out the right kind of permissions they checked well are you using the right amount of money so they were checking budgets so they were doing all of those kind of things and even if i moved to this self-service cloud world i still need those things and so the reason we have these kind of constructs here is i still have the requirement that i need to be able to do policy i still need to be able to do role-based access control give different people different permissions to different scopes or different sets of objects and i still need to be able to do budget and there are constructs in azure for each of these things and as you probably guessed already well i can apply these at management group levels and they get inherited down so anything i set here is inherited so i could set it as fairly high level it'll get inherited down to all of the child management groups and then within a management group what we ultimately create is that subscription and it's in that subscription that yes so we can apply policy and our back at kind of all these different management group levels we can also apply it at a subscription level now within that subscription when i want to create resources what we actually do is we create something called a resource group and again as you've probably guessed already i can set these things at the resource group as well and then it's finally within the resource group where i actually go and create my resources so i create this vm and another vm and i create a storage account and i create a website i don't know what that is it's very bad planet and the reason we group things in the resource groups is there they have a common life cycle they're created together they run together they're going to get de-provisioned ultimately together they're part of maybe the same application and so most likely we're going to have the same people want the same permissions the our back we're going to have them be similar policies maybe a certain budget for that project within a subscription i can have lots of resource groups a resource group is not a boundary of communication so i could absolutely have kind of resource over here maybe i've got networks in this resource group that these vms could absolutely use and connect to so the whole point is we have this hierarchy so we have the management groups that are used for really governance organization so i can apply different policies are back budgets i can also do that as a subscription resource group and the reality is even at a per resource level i can actually apply policy and our back generally we don't it's too convoluted um really resource group is as humans as low as we'll go sometimes certain automations they might do an r back or uh policy at an individual resource level we typically will not so we have all these different constructs and again the point is for the r back is a set of permissions i can perform actions i can perform at a certain scope management group subscription resource group or even individual resource and under any particular management group i can have multiple subscriptions so fundamentally our building block is going to be when we create stuff we create it inside a subscription that's really how we're going to get started so the governance is a key part now there are many other elements to governance i'm not going to go into detail again in the master class there's a whole video on governance because i have use of tags i have naming that there are many other elements that i have to really think about one other thing you can kind of do is if i think about well policy and role-based access control and creating resources and creating resource groups very commonly you'll kind of see these things bundled up together into some called a blueprint a blueprint is something i can stamp down on a subscription to maybe lay down an initial configuration so blueprint contains policies and our back also i can define resource groups and arm template deployments an arm template is just a way to create resources in a declarative form so it will create pay these vms and these storage accounts so there's other ways i can actually create things and blueprints bring these things together as a way to stamp down an initial configuration and you can have different locking i can make it read only i can make it so you can't you can change it but you can't delete it or hey it's an initial state but go nuts delete it if you want to i'm just putting down an initial state to help you so it's not to create anything these are the constructs i need identity i need somewhere to create things what are the actual resources i can create so if i think back to kind of all these different kind of layers we often think about what a virtual machine and really a virtual machine is just this unit of compute with other configuration so if we think we have a server so there is a physical server in those racks in those data centers now that physical server has attributes it has resources this physical server has certain amounts of cpu it has certain amounts of memory it has certain network connectivity it has certain amounts of storage locally and it has certain amounts of remote connectivity it can support a certain amount of throughput in azure they probably separate out the storage and the regular network communications to different network devices so it has a certain amount of throughput it can support for storage it might have specialized it might have gpus so there might be kind of these special configurations where hey maybe it's got gpus it's got special types of maybe nvme local storage maybe it's got rdma network infiniband connectors and many other different things but essentially we have servers with different configurations different skus maybe more memory to cpu maybe more throughput and then i'm going to create a vm and fundamentally my vm is taking a certain portion of those things my vm is going to have itself well it's going to have a certain amount of virtual cpus it's going to have a certain amount of memory it's going to have a certain amount of network connectivity it's going to have a certain amount of storage in terms of maybe number of disks maybe in terms of iops in terms of throughput remember iops and throughput are not the same thing iops is the number of input output operations i can perform per second and then based on the size of the operation that's the throughput so if i was doing an operation of 1k i did a thousand of them well that's a thousand k that'll be my throughput but if the operation was 8k it's still a thousand operations but now the throughput is 8000k so they are kind of these these different numbers and also there's a certain amount of local storage i'm going to consume and what we end up with is lots of different vm series and size i cannot just say hey give me a vm with three virtual cpus and 2.2 gigabytes of memory and 12 iops it's t-shirts small medium large but just like t-shirts there's sort of regular fit there's athletics there there's loose fit because my different workloads will have different skews in terms of what it cares maybe it's more memory-centric or cpu-centric or storage-centric or network-centric maybe they need to be really really big and so what we have if we look at the virtual machine sizes what we can see here is well just like i said there's different types so there's ones that are just general purpose it's the fairly balanced you can see here fairly balanced ratio of cpu to memory then we have ones that are more computerized so it's a higher cpu to memory kind of ratio we have ones that are memory optimized so a higher memory to kind of cpu ratio we have the storage optimized so high throughput high iops we have ones that have gpus these nvidia cooler cards and others ones that have rdma network adapters so if all these different types actually available to us and we basically select based on what we need hey my workload maybe it's a database well my database is actually super memory focused so we'd look at well what are the memory optimized virtual machines and then within the memory optimized there are different types d e m there were constrained versions now you'll notice sometimes you'll see kind of this s variant in fact most of the time you'll see this there's a ds and just a d the s means it's allowed to use the premium storage um types of disks that are available the premium ssd the ultra discs you'll also see sometimes there's a version so you can see there's a v2 here there's kind of this v4 there's a v3 so if you think about it over time hardware changes hardware gets more powerful but i want to make sure there's consistency if i create a machine today and i create one tomorrow and so these different versions are really focused around this azure compute units so the way we measure the cpu in azure is there's this acu score and it's all relative to the a1 kind of the first azure virtual machine has a score of 100 and then we can see the different generations so here the v2 of the d series is higher than the v1 it's over here we can see the v3 of the d series is lower which makes no sense until you notice the vcpu to core ratio is two to one so this introduced hyper threading so each thread this virtual cpu is slightly lower performance because now i get two of them so that's why the number goes down but the reality is now i have more of them to actually work with so this is how i can actually track what is the power of the actual cpus i'm getting i look at the azure compute unit but also when you actually look at the virtual machines themselves it tells you the type of processors they're running so it tells you the exact hey until zm platinum 827 blah blah blah blah blah so we can go and get all of the details of what that actual cpu is and then for any of these i'll just pick one there are different sizes so within the series i the e v4 there's an e2 an e4 an e8 e16 20 32 48 and as you can see the resources scale linearly the more cpus the more memory the more temp storage the more number of data disks the more temp storage throughput i have the max number of nics the network performance all scale linearly as i go up in size of the virtual machine because if you think about it logically i'm getting a certain portion of the physical node so you have to kind of scale linearly if i'm using up more cpu well i'm using up more of the now i'm using 50 of that boxes overall resources so i get 50 of his memory network storage as well so it's going to scale linearly now they do do some mixing on the back end to maximize the overall usage but by and large what they're doing is they have different stamps of clusters of identical hardware that are more memory skewed or cpu skewed or have the higher throughput for storage or network have the nvidia gpus etc so i go and pick a certain series based on my workload so i have to understand what are the requirements of my workload is it more memory or cpu or storage or network i need gpus i need rdma for high performance computing and then how much do i need that will drive the series and size that i ultimately select and provision i can change it i can shut down the vm change its size and then start it again so it's not like in the real world where i buy a server and if i make a mistake i'm stuck i can very easily change the size of my virtual machine and then just start it again so i do have some flexibility there it's not like the huge painful mistake now you may have noticed that we had those kind of constrained um virtual cpu skus the point of those is maybe i need those massive amounts of memory and along with that comes a certain number of cpus because again it has to linearly scale even the memory optimized skews where i get more memory per cpu ratio if i go to a really high amount of memory i still get quite a lot of cpus maybe i don't want those cpus and you might think why why would i not want them imagine i'm paying licensing on a per cpu basis a database product for example and i just need the memory i need a massive amount of memory but it really doesn't need many cpus and if i only need four cpus but it gives me 16 i have to pay for 16 licenses of the software so the point of the cpu constrained versions we go and look here is what they actually do is now you're still paying for the full size of the vm but essentially you can see here it's hiding them so the m8 for example 4 ms i'll only see four virtual cpus everything else is the same as the m8 ms but it's hiding the other cpu so i'd only have to pay for four cpus of the licensing so if we looked at what is the m8 ms so the m8 ms normally has eight virtual cpus so it's hiding a portion of them so i don't have to pay for eight virtual cpus i only have to pay for four or i could do the m82 and just pay for two virtual cpus so that's really the point of those now remember these virtual machines everything here um is multi-tenant when i get my virtual machine that's my virtual machine well another person from a different tenant their virtual machine might be on that same box now there's huge security there's isolation of the hypervisor the network that they can't see each other because it's on the same box if you need your own box there's really two approaches to this some of them actually the size of the virtual machine takes up the whole box see if i if i take one of these vms that is the size of the box there's only one vm on that box i still want it to be a vm not bare metal because being a vm gives me that ability to use all the various features the snapshots the types of deer school the extensions the arm deployment i still want to be a vm i don't want to be bare metal hypervisor gives me that abstraction from the physical hardware so the mobility but no one else is on it there's also something called dedicated host so if i may be running a certain type of workload that i'm not allowed for maybe regulatory reasons to be on the same box as someone else it can only be my stuff on the box than this dedicated host so dedicated host i basically pay for the host and i can buy multiple dedicated hosts and i can put them in saying called a host group so this is all dedicated and then i can put them into a host group then when i create my vm i can say hey create my vm in this host group and it will automatically place it now when i buy a dedicated host it's of a certain type so it's a d series or an e series or something else and i can only put d's on the d series dedicated host or ease on an e series dedicated host i can create them with different sizes basically i can fill it up up to the capacity of the host so if i don't want to share i can't share the physical host because of regulatory reasons there is dedicated uh options where i get my own host it's still azure everything else is the same it's just it's not multi-tenant in terms of who gets put on that box it's only my stuff will get put on that box and again the host groups make it easier for me to deploy i don't have to think about we'll put this vm on this host this one here host group kind of abstracts that way for me i can use that with things like virtual machine scale sets as well the other kind of special so again we have dedicated we have isolated because if the vm takes up the whole size it's an isolated normally you just you you pay for the resource i.e it's not over provisioned i get two virtual cpus if i'm running up one percent or 100 i'm playing the same thing there is also the b series so that's a b and the b stands for burstable so with the b series what i actually get is i get a certain amount of the number of virtual cpus allocated that's that's 20 so i get this provisioned hey you can consume 20 now if what i'm actually using is less i start accruing credit so just like your cell phone plan that you can run when it's over if i'm using less whatever that less is uh i start to accrue a certain amount of credit and what it lets me do the reason it's called burstable is if i suddenly get a busy requirement i can burst above that if i have some kind of peak and when i burst up i obviously start using my credit and then i can start accruing it again so that's the burstable and if we go and look at that we can actually see if we go and look at the g series general purpose b series burstable it talks about the idea that hey i can pick how many virtual cpus i want i get a base performance if we pick the b1 i get base performance of 10 i can burst to a hundred percent it does give you some credits initially and then you can accrue is saying hey you can bank six per hour maximum you can bank is 144. so i could burst up and use more so the b series is nice that they're cheaper and it still lets me do that kind of bursting on occasion so that's another option there and notice those isolated sizes i talked about earlier on so these are the ones that are just simply they're so big it takes up the entire box so if i create one of these isolated virtual machines i'm not sharing with anyone else it's because i'm taking up the whole box now one of the things often you want to try and work out is well what is this actually going to cost me so there is a pricing calculator and what i can do in this calculator just close this down is i can actually go in and say hey i want to work out the price for a virtual machine and then i could say well hey is it windows is it linux you'll notice the linux is cheaper why is that well because with the windows it includes the windows server license now if you had something like azure hybrid benefit i you have the license already i can tell it that you can see here the price for the windows dropped if i turn it back to license the total sum over here on the right and the windows license has a value when i say azure hybrid benefit it drops down but i can go and work out well what size do i need obviously the bigger it is the more i'm going to pay and the whole point of the cloud is on premises they run 24 7. in the cloud they run for when you need it so on premises we often think about for our workloads we just have these big virtual machines that runs the work and it's running 24 7. even if the requirements on it are variable it's just always there because we've got the physical box why wouldn't we in the cloud we pay per second now it's not practical to keep making it um sort of vertical scale i i don't want to think about or adding cpus removing cpus that's not practical so in the cloud we think about scaling out and in vertical so horizontal so i'll have more smaller ones that i can kind of stop and stop so if i have kind of a busy time then sure we have all of them running but if i get quieter well then we'll stop those ones delete them if i get busier again well it can reprovision them or maybe even add more they're smaller units it's costing me the same at the peak time as this one big box to have maybe eight small ones but when it's quieter i can delete them and therefore save money so we don't do things the same way as we think about with kind of on-prem we want to scale kind of out and in rather than up and down up and down is not practical for most things that does require the app to support multi instances and we need some kind of triggers to create and delete the instances so we stop paying for them but just remember in the cloud absolutely we want to be scaling we pay per second i don't want resources sitting there that are bigger than i need to be or more than i need to have we want to scale the things because we pay based on what we're actually using that's one of the huge opportunities for the cloud to actually save money is because it is consumption based we pay for what we're consuming so i go through these calculators it's super important to really think about well would i maybe i need four of them however not all four would be running all of the time maybe some of them i'd have to break this out into separate sort of settings but maybe some of them would run for two hours all the time but then some of them would only run for i don't know half the time so i'll actually go through and work these things out and i can change it over time the beautiful thing about again consumption base is if i'm off on my numbers if i'm auto scaling it will create and delete them based on the actual requirements not based on some hard number i've actually put in there are things like reserved instances which is where hey i know absolutely i'm always going to have if i think about those variations even in my most quiet time i know the minimum i'm ever going to have is four let's say and then maybe there's some others that will kind of change over time well with reserved instances i can kind of do this one year or three years and it's not just cp uh virtual machines i'm going to do for other types of resource as well i can say hey look i'm going to buy four let's just say for example vms for a year or three years and you get a big discount because it helps azure in terms of their capacity planning so you get this big discount the key point here is though if you ever drop these down to two you'd still be paying for four it's like doing a room reservation at a hotel and you get a huge discount if you book it for a month if you don't stay in that hotel room for two nights out the month you're still paying for those two nights and so the reserved instances the point is hey i know i have this base level workload that i'm always going to have running and i know i'm going to have it well i can do these reserved instances and it's going to give you a big discount so if i know hey i've got this certain amount i'm always going to use i can say hey i want to do a one-year reserve so you get a 32 discount or a three year reserved i get a 57 discount so when you start combining things like the reserved instances and the azure hybrid benefit things can start to get really really cheap but again the key point here is you'd have to balance this um if you make it too big and then you're not using them you're kind of wasting money however because of the size of the discount it may still be cheaper to say minimum of four even if sometimes i drop to three if it's not for that long i still might be saving money overall so there's a little bit of kind of look at the math to work out what the right number is when i do those reserved instances on all the different types of resource available in azure and obviously as you saw this windows linux there's a huge number of linux distribution supported microsoft actually is a big contributor now to kind of that open source community a lot of the azure components are built in or easily available there's a huge number of them available in the azure kind of marketplace when i create a virtual machine so if i actually jump over and if we go to actually create a vm so it's going to look at my virtual machines and i'll do add vm what we'll actually see is that i mean there's images built in so there's ones around ubuntu red hat suits say send os debian oracle linux then of course windows server windows client so i can have things like windows virtual desktop all of these available or i can just kind of bring my own then you'll notice this gen one most of the things in azure are gen 1 either bios based but there is a gen 2 for the uefi and you'll see also when i create those virtual machines here i can pick hey well this only gives me availability set so i'm thinking west us if i pick west us two well they support availability zones whereas west us one doesn't notice i can pick availability zone or i can pick availability set but i can't pick both so i can pick one or the other and again we see one two and three a different subscription remember we see something completely different so when i create a vm hey i picked the region and then based on the region if it supports availability zones i could deploy it to availability zone or availability set or none of them and i should have kind of said availability sets gives you kind of a 99.95 sla so we talked about those slas earlier on i kind of drew out the idea that hey if i use those availability zones you get that 99.99 if i use those availability sets again sla of 99.95 so it's a it's a lower sla which really makes sense because the blast radius so everybody sets okay it's uh it's now all within one data center so it's harder to give you as higher sla different buildings it's easier to give that resilience and obviously both of those are multiple i need multiple virtual machines to be able to spread over racks spread over availability zones there are slas for single instance virtual machines so if i have a certain workload where it is just one of them and it's just sitting on kind of this single box um the sla will actually vary depending on the types of disks it's using so if i have just a single vm if it's using premium storage so let's say premium ssd so it's really the storage that drives off that kind of that how quickly things can get spun up and resolved well then i get a 99 so a three nines sla if i use a standard ssd then you get a 99.5 sla if you use a standard hard disk drive you get a 95 sla and it's all documented you can go to the microsoft documentation and it walks through uh the details of those slas as you can see here it talks about halo availability zones well great you get 99.9 availability sets 99.95 using a premium or ultra disk 99.9 standard 99.5 a standard hard disk drive 95 so it kind of goes through um those different options for you so just uh jumped over there for a second but that's kind of the basics around hey uh when i create a vm i've got different the sizes based on the size it impacts the cost and the different capabilities that i actually have available to me so that that's really the compute side of things now beyond the compute we talked about the scale so i can scale out in with dedicated hosts so a vm fundamentally what is we came back is what is the virtual machine the virtual machine really is just a set of hardware configuration but what's what is its state well the state is its storage so i think about well yeah there's this virtual machine but what really makes up its long-term kind of identity is it storage so if i think about a vm yes it has a certain amount of resource but it's also hey we have an os disk and we may optionally have various types of data disks so when we think about saving costs in azure if i d provision a virtual machine it means hey this vm is no longer provisioned and running on a host but it doesn't delete its disks it doesn't delete that definition the metadata so this vm is this resource it doesn't delete that metadata so i can just start it again it would re-provision it on another host it would connect its os and data disks to it i'm exactly back where i was but i've stopped paying for the compute charge so if i d provision here i stopped paying for the dollars for the compute but i would still be paying for the dollars for the actual disks those i don't want to delete or i'd lose its state so i always have kind of those options there in terms of how i handle my various resources when you're playing around and you're learning these things make sure you shut down virtual machines when you're not using them if i'm not using it overnight shut it down now when i say shut it down you have to shut it down from the azure perspective if you just log into the vm and do shutdown but it doesn't de-provision it from a host it's still provisioned on that host i'm still paying for it i have to de-provision i have to take it down from the portal um through powershell through cli then it will actually shut it down de-provision it remove it from the host it's not there anymore and i stopped paying for the compute again i'm still going to pay for the disk so once again when i finish playing with it completely make sure you delete everything delete the vm delete the disks and the best way to make sure you don't miss anything is remember these ideas the resource groups so when i'm working on a certain project maybe i'm doing a certain lab create a resource group for the lab hey lab one there's the vm there's it's disks there's ip addresses there's its network interface cards when i'm done delete the resource group it will delete the vm the disks everything so i'm not leaving things around just i'm paying money for so it's kind of a super important point so we have the virtual machine certain size has certain resources that's this fundamental building block the virtual machine is yes i could just use it on its own i could create a number of them over different az's availability sets proximity placement groups if i want them close together but if i'm doing this scaling thing that's a fairly painful thing to try and do myself i could write something i could use automation i could use functions i could trigger based on hey metrics if it's not this busy over a certain duration stop some or create them but i really think about the virtual machine this vm as a building block so yes this virtual machine is great but then i can think about things like virtual machine scale sets a virtual machine scale set is simply a configuration of the virtual machine kind of a base image and then also we have kind of attributes around scale so if a virtual machine has a configuration and an image that is running a vmss kind of points to a certain vm image and configuration i can do scale which could be based on kind of a schedule it could be based on some kind of metric i cpu or q deck for above below it could be manual and it will automatically create and delete these based on those things so that virtual machine is a building block that gets used by richer things and then virtual machine scale sets are used by things like aks azure kubernetes service for its workers aks for its workers well it actually points to a virtual machine scale set then it adds the various kubernetes cubelet etc that it needs to do its job and run the pods these things all build together app service plans use virtual machines to run its apps on top of so the virtual machine when you start to understand vms and the sizes and those concepts it's not just for vms many of the other things actually build on top of all these constructs so again things like app services they use virtual machines like pick a vm size series so understanding these constructs is important windows virtual desktop hey it sits on top of virtual machines many many services leverage those virtual machines underneath even the pas the database services often you'll see i'm picking certain vm sizes because things don't which run in vaporware they're really running on a vm behind the scenes okay so that's probably spent way more time i should have done just on the virtual machine size but it's important to understand what a vm is it's a certain set of resources with various ratios that you pick based on your workload and then it its state is in these disks but where are those disks what are those disks so i drew this picture of kind of this vm on a node so let's focus back on that for a second so if i really think again about that node that node has all those resources and i did say hey it has a certain amount of local storage and then i create my virtual machine now i don't want to put my os disk for the most part um on here because remember we don't really consider any physical piece of hardware as super resilient it could fail so this isn't really considered durable it should be fine but if i'd stopped my virtual machine and then started it again id provision stopped paying and started it would probably get provisioned on a different host so anything on this disk would be lost if the node crashed i'm going to lose anything on the disk and so what we have is we have azure storage and azure storage at minimum is something called lrs locally redundant storage so anything i write on there there is always three copies of my data so i don't have to think about mirroring my disk or anything like that there's always three copies within that storage cluster now in the past we would have to think about things like well we create a storage account and then we create a page blob and page bob gives us very good random access to the pages that make up that binary large object but there was a certain maximum number of operations per storage account and the disk wasn't really a real thing it wasn't a first-class azure resource and so microsoft essentially abstracted away the storage account it's still there what we focus on is thing called a managed disk and so we actually now create a managed disk and the first one we're going to have is the operating system so be it windows or linux we're going to map to a disk now for pretty much all vms there are a few that don't have it um it's like the dv4 the dsv4 don't have a temporary drive but most do they also get a portion of this local disk and that's kind of this temp area so it gets that map to as well so for windows this is by default d for windows this would be default c i've seen linux this is kind of the slash mount the temporary area so we have this durable disk for the operating system so even if something happened to this node if we d provisioned it the disk is still there again there's three copies of everything within that cluster i can restart it or if this dies it gets reprovisioned it can just reconnect it but i would have lost everything on the temp drive the temp drive there's even a data loss warning file on that disk to say hey don't put your data on here um i'm not durable you're going to lose stuff if you put it on here and i can also add additional disks i can add one or more data disks as well which game will be e f whatever i want to do remember you never have to do mirroring or raid each of these is three copies of the data i might stripe because maybe i want more iops or throughput for a particular volume i'm never gonna do redundancy it's pointless there's already three copies we do redundancy with raid because an individual physical disk can fail and we'd lose the content well think of each of these disks essentially there's three copies of the disk you're not going to lose the disk so you don't have to waste space doing mirroring or striping with parity it's not going to get me anything so we don't do that so i can add additional disks now technically i could still use the old unmanaged disks but there's really not a reason to i guess if i was doing the unmanaged regular hard disk drive i only pay for the data i write but i lose all the benefits of the snapshotting the images so you're going to focus on managed disks and what i'm going to have is there's different types of these managed disks now i guess i should say i said the os is always on this durable storage there may be times i have virtual machines that are stateless they're tin soldiers i don't care about them so there are scenarios where i can actually create the os on this ephemeral they call it ephemeral temporary non-durable cash or temp area there's a certain amount of cash allocated per vm for cashing then there's a certain amount of temporary storage so it is possible if i have completely stateless i don't care if i lost that vm it's just doing a certain workflow but it has no state that's important i can actually say hey create the os on this temporary area it would provision faster it would have a lower latency because it it's local storage so you can create ephemeral storage if you want a lot of times with virtual machine scale sets that might be something i need because those virtual machine scales instances have no state i just want to provision them as quickly as possible and i save money because i'm not paying for this managed disk anymore so you can have ephemeral storage if you want to it's an option but most of the time if it is a state you care about i have these managed disks and there are different types of managed disk i can think about well there's standard hard disk drive which is really just kind of test dev and there's standard ssd and there's premium ssd and then there's ultra disk ultra wonderful super duper for all of these top three i can think about the capacity the iops and the throughput scale linearly i.e as it gets bigger i get more iops and i get more throughput it's this consistent line across all of them if we actually go and look at the types of disks we have available to us here ignore the fact this is linux it's just this particular page i picked but here we can see hey look standard hard disk drives you can see you get a certain performance now the iops are low on the bottom end and notice it says up to so both standard hard disk drive and standard ssd it says up to so that's the limit but it's not provisioned it's not guaranteed i will get that many but what you can see is as the disc gets bigger over here for example you start to get higher throughput and higher iops if i look at the premium ssd it's a much better kind of picture well one it now says provisioned i.e i am guaranteed with premium to get that number of iops i am guaranteed with premium to get that throughput whereas with the standard offerings it's a maximum but i'm not guaranteed to get it that's why really with premium and above you consider those for production purposes so you'll see here well they they kind of go up the bigger the disk the more iops the more provision throughput now one of the nice things we actually get with the smaller premium disks you can see here is we do actually get a bursting capability so what we can have here is hey there's a certain amount of guaranteed iops and throughput for those disks but for limited amounts of time it can actually burst up to a higher iops a higher throughput for that limited time window and you start off with a bucket of that higher iops throughput and then you can use it up and then i can accrue it again if i'm using less than kind of that provisioned amount so that bursting is becoming more and more popular as a means that hey i need this high vibes and throughput but for a fairly small time don't make me buy this massive disc for when i really only need it for these burst scenarios maybe a log on storm at the start of the day or something else it's only for the smaller discs because the idea is that bigger premium discs they already have high numbers you shouldn't need to burst beyond those now the other nice thing i can actually do is for the premium discs yes it's linear based on the capacity i get more iops and more throughput but imagine i'm running premium and i actually need a higher sustained iops and throughput for a certain amount of time but then i want to drop back down again now that would normally mean i'd have to make the disc bigger and then i've got to somehow work out how to shrink it which can be a fairly painful exercise so what i can actually do with premium ssds you can separate the performance and the capacity it's pretty easy to see so if i jump over for a second i'm actually just going to go and look at my subscription and if i look at my disks now the disk has to be either detached from a vm or the vmd provision so i'll find one where i'm detached so i've got premium ssd and notice i have size and performance so this is my size 32 gb bytes and i get a certain amount of provisioned iops and throughput what i could do is i could change it to a different performance tier this will not resize the disk but it will essentially give me the performance of the bigger disk once i've done the resize i would then run so it's not resizing it so i don't have to mess around in the future when i'm done with that higher performance i can then just shrink it back down to its regular performance and i'm done so for those premium disks i can change the performance without having to resize it so that's kind of the big deal now with those performance tiers hey there's some window and i could all make this i could script it maybe i do have some bigger workload running nightly a batch file i want to up its performance for an hour and then shrink it back down and maybe it's a bigger disc so the that burst is not available or the burst is just not enough i can actually completely change the performance characteristics of the disc without having to actually resize and then try and shrink it again so when i create my vms i can i pick the type of disk i want for the os and i can pick different types for the data disks again if i want to use kind of these premiums or above i have to do the s variant for the virtual machine so the ds the es they're the only ones that can use the premium or above types of disk ultra is special so with ultra it actually kind of you can think about it has three different dials for kind of the capacity the iops and the throughput and i can change them dynamically whenever i want so that's a de-provision so i can tweak and i pay for what each of the dials is kind of set to so once i hit a certain capacity i think maybe it's a terabyte we can check then i can pick whatever i observe throughput up to like 160 000 iops so it's massive so if we go back and actually look if we look at the ultradisc numbers we can see the performance so here we go so once i hit here you go one terabyte in size here i can now set these maximum numbers so my size can continue up to like 64 terabytes but even at one terabyte i could have 160 000 iops or 2000 megabytes per second so i can tweak those numbers kind of individually and it shows me where these are available and how i can actually use those things and so that the pricing is based actually on those so if we look at building there's a certain reservation so if i create an ultradisc and don't connect it to anything there's it's kind of like a penalty for actually doing that and for the vm so there's vms that are capable of connecting to ultra disks so if i don't actually connect it to an ultradisc um there's a penalty for that i pay this reservation fee but if we look at the actual pricing details we can see hey premium it's just this fairly standard sizing based on the size of the disk you pay more money the bigger the disk but when i get to ultra well i pay for the different configurations you have kind of the base iops kind of throughput things but then i pay based on the provisioned iops the disk capacity the provision throughput and then again this virtual cpu reservation charge if you don't actually connect it um to an ultra disc if it's a ultra disc compatibility set on the vm so i independently can tweak those dials and based on what i actually need so again depending on the size of the disk i can go up to these various numbers and i separately pay for whatever dials i set to based on the size the capacity the iops and the throughput and there are some rules about how often i can kind of change those things and you can kind of read documentation but fundamentally they're separate dials that i can change so ultradisc is super useful i can go up to those massive a single disc 160 000 iops um now when i i think about the storage there's a number of aspects to this firstly normally there's caching so on the os by default it's going to be read write caching which is normally what you want for each data disk i can set do i want just maybe read caching do i read write do i want nothing like on a database it's very common maybe you wouldn't have any caching active directory domain controller database i wouldn't have caching on that remember i don't put data on the os disk create a data disk put data on the data disk never put data on the temporary drive it's temporary you're going to lose it you're going to have a bad day so i can have the custom caching is set per disk i can hot add and remove these things so disks i can hot add i can hot remove i just posted a video last week about linux storage and i'll show you there manage disk i can just add and remove it anytime it's super easy to kind of do that the other thing i can actually do is i can share disks so if i think about this virtual machine it's connected to this data drive there are some maybe types of database where i actually want a shared disk so when i look at these disks you'll actually see there's a max shares property on them so if we go back and look uh i think it's on this one so look here if i look at the documentation we can see hey look premium ssd ranges depending on the size of the disk there's a max shares limit so if i had kind of a p30 i could have up to five max share numbers configuration for ultra disk it goes from one to five and what this enables me to do is exactly as the name suggests i could have multiple virtual machines connect to the same disk so i could think about hey i've got this vm this vm i've got this data disk i won't do it for the os they can both connect to the same disk now these would have to be in the same for example availability zone if it supports availability zones i can't mount cross zone and it supports things like scuzzy to persistent reservations so i can actually use that as a shared disk in my windows or my linux to maybe put database files if i need that so i can actually have shared disks and within the environment again depending on the size of the disk that really impacts how many connections i can actually have i made a big deal about iops and throughput and how the disks have so i have to understand what is my requirement on iops and throughput to make sure i pick the right disk sizes remember when we drew the vm the vm also has limits on iops and throughput so i need to make sure when i'm working out my sizing of my complete solution it's no good attaching disks that have this massive amount of iops and throughput if the vm's too small to use it so make sure when i'm planning out remember that the vm also has iops and throughput limits and a maximum number of data disks so plan out what do i need in terms of iops and throughput capacity and then make sure i'm picking a vm that limits also match or exceed what the storage is going to provide that i need for my application to actually function so don't add a ton of storage and 160 000 iop ultra discs to a vm that's got a limit of 3000 iops again the logs would kind of show you you're hitting remember those metrics and logs we talked about way at the start that are exposed up to us those metrics and logs i'd be able to see that hey i'm being throttled at the vm or the disk i'll see i'm hitting a limit so i can troubleshoot the issue but we don't want to get into that make sure you plan out you understand what the requirements are and then size everything the size the virtual machine and size the storage accordingly now there are other types of storage service yes managed disks are the key one when i think about a virtual machine but remember inside that virtual machine i have an application running see if i think for a second let's just raise that for a second so yes this is a virtual machine but what i care about is there's an application and that app has different storage requirements now if that app storage requirement is local block storage then yes i attach a data disk to it it uses that but we can also think about well there's there's many other types of storage service in azure remember that storage account i drew well when i think about the storage account i've got to come over here if this is a storage account that i create well it supports things like blob and i can do block blob page blob append blob we have things like tables so those key value pairs we have first in first out cues we have files which today is really kind of smb but nfs is kind of there in preview as well so these will be connecting over the network so my app hey could absolutely connect and use those various types of services that they're available there's also obviously things like database services so again i i can think about things like the sql based that are maybe postgres there's a managed offering there's my sql there's mariadb and obviously i can install anything i want in a vm but these are managed offerings they're just available so i'm not worried about updating databases or the security they're provided for me there is things like redis cache there's all these other things available that my app can use and when i think about storage account there's different performance tiers for blob and files that can get like a premium experience there's things like azure netapp files so azure netapp files are netapp appliances that i kind of i create an account a capacity pool of volume and that volume gets exposed to a virtual network that i can then consume there seems like azure file sync to replicate file shares from on-prem to an azure file sharing a storage account and then back to other file shares so there's a huge array of different solutions available way beyond the scope of kind of this overview but don't limit yourself the whole point is there's a lot of different capabilities understand what you're trying to achieve don't reinvent the wheel if there's something out there there's maybe a pass solution i can leverage use it okay so we've gone a long time we've covered vms we've covered kind of the base storage blocks the last part is networking so we drew at the start actually i talked about a region somewhere and i said it's hey it's connected to this big microsoft backbone and we have this region is a two millisecond latency envelope so if we go now and focus on the networking side and i drew the idea there was a virtual network if i think about the constructs that we have available so i have a subscription and then in my subscription i can use one or more regions and those two boundaries are what i can use to actually create a virtual network a virtual network lives within a particular region in a particular subscription it cannot span regions it cannot span subscriptions so if i had five subscriptions in one region i'd have to have five virtual networks if i had five subscriptions each of them using stuff in two regions i'd have ten virtual networks and a virtual network is essentially defined as ipv4 cider ranges all right sets of ip addresses and optionally i can have ipv6 as well i don't have to but i can and then that virtual network is broken up into subnets etc and each subnet is a portion that i specify of these ip ranges so it's a portion of the ipv4 address space if i added an ipv6 address range to the virtual network it can optionally have an ipv6 i always have to have ipv4 i can be dual stack but i can't be ipv6 only i have to have ipv4 i can have multiple ipv4 ranges and that's very common you're going to use the rfc 1918 the 10 dot the 172 16 192 168 we don't have to if i have my own ip range i can bring it to azure but even if you have your own network that's ip routable if you bring that address space to azure they won't be internet routable uh internet addressable to have something that's internet addressable you have to use the ip addresses azure can allocate to you so i break up into subnets each of these has an ip space so all of these are kind of these private ips so when i create a resource let's say i create for example a virtual machine it gets an ip address it uses dhcp i never statically configure the ip address there's one scenario if i have multiple nyx multiple ip configurations pernic but generally you don't it's always going to get the ipv via dhcp and the dns configuration so on the virtual network i specify the ip and i can figure what are the dns configurations this could be azure so it uses the azure dns or it can be custom where i can give it kind of the the ip addresses of my dns servers which could be like maybe my domain controllers or something else and those get passed by dhcp into the vms now obviously i'm drawing the vm in a subnet it's obviously not in the subnet um the vm has a nic attached to it the nic attaches to a subnet but fundamentally i'm i'm placing the vm inside a subnet vms can have multiple nics so i could have a vm that actually has nicks in multiple subnets like a some type virtual appliance but it can never have nicks in multiple virtual networks i'm always bound to a single virtual network and once again the vm has certain numbers in terms of the number of nics and the amount of throughput it can actually have and that's defined when i look at the vm size it will show me what my kind of network performance can actually be now an important point here these are this is all layer three so i i p and the traffic that's really supported is like tcp um udp yeah there's icmp i can ping things but when you start success obviously in layer four when you start talking about other layer four things it may work it may not but certainly a lot of the internal azure constructs like network security groups and load balancers they work on protocol they work on tcp udp if you try and send something else through it it's probably not going to work or there's no guarantee it's going to work i can't do things like vlans vlan is layer 2 this is layer 3. now a virtual network is by default kind of a unit of isolation when i create resources inside here these are all these private ip addresses now if they try and talk to the internet they can there's no special subnet there's no special thing i have to do if i do nothing else i can absolutely go outbound to the internet and get a response depending on the configuration like if i was behind a load balancer which had a public ip it would use the load balancer for the the snapping to get to the public eye ip address i'm talking to if i don't azure will just automatically assign something or there's a net gateway appliance i can add a managed service that the outbound internet traffic would go through that i could control the ip address or ip prefix again that's a more complex thing there's other videos in the master class on that but by default i'd have to do anything else they can all outbound get to the internet now if i want to be able to offer services to the internet well private ip won't work what i need is a public ip now i can do an instance level public ip so i could say hey this vm or specifically this ip configuration on the nick of the vm you have kind of a public ip one that it doesn't know about but will essentially get redirected to its nick and then things from the internet could talk to it and go to the vm we don't want to do that what we would generally do if we want to offer things to the internet is we have the idea of kind of the azure load balancer the azure load balancer could have a front-end public ip configuration then it would have a back end set and its back end set could point to multiple vms for scale and resilience purposes so that's kind of a front end configuration would be public and then a back end to balance things so if i actually want to get things from the internet um i can i can do that kind of a layer four the azure load balancer or i can do an internal private um as low balancer there's also a layer seven the azure app gateway that adds things like a web application firewall waff and they can do things like cookie based affinity it can do more things because it understands things like http there are different offerings to actually get things in from the internet what about controlling it then so okay great by default anything in the virtual network can talk to anything else in the virtual network they can be in different subnets but they can all talk to each other they can all go outbound to the internet maybe i don't want that i want to be able to actually control those things down and i guess i should have pointed out i said this was kind of tcp udp icmp one of the things i can't do is there is no multicast there is no broadcast i can't do those things in an azure virtual network i want to control the flow of traffic how do i do that so by default they all have these kind of complete flows between each other there's different ways to control that flow now one of them is network security groups and that's probably the main built-in kind of way to control the flow of traffic so if i think once again i i kind of have that virtual network and we have these different subnets a network security group is really just a number of configurations around kind of the source ip destination ip the source and destination ports protocol and then action and i can really think about the action as this kind of allow or deny and then i apply those to the various subnets now i can apply it to a nic they're actually enforced at the nic level but it's hard to manage so really think about creating these nsgs and then i apply it to subnets so it would control the flow of traffic and again there's more detail on these in the master class not going to go into that here but what i could essentially do is i could say hey this dmz is allowed to have inbound from the internet it's not allowed on the others i could say these are allowed to talk but these are not allowed to talk and these are not allowed to talk to the internet there are things like app security groups where instead of having to use ip addresses i can have tags on the network adapters so that's kind of an easier way so i don't have to be bound by ip ranges there's also things like service tags so tags enable me to actually based on various azure services they're available via a range of public ip addresses that's very hard for me to track service tags equate to the public ip addresses of those services there's a tag flight azure storage in south central so if i want to let it talk to those i could use this special tag for storage in south central and it gives me a more direct path to actually get to it for my virtual network so i think of nsgs is a great way to actually control the flow of traffic there are other things i can do there's things like azure firewall there are appliances that can do those things but then actually it's kind of that built-in micro segmentation at the kind of virtual network level a very common thing you're going to face remember the subscription the region is a boundary it's like well i've got mobile subscriptions or i've got multiple regions but i want them to talk so what we can do is if i have multiple virtual networks i'm just going to draw it out here we can do something called peering and this is on the azure backbone it could be cross-regional in the same region and it now enables these things to talk there is a charge for ingress and egress so you would factor that in it is not transitive so i can think about kind of this this kind of spoke a and this spoke b and maybe this is my hub these can talk because they have appearing relationship these cannot talk if i want to enable that i would either have to add a peering relationship between them directly so if i have a lot of them i'll end up with a big mesh or i can set up like azure firewall or network virtual appliance in the hub and then we can use things called user defined routes routing tables and i can basically say if i had an appliance let's just call it nva1 and it has an ip address i could say hey in this virtual network if you want to talk to ip address range a your next hop is nva 1 so now when it wants to talk over here it knows to send it to this appliance that would then forward it on and this one would have a user defined route that would say hey if you want to talk to ip address range of b go to nva 1. it's not like a traditional network this doesn't have to have an ip address in that subnet or that i can actually have a next hop to things in different networks different that's fine so i can enable this connectivity if i have my on-prem i want to connect it the base kind of the most simple way people start with is i can actually use a site-to-site vpn so i can kind of think about over the internet i have a managed gateway set of appliances here and i have a site-to-site vpn policy or route based and i'm connecting this ip space to that ip space if i've done this peering then it could be these ip spaces as well another option would be express route so i talked before about this massive microsoft backbone network that all the different regions and i said there were kind of these edge these pop points of presence these peering meet me locations so the other thing i can actually do is something called express route and with express route what i'm actually doing is i connect my network to one of these edge locations and then they kind of cross connect me to the microsoft backbone so i've connected my network to the microsoft backbone that's all i've done then over that express route i can add something like private peering that actually now connects me to particular virtual networks there's also microsoft peering that lets me advertise by a bgp other types of microsoft service i'm going to go into detail with that in the master class that i would then access via this connection as well once i've done this again remember now these peers if i turn on something called allow remote gateway and use gateway transit these spokes can actually now get to these locations um via that hub network's connectivity so i get that capability kind of just thrown in for me there are various costs of these i'm paying kind of the provider i'm paying for the gateway i apply for the express route circuit this can get kind of complicated to manage so one of the things that is actually available it's kind of the same color twice um you you might hear this thing called azure virtual one and if i do azure virtual one you end up with this managed hub that you don't really have any access to but they then facilitate those peering connections they enable that transitivity they enable the express route the sites types talks to each other and then this kind of disappears from you it's a managed network but it's used by the azure virtual way to facilitate that so if you're maybe starting out you could use this service to enable that connectivity there are other services like uh azure express route global reach that if i had multiple locations connected to express route and they can actually talk to each other directly over the microsoft backbone um that's something else you can do um something else i want to cover it's obviously a very long video already i i guess a key point is when you're deploying and think about your services always think about kind of those blast radiuses so really you always want to be thinking about deploying to at least two regions yes within a region i think high availability i want to deploy across availability zones but i want to deploy it to at least two regions doesn't have to be the paired regions there are benefits to that in terms of how microsoft roll out updates they'll never do the same update to the paired regions but always have at least two deployments or if the workload isn't um maybe critical depends on what my recovery time objective is at least be able to redeploy my workloads maybe run a ci cd pipeline template maybe it's just my data has to be backed up or replicated i mean for virtual machines again we drew the idea that we have this disk there is no concept here of these are three copies there's no replication of a managed disk to another region if i were to replicate to the other region in the app i would replicate to an app running in a vm in another region or there is azure site recovery that does a vm level replication to another az or another region so that's how i could replicate at a vm level but from a networking perspective if i end up now with maybe using multiple regions which is kind of the recommendation so i have multiple regions with my workload deployed maybe it's active active maybe it's active passive if it's database is the state maybe i'm doing kind of a database managed async replication but what i don't want is for the person using my service to have to think about this two instances of my service so fundamentally i'll probably have like a public endpoint and a dns name at each of these i want a single place that this person can talk to if the workload is http or https i can use azure front door so what that is going to do is use anycast so over all those different points of presence make it available the user can talk to at and it will then distribute it to my possible workloads but again that's only http https if it's something else maybe like dns i can use a traffic manager so traffic manager will give me another name and then it can resolve based on performance or routing or whatever in preview there's an azure global load balancer which is a layer 4 that can basically redirect to other public-facing load balancers that's in preview right now but definitely think about as i roll out services to multiple regions i need a way to have the traffic distributed and split across them so that that's kind of a a very high level state of the union where we are in start of 2021 so many things again this is a summarization very broad view the master class is 20 hours i do a weekly update so check that out to stay up to date and microsoft has this cloud adoption framework that talks about how to set up a lot of these things as landing zones the networks the connectivity i'd recommend going and looking at those for kind of a good head start but i hope this was useful um and i hope to see on another video soon take care you

Info

Channel: John Savill's Technical Training

Views: 7,209

Rating: undefined out of 5

Keywords: azure, azure cloud, azure overview, azure overview 2021, azure infrastructure overview, beginners guide to Azure

Id: acOzfw3z0Rc

Channel Id: undefined

Length: 122min 31sec (7351 seconds)

Published: Tue Jan 05 2021