Introduction to AWS Services

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi all this is Chetan. Welcome to this lecture. So in this lecture we are going to talk about AWS basic services. If you are not familiar with AWS and you don't know much about AWS services this lecture is for you. Ok so before starting this lecture I want you to understand few things about AWS. First - AWS global data centers ! So whenever we will be using some AWS services typically we will deploy it in some AWS geographical area now that geographic area is called AWS region like across the world there are different AWS regions available for us like in U.S. there are seven regions, in India there is one region in Europe there are a couple of regions and in all there are 20 regions at the moment till date and 5 more reasons coming soon and when you deploy services we can choose which regions we need to deploy in now every region further is comprised of typically two or more data centers that's for high availability of AWS services and those data centers are called availability zones we will learn more about it shortly also as in AWS global data centers there are something called edge locations now edge locations are something like you can consider it like and caching devices which are there across 100 plus cities across the world and your content like your media videos and pictures what you watch may be on the Facebook or YouTube they get cached to the nearest location and from there it is delivered to the user so it basically improves the performance by lowering the latency network latency, so overall AWS has 130 plus services if you heard about EC2 or S3 these are like different AWS services and we are going to learn more about these services in this lecture. Ok so as I said region is one geographic area here the blue area you see is an AWS region and every region consists of typically two or more availability zones for high availability of your application so when you design your architecture typically you will keep your machines in different AZ's so that if one of the AZ goes down for some reason you have your machine running in another AZ and your application then have high availability. Okay so we'll talk about more this in EC2 sessions which is a different course but as of now you just need to know about these things. Ok I hope the region and availability zone you're familiar with now let's move ahead and now I want to talk about AWS services so before that just a quick overview of how these services regions and AZ really map to each other? So first thing, if you have an AWS account AWS account is a top-level entity that means once you have an AWS account you can deploy your infrastructure in any of the AWS region. So as I said there are 20 regions as of now and every region then comprises of two or more AZs and that's what is shown here. Now in AWS there are different services and they have different scope with respect to region or AZ or account level for example say billing service it works at an account level that means at the end of the month you get one AWS bill which you have to pay. IAM which is identity and access management it also works at account level which means how many users you want to create you can create that and all these users would have access to all AWS regions and AZ's and the services because they work globally and there are more services we'll talk about shortly and then some other services like S3, DynamoDB they work at region level that means when you create S3 bucket you select in which region you want to create that S3 bucket right similarly DynamoDB tables. And then there are further services like EC2 which is a VM, RDS databases, Elastic block storage (EBS) which is a disk, all these works at AZ level. The scope of this services is AZ level that means one EC2 instance cannot be in two AZs at the same time it would be either in AZ-1 or AZ-2 or AZ-3 depending on where we are launching that machine and same with the databases and the disk so we will see more services but from this I want you to understand that different AWS services works at different level and this is a scope where AWS account is a top-level entity under which we have AWS regions and then we have AZs in the given AWS region. Now let's move to AWS services. There are so many AWS services as I said there are 130 plus AWS services and we can broadly categorize them into different kind of computing power or analytic services like this so in compute there are EC2, Auto scaling, Lambda Load Balancers, Container service likewise for Data analytics there is say EMR which is Hadoop service, Kinesis, Athena so rather than talking about these services in this fashion I would like to take some example so that you can map really how this fits into some architecture and that probably would help you recall what service is used for what. Similarly there are other categories like storage services and database services then there are some network related services and management services further you have application services and development services as well. So still it does not really take care of all AWS services but we have listed the widely used AWS services and the popular AWS services. Okay with this what I want to do next is that I want to build one application and we will see how to create the same architecture using AWS services. So what we want to do is now to understand different AWS services where they fit into any architecture we want to build a simple social media application maybe a mini version of Facebook or an Instagram and then we will see how to design the same architecture using different AWS services. Okay, so our application is fb.com for example our users will access it using this name. So first thing if you want to deploy this application in your on-premises data centers then the first thing you will need is one private network like every company has their private network, we would also require something like this to make it secure of course. The next thing you would require is a web server. Now to start with suppose we are a startup then we will probably build a small code in maybe PHP and we will run in some kind of application server or a web server and it should work maybe for at least 100 users or lower than that and it works fine and our users will access this application using IP address initially so maybe this VM has some public IP and users access it. Now what happens over the time is like you want to now extend your application and you want to add some business logic some UI stuff, the login functionality and more. So that's where you need to then have a web server as well as an application server so that all the front-end stuff is taken care by web server but all business logic - suppose it's a Facebook kind of application then maybe you connect with different people so adding that data and everything is taken care by application server and of course further if you want to extend it you need some kind of database like relational database MySQL or even you can have Oracle whatever you prefer. Right? so if you have this kind of application it works well and it's called three-tier architecture and your users are using this application using an IP address. Right? so this works well and considering the app is really doing good your website is really doing good and there is more traction from the users and somewhere then your web servers or an application servers becomes a bottleneck. Maybe they are not able to handle the increased load on your application. So what's the solution? Typically we will scale. Now that scaling can happen vertical scaling that means you increase the capacity of these machines or you could do horizontal scaling. So typically in three-tier architecture you will see web servers and application servers are scaled horizontally that means you will bring more web servers and more application servers right? like I have shown here. Okay that's fine, now I have multiple web servers and multiple application servers but as you know there are multiple web server that means there are multiple IP addresses and now is the time where we need an intelligent entity who can really distribute load to this web servers and that's where we bring in the load balancer service. So if you have heard about the load balancers like a HAProxy and Nginx they do something like this - A user hits the request to the load balancer and it evenly distributes that to back-end servers like this. And as you know now we have load balancers also and your application is really catching up, typically you don't want your application to be accessed using an IP address. You want people to access your application with the domain name something called say fb.com and that's where you need some DNS service where you can map your DNS domain name to load balancer IP address probably. Right? Okay so far so good this works fine right? Your application is three-tier and it is working well. Now it catches further and you are now having lot of data or say you have number of friends are growing, number of connections are growing, number of posts are growing and that's where your relational database cannot really serve this kind of data storage. You cannot do that in relational databases. For this you need a scalable databases and also for connection information and all it makes sense to rather going for NoSQL databases. So what you will do? Bring in the NoSQL database like MongoDB or Cassandra, anything that you want to have. So some part of data is stored in relational databases and other is stored in non-relational or NoSQL databases but still your relational databases could be a performance bottleneck. Maybe there is read heavy operations happening on this database and for that typically you will bring in one more component which is called database caches. Ok so you bring in some database cache engines like Redis or a Memcached where you can query the frequently accessed data so that your application servers don't hit the database but all the requests are served from this cache engine. Ok so this is fairly better architecture than where we started with. Now next thing as you know Facebook might be getting millions of pictures uploaded daily and the videos daily. Now this disk which are attached to the VM are not really capable of extending on the fly. They have size limitations and that's where all these media, pictures are never stored typically on these web servers or application servers. For this, you need some unlimited kind of storage and that's an external storage and it should not be necessarily a block storage like your disk. It can be a file storage like a shared filesystem or or some external storage like Google Drive if you are aware of that, right? So you need some external storage where you store this information. Okay so that makes your storage that solves you a storage capacity problem if you use external storage. That's fine, so far so good! Now next what happens is when you upload a videos or photos you need some kind of content filters like maybe you are uploading videos and that video has some content which are objectionable or there are some pictures having some nudity. So you need some content filter which can do this on the fly and then those pictures, videos should be actually stored here in the external storage so we bring in one more component there. Right? Okay that's fine now you also know Facebook also throws lot of ads and it is continuously watching what activities you are doing while you are on the Facebook page or maybe what kind of products you are liking, what kind of posts you are liking and based on that it gives you suggestions and the friend request, will throw a lot of ads , right? So this is called clickstream analysis. Every click is getting captured somewhere and it is getting analyzed in real time so you need some kind of clickstream analysis engine there, right? Let us take an example - Twitter. What all tweets are going on in the market? What's the mood of the people? Currently all this is done using the clickstream analysis. On Facebook also you have something like this. Now all this data what this clickstream analysis engine captures it has to be further stored somewhere in the external storage right? and you need an external storage for this like this storage for storing this data and further you want to take this data and do some data operations like you need to run some maybe aggregations, you need to sort your data and you to find some meaning out of that data and that's where you need some kind of Hadoop platform which can perform the computing on distributed systems. Right? So you need some kind of a Hadoop platform and you would also require over the time one data warehouse. Why because maybe at the end of the year or Facebook does lot of data analytics right ?Maybe at the end of the year they want to find out which kind of users are accessing Facebook more? What are their age? in which region they come from? How particular feature of Facebook is being used so that they can concentrate more on those kind of features. What is trending? All this information is taken out by storing this information in some kind of data warehouse engine and then doing some kind of business intelligence on top of it. So you need some business intelligence tool which can query this data, analyze this data and then there are reports generated out of which then Facebook can take decisions like next year maybe this is our strategy or we will focus on this area or that area so some business decision you can drive based on what analytics results come out of this. Ok! so this is more on a back-end side which end-user does not really know but this is happening there. Ok so far so good! so we have extended our architecture. Now next what we have is all these photos and videos they can be directly served over the internet because you consider this like a Google Drive so you can directly maybe stream your videos and watch pictures directly from this storage. So users might come from the web browser and they may watch whatever post. Suppose you have posted a video so they can watch that video here but sometimes your users come from using mobile devices nowadays they will watch your videos through mobile phone and in that case you need the same videos but in probably different format that's because mobile device might play a different format of the video and for this typically we will need some kind of video converter in between so whenever any user upload some videos maybe they should be immediately converted into a mobile friendly format. All right? So you need some kind of computing power here as well. Okay so we will introduce that as an Video Converter here. Next all these photos and videos are typically served from as I said from the external storage but you know whenever some video gets viral, millions of users watch that video. Now every time if that video is fetched from this location, this might become a bottleneck or you may pay a price because your data is flowing out to the Internet and there is a lot of data usage for your videos. So to solve this problem, you need to have something called CDN - Content Delivery Network, which is nothing but which caches these videos and pictures to the nearest caching devices from where the user is accessing your videos. Right? so that all the users in that geography when they want to watch the same video it is served from here it is not really served from here so user experiences the low latency and gets better experience. So in the applications like Instagram and Facebook or YouTube, largely they would have lot of content delivery networks through which the contents are served. Okay so far so good! We have extended architecture further. Now you know Facebook also sends you mobile notifications right? There is a new friend request or there is a likes on your post, now for this we need some kind of notification service, right? Maybe you get a SMS or mobile push notifications so you need that service. Also it sends you emails right for various activities you can disable that but yeah there is option to opt for email service as well, right? And further you can also chat with your friends and for this typically a queue is used. Now messaging queue - if you heard about like RabbitMQ JMS queues, IBM MQ, these are all queue services which enables the kind of first-in first-out and that kind of data structure so for chatting maybe you require some kind of queue service as well. Okay so if we consider all these services it's a bare minimum kind of social media application. I'm sure there must be many more components but we are just sticking to this as of now. And finally if you want to deploy this architecture and monitor it continuously like How my VMs are doing? How my databases are doing? How my storage is doing? How much storage is there? For all this, you need some kind of monitoring service and a dashboards like production dashboard where you can monitor health of your application. Okay so overall this will be your architecture and this probably will be deployed on on-premises and now let's see if we want to do the same thing on AWS then how we will do this? We want to do this now on AWS ! So let's see. First thing this private network what you see here, in AWS world it is called be VPC virtual private cloud! So it is not exactly the way it is shown here because some of these services are outside VPC but I cannot accommodate that in a diagram but consider VPC as one private isolated network that AWS gives you and then you would have to manage all the public network for web servers and load balancer and a private network for databases that is a separate part of discussion but the VPC is a network service. Now all these VMs that we are talking about, these are nothing but EC2 machines right?And the disk that we attached it's called EBS - Elastic Block Storage and they have limitation of maximum size. So EC2 and EBS solves your problem of the VMs that typically will deploy your applications on whether web servers or app servers. Now further you can have an auto scaling enabled for EC2 that means if the load increases on these EC2 instances then they can scale horizontally automatically and if the load decreases they can scale down maybe from 2 machines they can go to 10 machines from 10 they can come back to 2 machines depending on the load that you can configure using auto scaling feature of AWS EC2. Further, for relational databases there is a service called RDS and for NoSQL databases there is a service called DynamoDB. For DB caches there is a service called Elasticache service and it comes with a Redis and memcached engines in that. Okay further as you see there is a load balancer so in Amazon there is a service called ELB - Elastic Load Balancer service which can distribute the incoming traffic to multiple back-end EC2 machines like this and for that if you want to have your domain name mapping to your load balancer then you need a DNS service which is called Route53. Ok great ! Now let's talk about the other stuff that we have like for external storage it is a S3 service of Amazon simple storage service right? Which is an unlimited storage - you can just go on dumping the data and it is accessible over the internet directly and there is no size limitation how much data you can store in your S3 buckets. Also you need some content filter so there is a service called Rekognition which can find out an object in the images and it can filter it out before you upload it to the say S3 buckets. Okay now as I said you need some kind of service where your videos from one format get converted to another format like mp4 to some mobile friendly format. Now for this, one option is you run some EC2 machines which continuously watch your S3 buckets for new videos as the new video comes they download it here convert it and put it back into another bucket that's one option but there is a better option for this like a Lambda service. Now lambda is a serverless service of Amazon where you just write a code in that code you specify how to maybe convert a video and you can execute this lambda function whenever there is a new upload happening into your S3 so new video comes lambda gets triggered, it will convert your video and maybe you have put in logic that put that video in to another S3 bucket. So now here there are no servers to manage! Everything is taken care by lambda functions and this scale automatically. Okay so we got lambda there. Now let's talk about this clickstream analysis. Now for clickstream analysis there is a service called Kinesis which can capture your click stream data and then you can analyze that data, you can even store that data in S3 and you can do much more with whatever data you capture. Right? Now for this spark or Hadoop platform there is a service called EMR and what EMR does like operations like aggregation, sorting and you can run distributed jobs - SPARC jobs, Flink jobs. All this you can run in this managed Hadoop cluster and you also need to do ETL transactions from your DynamoDB tables like maybe you want to do what all friends are their? friend's friend? What activities they are doing? You want to continuously push new post on your wall. Now all this is done in real time using clickstream analysis and at the end of the year maybe you want all this data to be extracted & converted into different format data cataloging and then further do some data processing using EMR so you need this glue service for doing this extract transform and load operations - ETL operations right? And then finally all this data what you process or what data you have, you can store it in the data warehouse service which is nothing but Redshift in Amazon. So redshift is a data warehousing service which can store petabyte scale of data and you can perform the analysis on the data. And to perform this analysis and see the results you need some BI tools which like there are various BI tools in the market but in Amazon you will use Amazon Quicksight or you can also use Athena which is a SQL query interface so you can pull data from S3, perform maybe a SQL operation on that and all those results can be viewed in a Quicksight. You can build some graphs, some charts and you get insides of your data based on that you will take some business decisions so it's a BI service from Amazon. Ok so far so good! we introduced lot of AWS services here now let's move to this side, Now as I said there is a content delivery network which can cache a you're a static content and for this in Amazon there is something called CloudFront service and CloudFront stores or caches your data in edge locations. Like I said these edge locations are across the cities, across 100 plus cities across the world and when you use CloudFront service all your data from S3 or wherever you store your data, it gets cached in the nearest edge location from where the user is coming and the data is always served from that edge location for all the users in that geography. Ok so that's a CloudFront service. Now let's talk about this side also as I said you need to send an messages and mobile push notification in Amazon there is a service called SNS - simple notification service for that. And if you want to send emails, bulk emails then there is an SES service - simple email service. Now for messaging queues for chatting application Amazon has built its own queue service which is called SQS- simple queue service and finally to monitor all this infrastructure - how my EC2 instances are doing? How much is the CPU utilization of EC2? How is database is doing? All these can be monitored in real time using a service called CloudWatch. Even you can set alarms like if an average CPU utilization goes beyond say this percent, send an email or alert to the administrator or take some action, do some auto scaling here, all this can be done using this CloudWatch alarm there. Okay so I think we have completely replaced what we did on-premises with all AWS services and I hope you got some idea about all these basic AWS services. Okay next we want to see some more AWS services and let's see some application services. Now as you know it's a Facebook or Twitter or any other web services or even Amazon itself it exposes all their services through API calls so that different third-party application can integrate with these applications and for that they need a REST API service where they can expose all their APIs. So in Amazon, you can have managed API gateways where it takes care of scaling, throttling, everything so you just write a code for your APIs, definitions of your APIs and it can be deployed in API gateway. Also as the mobile usage is increasing most of your users the web users you need to manage their identities like when you develop an application your users must sign-up to your application right? And that means you need to manage your user pools, their accesses and everything and for that you need some user management service so in AWS that service is called Cognito. Right? so these are more application services that we can use here. Now let's move ahead and talk about the security services in this architecture. Now as you know there is one primary service for managing all accesses in your AWS like all your AWS users, what access they have, what services they can use and even when say one AWS service like EC2 wants to upload a data to S3 then EC2 needs permissions to do that. Now all these accesses and authentication and authorization is managed using Amazon's IAM service -identity and access management. It's one of the most important service for securing your AWS account as well as services. Next, what you can also do is you can encrypt your data which is there, which is stored at various storage locations like EBS is a block storage like a disk attached to the EC2 , you can encrypt that data. Data which is stored in S3, which is stored in EMR, Redshift, Queue messages, Databases, Caches all this data you can encrypt using Amazon's KMS - key management service. So it manages all the encryptions key for you. You don't need to have your own secure location where you can store your keys and do the encryptions. Further as you know this application will be accessed probably over HTTPS which is SSL enabled connection because obviously if users are doing some transactions or they don't want to lose their important information you would secure that communication and for this you need digital certificates, right? So that certificate you either deploy on load balancers or you may deploy it on CloudFront so that your communication is secure. For this Amazon has a service called ACM - Amazon certificate manager. Okay next as you know we can also have the application firewalls. Now those application firewalls are called WAF - Web Application Firewall. Now that take care of any attacks. It can prevent like cross-site scripting, SQL injection, even the DDoS attacks which are happening, WAF can protect your application from these attacks and you will typically deploy it on CloudFront or load balancers or in front of your API gateways that we saw in earlier slide so that you are safe and other various ways is to secure VPC - the public and private subnets that we will see in detailed VPC session - The networking in AWS lectures but here we are talking about application level firewalls so that's WAF. And if you're going for some kind of compliance for example PCI DSS compliance or say you're going for an HIPAA compliance so your machines need to be patched properly they should be free from vulnerabilities right? or CVE as you know and for that there is a service called AWS Inspector. What it does? It puts an agent inside your machines and it scans your machine for any known vulnerabilities and then it will give you reports saying like you know all these machines out of these machines we found these vulnerabilities, go and fix those. So Inspector can give insights about what's there inside our machines. Okay so these are primarily used security services and there are more but I think we will restrict our discussions to only these services as of now. Next, we want to see some development and DevOps services. Now as you see this architecture it has lot of AWS services and all are connected. So when you want to deploy everything by hand maybe manually I would say it will take maybe couple of days to do this, without making any errors or detecting the errors and fixing it, all this has to be done manually then it will take two or three days probably but with AWS it gives you ability to code your infrastructure that's called infrastructure as a code. So you can have a service like CloudFormation. What it does? It takes kind of a template from you which is in JSON or YAML format and it will just create this infrastructure from scratch for you and that too within maybe 30 minutes depending on what size you have but typically I have seen in like 30 minutes maximum it will create all these resources for you. It's a very powerful service which can provision your infrastructure from the scratch. Right? And now this CloudFormation template will be written by some DevOps people and at the same time you would have your developers and a QA, where developers are writing code for your product and maybe QAs are writing QA test cases, automation test cases, now everybody needs some kind of code repository like a GIT code repository for that AWS has a Codecommit service where they can check-in the code. So even this CloudFormation template is nothing but a JSON or a YAMLcode so these guys your DevOps guys will write that as a template, CloudFormation service will take that template and create this infrastructure. Now once you have this infrastructure up, you require your actually product to be build and for that you need CodeBuild service. So Amazon code build will take the source code in whichever language you have written in Java or whatever and it will build that using some kind of build tool like ant or maven and also while building it will do some unit tests and finally it will produce some artifacts. Now artifacts are like your exes or binaries, actually your application executables basically. So the CodeBuild will do that, will test it and then you have to deploy this. That means whatever it produces you have to put these EXE's and binaries in EC2 machines where your application is actually running. So you will require a deployment and for this you have a CodeDeploy service. Alright? So if you know about the DevOps you heard about the term CI and the CDs so this is your CI pipelines continuous integration pipeline or a continuous delivery pipeline you can say and if you want to have this automated like you know developers are writing the code, checking it in, it automatically gets build, it automatically tested and automatically deployed into corresponding application servers running in EC2 then you can have a Codepipeline service. Right? So you can completely build your CI platform here using these three services. Now if you want to further integrate all these things with project management tools like maybe a JIRA or some bug tracking tool, what's the speed of your development and all the management tools, now it is called a Codestar service which very well integrates with Atlassian JIRA and other tools so you have complete SDLC control now if you use these development and the devops services. Okay so I think this is clear now where these development and deployment services are used. Okay so if you have come up to this you know about most of the AWS core services now for compute, analytics, storage, security, application, and deployment services thank you

Info

Channel: AWS with Chetan

Views: 2,202,333

Rating: undefined out of 5

Keywords: AWS services overview, aws architecture, aws training, introduction, high level overview, aws beginner, getting started, bird eye view, start, highlevel, cloud practitioner, aws certification, associate certification, cloud, what is aws, how aws works, how aws services work, compute, storage, network, databases, devops, cloudformation, security, architecture

Id: Z3SYDTMP3ME

Channel Id: undefined

Length: 38min 54sec (2334 seconds)

Published: Sun Jun 09 2019