Introduction to AWS Services

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi all this is Chetan. Welcome to this lecture.  So in this lecture we are going to talk about   AWS basic services. If you are not familiar  with AWS and you don't know much about AWS   services this lecture is for you. Ok so before  starting this lecture I want you to understand   few things about AWS. First - AWS global data  centers ! So whenever we will be using some AWS   services typically we will deploy it in some  AWS geographical area now that geographic area   is called AWS region like across the world there  are different AWS regions available for us like   in U.S. there are seven regions, in India there is  one region in Europe there are a couple of regions   and in all there are 20 regions at the moment till  date and 5 more reasons coming soon and when you   deploy services we can choose which regions we  need to deploy in now every region further is   comprised of typically two or more data centers  that's for high availability of AWS services and   those data centers are called availability zones  we will learn more about it shortly also as in AWS   global data centers there are something called  edge locations now edge locations are something   like you can consider it like and caching devices  which are there across 100 plus cities across the   world and your content like your media videos and  pictures what you watch may be on the Facebook or   YouTube they get cached to the nearest location  and from there it is delivered to the user so it   basically improves the performance by lowering  the latency network latency, so overall AWS has   130 plus services if you heard about EC2 or S3  these are like different AWS services and we are   going to learn more about these services in this  lecture. Ok so as I said region is one geographic   area here the blue area you see is an AWS region  and every region consists of typically two or more   availability zones for high availability of your  application so when you design your architecture   typically you will keep your machines in different  AZ's so that if one of the AZ goes down for some   reason you have your machine running in another AZ  and your application then have high availability.   Okay so we'll talk about more this in EC2 sessions  which is a different course but as of now you just   need to know about these things. Ok I hope the  region and availability zone you're familiar with   now let's move ahead and now I want to talk about  AWS services so before that just a quick overview   of how these services regions and AZ really map  to each other? So first thing, if you have an   AWS account AWS account is a top-level entity that  means once you have an AWS account you can deploy   your infrastructure in any of the AWS region.  So as I said there are 20 regions as of now and   every region then comprises of two or more AZs and  that's what is shown here. Now in AWS there are   different services and they have different scope  with respect to region or AZ or account level for   example say billing service it works at an account  level that means at the end of the month you get   one AWS bill which you have to pay. IAM which is  identity and access management it also works at   account level which means how many users you want  to create you can create that and all these users   would have access to all AWS regions and AZ's and  the services because they work globally and there   are more services we'll talk about shortly and  then some other services like S3, DynamoDB they   work at region level that means when you create  S3 bucket you select in which region you want   to create that S3 bucket right similarly DynamoDB  tables. And then there are further services like   EC2 which is a VM, RDS databases, Elastic block  storage (EBS) which is a disk, all these works at   AZ level. The scope of this services is AZ level  that means one EC2 instance cannot be in two AZs   at the same time it would be either in AZ-1 or  AZ-2 or AZ-3 depending on where we are launching   that machine and same with the databases and the  disk so we will see more services but from this I   want you to understand that different AWS services  works at different level and this is a scope where   AWS account is a top-level entity under which we  have AWS regions and then we have AZs in the given   AWS region. Now let's move to AWS services. There  are so many AWS services as I said there are 130   plus AWS services and we can broadly categorize  them into different kind of computing power or   analytic services like this so in compute there  are EC2, Auto scaling, Lambda Load Balancers,   Container service likewise for Data analytics  there is say EMR which is Hadoop service, Kinesis,   Athena so rather than talking about these services  in this fashion I would like to take some example   so that you can map really how this fits into  some architecture and that probably would help you   recall what service is used for what. Similarly  there are other categories like storage services   and database services then there are some network  related services and management services further   you have application services and development  services as well. So still it does not really   take care of all AWS services but we have listed  the widely used AWS services and the popular AWS   services. Okay with this what I want to do next is  that I want to build one application and we will   see how to create the same architecture using  AWS services. So what we want to do is now to   understand different AWS services where they fit  into any architecture we want to build a simple   social media application maybe a mini version of  Facebook or an Instagram and then we will see how   to design the same architecture using different  AWS services. Okay, so our application is fb.com   for example our users will access it using this  name. So first thing if you want to deploy this   application in your on-premises data centers  then the first thing you will need is one private   network like every company has their private  network, we would also require something like this   to make it secure of course. The next thing you  would require is a web server. Now to start with   suppose we are a startup then we will probably  build a small code in maybe PHP and we will run   in some kind of application server or a web server  and it should work maybe for at least 100 users or   lower than that and it works fine and our users  will access this application using IP address   initially so maybe this VM has some public IP and  users access it. Now what happens over the time is   like you want to now extend your application and  you want to add some business logic some UI stuff,   the login functionality and more. So that's  where you need to then have a web server as well   as an application server so that all the front-end  stuff is taken care by web server but all   business logic - suppose it's a Facebook kind of  application then maybe you connect with different   people so adding that data and everything is taken  care by application server and of course further   if you want to extend it you need some kind of  database like relational database MySQL or even   you can have Oracle whatever you prefer. Right?  so if you have this kind of application it works   well and it's called three-tier architecture and  your users are using this application using an IP   address. Right? so this works well and considering  the app is really doing good your website is   really doing good and there is more traction from  the users and somewhere then your web servers or   an application servers becomes a bottleneck. Maybe  they are not able to handle the increased load   on your application. So what's the solution?  Typically we will scale. Now that scaling can   happen vertical scaling that means you increase  the capacity of these machines or you could do   horizontal scaling. So typically in three-tier  architecture you will see web servers and   application servers are scaled horizontally that  means you will bring more web servers and more   application servers right? like I have shown here.  Okay that's fine, now I have multiple web servers   and multiple application servers but as you know  there are multiple web server that means there are   multiple IP addresses and now is the time where  we need an intelligent entity who can really   distribute load to this web servers and that's  where we bring in the load balancer service. So   if you have heard about the load balancers like a  HAProxy and Nginx they do something like this - A   user hits the request to the load balancer and it  evenly distributes that to back-end servers like   this. And as you know now we have load balancers  also and your application is really catching up,   typically you don't want your application to be  accessed using an IP address. You want people   to access your application with the domain name  something called say fb.com and that's where you   need some DNS service where you can map your DNS  domain name to load balancer IP address probably.   Right? Okay so far so good this works fine right?  Your application is three-tier and it is working   well. Now it catches further and you are now  having lot of data or say you have number of   friends are growing, number of connections are  growing, number of posts are growing and that's   where your relational database cannot really serve  this kind of data storage. You cannot do that in   relational databases. For this you need a scalable  databases and also for connection information and   all it makes sense to rather going for NoSQL  databases. So what you will do? Bring in the   NoSQL database like MongoDB or Cassandra, anything  that you want to have. So some part of data is   stored in relational databases and other is stored  in non-relational or NoSQL databases but still   your relational databases could be a performance  bottleneck. Maybe there is read heavy operations   happening on this database and for that typically  you will bring in one more component which is   called database caches. Ok so you bring in some  database cache engines like Redis or a Memcached   where you can query the frequently accessed data  so that your application servers don't hit the   database but all the requests are served from  this cache engine. Ok so this is fairly better   architecture than where we started with. Now  next thing as you know Facebook might be getting   millions of pictures uploaded daily and the videos  daily. Now this disk which are attached to the VM   are not really capable of extending on the fly.  They have size limitations and that's where all   these media, pictures are never stored typically  on these web servers or application servers. For   this, you need some unlimited kind of storage and  that's an external storage and it should not be   necessarily a block storage like your disk. It can  be a file storage like a shared filesystem or or   some external storage like Google Drive if you are  aware of that, right? So you need some external   storage where you store this information. Okay so  that makes your storage that solves you a storage   capacity problem if you use external storage.  That's fine, so far so good! Now next what happens   is when you upload a videos or photos you need  some kind of content filters like maybe you are   uploading videos and that video has some content  which are objectionable or there are some pictures   having some nudity. So you need some content  filter which can do this on the fly and then   those pictures, videos should be actually stored  here in the external storage so we bring in one   more component there. Right? Okay that's fine now  you also know Facebook also throws lot of ads and   it is continuously watching what activities you  are doing while you are on the Facebook page or   maybe what kind of products you are liking, what  kind of posts you are liking and based on that   it gives you suggestions and the friend request,  will throw a lot of ads , right? So this is called   clickstream analysis. Every click is getting  captured somewhere and it is getting analyzed in   real time so you need some kind of clickstream  analysis engine there, right? Let us take an   example - Twitter. What all tweets are going on  in the market? What's the mood of the people?   Currently all this is done using the clickstream  analysis. On Facebook also you have something like   this. Now all this data what this clickstream  analysis engine captures it has to be further   stored somewhere in the external storage right?  and you need an external storage for this like   this storage for storing this data and further you  want to take this data and do some data operations   like you need to run some maybe aggregations,  you need to sort your data and you to find some   meaning out of that data and that's where you need  some kind of Hadoop platform which can perform the   computing on distributed systems. Right? So you  need some kind of a Hadoop platform and you would   also require over the time one data warehouse. Why  because maybe at the end of the year or Facebook   does lot of data analytics right ?Maybe at the  end of the year they want to find out which kind   of users are accessing Facebook more? What are  their age? in which region they come from? How   particular feature of Facebook is being used so  that they can concentrate more on those kind of   features. What is trending? All this information  is taken out by storing this information in some   kind of data warehouse engine and then doing  some kind of business intelligence on top of   it. So you need some business intelligence tool  which can query this data, analyze this data and   then there are reports generated out of which then  Facebook can take decisions like next year maybe   this is our strategy or we will focus on this  area or that area so some business decision you   can drive based on what analytics results come  out of this. Ok! so this is more on a back-end   side which end-user does not really know but  this is happening there. Ok so far so good! so   we have extended our architecture. Now next what  we have is all these photos and videos they can   be directly served over the internet because  you consider this like a Google Drive so you   can directly maybe stream your videos and watch  pictures directly from this storage. So users   might come from the web browser and they may watch  whatever post. Suppose you have posted a video so   they can watch that video here but sometimes your  users come from using mobile devices nowadays   they will watch your videos through mobile phone  and in that case you need the same videos but in   probably different format that's because mobile  device might play a different format of the video   and for this typically we will need some kind  of video converter in between so whenever any   user upload some videos maybe they should be  immediately converted into a mobile friendly   format. All right? So you need some kind  of computing power here as well. Okay so we   will introduce that as an Video Converter here.  Next all these photos and videos are typically   served from as I said from the external storage  but you know whenever some video gets viral,   millions of users watch that video. Now every  time if that video is fetched from this location,  this might become a bottleneck or you may pay  a price because your data is flowing out to the   Internet and there is a lot of data usage  for your videos. So to solve this problem,   you need to have something called CDN - Content  Delivery Network, which is nothing but which   caches these videos and pictures to the nearest  caching devices from where the user is accessing   your videos. Right? so that all the users in that  geography when they want to watch the same video   it is served from here it is not really served  from here so user experiences the low latency   and gets better experience. So in the applications  like Instagram and Facebook or YouTube, largely   they would have lot of content delivery networks  through which the contents are served. Okay so   far so good! We have extended architecture  further. Now you know Facebook also sends   you mobile notifications right? There is a new  friend request or there is a likes on your post,   now for this we need some kind of notification  service, right? Maybe you get a SMS or mobile   push notifications so you need that service. Also  it sends you emails right for various activities   you can disable that but yeah there is option  to opt for email service as well, right? And   further you can also chat with your friends and  for this typically a queue is used. Now messaging   queue - if you heard about like RabbitMQ JMS  queues, IBM MQ, these are all queue services   which enables the kind of first-in first-out and  that kind of data structure so for chatting maybe   you require some kind of queue service as well.  Okay so if we consider all these services it's a   bare minimum kind of social media application. I'm  sure there must be many more components but we are   just sticking to this as of now. And finally if  you want to deploy this architecture and monitor   it continuously like How my VMs are doing? How  my databases are doing? How my storage is doing?   How much storage is there? For all this, you need  some kind of monitoring service and a dashboards   like production dashboard where you can monitor  health of your application. Okay so overall this   will be your architecture and this probably will  be deployed on on-premises and now let's see if we   want to do the same thing on AWS then how we will  do this? We want to do this now on AWS ! So let's   see. First thing this private network what  you see here, in AWS world it is called be   VPC virtual private cloud! So it is not exactly  the way it is shown here because some of these   services are outside VPC but I cannot accommodate  that in a diagram but consider VPC as one private   isolated network that AWS gives you and then  you would have to manage all the public network   for web servers and load balancer and a private  network for databases that is a separate part of   discussion but the VPC is a network service.  Now all these VMs that we are talking about,   these are nothing but EC2 machines right?And the  disk that we attached it's called EBS - Elastic   Block Storage and they have limitation of maximum  size. So EC2 and EBS solves your problem of the   VMs that typically will deploy your applications  on whether web servers or app servers. Now further   you can have an auto scaling enabled for EC2  that means if the load increases on these   EC2 instances then they can scale horizontally  automatically and if the load decreases they can   scale down maybe from 2 machines they can go to 10  machines from 10 they can come back to 2 machines   depending on the load that you can configure  using auto scaling feature of AWS EC2. Further,   for relational databases there is a service  called RDS and for NoSQL databases there is   a service called DynamoDB. For DB caches there is  a service called Elasticache service and it comes   with a Redis and memcached engines in that. Okay  further as you see there is a load balancer so in   Amazon there is a service called ELB - Elastic  Load Balancer service which can distribute the   incoming traffic to multiple back-end EC2 machines  like this and for that if you want to have your   domain name mapping to your load balancer then  you need a DNS service which is called Route53.   Ok great ! Now let's talk about the other stuff  that we have like for external storage it is a S3   service of Amazon simple storage service right?  Which is an unlimited storage - you can just go   on dumping the data and it is accessible over the  internet directly and there is no size limitation   how much data you can store in your S3 buckets.  Also you need some content filter so there is a   service called Rekognition which can find out  an object in the images and it can filter it   out before you upload it to the say S3 buckets.  Okay now as I said you need some kind of service   where your videos from one format get converted  to another format like mp4 to some mobile friendly   format. Now for this, one option is you run some  EC2 machines which continuously watch your S3   buckets for new videos as the new video comes they  download it here convert it and put it back into   another bucket that's one option but there is a  better option for this like a Lambda service. Now   lambda is a serverless service of Amazon where you  just write a code in that code you specify how to   maybe convert a video and you can execute this  lambda function whenever there is a new upload   happening into your S3 so new video comes  lambda gets triggered, it will convert your   video and maybe you have put in logic that put  that video in to another S3 bucket. So now here   there are no servers to manage! Everything is  taken care by lambda functions and this scale   automatically. Okay so we got lambda there. Now  let's talk about this clickstream analysis. Now   for clickstream analysis there is a service called  Kinesis which can capture your click stream data   and then you can analyze that data, you can even  store that data in S3 and you can do much more   with whatever data you capture. Right? Now for  this spark or Hadoop platform there is a service   called EMR and what EMR does like operations like  aggregation, sorting and you can run distributed   jobs - SPARC jobs, Flink jobs. All this you can  run in this managed Hadoop cluster and you also   need to do ETL transactions from your DynamoDB  tables like maybe you want to do what all friends   are their? friend's friend? What activities they  are doing? You want to continuously push new post   on your wall. Now all this is done in real time  using clickstream analysis and at the end of the   year maybe you want all this data to be extracted  & converted into different format data cataloging   and then further do some data processing using  EMR so you need this glue service for doing this   extract transform and load operations - ETL  operations right? And then finally all this   data what you process or what data you have, you  can store it in the data warehouse service which   is nothing but Redshift in Amazon. So redshift  is a data warehousing service which can store   petabyte scale of data and you can perform  the analysis on the data. And to perform   this analysis and see the results you need some  BI tools which like there are various BI tools   in the market but in Amazon you will use Amazon  Quicksight or you can also use Athena which is a   SQL query interface so you can pull data from  S3, perform maybe a SQL operation on that and   all those results can be viewed in a Quicksight.  You can build some graphs, some charts and you   get insides of your data based on that you will  take some business decisions so it's a BI service   from Amazon. Ok so far so good! we introduced lot  of AWS services here now let's move to this side,   Now as I said there is a content delivery  network which can cache a you're a static   content and for this in Amazon there is something  called CloudFront service and CloudFront stores   or caches your data in edge locations. Like I  said these edge locations are across the cities,   across 100 plus cities across the world and when  you use CloudFront service all your data from S3   or wherever you store your data, it gets cached  in the nearest edge location from where the user   is coming and the data is always served from  that edge location for all the users in that   geography. Ok so that's a CloudFront service. Now  let's talk about this side also as I said you need   to send an messages and mobile push notification  in Amazon there is a service called SNS - simple   notification service for that. And if you want  to send emails, bulk emails then there is an SES   service - simple email service. Now for messaging  queues for chatting application Amazon has built   its own queue service which is called SQS-  simple queue service and finally to monitor   all this infrastructure - how my EC2 instances are  doing? How much is the CPU utilization of EC2? How   is database is doing? All these can be monitored  in real time using a service called CloudWatch.   Even you can set alarms like if an average  CPU utilization goes beyond say this percent,   send an email or alert to the administrator or  take some action, do some auto scaling here,   all this can be done using this CloudWatch alarm  there. Okay so I think we have completely replaced   what we did on-premises with all AWS services  and I hope you got some idea about all these   basic AWS services. Okay next we want to see some  more AWS services and let's see some application   services. Now as you know it's a Facebook or  Twitter or any other web services or even Amazon   itself it exposes all their services through API  calls so that different third-party application   can integrate with these applications and for that  they need a REST API service where they can expose   all their APIs. So in Amazon, you can have managed  API gateways where it takes care of scaling,   throttling, everything so you just write a code  for your APIs, definitions of your APIs and it can   be deployed in API gateway. Also as the mobile  usage is increasing most of your users the web   users you need to manage their identities like  when you develop an application your users must   sign-up to your application right? And that  means you need to manage your user pools,   their accesses and everything and for that you  need some user management service so in AWS that   service is called Cognito. Right? so these are  more application services that we can use here.   Now let's move ahead and talk about the security  services in this architecture. Now as you know   there is one primary service for managing all  accesses in your AWS like all your AWS users,   what access they have, what services they can  use and even when say one AWS service like EC2   wants to upload a data to S3 then EC2 needs  permissions to do that. Now all these accesses   and authentication and authorization is managed  using Amazon's IAM service -identity and access   management. It's one of the most important service  for securing your AWS account as well as services.   Next, what you can also do is you can encrypt  your data which is there, which is stored at   various storage locations like EBS is a block  storage like a disk attached to the EC2 , you   can encrypt that data. Data which is stored in S3,  which is stored in EMR, Redshift, Queue messages,   Databases, Caches all this data you can encrypt  using Amazon's KMS - key management service. So   it manages all the encryptions key for you. You  don't need to have your own secure location where   you can store your keys and do the encryptions.  Further as you know this application will be   accessed probably over HTTPS which is SSL enabled  connection because obviously if users are doing   some transactions or they don't want to lose  their important information you would secure   that communication and for this you need digital  certificates, right? So that certificate you   either deploy on load balancers or you may deploy  it on CloudFront so that your communication is   secure. For this Amazon has a service called ACM -  Amazon certificate manager. Okay next as you know   we can also have the application firewalls. Now  those application firewalls are called WAF - Web   Application Firewall. Now that take care of any  attacks. It can prevent like cross-site scripting,   SQL injection, even the DDoS attacks which are  happening, WAF can protect your application from   these attacks and you will typically deploy it  on CloudFront or load balancers or in front of   your API gateways that we saw in earlier slide  so that you are safe and other various ways is   to secure VPC - the public and private subnets  that we will see in detailed VPC session - The   networking in AWS lectures but here we are talking  about application level firewalls so that's WAF.   And if you're going for some kind of compliance  for example PCI DSS compliance or say you're   going for an HIPAA compliance so your machines  need to be patched properly they should be free   from vulnerabilities right? or CVE as you know and  for that there is a service called AWS Inspector.   What it does? It puts an agent inside your  machines and it scans your machine for any known   vulnerabilities and then it will give you reports  saying like you know all these machines out of   these machines we found these vulnerabilities,  go and fix those. So Inspector can give insights   about what's there inside our machines.  Okay so these are primarily used security   services and there are more but I think we will  restrict our discussions to only these services as   of now. Next, we want to see some development and  DevOps services. Now as you see this architecture   it has lot of AWS services and all are connected.  So when you want to deploy everything by hand   maybe manually I would say it will take maybe  couple of days to do this, without making any   errors or detecting the errors and fixing it,  all this has to be done manually then it will   take two or three days probably but with AWS it  gives you ability to code your infrastructure   that's called infrastructure as a code. So you can  have a service like CloudFormation. What it does?   It takes kind of a template from you which is in  JSON or YAML format and it will just create this   infrastructure from scratch for you and that  too within maybe 30 minutes depending on what   size you have but typically I have seen in like 30  minutes maximum it will create all these resources   for you. It's a very powerful service which can  provision your infrastructure from the scratch.   Right? And now this CloudFormation template  will be written by some DevOps people and at   the same time you would have your developers and  a QA, where developers are writing code for your   product and maybe QAs are writing QA test cases,  automation test cases, now everybody needs some   kind of code repository like a GIT code repository  for that AWS has a Codecommit service where they   can check-in the code. So even this CloudFormation  template is nothing but a JSON or a YAMLcode so   these guys your DevOps guys will write that as a  template, CloudFormation service will take that   template and create this infrastructure. Now once  you have this infrastructure up, you require your   actually product to be build and for that you need  CodeBuild service. So Amazon code build will take   the source code in whichever language you have  written in Java or whatever and it will build that   using some kind of build tool like ant or maven  and also while building it will do some unit tests   and finally it will produce some artifacts. Now  artifacts are like your exes or binaries, actually   your application executables basically. So the  CodeBuild will do that, will test it and then   you have to deploy this. That means whatever it  produces you have to put these EXE's and binaries   in EC2 machines where your application is actually  running. So you will require a deployment and for   this you have a CodeDeploy service. Alright? So  if you know about the DevOps you heard about the   term CI and the CDs so this is your CI pipelines  continuous integration pipeline or a continuous   delivery pipeline you can say and if you want  to have this automated like you know developers   are writing the code, checking it in, it  automatically gets build, it automatically tested   and automatically deployed into corresponding  application servers running in EC2 then you can   have a Codepipeline service. Right? So you can  completely build your CI platform here using   these three services. Now if you want to further  integrate all these things with project management   tools like maybe a JIRA or some bug tracking tool,  what's the speed of your development and all the   management tools, now it is called a Codestar  service which very well integrates with Atlassian   JIRA and other tools so you have complete SDLC  control now if you use these development and   the devops services. Okay so I think this is  clear now where these development and deployment   services are used. Okay so if you have come up to  this you know about most of the AWS core services   now for compute, analytics, storage, security,  application, and deployment services thank you
Info
Channel: AWS with Chetan
Views: 2,202,333
Rating: undefined out of 5
Keywords: AWS services overview, aws architecture, aws training, introduction, high level overview, aws beginner, getting started, bird eye view, start, highlevel, cloud practitioner, aws certification, associate certification, cloud, what is aws, how aws works, how aws services work, compute, storage, network, databases, devops, cloudformation, security, architecture
Id: Z3SYDTMP3ME
Channel Id: undefined
Length: 38min 54sec (2334 seconds)
Published: Sun Jun 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.