AWS re:Invent 2018: [REPEAT 1] Deep Dive on Amazon Elastic File System (Amazon EFS) (STG301-R1)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone and welcome to our deep dive into Amazon Elastic file system would like to thank you all for joining us today my name is Duncan Lenox I'm the director of product management for Amazon EFS and I'm joined by my colleague Vince carrion a senior product manager on the EFS team and we're also really fortunate to have a builder with us today a customer from t-mobile who's going to talk a little bit about how they're using EFS later in the session we've got a lot of great material we need to cover with you in the next hour we'd like to start out with a little bit of an overview of AWS Storage give you a sense of the options that we have available and how we think about storage at AWS and then I'm gonna turn it over to Vince to give you an introduction to Amazon Elastic file system and then we're gonna dive deep into EFS and share with you a lot more of the details about the capabilities of EFS then we'll talk a little bit about customer use cases how our customers are using EFS today wrap it up with some best practices some guidelines for getting started with EFS and then take your questions at the end so let's get started let's dive into the AWS Storage overview so AWS has been innovating with storage rapidly over the last decade and we are really excited that Gartner and their clients have continued to recognize our leadership on the public storage Magic Quadrant we're the vendor that is the highest on both axes and we've been there since day one and being out dealing with storage at scale with our customers has allowed us to gather a lot of feedback about storage and storage needs at AWS 90 to 95 percent of our roadmap is driven by the feedback we hear from you our customers and it's allowed us to build a storage portfolio that's broad and deep and in the almost 13 years now that we've been working with customers helping them move to the AWS cloud we've learned a lot about adoption patterns and we see three core adoption patterns out there with our customers the first one is Rijo stting and this is where you're taking an existing application that might be running in virtual machines and moving it directly in the AWS cloud that's how we help Lionsgate for example take their sharepoint and s AP applications and move them into the cloud another pattern we see is replac form that's where you're taking an existing our application that you have moving it into the AWS cloud and taking advantage of bits and pieces of AWS services including Amazon Elastic file system that's how we help the BBC take their red button application and move it from on Prem into the cloud and then the third pattern that we see is we architecting and this is where you're modifying an existing application as you move it into the cloud and of course building new applications for the cloud and leveraging a storage service like s3 through our API s I'm getting all this feedback from customers and working with them has let us build out the options that we have for you and what we want to be able to provide to you is more choice so you can select the right storage solution for each of your use cases so whether that's a block storage service like Amazon EBS which you might be using and re hosting or file storage of course like Amazon EFS for we are we platforming and then we architecting with something like Amazon s3 or Amazon s3 glacier so why does file storage matter well really when we're talking about file storage in this context we mean network file storage and of course the vast majority of data that you have today is unstructured file data you have a lot of applications that are depending on file and file systems so having services available from AWS that provide file system services natively allows you to take those applications that you have today and move them to the cloud without needing to rebuild core components that sometimes are gonna be costly and expensive to re-architect or you may not even be able to have that as an option depending on where the components come from so we're providing services that enable you to take those existing applications and your existing IT environments and move them into the cloud so let's talk about fully managed cloud systems so amazon EFS of course is what we're gonna do a deep dive into today but just yesterday a tree in Van announce two new fully managed file service options that we have available Amazon fsx for Windows file server and Amazon fsx for lustre so Amazon EFS is a general-purpose cloud native file system designed for Linux workloads it's an NFS based service Amazon fsx for Windows file server then is designed for your Windows workloads it delivers windows native capability in a fully managed service an Amazon FS extra lustre is designed for your compute intensive applications it's based on the popular open source parallel file system lustre but it's fully managed so today we're gonna focus on Amazon Elastic file system and I'd like to turn it over to Vince for our deep dive good afternoon my name is Vince Karen I'm one of the product managers on the Amazon EFS team I have a lot of insights and a lot of stuff to go through so let's start with an introduction to Amazon DFS so a couple years ago we launched Amazon EFS and before EFS if you wanted shared file system in the AWS cloud you had to do it yourself so some of you may be familiar with an architecture similar to this in doing it yourself and to do this it could be pretty burdensome definitely a lot to manage first you had to manage the file server yet and manage the storage volumes and then you also had to worry about replicating that data across availability zones to ensure for high availability so we launched DFS and we had a couple things in mind when we did it so first thing we wanted EFS to be who's wanted it to be as simple as possible amazon EFS is a fully managed service it provides a simple interface for you to create and configure your file systems and manages the file storage infrastructure for you and with just a few clicks you can create a petabyte scale file system talking about elasticity that was our second core tenant that we wanted EFS to be so EFS file systems automatically and instantly scales your file system storage capacity up and down as you add or remove files there's no need to provision storage in the last tenant is we wanted EFS to be scalable as I just mentioned with just a few clicks you can create a petabyte scale file system that where throughput scales as well so we'll talk about each of these in a second and on top of all that we wanted EFS to be highly available and highly durable so now that you know a little bit about their tenants on how and why we built EFS let's go into some of the features of the service but before we do that I want to talk a little bit about the service in the past year so as you can see from reinvent 2017 it's been a pretty busy year for the EFS team we've launched more features launch DFS into more and more regions as you could see we expanded to the Northern California seoul-tokyo and Singapore regions we launched encryption of data in transit we launched a new feature called the provision throughput and we earned more and more certifications and provided more ways for you to access that file system so I talked a lot about a number of these we'll get into these in just a moment so you hear they learn a little bit more about Amazon EFS and we'll talk about each of these again starting with elasticity and scalability so again you can grow and shrink your file system on demand in fact there's no way to provision storage you simply just add or remove files and EFS managed it for you you can grow up your your file systems up to petabytes sighs there's no provisioning as I just mentioned and in the default bursting throughput mode your performance scales as the file system grows and we'll talk about that in a second and lastly in terms of scalability you could also mounds of instances with concurrent access to a single file system Amazon EFS is designed to be highly available and highly durable we're designed for 11:9 of durability and the distributed architecture of Amazon EFS provides availability and durability protection from a-z outages system failures or network connection errors will talk about the distributed architecture in a little bit in terms of our best pack and that'll be towards the end of the session again you can mount your filesystem across multiple availability zones we offer strong consistency for concurrent access and your data is replicated within and across multiple availability zones again adding more layers of availability and durability another benefit of Amazon EFS is the ability to access your filesystem from a number of different locations so first you could access Amazon EFS through ec2 instances in your V PC you could also access the file system via AWS direct connect to your on-premise servers in October we announce that you can now access your EFS file systems from on-premise servers on-premises servers using AWS VPN connection as well as ec2 instances using intra region B PC peering and just this week we announced that you can actually you can access your EFS file systems from ec2 instances in other AWS regions using inter region V PC peering or via the newly launched AWS transit gateway and in fact just yesterday we announced that you can now access your EFS filesystem across accounts using the new shared V PC with this we've increased the number of file systems that you're able to create per account as you can see there we increase that to a thousand file systems per account or ten times increase and we've also added a new limit of four hundred mount targets per V PC so Amazon EFS has two different performance modes you have to pick the performance mode when you create the file system so let's talk about both of these the default is the general purpose mode and this is what we recommend for the majority of customer applications and as you test your application with DFS this is where we recommend that you start the other performance mode is our max i/o performance mode and this is really recommended for scale-out workloads so let's compare and contrast the two different performance modes let's start with what's it for the general purpose of performance mode is designed for just that general purpose file workloads as well as latency sensitive applications whereas the max IO performance mode is really best for those large-scale out data heavy applications both performance modes obviously comes with their own advantages as well as their trade-offs in the general purpose performance mode your general purpose performance mode offers the lowest latency per file operation but it comes with a 7000 IO / IO per second limit while the max IO performance mode you have a virtually unlimited ability to scale throughput in AI ops but it comes with slightly higher latency x' per file operation as compared to the general-purpose performance mode and when to use it as I mentioned before we recommended customers start with the general-purpose performance mode test our applications but if you know that you have a large-scale out type workload we really encourage you to look into the Maxell performance mode again you pick this when you create the file system we also offer two different throughput modes the first is the bursting throughput mode and this is where we recommend for the majority of workloads and in the bursting throughput mode the throughput that you're able to draw with the file system is depends on the size and scales up with the size of the file system itself so you're able to drive fifty megabytes a second per terabyte of data stored at baseline but then burst to double that or a hundred megabytes a second per terabyte of data stored the second throughput mode we launched in octave this year is provision throughput now customers told us they want the ability to to drive higher levels of throughput compared to their file system size so we launched this feature where you can actually set the throughput independent of the file system size again let's compare and contrast the two options what is it for again the bursting throughput mode is our default option it's good for workloads with variable throughput demands whereas the provision throughput mode is really best for customers who where you know you need a certain level of throughput for your application both again come with their own unique advantages as well as trade-offs in the bursting throughput mode again one of our tenants simple and inversing throughput mode it's simple your throughput scales as a size of the file system grows however again the downside is that's a fixed ratio however with the provision throughput mode the advantage is here is that you're able to define the throughput independent but it comes with a little bit additional complexity in terms of pricing it adds in the pricing Domitian and we'll get to that in a second so when do you use it we again we recommend the bursting throughput mode in general for most workloads however if you know that you have a higher throughput to storage ratio we recommend the provision throughput mode you can actually switch between two the two throughput modes and we'll get to that in a second and in fact later on during our best practices will talk about when that might actually be something you would want to consider so again the provision throughput mode provision of throughput independent of the size of the file system itself increases often as you'd like and you can switch between the two modes or you can decrease about once every 24 hours again going back to our tenants we wanted things to be as simple as possible so when we launched Amazon EFS about two and a half years ago we only offered the bursting throughput mode but we got a lot of customer feedback telling us that they wanted provision throughput and we launched that again in August of this year now customers have also told us that while they love the simplicity of Amazon EFS it can be quite expensive to store large amounts of data for long periods of time so we took that feedback we looked at all the various ways that we can reduce costs while still providing the the features that our customers have come to expect so we're very excited to pre-announce this week Amazon EFS infrequent access a new storage class that reduces the cost by up to 85% now with Amazon EFS infrequent access there's no changes necessary to your existing applications Amazon EFS transparent offers a single file system namespace and will transparently serve that data from either the standard source class or the infrequent access storage class to your applications cost savings of up to 85% for for that data that is not an access as frequently and we offer off automatic lifecycle management to simplify things so let's talk a little bit about how that actually works so when we launch amazon EFS infrequent access you simply create your filesystem you also enable lifecycle management and then with lifecycle management enabled any files that are not accessed within or for 30 days are automatically moved to the infrequent access storage class there's no need for you to to manage and you can save like I said up to 85% we'll get to the pricing and have a few slides so switch gears a little bit let's talk a little bit about security and compliance at AWS Security's always a top concern so we offer a number of different mechanisms for you to control and secure your file systems first you can control network traffic using Amazon V PC security groups as well as network ackles you can control file and directory as access using standard POSIX permissions you can use AWS I am for administrative access you can encrypt your data at rest using keys managed in AWS kms as well as encrypt data in transit using industry standard TLS 1.2 and because which Amazon EFS is has become HIPAA eligible gdpr compliant PCI DSS sock and ISO compliant so we've talked a few minutes about the features of Amazon DFS let's talk about a new integration that we were very excited to announce this week as well some of you may have seen this week the new announcement of AWS data sync it combines the speed and reliability of network acceleration tools software with cost-effectiveness of open source tools fast data transfer you can transfer data from on-premises to AWS up to 10 gigabits a second it's very easy to use you can use via the console via CLI via SDK it's secure and reliable it supports encryption at rest as well as in transit and it's cost effective only four cents per gigabyte talking about costs let's move on to some of the economics of Amazon EFS so there's no commitment minimum commitment no upfront fees there's no need to provision store in storage in advance in fact there's no way for one to provision storage in advance for with the FS and there's no other fees and charges so the default to bursting throughput mode it's simple pricing it says only a single pricing dimension and you only pay for the storage that you actually store on EFS and with bursting throughput you that if this price includes 50 megabytes a second per terabyte of data stored as we previously discussed now if provision throughput it dad's that second pricing dimension so the first of the storage price like we previously talked about whereas the second is the throughput price and you pay separate for the throughput that you provision and in fact you only pay for the provision throughput a mouth that's above the throughput that's included in the storage price and now with EFS infrequent access the data in the standard class is charged the same as we previously talked about however in frequently accessed data is stored in the ia storage class and you can see the price there it's a 4.5 cents per gigabyte and separately you're also charged for access they'll be one cent per gigabyte for access let's put it back together with the slide that I talked about previously so obviously there's a lot of operational burden when it comes to managing this yourself there's patching there's managing traffic over procuring let's talk about what it actually what it actually would cost so if you did this on AWS you'd have to first pay for Amazon ec2 instance costs you'd have to worry about EBS volume cost and again you'd probably would likely have to over provision to ensure that you had the capacity available to you and then you also have to worry about that inter AC data transfer costs again for high availability and additionally with infrequent access you would have to manage the tearing of data if you will between one storage option versus another storage option and as you can see here if you take if you compare and contrast the two looking at a 500 gigabyte file system you could see that Amazon EFS is about 90% less expensive than the do-it-yourself option and again this is a fully managed option there's no patching nothing that you need to worry about for her operating the file system Amazon EFS is now available in ten regions and we are excited to announce that we are coming soon to the London region as well as to the gov cloud region so stay tuned there and with that I'd like to pass it back to Duncan and talk a little bit more about how our customers how they're using the service as well as some of their use cases thank you thanks Vince so now that we've had an opportunity to dive deep into some of the capabilities of EFS let's talk a little bit about how people are using it today so when we designed EFS we designed it for to be a general-purpose file system for the broadest array of workloads so everything from your scale-out workloads that need massively parallel i/o and high levels of throughput all the way down to single threaded latency sensitive applications and everything in between and our customers are using EFS today in a very wide set of applications which is a testament to what we've been able to do with our broad set of use cases I'm not gonna go through all of them but I'll just pick a few to highlight we talked earlier about the BBC moving their red button application they actually gave a chalk talk session earlier in the week on it and we have a detailed case study on their use of the FS on our website so I'd encourage you to check that out we've also worked with enterprise application vendors to run their applications on Amazon EFS so Atlassian actually runs their own implementation of their JIRA application using Amazon EFS and we've worked with message broker application vendors like IBM and TIBCO to get those running on EFS as well and companies like Celgene have taken their high performance computing workloads and move those into the AWS cloud using amazon EFS but the best thing is always to hear directly from a customer so we're delighted to have a builder here with us today and I'm gonna turn it over to a mereth from t-mobile to hear about how t Mobile is using Amazon EFS Thank You Duncan thank you wins it's an incredible honor to be here to talk about our use cases here t-mobile so who likes to be on world's fastest 4G LTE and on a best network and to be with unused a lower-cost long Fe here Thanks so that's what t-mobile is doing so t-mobile is not only as we transform the wireless industry here it has been through a lot of transformation over the last few years and we have not only changed the wireless landscape through this transformation we have even transformed a lot of applications and whatever we have done in our data center and we have moved a lot of things it abuse so we have been an AW has played a key role in this transformation it has enabled us to transform in a much faster fish and ef-s specifically has been part of our part of our architecture and it has certainly helped us to move fast my name is Amna Chandra Shekar I am from t-mobile and I'm also part of the cloud center of excellence team so as I told you we not only transform the wireless industry we are also innovating in a very fast phase and we used a lot of open source technology and we would love to contribute back to it as well and over the last few months and couple of years we have open source a lot of our projects and some of the projects are here like the Josh serverless which uses the lambda functions at AWS T Walt which is built on top of Hasek of Walt for secure accessing your keys and we have integrated with our mesas and cubed as clusters at t-mobile at a pretty large scale back border security compliance tool where teams have been using at t-mobile in a large scale for knowing their compliance status and even you can write your own rules and it's again open sourced and we have seen an incredible traction for people who are on AWS and we are also on blockchain with next directory and teams we have been on partnership with multiple other industry folks to build the blockchain capability as well and along with that we are also building capabilities on cumulative platform which we called as conductor and which will be trying to open source in Ex coming months so the most common question so ye FS so we have been running a pretty large scale container orchestration systems at t-mobile and we have been running with both mesos and Cuban others we've been running with more than six thousand containers with more than two thousand services that is running live and we have been taking in transactions at about like close to 100,000 requests per second and we've been doing that over a lot of our NPI launches EFS is a fully managed service and fully managed we don't have to worry about launching these NFS file systems on AWS and worry about scaling these aw are pre provisioning in it's been fully managed we don't have to worry about patching or any of those it is highly scalable I suggest all it's available on dual performance modes the general purpose and max are you because we are a large enterprise company we have a wide variety of use cases it's also available in throughput mode with bursting and provision and I'll be talking about these use cases and a deep dive into the architecture of how we build those systems and we are using with EFS we take security very seriously at t-mobile and we choose demanded services on AWS which are in compliant and EFS is PCI compliant as well because we deal with large number of customer data and payment data as well and of course it is very less expensive it is about 70% less expensive and now it is 90% expensive as what Vince has just talked about so what are the use cases seer like various architecture patterns and one such is with Q brothers with CI CD with artifactory even TIBCO and what is one thing common about all these architecture pattern is we use EFS so if s is central to all these architecture patterns and I will be talking about all these use cases now it's accumulative so over this reinvent we have seen a lot of sessions with Fargate with eks and even customers are using building large scale kubera des environments on AWS there over 51 percentage of Cuban artists clusters are turning on AWS and we are one of them who are actually built pretty large-scale clusters on on AWS so we use EFS persistent volumes to mount the EFS to the parts directly and we we have customers to our own customers to actually have the application teams to use EFS directly and mounted to their pods they can provision they won't EFS or we can actually manually create here files for them and they can mount it so this even if the pods die pods move between one node to another node you don't have to worry about the persistency of data because it's been stored in EFS and you can directly access it and it is scalable across hundreds of nodes and it is as of what we have been using it's like 70% cost effective then if you are doing it on ourselves the next use case is CI CD so since we are a large scale enterprise and gathering like 2000 services and we are into congest integration and continuous delivery using jenkins and any other pipelines what you are using what we found out is it is it is it's a lot of a lot of work and it's it's we have to maintain large systems if you're actually having Jenkins slaves on ec2 instances and we have already have a massive mesas and cube meters cluster is what we are build so we started provisioning the Jenkins slaves on these clusters and we have we have been doing build jobs and everything on these clusters and what we found out that every time you are doing a build it has to download all those dependencies like maven dependencies or NPM dependencies into and every single time and that takes a lot of time for for us to build those artifacts and then move to our factory and deploy so we have reduced our build time from like 10 to 50 percent a depending on the length of the pipeline just because we are cashing those on EFS and every time a new Jenkins slaves spawn it actually has these EFS mounted on these pods so all the frequently downloaded artifacts everything has already been pre available on these pods and every time you actually do a build job it only has these artifacts so you don't have to download over again and again you're saving a lot of network past and even the throughput and even like you don't have to worry about your bill job taking a lot of times and I'm pretty sure nobody wants their bill job to take about like half an hour one or two run right so that's exactly what I've been doing and all the parts are short-lived we don't have these parts running for days together and some parts run for like a few seconds some parts take minutes some of these slaves takes even a day because of some backup jobs and we have been through this journey over laughs last year or so another use case is Factory so since we are in the CI CD and we have a centralized repository to store our maven or artifact we are even our docker capacities we use artifactory heavily and we wanted to be highly available and even it has to be dr disaster recovery proof even if one of the region's goes down or even the artifactory goes down we should be able to serve the the nature of how applications are containerized every time you run a new container or spawn up it has to download those dependencies and it has to you sorry factory to download so we leverage the max ion provision throughput mode Safari factory so we were initially trying with the default configurations of re factory and what we found out was we are not able to get the maximum performance out of EFS so it is it is very important for us to choose the right modes and we chose in last August actually the EFS team had released the protein throughput modes and we make use of those and then we actually moved some of those to EFS and we store like more than 50,000 artifacts with 200 terabyte of data the under use case where we actually initially started 200 years ago was to actually stand up to go on on AWS so it's a very simple use case all all typical EMS needs is a single place where it can actually have a config file and the TIBCO Emma's the master will actually talk to this I mean will actually make those read and write operations on it the slave will just read from this config file but it is it is it just initiated or start to start using EFS explore here first so this use cases and we were able to stand up a typical EMS on a double years because of EFS so talking to all these use cases I'm pretty sure there's a lot of developers and architects and a lot of folks would would love to hear how did we implement these systems so the first architecture pattern which like to talk about is on applications running on Cuba natives using EFS so this is an architecture diagram that shows like we have children's masters and nodes the notes are an auto scaling group and each of those easy to censors are running docker are running Cuba narratives agents and and these connect back to the Cuban leaders masters and we have EFS that has been provisioned so each of those EFS is actually highly available you mean to say that each of those EFS what you see in the diagram there is built on three availability zones so we mount all these all these EFS directly to the pods it's up to the application teams because application teams can provision EFS automatically in a self-service way through EFS provisioner so the application teams manager won't EFS and and they use EFS solutions and since EF is a solid is scaleable there's no need for us to pre provision or worry about scaling or patching or any of this it's already available and the variety of use cases teams use EFS for and and use cases like like what has talked about caching it could be one of the use cases or teams would like to legacy some legacy systems where they have been transformed into containers and these containers need a persistent storage and they use EFS for that if you're not in the cube where it is and you are into maiasaurs and if we want to provision you can actually pre provision this EFS and directly mount those into the docker containers so so we have number of applications here we actually use it so since i'm from the platform team at t-mobile so we actually run the cumulus platform and we provide a self-service way for application teams to provision their own EFS clusters and and then they can directly mounted so there is no need for us to have an interaction or whatsoever because it is very simple to use EFS a search the second use case is see a CD on it alright very typical they a developer comes in the morning he wants to write code he writes code he checks in and then it goes to bitbucket and once it goes to bitbucket as a web hook that triggers and it triggers a Jenkins job and Jenkins requests the kubernetes or the mesh was master to actually create a new slave that slave gets provisioned in either of queue models or the mezzos and each of when its provision as I talked about earlier it mounts these EFS to it and that's how all the bill jobs take less number of frames because it's already been cashed and now it goes to the iteration of okay it has to do a maven build it has to do a dockable it then pushes to re factory and once it pushes to artifactory now it is available for it to deploy as a part itself and guess what it's on a same cluster we are not building a new cluster just because you have to run as he s ad we are actually using the extra space that is available in the same system so we are actually saving a lot of cost because it's running on the same same cluster so it runs on the same cluster and then you're deploying it once you deploy your services the Kuban it is and the mayor is going to tell it okay this particular part or the container is now available for use it has passed the health checks now a user it could be an internal or external user if the user is actually accessing the system's through the browser or an AP icon using postman or anything it has it will be going why are the English and then it access the same nodes where both the Jenkins slaves and the part that is actually serving the application is running so this is one such architecture pattern the third architecture pattern is on how to build and highly available and a disaster recovery system on our factory so this is our architecture so this is exactly how we use it today and how we built it so you can see we have like two two instance ec2 instances that is build that is running re factory and both are using EFS mounted on these and all the artifacts and everything is having the persistent storage on EFS and we have RDS and we and we use EBS and even the load balancer to actually load balancer traffic between the master and slave clusters of re factory we have a mission control dr application that replicates the state from one region to another region even if the artifactory goes down or even if the entire region of AWS goes down we will still not have any downtime and we can provide 99.99% availability for our customers internally at t-mobile to actually make use of re factory the fourth system is the typical EMS so this particular architecture is taken from one of the architecture patterns at AWS itself we didn't want to reinvent the wheel of how to actually build another system so I exactly took that here and I put up here so this can be found on on the QuickStart architecture typical EMS so what you see here is you have a primary and secondary EMS over and it's pretty straightforward you have an EFS that has been monitor across these two ec2 instances and it has a con trick files that are stored on EFS and the primary master keeps writing those configurations and the metadata to the EFS server and then the secondary whenever there's a failover event or the primary master goes down or if you are doing a patching activity on your primary whatever you do and the primary goes down the secondary typical EMS already it takes care and that becomes a primary node so this is one of the simple architecture patterns what we did but this was started two and half years ago just when EFS was announced so I I was initially planning to launch my own EFS cluster for for one such applications but then once I found about EFS it was like very straightforward to use from day one so thank you everyone and thanks for this opportunity I was not the wins well thank thank you very much on breath so I want to leave you folks with some best practices for using and testing with Amazon TFS so in general paralyze as much as possible we recommend that you use multiple threads writes multiple directories in parallel and increase the i/o size as much as possible and doing so will greatly improve the performance that you find on Amazon EFS so here let's look at throughput in this example we're using a single instance to write a 1 MB or 16 MB block to an EFS filesystem and you can see that when we add either more threads or increase the block size of the writes we're able to maximize the throughput from the single instance so in this case it's a single c-5 instance able to drive the 250 megabytes a second to an EFS filesystem just by changing those two attributes and just to note the limit for a single instance throughput for a filesystem is in fact 250 megabytes a second similarly with I ops you can see that by increasing the number of threads as well as by writing to multiple directories to avoid inode contention you could you're dramatically able to improve the i/o per second and performance to the filesystem now with both of these in mind again paralyze as much as possible think about the tools that you're using or how you're writing your applications so not all file transfer applications are created equal you can see that by using a multi-threaded application such as FP sync or new parallel be able to drive much higher levels of throughput as compared to single threaded applications and additionally as we mentioned earlier the new AWS data sync is a quick and is an easy way to help you transfer your data from on-premises to the idea of its cloud I talked earlier about the two different throughput modes and one best practices again is to use the throughput mode because you're able to switch between the to use the throughput mode that best makes sense for the job that you're doing so if you can think about if creating a brand new file system and you're looking to ingest the large amount of data into the file system if you choose the bursting throughput mode with no data in it again your file system throughput scales with the size of the file system itself however with no data in it we are able to drive up to 100 megabytes a second using burst credits and you can see that if you're looking to ingest a 1 terabyte of data it's going to take about three hours however if you're able to parallel eyes as well as use the provision throughput mode you're able to dramatically reduce that so it's a 10 times improvement thereby writing from four instances in parallel and you're able to achieve one gig a second of throughput just a side note the default limit for provision throughput amount is 1 Giga second if you require more just please reach out to to us and we're happy to have a conversation other best practices again monitor your file system so you can use Amazon Cloud watch to monitor the various metrics I talked earlier about the general-purpose performance mode and you can see there that the percent IO limit is specific to that again that's the just that's a metric to help you see how close you are to getting to that 7,000 io per second limit additionally we talked about the bursting throughput mode and you can see there's a burst credit balance that you could also monitor via Amazon Cloud watch and in fact this past week the Amazon Cloud watch team just launched the new automatic dashboard automatic dashboards provides you with an aggregated view of the health and performance of your various AWS resources so this is the default ef-s view with automatic dashboards you get the insights you need really at a quick glance if you require more in-depth or a custom metrics of course you can use the metric math feature with Amazon Cloud Watch in fact if you visit the EFS performance tutorial we give a number of links and a number of suggestions to to custom EFS specific calculations to simplify monitoring as well as alarming and then to summarize a number of best practices again I talked about start with the general purpose performance mode again this is the one that you have to select at file system creation time start with the bursting throughput mode use a Linux kernel 4.3 or greater utilize the EFS mount helper and really this is a wrapper that's executed to help you mount the file system with NFS for one as well as with the the default recommended mount options and as I talked about large i/o size and paralyze as much as possible multiple threads multiple instances writes multiple directories to avoid inode contention and then monitor those metrics so to close out you quickly create a scalable high-performing filesystem visit the AWS website you could also take a look at some of our tutorials the the examples that I showed previously in the best practice section you could actually walk through those yourself so visit the the AWS website and you can take a look at a lot of those and with that I'd love to take any questions that you might have we're very fortunate to have a mereth here so we're happy to take any questions about EFS or any questions for for how t mobile is using the filesystem yes sir coming soon coming soon yes it's a it's a very important feature and someone that customers have asked us quite frequently for so we're happy to pre-announce them you may have seen it yesterday in Andy's keynote yesterday but is coming soon I'm sorry sir yes so you are able to drive your performance scales as your file system size grows in the bursting provision in the bursting throughput mode you're able to drive up to three gigs a second depending on the region or one Giga second again depending on that region but again it's depending on the size of your file system itself whereas with the provision throughput mode you just set the provision amount to what you would like and again you just have the two different the two different pricing dimensions yes sir so sure so the so the question was about if the same things we've just talked about with Amazon ifs applied to our new windows file system so the Amazon FS extra windows file server is a completely different service offering it's designed for a different set of use cases and they do have their own performance features and their own pricing as well so it's it is separate from Amazon EFS lustre as well Amazon FS extra lustre also a separate service each one tuned for slightly different performance workloads and each has its own set of features and its own separate pricing yep so if I understand your question is you had an issue where you were slowly running out of burst credits and someone recommended that you just add them add some padding data to your file system so that you're able to achieve the throughput that you're looking for correct like I previously mentioned the when we launch the efest we want to make it as simple as possible so we only had that single bursting throughput mode so well we in fact recommended for customers to do is if you knew that you needed a certain throughput amount was to do something similar to what you just did which is again just to ensure that you had file your file system size was of the appropriate size so there are a number of different ways you could do that dd obviously was was one that obviously worked in your use case but obviously with provision throughput you no longer need to do that so absolutely you can just just use provision throughput you could switch at any time you don't even need to create a new file system and yeah that answer your question sir great so you know that the question was about accessing an EFS file system from a Windows box it is an NFS file system so it's available from an NFS v4 client what we found in talking to a lot of customers over the last year because we get asked about windows a lot was that the needs of our windows customers were often quite unique and distinct and that's why we developed and announced yesterday fsx for Windows file server so right now with Amazon EFS we continue to be very focused on NFS and primarily Linux workloads well Amazon ef-s itself does not offer a caching mechanism today but we are integrated with Amazon AWS to reconnect for high-speed access into AWS and we just announced a few weeks ago that we're now supporting AWS VPN for access from on-prem into EFS as well I think there's a question here ok you know that's a very common question customer feedback we take for customer feedback very seriously as we mentioned we launched a number of new features this year we probably don't publicly comment on our roadmap but if you'd like we could probably we'll step it we could take a take this offline and have a more in-depth discussion perhaps in a more a different video there any questions for forearm wrath and on the on his use of EFS and how t-mobile's be able to achieve success are there any other questions in general yes sir I'm sorry can you say that one more time sir well you know again that's something that gets into really talking about our our roadmap going forward which is something we typically don't do in a public session certainly though we we appreciate hearing that feedback that is something that we hear from folks we do have as Vince talked about the new AWS Data Sync service for moving an ingesting data in DFS directly and that supports both s3 and EFS natively yes sir so the lifecycle management feature is part of the new EFS infrequent access storage class so we've announced that is coming soon so it's not yet available for people to use but we hope to get it into customers hands very very soon and lifecycle management for newly created file systems once it becomes available you can turn it on and then will automatically move being frequently accessed data after 30 days to the ia storage class yes sir yeah absolutely do you know in general like I mentioned the best practice you'll typically get large better performance if you increase that IO size you know one of the key benefits of Amazon AFS is that we you know durably store your data across multiple availability zones we support strong consistency so that any subsequent you know when you close the following subsequent opens is immediately accessible that data is make it accessible due to which there is a there's a small latency overhead associated with each file operation so in general if you're able to increase that the IO size you'll be able to get better performance because that that that latency overhead is amortized over a larger set of data I totally understand it's I said if in general if you're able to increase the file size you'll get better performance but maybe we could take this offline I'm happy to share a little bit more and learn a little bit more about your specific use case yes sir well we would the use of of rattus with the FS is something we'd love to learn more about your specific use case traum we're always looking for feedback like that so we'd love to have a conversation at the end of the session about that with you and learn more at the yet after question sir so again we'd be getting into things in terms of the roadmap which we typically don't talk about in a public forum right now we're very excited to get in free could access out the door and into customers hands and we'll be listening intently to all the feedback we get about where customers would like us to go from there yes sir well EFS is designed where you want to have a file system a live file system available it's designed to be a drop-in replacement for applications and use cases today that expect to have access to a file system and conduct file system operations so that would be the primary use case for EFS today and they're another there are a number of considerations s3 is a fantastic service obviously and it one of the things that we like to recommend is use the right tool for the workload or the job that you're doing right while taking into consideration the the subtle differences between the various services so for example one one thing is that s3 is eventually consistent whereas EFS offers strong consistency so for some customers that doesn't matter for other customers and other workloads like I have to have one or the other so again just it all depends on it's an it depends Court of question we don't have a standard list that says hey pick a versus pick B but we can talk offline I'd love learn a little bit more about your particularly use case and then maybe I can help you with with that journey cool yes sir we use both so yeah so but EFS so in case of disaster recovery and all those cases we use here for success only for those so that's the reason why we have the throughput board that we have a higher throughput mode provision so that was announced last August so if you actually go with the deepok default mode so that's going to be a very high even more high latency and it's it's going to be snow work for us so that's why we pre provisioned okay well thank you very much folks I know you guys have a number of different sessions too that you could choose from so we're very honored that you guys spend your time with us as always thank you again and please fill out the survey and hope you guys enjoy the rest of your conference thank you very much folks

Info

Channel: Amazon Web Services

Views: 5,437

Rating: 4.8095236 out of 5

Keywords: re:Invent 2018, Amazon, AWS re:Invent, Storage, STG301-R1, Amazon Elastic, File System, Amazon EFS

Id: 4FQvJ2q6_oA

Channel Id: undefined

Length: 55min 39sec (3339 seconds)

Published: Fri Nov 30 2018