AWS re:Invent 2017: Deconstructing SaaS: A Deep Dive into Building Multi-tenant Solu (ARC407)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

thanks everybody for showing up today at the end of what I hope it's been a very productive week for all of you I know this is probably at the last or one of the last sessions you'll be attending so I very much you appreciate you taking the time to be here as the slide says my name is Todd Golding I'm a partner Solutions Architect focused on the SAS specialty and last year I did a few in sessions at reinvent and we really focused more as part of those sessions on sort of a broad set of principles and SAS sort of objectives and looked at pros and cons of different partitioning schemes and tried to give everybody sort of a real broad view of the landscape of SAS and for some of the folks I think that was a really good experience but also we listened to the feedback and I think we decided this year to come back and say could we do something more targeted could we really do something more focused that looked at a specific SAS implementation and said instead of sort of discussing the fundamental principles show me a SAS solution that's been built like dig into a single solution and yes that'll solution will be one example of how you'd build one but I think I might learn more from just seeing the end-to-end sort of solution and seeing all the pieces connected to wanting to get another and so that is exactly what we're going to do here today and I hope for all of you that matches what you think you've signed up for so this will be actually a reference to a solution we've actually built that's out there that you'll be able at the end of this to go out download and so it will certainly try to be deep as we can be today here but my hope would be that will sort of give you enough of the framework for what's here that you can then go out download the solution and there's a 40 page PDF and a bunch of source code and a whole environment you can install and stand up on your own and then take sort of the knowledge that comes out of this and use this to dive into that example also just for a little more sort of set the stage here we the this is a 400 level session so the assumption is you're coming in and you have some orientation some knowledge of SAS so things like data partitioning and tenant isolation these are not hopefully new concepts to you also we're going to dig a little bit into the code level we certainly aren't going to back open the IDE we won't have time for that but we are gonna have slides with some code on them and we're gonna walk some of the sequence of the interaction between the application services to give you a sense of what this is all about my hope is when you're all done with this that that even if you even if the example doesn't directly map to your existing solution that this this will be a foundation that will let you take some of these ideas figure out how to lift them see how they sort of work with the different AWS services and leverage them to the best your ability and whatever solution you're building on your own so enough caveats let's dig in a little bit here obviously before we can sort of dig into the implementation we need some kind of blueprint for what are we really going to build here what are the moving parts of it and sort of establish some kind of landscape for our discussion so gonna start with thank you clicker that we have to obviously anything we're gonna do in SAS is going to be a multi-tenant solution right at this particular multi-tenant solution we're going to look at here is what we'd call a pooled solution in the world of SAS which means all the infrastructure resources are shared by all the tenants in this environment so when we look at storage when we look at compute and we look at all the moving parts of this the idea is that all the tenant data is shared in that environment dynamodb for example when we look into dynamo DB the items and the dynamo DB tables are going to be all the all the tenants data commingled and so a lot of the policies in the approach and the strategy we'll take in here will be driven by the fact that we're a pooled environment if this were siloed or one of the other flavors the dimensions might look a little bit different and the first step we're going to look at is the onboarding piece of that and for most people this is just hey you go in you sign up you register but if any of you have poked at SAS a lot you'll find that onboarding ends up being one of the most intensive parts of a SAS application all if you think about everything that has to get provisioned by somebody just signing up as a tenant and all the underlying configuration then the setup of a tenant set up of their identity set up of their policies that's all part of the onboarding experience and we'll see in a fair amount of time not now a huge chunk of time but a fair amount of time looking at that to get a sense for what does it mean to autumn autumn ate all that inside a SAS solution inside of AWS once we deal with onboarding and we have all the bits of that will look at authentication this probably should say authentication and authorization and we're not going to look at just like what does it mean to authenticate a user I think most people are familiar with identity providers and what it means to authenticate when I when I look at this this is really more about what does SAS do to authentication how where's it because to me you can't just use vanilla out-of-the-box identity and just plug it in for SAS you have all kinds of additional considerations that are overlaid on top of this in fact I did an entire talk this week earlier in fact for some of you there I hope I warned you then lots of this will overlap with that talk because identity is such a fundamental part of SAS and in fact you might be surprised that we'll spend as much time on identity as we do in this talk because identity weaves its way through all the layers of our architecture it affects the way we implement the app services it affects the way we implement the partitioning you'll see its story all over in here so yes I didn't want this to be identity talk but it's really hard to talk to about SAS without weaving that into the discussion finally we will look at ok what does it mean to be a developer and write the actual application services that are part of this in this case we're we're going to have a application that's decomposed into a series of micro services and we really want to at this level just say what does it mean as a SAS developer to build a micro service with that's multi-tenant aware that thinks about all the things that I'm that they have to think about but also what do we do to make this a good developer experience like for the person writing these services how do we shield them and hide them away from the details of multi-tenancy so that we're as efficient as we possibly can be so we'll look at that bit and then obviously we're going to look at storage partitioning as part of this as well like how are we representing data and I said this is pooled in this case how are we representing pooled multi tenant data and in following with sort of best practices of micro services each of these services not shown so visually well in this diagram each of these services are encapsulating access to whatever storage is on the other side of them so we'll look at that whole experience we'll look at how that's affected by identity and so on and then the bit we'll look at and spend a fair amount talking about that is another area that gets overlooked is tenant isolation I was doing another session earlier this week Tuesday and I was in a room full of developers who were SAS developers and I said how many of you are sort of just relying on authentication as your entire sort of isolation scheme and almost three-quarters of the room their hand went up like that's the definition for a lot of people like okay I challenged you at the front door I made sure you were who you were and I know your identity well enough that I'll just be sure and code that nothing gets crossed that's not enough in a SAS environment really right your customers and your business really the lifeblood of a saps business demands that you keep really firm boundaries between your tenants you could imagine what would happen in environment where one of your customers accidentally got access to another tenants resources or their their data that can be a like a fatal blow for a business so even though it doesn't come up in a lot more conversation we spend a lot of our time in the universe of SAS talking about what strategies can we have beyond authentication or connected to authentication that we can use to enforce this isolation between tenants now the last bits here on the slide are are more about operations and bits of this metrics analysis management billing I would love to have a bigger chunk of time here and be able to talk about those as well and great depth but I don't in an hour-long session we can't really dig into all the operational bits but I want to be sure I say that those are every bit as important as the application bits of this equation and I would definitely go look at the other talks we have we did another member of my team Judah Bernstein did a whole talk on like tenet health and tenant metrics earlier this week I did a whole talk on metrics and those bits so don't don't think those don't carry a lot of weight we just don't have time to dig into a date we'll touch on a little bit but a billing will touch on a little bit of metering and that's about it okay so what's the stack we built all this with and there's that it's really hard when you're building a sample and reference application to say how can I find a set of tools that are somehow gonna resonate with everybody when the audience is so large but we've tried to pick what we thought was a set of tools at least that would be have a reasonable chance of matching some of what people are doing we leverage angularjs as our client already probably since euler Jaso you can imagine you know ember and react and all these other sort of tools are showing up over there so we'll probably have to continue to introduce new clients to give people an idea what they look like but the client and the angular bits of this actually are all just deployed on an s3 bucket and following sort of general AWS best practices for for deployment of a web app the next piece in the puzzle is the API gateway and there's a specific role the API gateway plays in this implementation that we're gonna dig into the code of but I also want to say that in general if you're building a SAS solution you should use some kind of managed gateway as part of your API it doesn't have to be a WSS gateway but in general if you think about what SAS has to do you should have you often have really serious throttling needs you'll have tenants at different tiers so a basic tenant should be throttled differently than it in dance 10 it another thing is generally you'll find yourself exposing some kind of public API to this and you get to rely on the Gateway for issuing keys and doing all those all the heavy lifting of exposing a public API but in this specific example we leverage this something called a custom authorizer with the api gateway to add a layer of security at the at the api entry point that was another layer of security in our solution so we'll dig into what that looks like then on the back end we basically have a bunch of nodejs Express services and we built this very simple order management system it is not a meant to be like this very robust you could go ship this order management system kind of product it's to amend to just include enough services that you could get a sense of what it means to implement a system in this model so you'll see order management and product management services but then a whole bunch of other services here that are more about how do you get a tenant on board how do you register them how do you deal with the identity bits of this and you'll see some a lot of these are using dynamodb and a multi-tenant fashion and these are all deployed in an ECS cluster and so very classic sort of micro-services model and then this last bit I wasn't sure where to sort of snap it onto the diagram in a way that was useful but on the we are definitely using a kognito as our into end identity solution and we're using many parts of Cognito in this solution we're building another solution that is a mirror of this that is leveraging octa as an identity provider just because we want to convey and and demonstrate that this doesn't have to be a Cognito solution you get certain nuances out of both kognito you'll get certain nuances out of it with octa we want to definitely show variety in that respect but today lots of discussion of how kognito was used okay now you go get this download this you go to the page where you get the QuickStart you run the provisioned environment to set up the whole environment to run your app what is the underlying infrastructure that's going to get provisioned as part of this experience well what you're gonna find is that we really like all the quick starts for the most part on AWS our following just a traditional best practices high availability scalable sort of architectural pattern so you'll see we've got a V PC and of course it's a multi a-z V PC that so we get these sort of the high durability of multi AZ and then at the front like we said s3 api gateway and this lambda function that is our our custom authorizer as the entry points to our environment and then we got a we obviously have to have public in private subnets it's just part of conforming to best practices so we've got a NAT gateway that is our public subnet and then that's load balanced into the private subnet that's gonna host all of our actual application services and finally there's our our app services all running in an e CS cluster auto scaled and so on so probably any of you been working with AWS for a while seeing this exact this looks very much like the blueprint of any H a sort of architecture you'd see on top of AWS now we can start shifting into onboarding and we want to look at two views of onboarding I thought about leaving the slide out but I think it there's something important it illustrates because it's still not code yet which is where we really want to get to but this illustrates some important points that I think here which is I'm this is the onboarding experience you'll see if you just go run the app right so I go out like any other onboarding and process I put my name and I'm other bits in I tell it a little bit about my sass configuration which tier I am things of that nature I get the success message saying hooray you're gonna get an email the system emails me and says here's your temporary password go login I go to this system I log in the system says oh you're a new user go change your password something we've all done a million times but the reason I put this here is because the code behind this there's if you when you go download the sample you're not gonna see a bunch of code that's doing all this work because Cognito is doing most of this work for you and so I wanted to point out the fact that like sending the email that the configuration of mi MFA or not MFA what my password looks like am I going to be challenged to resupply a password again all of that is orchestrated by Cognito and it's one of the upsides of that and by the way the other providers have their own way they orchestrate it but the good news is it's not really part of the solution you have to build a whole lot yourself now let's actually go look and what to see with the services under the hood of that experience or doing because if you remember at the outset I said hey there's a whole lot to this onboarding process there's a lot you've got to do here and there's a lot of heavy lifting that you may not be considering let's talk about it so I push that register button user says success you've registered what's that doing under the hood well the first step obviously it's going to hit this tenant registration service I have rest entry point with a with a reg on it and now this tenant registration service is going to be the orchestration of all these moving parts that we have to create as part of onboarding and there are three major legs of that orchestration part first of all I have to create the user and the identity footprint of my app so yes you signed up as a user I got to get you in as a user but I also have to provide a llama can exceed our necessary for all users of the system that are going to need identity of them themselves the other piece of this I'm going to have to provision is the tenant itself like yet the tenant is an entirely separate construct we have to configure and setup the tenant and the last piece of that puzzle will be the actual billing interface so let's walk those three legs and start with the user and the identities well tenant registration calls our user management service and then that user management service is going to create all the things needed to both create the user and create the identity profile so when what that does is the first thing it does is it goes out to Cognito and it says to Cognito i'm gonna create all the pieces that are needed therefore Cognito support the identity profile I want and there are lots of pieces of that the first thing I want to point out here is that we're using Cognito in this example as an open ID connect provider an open ID connect is implemented by tons of providers but there's some real goodness in the open ID Connect implementation that let us take the traditional notion of sort of identity and bind to it the notion of tenant identity right because what we really need out of identity isn't just who you are we have to always know who you are in the context of the tenant you're associated with and that's the extra piece we're gonna leverage out of open ID connect to make that happen so when they finally get into the hood of this and you'll see what it's provisioning well the first thing it provisions is a user pool those you haven't used Cognito user pool is just a grouping of users and the good part of that user pool is it lets you actually configure policies for each one of those pallets so am i MF am i what's my password policy all those things so what we've decided here is we're provisioning a user pool per tenant so that you can then configure those policies on a tenant by tenant basis the other piece we have to provision here is an identity pool so Cognito requires it has a federated identity model and so you have to bind the user pool to a federated identity then of course we have to actually provision the user who is this user who just signed up who is like the tenant administrative administration user in your system and then the last bit is we have to do these custom claims and this is all related to that open ID Connect open ID Connect has this notion of I have a standard set of claims and then I have this ability to configure custom claims and those custom claims are we're going to put the tenet attributes and that seems like not such an important thing but it's actually a very important thing as we go through the rest of this implementation so in there is tenant ID the role the company the plan you signed up for all those bits are in those custom claims and become a first-class concept in your in your identity solution now most people would think well that's enough you're ready you've got your identity bits but there's another piece of this equation we said we want isolation and the way we're going to get isolation is through I am policies so as part of provisioning this user this tenant we also have to provision all of the policies that are going to be needed for all the types of users so while this looks like a provision a user kind of experience remember its provision a tenants identity profile and this means provisioning policies for every single type of of user you're going to have in the system so once we're done with that we've got that part done now we can shift to ok it's successfully created the other piece we have to think about is we're going to create some mapping between the user and the user pool here and that's the only job of user management storage is to be able to say which user pool is this user with everything else about the user and their profile is stored incognito not in that data that's managed by user management finally tenant identity is created so out here we're going to create one entry in the DynamoDB table that is our tenant it's going to have the tenant ID the plan are they active inactive you can imagine all kinds of additional configuration and policy data related to a tenant here the important concept here to notice is once this is all in place right we're still only gonna have one tenant and so as new users are added for that tenant no tenant is added they're all just bound to that existing tenant that you created in this process and I just wanted to show you quickly because I said these custom attributes these custom claims are important this is a snapshot of a screen straight out of Cognito and it just shows you at the end of the process probably hard to read but these are the custom attributes that were created to convey a tenant ID role etc now the last step in that trifecta that we couldn't fit on that screen was the billing bit of this and there are going to be three caveats in this talk that I'm going to tell you our code that are not in the downloads you're going to get because these are all areas we're adding but I think are very important for you to think about as you're building associate billing is one of them it's not in the download you're gonna get but I want to show you what it looked like so we've the last leg in this tenant reg is we're gonna go out and we're gonna tell some cue hey provision a tenant now why did we do that in a queue and everything else was in the queue the truth is it all could have been a cue and could have just had general good asynchronous sort of messaging but there's a very specific reason I'm relying and showing a queue here which is when you sign up for a system and you have built that all this provisioning and somebody's gone to all the trouble to sign up and become a tenant your system you don't want to lose them at the billing step so for me I want this to be a very fault tolerant sort of mechanism so I put it in the queue and I say yes I've asked account to be created but if that account doesn't get created for another hour it's not going to affect the overall sort of tenants ability to get in and start using the system so I put that in the queue and then I have a service that will essentially sit there pull the pull the message out of the queue and then it will go to the billing system and say provision that experience and here's where we can say okay if that billing system isn't somehow ready now we can have some kind of retry logic or fallback logic and really make this as robust and sort of fault tolerant it's possibly could be because for most of you billing systems are often an external third-party even if it's internal it's some other system that's often outside of your control so anytime I have an integration with a third party system or an outside source I'm gonna do everything I can to give myself fault tolerance on that boundary okay so we've set up our tenants we've done everything to onboard them now we're ready to actually come in and sign in and login as a tenant what's that look like well pretty straightforward you hit the web app now what's not straightforward here is and a lot of diagrams this web app would redirect to the identity provider and you'd immediately let the identity provider handle the off process but we made a choice and I think it's a bit of a controversial choice in our implementation of the sample to say hey we're going to use the user pool per tenant and as part of using user pool per tenant before you can off against Cognito you've got to know what user pool is in because all I have when you login your user ID and a password I don't know what's user pulled off you against and I'm not gonna go look for all of them but the downside of putting this off manager in this cycle here is now I've introduced a point of failure and a point of scale that could be a challenge in my architecture so it's a conscious choice but we may reconsider whether or not that's the best way to go or should we let the identity provider directly be that unit of scale which means we have a different model for how we represent users incognito either way that user manager auth manager says I need to be able to go figure out are you which user pool you're in so I hit the user manager I see look up your user pool if you remember in the early slides we showed that mapping was created and then I go out to kognito and say hey is that user pool out there validates out there obvious not obviously if it's not out there we're gonna come back and indicate hey you you didn't successfully login but assuming you found it then I'm gonna go off you against the the kognito and I'm going to get back based on open ID Connect a JSON web token a jot token and if you remember earlier I'm a obsessed over the fact that we were using Open ID Connect and these claims this jaw token has in it has one token both my user identity and my SAS identity and that's going to now be accessible to all of my services my partitioning and everything else as it flows through the system so well if you said learn nothing else from that diagram focus on that token in the top right cuz that's the outcome we really want now application services I'm gonna write an app service I need a product management service that's gonna say go get me all the products out of your catalog right now and I hit that prata so my question is okay a multi-tenant I when I made this slide it's like what's different for me as a developer when I wrote these services and we built him into this sample what was different well the interesting thing is when we wrote the services the whole goal was to say I want these people to write these services if they almost didn't know they were multi-tenant right I'd like them not to be like adding all kinds of code into their solutions to somehow figure out how to resolve a tenant figure out what policy needs to be applied I wanted to make that as straightforward so I'm gonna do little framework II things here to sort of take common concepts that are about multi-tenancy and hide them away from the developer so that mostly they don't have to care about that stuff so let's look at what that means is an example so here I get a request in you'll see that I get the get for my products and in my header of my spear aquestion authorization header nothing special about this it has a bearer token and that bear token is that jaw token we've been I've been obsessing over right it's encrypted but it's in there and it's passed in as part of it in the context of every call and service to service interactions through my app services are all going to use this message as a method as a way of conveying the context of tenant and user as we go through them with one of the key goals of being I don't have to leave the service or go call something else to figure out which tenant you're part of it's all in the header it's all in the token so now I get that and I now I want to just as a developer say hey go get all the items out of it for this particular tenant out of the database but I don't know who you are as a tenant and I don't know what security credentials to call you with right that second part some people don't think of right which is oh I just know a tenant and that's all I need to know no I still want a scoped set of identity security credentials based on our isolation goals that will scope your access that data and make sure no matter what you do you aren't able to see a piece of data you're not supposed to see right so as an example let's say as I just happen to as a developer manually put some tenant ID in and I put in whatever tenant ID I want I can go see anybody's data without any credentials there but if I add the credentials to that equation Eitan no matter what you put in the tenant ID if you're not allowed to see that data you're not only gonna be allowed to get to it so that's the sort of two piece of this puzzle so what we did in this solution which is what every developer probably does is we said let's take this and make some kind of library that is a library that will take care of of sort of managing all these token related things we need to do so and the first one was we just need a tenant ID how do I get a tenant ID well I want them to just be able to say I've got a request hey token manager go figure out what my Tennant ideas so you call this helper function on the token manager it cracks open the request it decodes it it pulls out the tenant ID and it returns that right and that becomes my standard way every way for everybody to get their tenant ID and the truth is in my data access layer if I have one I could embed all of this in the data access layer in a way that the client the service sort of developer wouldn't even know make this call they could just that could all be done in the data access layer that's not true in the implementation that's probably something I'd like to do to make it better the other piece I said you had to have is these credentials and we're going to obsess over these credentials a lot and how you get them but essentially from on the surface here you'll see hey I'll go to the token manager again give it the request and say you give me a set of credentials remember I just have a token I know I want a secret key and an access key that are going to tell me contextually this is what you have to get to this piece of data and so I'll make that call and back will come the credentials and those credentials will be the credentials I use when I go to the database and say get me the give me the items out of my catalog so that's that's the underlying implementation I think you could see how this would be minimally invasive to the developer of the product management service but still be pretty nice in terms of hey there's some good multi-tenant values in there now I wanted to just peek briefly into what's in that get credentials from token bits and there's - there's a couple phases to this we're gonna see code multiple times along the way here for this this first one is more about hey how did we just get those credentials back without how the mapping is actually done under the hood so you'll see here this get credentials from token essentially drops in look to the request pulls the authorization header out of the request parses out the token decodes the token and then there's a couple of calls that are made here and these calls are all about our internal calls that we'll break in go into a little more detail on but these calls their goal is to essentially go out to Cognito and say I have a token I need the appropriate credentials that go with it to come back those credentials come back and then those of you that are familiar with no this is just a callback function you see it passed up credentials passed in at the top that's a callback function and then at the end of this function you'll see that I get the results from from making this mapping and I call the callback functions throwing there is sending the results back wish we had a little longer to dig into this one but you get the basic idea here which is this guy's sort of isolated in a way doing that mapping for you that should be reused by everybody and from one spot the other piece of security we want to add to this is this custom authorized or I talked about earlier and this is so right now what we looked at was inside the service what were we doing to apply security with the custom authorizer I'm wandering out to an outer the outer edge and I'm saying what am I going to do at the API gateway as another layer of security in my system right it's not that if I get get security at one layer I'm done no matter what I want I would love to put security as many layers as I can so that I get this better sense that my my security isn't gonna fall down on me so what this custom authorizer lets me do is say hey I've got this token coming in it's got all this goodness in it I know who you are as a tenant I know you are as user and I can connect this lamb to call to your API gateway entry points and say hey for these rest entry points I'm going to inspect what's in there and I'm gonna determine which methods you're allowed to get to and which methods you're not allowed to get to and I will control access to your services so you never even get to a service if you're not supposed to be there and if we look inside this this is a the lambda function that we implemented it's a snippet of code it's a much bigger chunk of code but you can kind of conceptually see what's going on here the first thing we do is we pull the jot token out of the incoming context from the API gateway and then we construct a policy write a policy is the mechanism we need to convey back to the API gateway what methods are allowed and whether a methods allowed to come through or not allowed to come through and so we construct that and now we inspect the contents of the jaw token and you'll see a line here where we look at the is this jaw token and admin we essentially look to see whether it's an admin and if it is an admin we set the policy to hey all methods allowed anybody can you can access any you want if you're not we constrain it to you can only do gets and you can do one post which is on this users path and then you'll see we do a context succeed and it goes back and that policy is then used by the the API gateway to control access to me if you didn't even have access say you didn't do all the bits under the hood with the token and the mapping that I talked about earlier you could at least include this at in the outer edge as a nice way to say let me at least inspect the token enough to say they can't get through the API unless they're unless they have the right proofs now data partitioning is kind of interesting because when I put this together I thought well data partitioning there's going to be a lot to say here is not but data partitioning ends up being not so complicated and it's the interesting parts of data partitioning are more about how we control access to the data than how we actually partition it but just if you look at what we actually implemented in this solution in a pooled model with DynamoDB and a shared table we're essentially going to use the partition key of DynamoDB for each tenant ID so now when we access data we first have to fill out the tenant ID is the partition key and then your your range key will become whatever your prior you know top-level key would have been so in this case product ID now we've also implemented and this is caveat number two this s3 bucket now we we're in the process of adding this to the solution it's not in the version you'll get but you'll see it soon but we're adding s3 bucket with partitioning to show you what it looks like but also because it's kind of nice to add product images to our catalog in a way that you can see that but we also want to show how object tags can be used as a mechanism to partition data in s3 but the partitioning of this is not nearly as important as the fact that you probably put helpers for between each of your each of your storage units that basically hide away the details of of how you're partitioning the data so the best example I could give here is let's say we're using this pooled model for data for this DynamoDB table on the left and tomorrow we said you know what based on our feedback from customers or some operational profile we're seeing we really need that to be siloed data now it can't be pooled we need every tenant to have their own table well in that model I would prefer that the app code itself doesn't somehow and I'll have to go account for the fact that it's silo versus pool I want that to be hidden away by whatever sort of framework our tool or library sits between me and the table itself now this is the code that is sort of the second step in acquiring the data we talked about how you get the credentials for the code and that's the first line of this but then now that we have the credentials that come back from that that come back from this callback function what do we actually do to call DynamoDB to get the data from the tenant in that context well the first thing we have to do is populate a set of search parameters pretty straightforward what table were we getting it out of and we're looking for a value where tenant ID equals a specific value that's based on the tenant ID which is a context I already got outside the scope of this function and then the line that seemed so innocuous but as the whole reason this slide exists is this construction of this DynamoDB helper and the DynamoDB helper really just gets constructed and we use it to call the query you'd say why why is that line relevant well that line actually isn't being a huge source of challenge for us when we built the the sample solution because you'll notice the second parameter it takes when I construct it takes these credentials that comes in well we initially started by leveraging all kinds of third-party helpers that are out there for dynamodb and what we found is a lot of those solutions rely on a credentials configuration that sort of leverages the AWS general ways they're like globally configuring your API config and all those other bits and then they just suck it in as the context for the credentials and even when you try to override them with some specific set of credentials you don't successfully override them so you end up with a broader scope than you ever want or you end up with one notion of credentials so we actually had to build our own DynamoDB helper wrapper here around the API with the expressed goal of saying on a call by call basis your reacquiring and reapplying the credentials for every single call and that makes sure you have the scoped access the data that you need I'm hoping there's a better answer this ultimately I or a third-party tool that lets us do what he wants so that it fits more with a known normal what you're doing but that's why all that code exists if you go look at it so I haven't really answered the question I posed at the beginning though which is I said we had identity we have these iam policies but how are those iam policies controlling my access to data so far nothing you've seen in the code would have changed anything about whether you could have just put a different tenant ID in and went and seen somebody else's data and where this gets applied is in the way that you use iam policies that's part one of the problem so let's look at the I M policies under the hood of the I and policies that were provision if you remember one of the early slides I provisioned all those ion policies I've pulled one out here and this is just one example of one of the many iam policies that get provisioned for a tenant and I also wanted to convey the idea that Bobby obviously these policies are parole policies so this happens to be a what a read-only on the order table that is used for for a tenant but if I had an admin or a system admin or a different role that prigs would be potentially different but on this one it's read-only and you'll see the actions are constrained to read operations only no write operations and then the real caveat is down in that condition down in the bottom in that condition you'll see that I use leading keys and you'll see it specifically in the leading key it's not so obvious but that's a tenant ID and what that says is whoever's running in the context of this iam policy will only be able to access items that have that leading key value and that's where the IM policy enforces and and controls that cross tenant access now the last piece of this sort of isolation and identity puzzle that is being used across to all of this is a piece I sort of left out at the beginning because and it's a nuance that gets lost often in the implementation of this particular solution which is if you think about it at the beginning we configured I am policies and we configured kognito users but there's nothing when you off that suddenly binds that user to an IM role those IM policies just exist out there on their own there's nothing in them in the policy that references the user nothing in the user that references the policy and yet somehow under the hood when we do this the two get bound together so let's look at how that works and by the way if you look in the code of the solution because I didn't write this part of the solution another member of team Judah Bernstein worked on this and in fact I came back to him this exact question like I'm not seeing where that mappings happening and if you look at the user pool when you configured it you configured these custom attributes one of the attributes you configured was role and then when you configured the federated identity you gave it the user pool in the application ID but the really secret sauce of this moment is these role matching rules the role matching rules allow you to configure a matching from an attribute to an actual set of policies and those role matching rules which are enforced and applied by kognito itself are the thing that make this binding that is not so obvious now if we look in the code that is the code that underlies this and we were to go back to the provisioning code that it was at the onboarding process here and we looked at where we're creating all those policies this is a snippet of the code you would have seen you see here that I'm defining each of the roles and you'll see basically this sort of expression here where you say the role equals some value in this first top one it looks like it's a system user so if the role is a system user here's the particular role I want it to bind to if the role is a support only user then it binds to this particular role and then there's obviously a whole collection of these that describe these and if you open the Cognito hvi console and you go look in there you can actually see these matching rules so to me something that kognito can do because it has it can bind to i am in a way that's kind of special that gives you a little bit of a with a short-circuit the process the last piece of this is the actual code that you you run at runtime that sort of forces this mapping to happen so if you if you remember at the beginning of this we looked at some code that said get credentials from token and I said oh there's a couple of functions in there that translate your ID token into the appropriate security credentials well under the hood of one of those calls is this predict particular call get credentials for identity and this get credentials for identity takes first of all the federated identity pool ID and it just takes your token but inside your token here is your role so the role is that's needed to make this mapping is already in there so then when we get to this kognito API call here which is get credentials for identity and we make that call and we pass in these parameters kognito we'll look inside of it it will see what role you are it'll evaluate those role matching rules and it will find the appropriate policies that are there and it'll construct a new token for you that is a token that is now scoped by the that roles set of policies and just because I think it's kind of convoluted I tried to have one last slide that connected all the dots on this right so your service under the hood at some point says get credentials for identity passing that ID token at the top you can see the ID token the cracked open Jason version of that token that has the role that you're interested in then kognito will go to the matching it returns the scope credentials and then eventually you have a secret key in an access key that are there now if you were not using Cognito here you could still map the role to a policy but now you'd use the assumed role API that's out there so now it becomes your job do the assumed role and it goes through this and you'll eventually get back an STS token that is the equivalent of the tensions that we get at the end of this process so again kognito has got some nice little bells and whistles here that make this a little easier but totally achievable outside the context of Cognito as well now everything we've talked about so far has been in the context of the server how does the server implement storage how does the server end services because I think that's where most people are focused on terms of how to build all this but there is a multi-tenant sort of universe that we have to think about for the client and I don't want to spend a ton of time on the client but I think it'd also be wrong to leave the client out of this because it plays a role and how does it get how does it deal with identity how does it map it into the implementation but I also think it's very specific to individual applications and even technology stacks you know dotnet java angular are gonna approach this probably slightly differently but you do have to think about hey my a panda specifically this app has one application that is addressing the needs of system users tenant users and all the different variations there's like six I think six or seven variations of users that it has and each one of those has a slightly different experience and we did that on purpose so that you could both see the IAM policies applied but you could also see different flows in the application go on and off based on your privileges so we have an admin experience where somebody can go out and view all that had tenants in the system they can inactivate DIAC sorry deactivating and activate tenants out there and control whether or tenants allowed into the system or not into the system they can create new system users with different sorts of views so you can have a system user sees all tenant data or just somebody who's like read-only and can go out there and maybe is more of a support role kind of things and then we have a tenant admin a tenant admin has a very different experience because tenant admin is allowed to create other tenant users they're allowed to see metrics that are out to configure policies that are not there and then we just have a basic tenant user or somebody who just consumes the application and you can imagine even different roles at that level so what's that look like under the covers like how are we applying both security and controlling like the paths through the application well if you're familiar with angularjs L there's no magic here we're essentially want to say hey when we authenticate yes we're interested in what that token is for all our authorization through on the server side but on the client side we still want to pull that token out and pull the data out of that token on for client side use for pieces that we need in the context of the client so here you'll see this game called root scope and root scope and angularjs is just think of it as a global context that I can you reference throughout my application and I you'll see this Okin that comes back response.data token there in that second line that's my bearer token that comes back from this I have to decode that token I use this little module called a jot helper that goes out and decodes it for me and now once it's decoded I can just poke into it and pull out the attributes that are interesting me on the client so I pull out the name because I want to display your name on the screen somewhere and then more importantly I pull out your role and when I pull out your role I'm gonna leverage your role in the app side to control all the paths that you can get to and I don't want to dig on into this one too much because it's so specific to to angular but what we did here is say okay now there's a set of policies that determine on the client which paths you can navigate to in which screens you can see in which contexts and so we created this centralized function off the root scope that was is link enabled and what is link enabled does here and if you look at this function it takes a view location and that view location is essentially the path you're trying to navigate to are you trying to get to tenants or users or orders or products what are you trying to navigate to and then says I'm going to go look at your role that I got from the token that came in and evaluate whether or not you are allowed to see that path in the application so here you'll see if you wander down a little bit if you if the view location equals slash tenants I'm gonna go look to see this is a little bit of a helper function because I have to squeeze all this on the screen I'm gonna go look to see if your role actually is compliant with a system user and if it is I'll return true if it's not already and that's how I'll control whether or not that link is there so this is link enabled ends up being referenced in our HTML now so if you get all the way out to the HTML side of coding the HTML bits of this and say I've got this link that lets you get to this on the screen now I'm going to use these angular notations like ng-if is just an angular notation for invoking the angular framework to say invoke this if clause and this if function is going to call my is link enabled function that I showed you is just a second ago to say hey here's I'm interested in the slash users path view the slash users of view location is a my allowed based on who I currently am to get to that location and then if I am I'll use this is link active to be able to like deactivator or activate that link I'm less worried about the HTML bits of this and the angular bits of this what I'm really worried about and maybe I could have just presented is this flow that is at the bottom here right really I just want to say whatever technology you're using go out there extract the roll from from their token that comes back to find some process for centralized process hopefully for evaluating somebody's access privileges and then and use whatever mechanism that's the lightest weight mechanism you can in your HTML to to enable and disable that access the most important thing here is whatever I do on the client that's just client's notion of of security it has nothing to do with the server side of security so if somebody somehow implants implement something in the client here and enables a path that they weren't supposed to enable by mistake when that token finally gets to the server side I don't really care about any of that I'm still going to do all those things that we talked about I'm gonna look at the I'm first gonna look at the API gateway level and with the custom authorizer see if your allows through and then even when you get all the way into my service and you try to access some resource I'm gonna use privileges to be able to make sure that you can't get to a resource so this is what I'm saying isolation sort of wanders across this whole experience even when you don't intend for it to potentially third caveat metering is not currently in the sample solution last of the caveats metering is being added to that solution right now so we ultimately want to see metering and billing both added and in this s3 partitioning those are the three that are like under development but not released yet um but I still feel like you can't build a SAS system and not talk about metering because metering is at the core of both billing but it's also at the core of a lot of other analytics potentially and so showed a sample here of what the metering sort of solution we're looking at right now and you'll see our functions come in our services are accessed through the API gateway like we said we will then in we are basically instrumenting them with a with a module here could be a agent could be anything you want that is appropriate for your technology and those metering sort of modules are responsible for surfacing the metrics of our system and you'll notice they consult this tenant manager in the middle and that tenant manager has configuration and that's because metering can be tenant specific I kind of specific SLA s for each tenant I can have configuration that will affect your metering experience so these these metering agents have to consult that configuration to determine what to do when their metering your data but essentially after that if I've got some way to surface the metrics now it's just going to go through some mechanism that's going to aggregate that data so I'm gonna go to firehose here I'm gonna go to to aggregate it I'm to collect it put it into redshift and then I'm going to put some billing aggregation piece of this that will actually pull out the metrics that are metrics that are most relevant to the billing environment and push them some billing systems pool so it could be the way but in this case I think it mostly be push will push those metrics to the billing system and then the billing will billing will be able to generate that bit I mostly don't want people to overlook the need for this as part of their implementation now we had we wanted to be able to say well we got this system admin experiences tenant admin experience what's the difference between their experiences and so we introduced the kind of a cheesy little dashboard not very not very visually appealing necessarily but it shows you the service healthier so all as the services are all running you'll see a little green chat box this year obviously if the micro services go down you'll see red X here and it was meant included here to say this is a system admins view of Health that's across to all the tenants and then we have these metrics down below which are just how many products how many orders things of that nature that was meant to be hey this is both the system and a tenant admin view and it was a way to sort of show you what would it be what's a different experience potentially like especially on the client as we look at applying different roles but nothing exotic about it just something I wanted to call out because it's an example of something I think you should include in your app now one of the takeaways from all of this what are the bits I'd like you to potentially leave with well I hope you realize after you've seen all of this and all the discussion of roles and isolation yet we were supposed to be here to talk generally about architecture for sass and we had all this discussion of identity and and the way that identity gets bound and gets applied as we access data but that is what multi-tenancy is all about right multi tendencies like how do I get 10 a context and how do I appropriate apply it as I access data as I implement my services and all those bits and how do I provision an experience at the beginning of all this that supports all of that it's it's it's it's more fundamental than a lot of people think hopefully also by looking at I am here and see that we use policies especially in a pooled model how we use policies to control scope looked like a powerful construct to you I personally love seeing this and love using this in my multi-tenant solutions because I feel like no matter what a developer does no matter what somebody on the UI does when you finally get down to accessing that resource I'm gonna have some confidence that somebody can't cross a boundary so I'm going to invest the time and energy and building those iam policies I will say as you go service to service the way that you partition is different and because the way you partition is different the way you build those iam policies is also different silo versus pool is different so you have to think about that as you come up with a scheme for building your policies just a general goal I've already beat this one to death but hopefully it's very clear that we want to keep all these details as far away from the developers as we can we just want to write at the app continue to write the app and we don't want to feel like hey because we're in the multi tenant universe suddenly writing an app got way harder no we want to solve these problems solve them right but then mostly just have them be baked into our infrastructure baked into our approach in a way that we can just charge forward and feel good about the fact that the system will support this and then we'll focus more of our energy on scale and how we deal with multi-tenant load and things of that nature those will always be huge operational challenges and part of another talk entirely this one we didn't really hit on a lot but if you notice in here we had a lot of discussion of system users and like you can see all the tenant data and then we talked about tenant users and what does it mean to be a tenant in the system and even within those there are different flavors of those things a lot of people don't even think about a system user when they come in right like they don't think well what is it gonna mean what how should we provision and manage the scope and access of our own people who have to touch the system that becomes like what everything's about the tenant but these system users are more dangerous and have the more potential to do something you don't want them to and they all should be sharing a common approach to how you're enforcing it like just I want system users to have a whole set of iam policies that are just of their policies even though we didn't get to hit on metering here I feel wrong not saying that these are key fundamental concepts to to SAS and if you if you kind of make them an afterthought they often end up being a lot less valuable than they could have been and your billing ends up being a lot less interesting than it could have been I love to see organizations SAS organizations like struggling with what is the right thing we should bill on how can we meet her for that how do we get better view of what tenants are doing in our system I like that to be an early discussion and I want you to sort of put that off until the application features are finally getting rich and then finally I want you to I want you to work multi-tenant we had this very early diagram where we showed onboarding app services storage I don't want you to work that in silos I mean the best solutions to me are ones where somebody says hell we're building a brand-new solution what's your first sprint somebody logs on somebody says write a product to the catalog somebody writes the product to the catalog and then you get the product back out and that's it like that's a great first print do you know how many pieces of this you have to get right to make that work like all those policies have to work can I go verify they can get to it can't get to it how do I resolve the tenant ID in the service how did I resolve those bits I just do that as my first slice then I can start adding all the depth that's needed there and I'm by the way DevOps isn't in this discussion I'd have that I'd make the same comment about DevOps is along that path as well finally resources I promised a link to where this is at you can see it's SAS identity in isolation with Amazon Cognito it says identity and isolation I sometimes wonder if we rebranded it because it's truly a full reference SAS implementation with micro services but obviously as you see here from the talk identity and isolation are a big part of that discussion there's also just a source repo and if you want if you do nothing else you can just go down the get the get the PDF out of the repo nice big document there to cover that and then the last one is just general SAS content there's a whole bunch of SAS content out there if you go look at this last link and around all of these topics with a lot more depth around them for especially for the areas we couldn't get into much detail on today and that's it thank you so much [Applause]

Info

Channel: Amazon Web Services

Views: 20,744

Rating: undefined out of 5

Keywords: AWS re:Invent 2017, Amazon, Architecture, ARC407, AI, Connect, Security

Id: kmVUbngCyOw

Channel Id: undefined

Length: 56min 31sec (3391 seconds)

Published: Fri Dec 01 2017