Multi-tenancy OAuth with Spring Security 5.2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good afternoon welcome thanks everybody for coming I've never been to a conference that has popsicles hasn't that's never happened and it's funny to look out in the audience and see people with popsicles at a technical conference it's a pleasure to be with each one of you I'm really excited to see the interest in the multi-tenancy story here here at Spring one and to see your particular interest and how spring security plays into that and more specifically what the a la story looks like this was a major theme for us in spring security 5.2 ensuring that our multi-tenancy story was more sensible than it has been in the past my name is Josh Cummings I'm on the spring security team I've had the pleasure of working with Rob winch and Joe Granger for a couple of years now it was shocking to me the other day to realize that it's been nearly two years in just a couple of months and this year we welcome two new people to our team Riya Steen and Phillip onic and I personally am so stoked with what we've been able to do in this release with so many passionate individuals about application security dedicated to a common cause it's been really very very cool and I am honored to have the opportunity to talk to you today specifically about what we added in spring security 5.24 multi-tenancy I have a little brief announcement which is that the vast majority of what I'm doing today is GA it came out in some Greek security 5.2 there will be plenty of demo code that I'll be writing so obviously that's something that you know use your best judgment when you decide what to copy/paste and what to use and what patterns to to use inside of your application use your best judgment I'm depending on questions we may get into some things that may eventually be rolled into spring security but are not ga-ga yet so again just use your best judgment and thank you for considering your own use cases as you decide what to apply so let's get started what I've got here is a simple application but I would like to make multi-tenant the story for turning something similar in it into a multi-tenant application isn't super common or hopefully it's not super common because that's you know a rather challenging goal that's not a goal today but we are going to start with a single tenant application just to be able to I lay the kinds of things that are necessary to consider when when making and choosing to make an application multi-tenant we're going to talk about the security implications both of authenticating as well as data modeling so we'll we will get in touch with our data side just a little bit and and and talk about different ways to model tenancy underneath and some of the security implications for those as well this is a simple application and if I click my little sticker page there then I can log in as one of our team members and I get back to the application it's a pretty simple full functioning ooofff application it's called SML because Saturday afternoon I realized that if I renamed it from secure mail to SML then you could send your mail via snail mail and because I have six children and a lovely wife it is my obligation to make my children laugh and my wife roll her eyes as many times as humanly possible so you can see how it was incumbent on me to make that change and and to fulfill my primary life goal so now I'm logged in and I can see messages from Joe and me to Rob and this is a simple application where employees can message each other back and forth and I can also go over to a different domain these are just in my Etsy host file right there all just routing to localhost I mean I see the sticker page again this sticker page feels a little odd though right and secretase is are pretty normal in when it comes to authentication especially with OAuth it's not uncommon to see login with facebook login with Google login with LinkedIn and whatnot it's a little odd to see it here though right because the intent of the user is clear based on the hostname I know that he wants to login to the number two endpoints so there's a little bit of work for us to do here in order to make this application a little cleaner and we'll get into that actually towards their end of the demo but if I click the number two endpoint then OH I get a bug something's not working and we're here to fix it so we get a four oh one right here that's not great we something's happening as the client is trying to talk to the resource server so these are a couple of things that we're going to try and fix inside the application but a couple of things that you observed already are that there are four actors inside of this application first we have this client application which is the UI that coordinates with two resource servers the first one being a messaging application that's the thing that gets the messages and shows them to you and allows you to send messages to each other and then the second is a user service where we can pull up the user's name and an email address um whatever else we need about that that user and if you saw my presentation last year you'll notice that this is basically with an air of multi-tenancy on top of it it's basically the same application that we left off with so let's see what we can do let's take a look at the humble message repository this is a spring data repository and it has all of the capabilities that a natural spring data repository has because it implements or it extends credit repository so I can do find all find by ID say of all the other things that are possible including this custom query that I've written which is the inbox query find all messages that are written to a particular individual this is a great place to start when we're thinking about multi-tenancy because it reminds us of something else that sometimes we're reminded of when we think about multi-tenancy and that is a multi-user we don't really think about multi-user applications as such because I'll bet you if you inventory it in your mind all the applications you've ever built you can probably think of very few that are actually not multi-user generally speaking most applications that are non-trivial have user login and a logout and we need to be able to isolate user data inside various tables or collections based on who's logged in it's a really very familiar idea and I'm sure if I took a poll of raising hands who has ever written a query that has where user ID equals something I suppose everybody would raise their hand you wouldn't find the same result though with multi-tenancy if I said raise your hand if you've ever in a query where 10 and ID equals this we want to see as many hands go help there are two reasons for that the first reason is because not all applications are a multi-tenant and the second reason is because well let's find out it's very natural for us to think about multi-tenancy as another column in the database one might wake up one morning thinking about multi-tenancy and say 10 and service users - if I have a user ID column it's a very natural idea for me to simply add a tenant ID column as well to all of my tables and so now my application can make a multi-tenant one of the initial benefits that we get or that we see from doing something like this is that it's familiar it's just like we've done with users forever it must has a couple of other benefits that maybe we're considering in the beginning from an infrastructural standpoint it's about the same order of magnitude of difficulty or of simplicity as single tenancy if I have twenty five tables and I add a ten and ID column to all of them and maybe a couple of other tables to describe tenants I have the same number of or the same order of tables as I had before it can also be easy from like an analytics perspective and from a migration perspective because you only have one schema to manage and I can simply take off the tenant ID column from any one of my queries and now I can perform company-wide or application wide analytics on on my on my data source so for a few different reasons we might go down this row but there are some important security implications that we want to think about and let's let's just see what happens if we do this so I've added a I've added this method the spring data makes it super duper easy which is awesome and then I will make the change here as well now in order to fix the next compiler here I need to add the tenant so let's do that and one thing that I'd like you to observe is I'd like to I'd like you to put your HAP your scaling hat on I want you to not necessarily about performance per se not just about performance but about how many places am i touching in order to describe multi-tenancy in my application right just just kind of keep a count in your head so we've got we've we've added it here and now we've added it here I need a way to go establish a tenon this is no different than what you've done with multi-user applications for ever we need to figure out who the user in context is and we need to figure out how to propagate that user down to subsystems whether they be databases or other micro services or whatever we have to do the same thing with the tenon and so I'm gonna make a snap decision and have it be like a requester this is a REST API so it's totally fine they're already sending me one header the bearer token will have them send the tenant header as well and it'll and it all be dandy because I've made the choice to do multi-tenancy by discriminator I'll need to add the property here otherwise that that query won't have the column to add and in addition to setting it here inside the constructor I'll go ahead and have it here let's see how's it get her inside and then the last thing I've got to do is and this part hold your breath it's a little bit of a boring part of the demo I do have to do a manual edit here in order to add this tenant to or the notion of tenancy to Megami data but this is still a luster tip even though it's not very cool to watch is to see the the many different places that I'm having to edit in order to describe multi-tenancy at the column level the first time that a colleague showed me that little multi click and IntelliJ I nearly cried it's so cool and I'm on a Linux so I don't know if it's different for everybody it's just all shift-click and then you can click different places there's also like alt shift page now and I think I don't want to get let's see is it now okay I'll find out there's another one where you can just like go down the line and then type and it's really cool okay cool so now all of our data is multi-tenant and we have at least one multi-tenant aware request if you start putting on your security hats you're gonna you're gonna start seeing a couple of problems maybe your spidey sense is telling you that not all as well but I'm gonna go ahead and login as Rob again this time I'm gonna use a little script that I have and you can check it out here it's just getting a password token from geek cloak nothing nothing super jazzy but these little scripts will help us just to be able to cover more information I'll have another one that queries the inbox endpoint right here and depending on how I execute it it'll include at ten an ID column otherwise it won't so I got a token so now let's go ahead and ran query and great I was able to get Rob's information that's really fine I presented the bearer token that I got from Keith look in this application already uses OAuth to protect itself and so that's just fine let's go ahead and log in as me if I query it I get my inbox which is great however if I supply a different tenant ID then gasp I get another tenants information this is kind of unfortunate its temp linkage is one of the scary problems and multi-tenancy when we're thinking about multi-tenancy one of the first things that we want to make sure that we consider is tenant isolation we want to be careful about having paper-thin walls inside of apartment buildings hotels and inside of our distributed systems and so let's we'll try and strengthen this but the main problem here right now is that we're trusting too many people with something that is more easily derived from one source of authority if I ask a bunch of services to just supply me to ten and I D you got it securely that's totally fine just pass along to me like I'm beyond I'm beyond the security boundary now right like you you figured it out just pass it downstream to me if we do that then now we've asked n number of clients to do something correctly and if when you think about that from a security standpoint the only thing that has to happen for a security breach happen is for one person to do something that they weren't supposed to do or one person to do not do something that they were supposed to do I personally don't really want to rely on a whole bunch of our clients to correctly supply me the header even if they got us securely I don't want them to have to remember to do that and so that's a problem let's fix it here in just moment let's go ahead and carry out this analogy though or this little thought experiment all the way through because there are other security problems to consider as well so the first problem to consider is that if I forget the tenant ID or if I miss a playa then suddenly like it's either not in the where clause and we leak data all over the place or we get the wrong tenant tenant information and it's due to this paper thin wall that we have right now there are other problems though - for example there's this one this seems like a totally JIT Amit method called make in fact it's right there in the credit repository for us to call autocomplete gives it to you and everything will work just fine and in fact is very deceiving as well because since it's a primary key we might say oh it's primary key there's no way for today - the leak also actually this is a problem it's a problem because of unsecure direct object references when it comes to access to data getting the result back is without any other authorization in place is implied authorization to access that data so for example without changing this out what I am stating implicitly and we get scared with implicit security statements right we should be declarative with our security what we what we're stating here implicitly is that if you have access to the message ID then you may see the value fair like this hold this we'll just come back right and so we have to do it here - we're gonna have to supply our tenant ID to ensure that to ensure that anything that any time we query an ID it belongs to that tenant as well so data can leak in otherwise as well and it's not just due to there being only a a many to one or one to many relationship - like a single tenant - to other other resources so let's go ahead and add that it's going to add that sweet find my all is another problem certainly this is not the intent of the user I don't want to query all of the data across all of the tenants I don't want the data from my tenant and so we're gonna have to do the same kind of exercise one more time are these are really deceiving right because you might say oh but look it's really easy because the ID makes it so easy for me to do but just because it's easy it doesn't mean that it's the right choice okay so now this is more this is better this is what we originally intended to create the tenant is inside of every one of our queries inside this one controller across this one entity and this one repository hopefully this makes you like sad right we don't want this to percolate through all of our controllers all of our business logic all of our stuff everywhere and we might ask ourselves why this is happening and I would like you to just put a pin in that like why is this happening and how can we change it can we like push away all of this logic out to other aspects of our platform and to what degree and what might enable us to do that and what might prevent us from what's what's these what's the nothing scape glassy but what's the level of friction of of having to do that with each one of these models there's still a problem though and that is that findall is still part of the autocomplete like I can still legitimately call final and still cause a problem right so what I'm trying to highlight here is that whether we're using spring data or other kinds of things adding a column is difficult to audit it's difficult to know for sure before you release do we have a tenant ID column in every single query or is it any way that we could possibly leaking data it's a hard thing to do spring data makes it look it look nice because it's just method names and we can see it all happen all on one page but this is a problem that would exist in any application that we try to do whether we're using a spring data or something else and we choose to include the tenant in every query let's try for a minute to to to fix some of this let's fix the tenant ID problem first because because that's an easy first win so I've got a couple little helper classes just help me go just a little bit faster and spring security supports OAuth 2 based authorization yeah is it too dark yeah yes that's a great question let's talk about that for a minute so a user ID does not necessarily imply a tenant ID and let's see why yes the question was does a user ID imply the tenant ID and that's a really good question and the answer is maybe and that is definitely one of the now where to repel there we go that's definitely one of the concerns with considering multi-tenancy by by column is because maybe maybe not a particular circumstance that oh excuse me and the question I would ask you and and those who are considering this this this initial question is does my user belong to more than one tenant and if they do or if they don't then it depends that doesn't mean no obviously but it does mean that things like this become harder to auto right because we might argue right here nope not that one we might argue here though this is a totally legitimate query for us to use the user has already authorized to get auth - right and so we know that the user ID hasn't been supplied incorrectly by that by the client the user has given a threat to this client to query on behalf of the user however depending on our business rules a user might belong to multiple clients and/or multiple tenants and thus we need to include the tenant ID again so that's a great question so the spring security we're defending this application with spring security and so inside the security context holder is a security context and that security context has an authentication object called half-height oh that's you authentication token there are two implementations of this one is for opaque token and one is for jaw and we added opaque token or 5.2 which I'm not demoing today but because I want this to work for both then I'll go ahead and use the abstract class and if I go and get the tenant ID attribute from the token then now I have a much more secure way to go to go and get the go and get the attendant ID this is a payload that's already been sent to the application and not only is it being sent the request actually won't work if this header isn't sent so for example we're piggybacking off of the bearer token capabilities that the application already has and if I include a tenant ID attribute inside the already verified authentication verified by spring security then I have confidence that this came from the authorization server it wasn't that a lot to client that just got it right because they read the documentation really well and whatever it came from the authorization server it's signed it was verified by a public key and Andrew solid so I can go ahead and remove this and still be sad by the way because we can do much better than this but we'll go ahead and complete this thought experiment great so we're able to get it from the security context holder right essentially this is just a convenience class to get through some of the boilerplate that we need so that we don't have to call us several times throughout our application but effectively we're going and getting the tenant ID from spring security our next goal then is going to be to try to get rid of these right here and see what kind of machinery is involved in order to achieve that let's use spring data and see what we can do by having a custom based repository I'll also need in order to differentiate or in order to call a special method on our bean or honor domain object I'll need a little interface and then a base repository is something insights bring data that allows you to decorate whatever spring data by default would use as an implementation for your interfaces by default spring data will select simple repository as the thing that implements my message repository and I'm going to change that I'll change it to this tenant base repository and my idea is this now what I want to do is I don't want to have to have people remember to not call find all please oh please don't call find all because that means that we'll search the entire database I don't want to do that and I also don't want to have like a custom interface that's that's nice and that's fine we could do that right we don't have to use credit repository I would like everybody to have all the coolness that comes from Spring data right so I need a way to intercept everything and this is the this is the way they were going to show so I'm gonna write all of the delicate methods really fast look at how many and we'll leave a few of the best ones for last I did skip over something I have a bunch of boilerplate here that's there to try and go and get the tenant in the different ways that I need to in order to supply them inside the query so let's go ahead and add something for the save method and then you'll notice that these methods that I've left for our implementation are the ones that are inside the the controller great so what we're gonna need to do is something like this right where don't mind extra stuff here but what we're doing is we are calling my special method below that's simply calling the set Tennant method on our entity right we know our entities of type tenant aware so we go ahead and set it and saves will work fine now saves are pretty easy find all is a little bit more difficult and we need to do something different because we have no context from the user we have no initial object from the user either so we need to create something that that we can send down happily spring data has a method called find all by example where we can add properties into an example object that we need in order to pass and so all that's inside this is sample object is a tenant ID and say please show me anything that has this 10 and ID connote next if a person does send us an example then we need to augment that with the tenant and find by ID is kind of an interesting one right because we need it to somehow be find ID and 10 and ID so once again we can use the example based support from spring data and let's see I've got another special method here that I can use to call and what it does is it does something really ugly and let's see it does something really ugly which is let's see are we good there okay if you look down here like if I'm given as the idea then I'm going to need to inside of this platform like this attempt at trying to add the the IDS in two queries I'm going to need to do some ugly reflection in order to instantiate an example and set an ID and have you and then have the tenant there after so looks pretty ugly looks pretty bad right however this is going to work and we will have we can safely call all of these methods what I'm illustrating here is that regardless of where we put this like we chose to use spring data here right and so we have a base repository maybe we could have tried like an AOP aspect right and stood in front of something and tried to catch although the difference circumstances there or maybe we could you know have a specialty like operations or or or wherever we are like it may be a specialized entity manager or something like that the fact is that querying is complicated and to try and capture all of the different places is rather challenging wherever we wherever we put this code and believe me I tried how I was really researching this tried many many different ways to cleanly describe the the column somewhere okay but now that we've done this we're gonna live with our architectural decision actually we're gonna like blow this away here in just a second but we'll use this base repository with spring data say please use this when you're implementing any spring data interfaces any positive ratings and let's see I think that's that might be all we need so now let's go ahead and change these guys now we can safely call final again yay and now we can safely call find by a team and actually we can use the message we can use the query by example support from Spring Security us from spring data if we want to because it will safely add that tenon for us and we can just do a find all by example okay we good there we go nope then we can just have great by example and we can get rid of all this stuff it's fun to get rid of code so we can declare a minor victory a minor victory because we did successfully push out the tenant from this controller it was at the cost of quite a bit of complexity and even this didn't actually address all of our concerns we still can't use like at query annotations we can't use custom methods those all still we still need to remember all of those they had to call them to all of those right and it's still a complex auditing problem you've heard people say before if you've ever listened to any presentation about different models for multi-tenancy that if you choose discriminated by column you are also choosing complex software governance inside your application in order to ensure that you're not linking tenant data and we see an example of that right here let's make sure that we don't break anything go ahead and log in as me and we'll go ahead and not include the tenant ID and will see that I did break something right I was just telling RIA that I've been tracking the number of mistakes that I make when I demo and it continues to go down but it's never zero okay great so now we'll go get our token and we can see my data again right so it's working even though I am not supplying the tenant ID inside my controller it gets supplied by this base repository this basically this query interceptor so we can achieve a certain level of success with multi-tenancy by discriminator it's at the cost of complex software machinery to make sure that either a we can do the intense audit necessary in order to prove that we haven't forgotten to include the tenant ID anywhere that we shouldn't or platform level code that tries to tries to do this for us there's a reason in my personal opinion why hibernate still doesn't have discriminator multi-tenancy support it's because it's hard there are other reasons to maybe double check our thinking if we're going to go down this route consider some other security implications so the first two major security application that we've just been talking about and and kind of belaboring the point on is it's hard to make sure that you don't accidentally leak then tenant data everywhere there are other security concerns though to imagine the day when your customer says may I please access and administer my own data set this won't work for that or it's going to curtain it's going to require quite a level of complicated infrastructure to allow a client who is not an employee of yours to be able to go see an administer their own data set if they come to you one day and say I would like you to please encrypt all of my data with my key that I bring to you this is something that's pretty difficult using this model we can using multi-tenancy by discriminator is nice in the beginning because it feels very natural but it is what sort of rap that's not raft it's but if it has a bunch of problems underneath that we consider down the road or we don't consider until we get down the road and are much more mature so let's try another approach let's think about why this is happening and I'd like to completely change the conversation for a moment back over to seeing if we can fix this buck remember the speck right here that was like the beginning part of our conversation so let's go ahead and see what we can do with that what I like to do is I'd like to create something called a tenant authentication manager resolver and this often a kitchen manager resolver is a new API in spring security 5.2 it's something that allows you to by request determine how you would like to authenticate the request and way we'll do it here is we're going to do it by claim we're gonna pull out the tent we're gonna pull out the tenant ID again but this time we're gonna pull it out directly from the request so we have to pull it out and do a quick parsing of it using Nimbus the library that spring security is dependent on for its jaws support and then what we will do is we'll on-the-fly determine how we're going to authenticate this token in order to do this I need to know excuse me let's let's actually illustrate that really quick in order to do this I'm hoping at the show there that's okay we can go like this in order to know this I'm going to need to know the issuer in order to construct one of these so let's go ahead and state what it is I'm going to do it just be a little map here of course we could use maybe configuration properties or we could use a database call a separate database call to a tenant system that would tell us the information that we need about our tenant and we have a couple of tenants in here so now in order to return my on occasion manager what I'm going to do is I'm going to take the tenon that we've taken from the claim set here and I'm going to try looking it up in our tenant configuration map this is actually going to achieve a few things for us not not only just a single point of configuration but it's it's going to provide us some nice security as well so what we ultimately need to do is with this URI we're going to call shot decoder stuff from issuer location and then we're gonna construct our doth indication provider this is the thing this is the thing that boot does for you already but since we're in a multi-tenant state we're gonna just do it ourselves and this is kind of like you know we need to throw some sort of like user not found exception tenant not found exception and then we can just go ahead and convert that into a authentication manager because we only have one provider there's no reason to use a heavy provider manager from spring security we can just we can just convert our authentication manager our authentication provider into an authentication manager so this Chief's this achieves a couple of things for us the first thing that it achieves is that you'll notice here that this is all single tenant we didn't have to have a multi tenant aware dot decoder and if we had you can you can imagine that if we had pushed down the notion of tenancy down to the jot decoder then its components and siblings would need to know about it as well the validator the the what's the other one called the the claim set mapper all of those we need to also know about the tenant because they're all now multi tenant down there if we back away if we make our multi-tenancy decision early then we don't have to make artisan multi-tenancy decision often when we push it all the way down to the where clause every single where clause needs to have it that's kind of by definition if I can make the decision or then I can clean things up a little bit so notice I'm making though multi descendancy decision up here and now everything here and thereafter inside the filter Jane is single tenant instead of being multi to them the other thing that this gives us is let's say that we have an application that has three thousand tenants 4000 tenants before I joined Spring Security I worked for a company where we had 4,000 tenants and the idea of using our resource service support for it we're on startup it would call each issue a end point in order to go ahead and collect all the configuration information that I needed was laughable there was no way that it would ever actually start up it would take forever but guaranteed there would be one of them like leg was down or something like that and and it just wouldn't work so this is also nice because it makes this lazy only when a particular tenant has is requested will we we actually make that configuration call so we'll go ahead and just have a little cash here and we're good we've got a little authentication manager resolver that is gonna get the tenant and based on that tenant it's going to figure out how we want to authenticate there's one more thing that this accomplishes in its subtle and that is that this piece of information is untrusted information we haven't actually authenticated yet right we're actually producing the authentication manners are so that it can be authenticated that means that at this point in the chain we can't trust this value it would be very tempting for example to not look up this information right here and instead just compute it it's quite obvious that there's an easy computation based on the convention here and so maybe instead of looking at a Pan Am app I just say hey whatever the tenant ideas let me prepend this value to it that would be a security violation that would be using untrusted information to make an authorization decision the way that you fix those kinds of things like how should we trust something how should we read something that we can't trust yet how do you know whether or not the hostname is something that's legitimate the legitimate lookup value or the path or the header or whatever however you may be a describing what's a tenancy well you need to do it with a whitelist and and that's what we have here implicitly because we have our configure because we're using declarative security here we get an implicit whitelist only those things in this map are the ones that are going to be respected if it can't be found in the map then we're simply not going to allow the request to go through so let's go ahead and add this to our configuration so we'll go ahead and Auto wear that and we'll use our our nifty new lambda DSL that I'm so stoked about and go ahead and say instead of using JA use an authentication manager resolver this authentication manager resolver on every request we'll decide at that time what we're going to do and of course the decision will generally be a cached decision but this is going to Nabal us to do something like when we add a tenant then this can wake up and go ahead and create the necessary things for this microservice when that tenon gets out so let's see how we are now oops great so now we can log-in now we can talk to that service because it's been made aware of the other tenant and it's been way too we're in a in a way that is important certainly we could have we could have generated something excuse me we could have pushed this down a layer we could have instantiated a really smart Rock decoder or something further down at the point of decision like where is the point of decision in a database query it's in the where clause right where's the point of decision here it might be at different places in the stack but if we push it back to an earlier spot wherever we push it back to if we can successfully push it up earlier in the filter chain up earlier in our query whatever it is that we're working with then the oftentimes the rest of it can remain single tenon and it's it's it's nice that way okay so let's try and apply this principle on the spring data side see if we can push it away from our where clause and have something a little bit cleaner let's now let's just do her here so let's go ahead and take our Tennant resolver and publish it with this being available now I can instead make a decision to say something like well let's try a different form of multi-tenancy let's talk about the tendency by collection I use collection because I'm using underneath but what I mean is if you're from the our DBMS world I mean by table where each table might be replicated and you would have not replicated but but duplicated and you'd have the tenant ID at the beginning of the table name so we'll have one underscore message to underscore message and let's see how how well that turns out for us so here inside of spring data we can and go ahead and use a spell expression to describe the table name and because I've published my tenant resolver as a being then whatever we're logged in as when this is queried it's going to change its going to specify the name of the table as when a task or message 200 score message so forth that means that if I have a little bit of boilerplate here just close your eyes just to make sure that there's something inside the security context when we do this that means if I say my tenant is currently one then it's going to write to one underscore message now if I do too then it'll do 200 score message and if I do three and then I'll do the 300 school decision and actually what I assert is that at this point we don't need a lot of this other stuff if we take a look then now this is a totally legitimate thing for us to call it's 100% safe it's safe because the name of the table is going to change at query time based on who's logged in this is nice for this this is nice for one reason and by the way we're going to talk about trade-offs of this one as well right but this is nice for the particular reason that there's no way for the query to succeed if I forget the tenant if somehow there's some kind of bug where we can't like we haven't been able to determine are we sending the ten in it or not it doesn't matter I mean it does but I but from a security perspective if we forgot the tenant then the query will fail there's no table called this thing of of just blank underscore message we don't we don't have it right and so we're in a better spot because of that so we don't need all of this extra protection let's go ahead and delete that let's delete it here and I think that's it so now oh nope I do have some other stuff so multi-tenant save byte collection definitely has its own limitations as well if you can imagine having thousands of tenants like I had and thousands of tables like we had now you have millions of tables right that's a lot and aren't many David Enders out there who will tell you yes three million tables that's just fine we're good Margo for example they're their initial limit on their databases is 24,000 and you can bump that all the way up to 3 million and 3 million may seem like a lot but when you're thinking about it at an enterprise scale suddenly that doesn't feel like enough anymore right I'm limited to the number of tenants I can have based on how many tables or collections can have so this is very clean from a software standpoint but it does have some limitations that we need to consider before we apply it also it does have some security limitations the same kind of security limitations that we stated earlier it may still not be very easy to give your client access direct access to just his tables you'd prefer to give him access to just his database or just his schema right as opposed to trying to make sure that he doesn't actually move over and get into other people's tenant data by it being able to query his own database or or or see his own data but let's go in and see hopefully it works yeah great i'ma log in with Josh number two and we see Josh's number two data awesome now because of those limitations on collections or database tables we may want to instead do a bi schema schema is a nice way like this is this is kind of where we start to get closer to more of a happy medium right because we still want our software to be very clean we'd like our infrastructure to be easy to manage as soon as we move away from a table that has a single tenant column over to separate collections separate schemas and separate database servers now analytics is harder now migration is harder we've exchanged we've made we've made a trade-off but one of the things that we've got that we increasingly get as we move along this spectrum is increased security better tenant isolation and better guarantees that we are protecting tenants data from one another so let's make one more change let's go ahead and change the MongoDB factory um and we'll go ahead and override get DB method and what we'll do here now in depending on the vendors that you're familiar with a database refers to a JPA schema and so it's the same kind of thing I'm not actually standing up a brand new database server say alright this is all using the same connection pool actually which is an important other performance consideration to keep in mind so we can go ahead and resolve the tenant and go ahead and give it syntax doesn't really matter but give it a different database name this is essentially what we have now actually is completely separate databases on site the same database server for every tenant all we need to do is expose this and spring data makes this super to be easy to do [Applause] then we'll just give it like a simple database name like so let's see and then if we restart the application then now everything's going to be in separate databases pushing this decision up above our query moving it up from the from from the where to the from or from the from to the connection allows us to everything else downwards from that be single tenon which can simplify the our code it can create stronger security walls and make it make it simpler to guarantee that that that our data our customers data remains safe so one more time let's make sure everything works and then let's do something really quick here let's see if I can I added this last minute so hopefully this works if it doesn't then it's fine so we can look at and we can see like the different databases right and it's created them for us all we had to do was describe the strategy by which we were going to name the name the database and now we have we have these separately and we can view the messages in here can the one message table and you might ask why would you leave the collection with a prefix and have the database still have the same prefix as well and maybe do maybe don't but I think that it's important to consider another implication if we have say like a freemium model in our application just random random idea and we have a bunch of people stand up for free trials shall we stand up a separate database for each one of them when maybe they'll go away in a cup just a couple of weeks right it may be very costly for us to simply always no matter what every tenant that gets signed out we're gonna spend up a new database right and so it may be valuable to still have the tenant as a prefix in the collection and a tenant has a prefix inside the database so allow us to have some kind of migration story as tenants sign up or as they get bigger and as they evolve the last one which I'm not going to demo here is database server once we get to a database then now we can finally do what our clients are asking us to do which is I want to have access to my own data I want to know that if someone else if you give some other customer access to the data they can't get to my data accidentally and I want to know that your software can't accidentally leak tenant data all over the place so we've got that at this point however we have problems with noisy neighbors this is all in the same collection pool for example so client is has a collection pool and it has threads and they're all calling out to all of these databases and that's a problem it's a big problem but we can't just simply cut over to hey every single person has an own database server because that would mean 3,000 threads that are all connected to databases and if they all have collection pools then that's like 9,000 threads and it's completely untenable so we need to consider our modeling when we consider the names of our collections the names of our databases and how we decide who gets their own database server who is on their own database who is on their own collection and we need to model that that enables us to do that okay cool with the dramatically few minutes that I have left let's go ahead and look at one more application of this principle of moving multi-tenancy as far up the chain as possible making making the decision early but not often and that is here in our in box so remember that entry point if we if we log out and we see this and this is weird right because we don't want to show a sticker page in this case the users intent is obvious based on the host name we should just take them over to ten at number two and what we might do is we might say well I know right where to go into something security to do this I will do it at the point of decision like where I need to know the information I will do that and I will create a tenant aware authentication any entry point something that can like ask for the tenant is and figure it out based on that and redirect to a different end point on whatever nobody's just fine but it's pushing that information further down the stack and the further I push this down the more of its collaborators also need to know we saw that with with with the database right when we push it all the way down to the where Clause everybody needed to know about it so we're the same kind of problem here let's try and push it further up instead what I'm going to do is let's go ahead and create our own filter G this may seem super weird right like we've always just used the spring security filter attain can we like crater owns litter tame of course you can and the way that we're gonna do it is you know the same way that we that we did it in in our other example we're go ahead and create a cash tenants something that will something that will hold on to our calculation of the of the server HTTP security object and the insane web filter and then let's go ahead and grab the host and look up the hosts look up the client registration for that host what I'm going to do right here just to kind of simplify things is I'm going to assume that our tenant maps directly to a client registration if we can do this and it doesn't have to be one-to-one relationship we just have to have a correlation if we can successfully correlate these two then we can get a lot of multi-tenancy for free in with whispering security and OAuth so we've looked up our client registration let's go ahead and compute the compute the filter okay if we don't find it in a map then it's time to create something so let's do so we'll create a server HTTP six security object and then we'll just use what we did before and then well add what we are intending to add initially instead of adding it over where we were going to we'll add it here because at this point because I know the tenant right here and at the tenant in context I'm not going to have to have this be a single tenant or I'm not going to have to have this be multi tenant I can simply have it be my single tenant instance in our work just fine we applied the same principle with logout here we can supply our the redirect URI that you know we want and we're following a naming convention here so you know might be different and we might have to have further configuration lookups in order to do this but right now we're following a convention and a and it worked well for us because we didn't need to create like multi tenant or tenant aware versions of each one of these right we simply supplied the value because we have the tenant in context right now and then we create our filter chain exception handling great so now we have a filter chain that we've defined that will get defined on the fly for each for each tenant that comes in and will cash this value so it's only done once pertinent so we can go ahead and do the appropriate casting necessary and then we'll cash them and return okay cool so it looks rather it may look complicated initially but when you think about it for a Mona this is exactly the same thing that you would put in your spring security application anyway it's the thing that you use to configure the filter chain all we're doing now is we're just reconfiguring them on a tenant by tenant basis as the tenant comes into the request we look it up and we say this is the filter chain that you should use and because we've done it at this point it's all single tenant here we don't have to have crazy multi tenant versions of each one of Spring Securities interfaces not safe so our last bit is to tie these two together let's see this is one of those where I probably should have like just I think I think as soon as we get past this the whole part there's something to talk about so okay cool so we've got our two calls here they get our client registration in our Web Filter and we want to you know pass this information down to the rest of the filter chain so we'll go ahead and say when both of these are ready then go ahead and extract them from the tuple and then given both of those you can go ahead and just filter as you would before we're gonna pass it along the rest of the filter chain and then we're also going to go ahead and send this client registration down in case other filters need it like so then if we didn't find a tenant again the same kind of principle right Wow what do we do well let's go ahead and just send it down the rest of the filter chain which will then allow us to finish up so lots of code forgive me for that someone afterwards who is more familiar with reactor will come up and say like if you do this and it's like two lines code so please if you know that then come tell me but but anyway um the reason for this extra machinery is because we do want to make sure that client registration which we have chosen as our representation for our tenant is available downstream and that's that's all there is that's all there is to this kind of extra boilerplate is to make sure that's available the main point here is that we push this up the chain so that so that we can make a a request time decision based on the tenant that we found what does this filter chain for then it's if we didn't find a tenant right so we'll go ahead and have just a little bit of we'll check for this client registration and it's not there then we know that we can simply deny the request and notice how this is all single tenant as well like we don't have anything crazy that's analyzing the request or something like that because at this point we know that there's no tenant information at all which is a notable multi-tenancy case of which is which is no tenant the tenant that is no tenant okay so let's go ahead and refresh thank you for waiting for a couple of minutes let's see oh well yeah you got to register the bean when you create it and this is a filter chain that's actually going to be before our filter change before the spring security filter chain so we'll go ahead and in Stan she ate it [Applause] we do it there we go so just restart one more time what's that no I don't I I don't need to I just need the one this this is an order attribute that says like the order in which it should be processed is that where you were calling out oh I got you yeah they are two different methods this one's publishing a web filter and this is publishing the downstream web filter chain the thing that will get invoked if I can't find the tenant like what shall I do and in this case we simply want to deny but in more sophisticated applications we anyway I want to like I don't know who the tenant is but like here are the other things that we need to do great so I'm able to log in or I'm able to like see the page but notice that when I log out I go back to the appropriate tennis login and I can log in to somebody else oh yes sorry I don't think I added any messages for you I apologize I will send you a message later on and if we go to one then we'll see that we just immediately get logged in we don't see the sticker page anymore right I know I made a couple of points at the same time so I do want to be do want to be clear and in order to get rid of the sticker page we simply need an authentication entry point right that much is obvious what we decided to do is to do a little bit of extra work to push our tenant representation up in the filter chain so that the rest of our expressions about what each tenant needs to know are all single tenant instead of having to create a multiplicity of implementations Wellspring secretes api's that are all tenant aware awesome let's see what time does the session end is it in like two minutes four minutes oh darn it okay well I was really hoping to show you one more thing but feel free to look at the repository let me show that to you so that you are so you know where it is something security yes that's actually another one that's muscle memory um you can go to Joe spring wine 2019 and you'll see a number of commits on there this is where we got this is where we got to so we we got pretty Queen at duty badly but this is the one that was really hoping to show you guys very very cool and because the reason that we went through all of this extra rigmarole with the authentication energy resolver and client registration repository is so that they can change so that we can update the so we can add tenants right and so what would happen let's see should we I don't know well I think I would prefer to take questions if there are any over trying to just like update the get thing restart everything and see if it works right and if you want to see it or talk about it then we can but let's just take questions now yeah thank you very much yeah yeah good question I may ask a follow-up question for your third question so your first question is his framework support coming for this it's currently under discussion so they there are certain places where you probably noticed that there's a lot of boilerplate that a framework can take away right a couple of spots that are very interesting are how do we how do we make it simple to describe a multi-tenant server HTTP security or HTTP security object so that you don't need to have that extra map and cache on whatever that we are doing right so something like that may come in the future other spots are are are more first-class representation of the client registration has a tenant says that does seem to be a relatively common pattern so there are places that that we're definitely looking at and I - our observation the second question is can you use scopes now are you referring to a lot scopes is that what you mean yeah so to a degree yes the what would I advise folks to do is to what degree that you can make the decision about multi-tenancy early in the in the request chain early in your query early wherever it is that you're in context of and so I typically recommend a dedicated attribute to describe the tenant I have seen too many folks that doesn't mean that it's not there but I haven't seen too many folks use the tenant has a scope since the scopes are typically more often used as like an authorization statement as opposed to a logical separation of data so I mean then can you sell me your set third question one more time I guess I still don't understand the question then can you maybe ask me afterwards yeah yeah okay other question it's great questions thank you yes can you say that one more time yes yes yes that's correct that's where I put it yep good good question the anyway that's I was gonna say a follow-up thing yeah so currently there are two spots inside spring security 55.2 that are using the authentication manager resolver one is resource server specifically because that was our initial point of research where we began to we began to look into this API and its utility and the second is in a generic filter that came out in spring security called authentication filter which I invite you to go take take a look at I anticipate the authentication filter will eventually replace our other specific like username password filters are those kinds of things that's what's happened on the reactive side and it'll probably happen on the server side as well that also uses authentication manager resolver so I'm great thank you very much and feel free to come up with other questions [Applause] you
Info
Channel: SpringDeveloper
Views: 18,599
Rating: undefined out of 5
Keywords: Web Development (Interest), spring, pivotal, Web Application (Industry) Web Application Framework (Software Genre), Java (Programming Language), Spring Framework, Software Developer (Project Role), Java (Software), Weblogic, IBM WebSphere Application Server (Software), IBM WebSphere (Software), WildFly (Software), JBoss (Venture Funded Company), cloud foundry, spring boot, spring cloud
Id: ke13w8nab-k
Channel Id: undefined
Length: 71min 22sec (4282 seconds)
Published: Wed Oct 16 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.