Bootiful Cassandra Live Coding with Josh Long

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] [Music] [Music] [Music] down oh hi hello hi hey hey what's going on and john josh this is everyone will enjoy it trust trust me it's all right it's all right so all right we got a lot to talk about today um gonna dig into some code we're gonna dig into cassandra data modeling uh josh and i have been talking about this for a while and we i think we have a pretty cool setup today because um we're gonna do um we're gonna walk you through some beginning to end the live coding is gonna cover a lot of the stuff that you would normally wanna you know when you start using spring with cassandra um there's just some things you wanna learn and we're gonna have some fun with it too like more of not just like here's how you do it it's going to be like why would you do this and i think that's kind of important because um just copying and pasting is not the way to write code you want to understand it and um that's it's there's a lot of reasoning behind some code that you write for cassandra but let's face it we want to have really highly scaling efficient code that doesn't break um and uh josh i think that's kind of the name of the game today what do you say yeah and and also the name of the game a decade ago because as as i'm sitting here listening uh to you i'm just realizing that when netflix originally talked about the cloud native architecture yeah talking about microservices built on spring talking to cassandra that was that's what adrian carcroft and the team were talking about back in 2009-2000 yeah that's right and so it's there's a reason it's a good combination right yeah it's a great combination and it's one and it's interesting because a lot of folks say that cassandra was 10 years too early um maybe not i mean thank god it was there or else we wouldn't have netflix and you know tiger blood and all the other things that we have but um i mean i i think like what cassandra enabled then is more of what people need now every day you know um like a good example isn't we were talking about this before but like in retail um target has been pretty vocal about during the the pandemic how um they run they run their curbside pickup off of cassandra and as soon as the the pandemic hit they had to do curbside pickup like every day yeah and like at a ridiculous scale and it meant that and this is what i love this quote they said we basically had a cyber monday every day oh there's there's more than one monday per year yeah i like to think that and by the way for those of you who are tuning in from the other parts of the planet uh i i don't know if it's i don't know if target's outside the country i haven't really come to come to think of it i haven't really seen it outside the country but here in the states it's everywhere it is everywhere it's a huge huge um you know way to buy stuff it's it's a retail store yeah it's a really it's a huge chain in the u.s but you know there's there are huge chains everywhere and they probably have cassandra in their stack somewhere yeah and spring and spring and a lot of smart engineers to glue it all together yeah yeah exactly um cool by the way let's shall we get started on this yeah well i think the first thing we need to do here is uh we're gonna we're so we're gonna use astra um data stacks astro which is a standard as a service to get things going so why don't we get our database set up and then we can uh while that's spinning up we can talk about a little bit about what it is some data modeling topics and then we're going to just jump right into some code how's that work for everybody sounds good to me i think we i need to set up an account is that the idea i need to set up a database yeah so let's uh okay that's my screen okay let's get that let's get that and if you have a if you have an account on there you have an account on there sign in if you're gonna do it you use google it you can use google signature signal sign on so here we go here we are now you that we've got in there um do you want to just terminate it oh yeah just get rid of it let's start from scratch here so yeah while this is terminating so um this is the dashboard for astra this is uh pretty simple it shows you all of the things that are happening with your database because astro is serverless you pay for what you use this gives you a basically a snapshot of everything that is happening on your database the writes the reads how much storage you're using and data transfer those are all the components you pay for um so in a second here when this terminates we're going to go ahead and create a new database and um i think we could probably start now well we'll see um the uh it's always this is what happens when we do live codes like we got to do the setup first right um but the uh the the process of creating a database if you've ever used cassandra before or you haven't used cassandra before this is so lucky using cassandra just uh uh we can't hear josh can we i think he's he's muted okay i can't hear him there you go click that button can you hear me now oh okay yeah man i was like talking like okay is josh ever gonna say no i was speaking i was like oh what's going on well okay good thing i'm not the one yeah good thing you texted me and say i'm talking and you're not listening yeah um yes so astro tell me about astro first of all we're we're logged in to astra magically uh i've just just terminated my instance astra hosted cassandra that's the thing that's right yeah i mean yeah so if before astra you know there's different ways you can run cassandra probably the most popular was um up to this point was maybe renting a container so your docker poll uh before that it was download a tarball you know that that sounds great um so what what is the like i can still do those things i guess but what what is using astra by me like what's the uh this isn't the vaccine right this is totally different that's astrazeneca right it's very different yes uh no not the not the vaccine but it could be for your bad database experiences oh do you see what i did there see what i did uh yeah i know well it's meant to be to give you like any person like especially developers the ability to use cassandra without having to run it yourself right and i think in 2021 we should expect that yeah i'm all about not paying for i'm all about not running things that i can't charge for basically so like if i can't make money off of running cassandra if that's not my core competency i don't want to be doing it i'd much rather just pay somebody else to do that right right that's not the thing i am in business to do right in the same way that i'm not in the business of administering email service it's just not the thing i care about i'm gonna delegate that to some other provider if at all possible right whenever possible yes and i don't do puppet shows at my house i rent netflix right yes um so let's go and create a database then uh let's let's let's show how easy this really is okay what should we call it i think i like the bootiful i mean let's stick with it it works we're going to create a domain model having to do with customers and orders so can we just use the keyspace crm yeah that's i think that works um and then you can pick whatever cloud so we have google cloud and amazon web services in here and for now uh azure coming soon but when you click provider then you can uh then you also pick which region you want to be in and this is just for the the basic serverless um and as you get deeper into like more enterprise workloads there's different options but um right now like if you want to run amazon east there you go well as you and i are in the states amazon east is going to be proximal but of course relative to the world you might argue that emea is fairly central right yeah roughly midway through between uh apac and north america or the americas so okay ready steady go do it get ready for liftoff oh that's intimidating uh i have to pay that's that's what that's telling me that i'm gonna yeah so you selected the pay-as-you-go plan yeah for each person you get 25 free of usage per month and then there's a charge you don't have to put in a credit card to get it going um and it's it's pretty good for like a toy database if you want to use it all the time for various little things it's well under 25 um it's it's a pretty generous plan in that regard cool i'm done thank you for that uh it's gonna it's so it says pending okay yeah so let's let's switch over to some slides it's so it's what it's doing right now is it's setting up the infrastructure in the back end and um it takes a few minutes to do this and you'll get a notification on email it'll also show up on the screen this is reactive of course so um but what i can do is uh cedric wants you um once you switch over to my share and let me know when you're there and i'll walk through a little bit about what's going on you need to take over the screen sharing patrick please you need to show yourself oh i do need to share my screen see see it's a good thing cedric's here because we're a bunch of noobs yeah all right i'm gonna share now all right is that better yeah okay can you see us all right but more importantly does everyone else see it yeah it's coming let's let's let's get rolling okay let's get rolling all right uh so i just want to make sure big shout out to josh thanks a lot for showing up today you need to get a hold of him here's all your here's all your coordinates go find josh i work on the spring team i've been there for 10 plus years uh and uh you know just i i have a you know spring source email and all that stuff uh from the old days if i needed it but just that that one's nice and fixed and it stays steady so my direct messages are open so should you have questions when intellectually have questions don't hesitate to reach out to me i'm happy to help happy to engage happy answer questions i i don't doubt that i'm going to you know create more confusion than not so again just don't real don't feel constrained by our brief amount of time in this session if you have uh anything you want to talk about just hit me up and my direct messages are wide open as well for twitter so there you go there you go and um yeah so thank you very much for being here today i think this is pretty magical magic okay so let's talk about what we're doing today um so i as i mentioned astra is um really cassandra's of service let's just keep it at that but it's there for probably the thing that you don't want to do with cassandra is run it it's just operations running a distributed database yeah that like you said josh that's that's not your core competency then right we were talking about pain earlier right i i was telling you that i i stubbed my toe and cried a little tear of of of agony right and it just it's it it recalls the pain of running your own distributed database oh yes nothing i'm interested in doing yeah it's you know it's a competency that it's not unattainable but it's like well is that really what my business is right this is the this is the stepping on a lego of software right it's it's you can do it but why right you know we could also create our own data centers but why um and and also things that are probably things that you don't even want to think about like security making sure your data is secure and then finally we have a lot of really cool features to allow you to do application development and we'll touch on those a bit later on in in this live stream but um it this is it's meant to be a cassandra for developers and that's why we're here today um go ahead i just said okay oohy oh wait hold on so this is what it's made up of and so interesting facts fun facts astros really is is cassandra running on kubernetes and so you know we we dog food a lot of what we already talk about with cassandra and kubernetes in astra so um it happens but because we're using kubernetes we get that portability we can run it in any cloud and um so that that's kind of a key part of this we also include some really important dev tools using the stargate project um which i think is a really cool name for a project don't you josh [Music] yeah but it gives you things like rest graphql but you can use more traditional cassandra tools like a cql console on the screen we're not great with project names i mean don't get me wrong they work it's just that they're very uh right on the nose right so we have if you want to create a web application that does some sort of mvc web framework stuff you can use spring mvc if you want to do batch processing you can use spring batch if you want to do enterprise application integration you can use spring integration if you want to do security you can use spring security you see what i'm doing here you see where i'm from if you want to create boots then you can use spring boot wait that doesn't work at all well okay that's the exception but basically we're not great with names i really appreciate what you've done here uh with the uh yeah the theme can you see the theme there's there's something there's a theme yeah it's space space is a good thing um so the other thing that's that i think is really critical is running cassandra can be a pretty expensive proposition when you start thinking about building out a huge cluster that you need to support your cyber monday every day but we we now have a serverless version of casino running in astra that essentially allows you to just run to use what you need when you need it and then not pay for it later so that means if you're doing a lot of rights you pay for that right now when you stop doing the rights you just pay for the storage and that's a pretty radical shift in the way that people run cassandra and it's really super cool um it lowers your cost by a lot and so you can get the same wonderful cassandra experience at at least half the cost so um that's just something to think about josh yeah so what is that like what sorcery is that how is that even the thing so serverless did you just say like yeah well we've we've gotten into this point where we can because cloud native technologies are about supreme separating compute storage network uh we've componentized the way that cassandra works in our kubernetes cluster in the back end so we can charge separately for compute storage and network and when you're not using compute then you shouldn't pay for it okay and checks out yeah and if you're doing a standard sander cluster usually you provision say 10 machines in your cluster that are continuously running 24 24 7. um you don't need to do that anymore not anymore this is super cool so let's talk a little bit about i josh this is where you and i were really grooving on and this is going to be really helpful as we get into the code is just understanding the cassandra data model methodology if you will indulge me i cannot wait to hear what you're about to say and i can't wait to later on explore this and we can do that in the database that i've just got an email confirming has been created all right uh i'm gonna dig into this real quick and then let's do it all right just slightly good okay so all right let's talk a little bit about cassandra data modeling um it's it is a very important uh thing to understand at some level and granted you have to today you have to understand less than you did say a year two three or four years ago but there's a mode of thinking whenever you do cassandra data modeling and it's really thinking about your application first and um denormalizing data sets to be ready for application query time so you start with your application you build your models around that that application like what queries do i need to support it and then your data will follow so what would that look like in practice um something like like a to-do list like this is how you pull out those those queries so i need uh to list every single task i need to create a new task market task is completed or uncomplete these are all um actions that will require some sort of a data model to support it today we're going to do a little demo or we're going to use a database like you were talking about crm and ordering products let's think about like how how would we design this application and what do we need to support it um and then really it's from there taking that model then we can go through this like this mapping like okay we're gonna have this application workflow we can map it and then we can start building the actual physical data model um there's a lot more to dig into here if you want to do that there's plenty of videos out there but i'm just going over the behind notes and then when josh and i get into this it's going to be it's going to show up but this is what that physical data model looks like and it looks a lot josh it looks a lot like sql doesn't it remarkably so yeah speakers lisa and um but yeah yeah you were like oh this is sql no this is cql this is a cassandra career like awkward super awkward um the thing that's a little different here and i think that you'll notice right away is there's no there on the typing there's no size components and that's that's built by design it's pretty uh pretty flexible on size now accord boolean would you say that the web um i'm gonna say web scale okay yeah um i it kind of makes me cringe when i say web scale but you know sure can we just assume they all default to say scale web scale you know size web scale size equals web scale yeah so like your boolean could be huge it's like that's the biggest boolean i've ever seen yeah so we're gonna we're gonna walk through like building well i don't need this part right here um so whoops we're gonna do this yes we're gonna do this um we're gonna actually go through this process right now so i'm gonna stop my screen share in a second here so we're gonna we're gonna walk through how to connect to astra through spring so um let me stop the screen share and i'm gonna hand it back over to you and let's let's just do this josh all right it's my arm why don't you okay you can see my screen hopefully okay good so we are i've got a database here we need to build an app this isn't you you you just explained that we're going to build database views for our app use cases so it's an application database right is that a fair way to think about it so i need an app we don't we won't get very far without the app so let's build something right um we're gonna go back to uh start that spring get out this is my second favorite place on the internet anybody who knows me knows that my first favorite place on the internet obviously is production i love production you should love production you should go as early and often as possible bring the kids bring the family the weather is amazing it's a happiest place on earth it is better than disneyland but if you haven't been to production you can begin your journey here at start to spring ill and what we're going to do is we're going to build a brand new application uh using cassandra so i'm going to use the reactive cassandra support i'm going to use lumbok to make java just a little bit less tedious uh but by the way java 16 is now supported so that java 16 just dropped a couple days ago it's now supported on the spring initializer uh let me ever so briefly ever so briefly uh sdk list java it's always good to check the stuff on the day of your presentation i think that's smart you should do it live yeah always yeah okay so sdk install java that'll be fine don't worry about it it's fine yeah no it'll be great don't don't worry no fear we're doing it live exactly worst case i'll go back to 11 and you know it'll be fine okay so we got a new application i'm going to use cassandra i'm going to use lumbuk okay maybe i'll just add the reactive web something so that we have something up and running and very good i'm happy with that let's go ahead and hit generate that'll give me a zip file that i can then oh hey you want default i do too good stuff so let's see java minus version hey take your sweet time why don't you wow that was weird you're there oh so here we go there's our crm i'm going to open this up in my ide don't be alarmed by this little command this little alias uao is a little script i have on my machine it just means unzip and open it literally unzips the zip file and opens it in my id which in this case is intellij anything will work just fine mind you right it doesn't all that much matter what you're using so long as it supports uh java 16 they just released java 16 uh and um and maven and or gradle so we're going to use cassandra there's a there's default configuration in spring uh boot uh that draws from spring data and that support talks to cassandra and so had i a an instance of cassandra running my local machine we'd be all set but of course we need to do better right we want to connect to uh astra now as you just started to explain there's a wee bit more involved in running a cassandra on astra a little bit more scale a little bit more security a little bit more um there's just more it's a little it's a little extra so we need to make sure that we it's extra that's right yeah use that yesterday yes i'm gonna make that work so we need to make sure that we connect to it in order to do that we need we need to use the the spiffy new um astra sdk starter right so uh cedric was nice enough to help me he was nice enough to help me figure out how to do this i get cloned the code and built it and installed it at my local machine are they what is this will this be in the repositories at some point yes yeah now for now um yeah for now it's on my repo but we expect that to be on maven central pretty soon you just are a better user of this astra sdk so astra exposed the regular cassandra cql the one you just show but astra also exposed new apis rest api graphql apis pretty neat for people using javascript for instance and having a sdk is like i do the wrap up and read the specification of the api for you you own you now have a fluent api to to interact with astro so let's now that i've got that in my local machine and it's there it's going to do a lot of the it's there's a lot of work involved in connecting at first blush right so hello astra so i guess i need to connect right and so that's not going to do something useful unless i have uh some credentials or some sort of authentication is that is that fair so where do i go now yes you need to create a token so go back to the database and find the tabs name settings and there see you can now create a token so i want you to create a token with the uh credentials like database i mean you know no fear again and admin user yeah it's okay admin user is totally fine yes you created the token and now you know you download that csv uh you make sure that you copy the values somewhere because they won't show up anytime soon it's quite secure so please use that save that token somewhere neat uh this should be a fairly uh common practice if you're doing any kind of cloud databases you know you need to be able to store the credentials and create tokens to do this so um this is just a good general practice careful where you put it don't put it in your git i'm going to terminate this later but it's expecting certain properties in a certain place in the spring boot world you can start by putting your properties on application.properties and there's all sorts of other properties to which spring boot will respond and your ide will no doubt know about some of them if you're using anything in anything like visual studio code or uh intellij or clipped or anything they all know about the properties sets that are available but what properties are are supported by the astra sdk starter uh because you imported the the no because you imported the the springboot starter you should have autocompletion available so you you you should go to application.properties and start typing astra dot and you will see that you will need to provide the database id the cloud region uh your token and the key space database id where do i get that from here where do i get that okay it's so database id you would go back to your password yeah there you go database id is a cluster id in this case good and astra application token correct would have that that's this bit right here with the astra cs something right okay and we want astra dot cloud region uh that's that's uh yeah where are we pointing your code it's right there one okay okay and that's it that's it walk away yes you can sit here and put the key space yeah the key space as well eventually okay now keep in mind these are properties in spring boot i can support them i can plug them in through application of properties but as patrick just said as you just advised you shouldn't check this into your your git repository so keep in mind spring boot will do just fine if you store this outside of the jar or if you put it in an environment variable so for example you could have an environment variable that looked like this and you know in your shell script before you run the program if you said export austria key space that'll be the same as providing it as a property in the property file right so uh that'll and this this actually you know if you check in the code don't do that but if you check in the configuration here and you have like a default local cluster or whatever that's for whatever and you want to override that at runtime you can just use environment variables and those environment variables take precedence over whatever is inside the jar um when you're in the program so you can plug that in you can have production credentials that override default inbuilt credentials for example yeah you know yeah and the token as a matter of fact the sdk the sdk itself looking for uh system variables environment variables and only then if they are not provided we look at application.properties ah sweet nectar so that makes perfect sense so both the spring boot default and the austin default is to prefer external configuration okay so it feels like i got something here what's what's next can i just start it and run let's just see if that works uh yeah it should do you have the astra clan bean yep should be okay let's let's we'll see some magic happen java 16 is not supported oh no wait for it don't worry we got this so it i knew it was gonna come and get you all right just a quick configuration change nothing to see here oh come on do i not have uh except how do i get uh [Music] well okay sure apply okay will that run will it work these are the questions it's an indexing jdk 16 what could go wrong i think we got a fast hey it's compiling come on so close at least i mean it's it's fine i can just do 15. everybody's happy it'll down convert yeah it's it java has that neat trick where you can run despite a week's worth of time and changes quite unlike the node.js ecosystem okay okay what is that it's sorcery so maybe reimport module settings oh oh oh 15 retreat it's bad television this is live coding yeah nobody else is seeing this is fine no sorry good i'm using java 15. uh don't do that right you should be using java 11 or the current support version of java which is 16 which i'm sure will work if i just spend a second on it later but certainly not now not before all of you uh while you're waiting patiently um so i've just restarted the application and look a fair amount of time look at that six seconds what's going on so it looks like it's connecting to a lot of stuff what's it doing there indeed so you know astra is a cloud service and you will connect to astra uh you need to connect to a star you need to open a two-way tls communication to do that you need strong authentication you need certificate this is the only way to be sure that you are very uh strongly authenticate and so you need to download some kind of zip locally and to enable that that connection the sdk will do that all for you um and it will connect to this cassandra for you now you are ready to go you do have the sql session open and connected to all your nodes on the astra side you're good to go but indeed as you stated when you start a stateful app you need to connect to the database and it takes a few seconds okay so behind the scenes that makes perfect sense right it's a hosted cloud database i am in theory i'm talking to something in western in in virginia in the united states which is 3 000 miles away and it's doing a lot of stuff so that's fine makes perfect sense it's downloaded something called a bundle that's the token you're talking about that's the certificate and all that stuff that i need yeah it's all the goodies that need to be there to make a secure connection um what's really cool which cedric has set up here is it it just does all that behind the scenes that takes a little time because i have to download it right you could do this more manually you could downl yourself and put in the parameters so if you wanted to make this faster you know for boot time then um you could just do those those steps manually this is a very hands-off type client i like hands off so okay so now we can focus on data modeling and data right that's what i care about let's build an application yeah okay so here's what i'm imagining patrick and cedric i'm imagining uh a nice clean uml diagram that looks something like this you just tell me how close i am to the finished product okay just you just stop me when i'm ready for production you just you just tell me when i'm done okay so i've got a customer object nice clean object with a key a primary key like in my sql database and i've got a string name and i'm going to have a one-to-many relationship with my orders now again just stop me when i'm done okay patrick just tell me when i'm done uh with this description with this code and i can i can just go ahead and hit deploy tell me when that's that moment i think there's a couple of things in here so you yes you're doing a one-to-many relationship so one customer has lots of orders right yes i think one of the things we need to stop and just for a second to talk about is is um the id number um one of the things that's really uh important around like a distributed system like cassandra is we rarely use integers for id numbers we use uuids okay so this is so this you're saying that's not going to work okay i'll just uh it'll work but it's hard to generate a sequential i um integer um in a distributed system especially large scale like if you had a thousand nodes how are you going to synchronize that sequence uh yeah it's very expensive yeah so so that's that's actually the thing that like let's just stop here for a second and call this out this is this is how we learn to scale right um it's this is a a moment where we think about what we're building and why we're building it and so when we use a uuid we're embracing this idea that you know we're going to be using a distributed system and those norms mean we just stop doing coordination because it's really super super expensive so let's just keep going but i think we're going to find more of those okay so but but otherwise if i just change this to uuid i can have my foreign keys and that works just the same as before is that that's true right i mean yeah we're what we're going to do is we're going to denormalize and map those into a customer table but keep going you're doing good okay so it's going to be a denormalized thing something like customer orders is that more that's the speed okay good so private uuid uh id okay that seem legit yep okay okay um customer id because we're going to want to have a uuid for customers right right and then we're going to have um a uuid for an order okay order id and then maybe the name of the customer something like that no yeah exactly so now we're gonna this is really what we've just done here is set ourselves up for a one-to-many relationship in a denormalized table so this is correct from a spring standpoint so now we just need to create the data model itself so why don't we just do that okay so i'll go back to here and uh go to the console cql console yep yep i got this and so when you're in cql console you're gonna um this is the command line for your database and one of the first commands you're gonna have to do is switch to that schema that you created or that database or key what we call key space and if you recall that was called crm so let's just use a use crm okay so now we're in there now we can create a table okay so i'm gonna go ahead and just write regular sql code you just stop me when i'm doing it correctly okay just yep stop me when i'm done stop it if not exist yep by customer right since we're creating a thing that's not two separate tables it's one table um i need to create that table if it doesn't exist and it's going to have a customer id that's going to be a uuid see is that checkout okay looking good can i have an id an order id which is a uuid yep a customer name which uh which is gonna be uh a text field and it's gonna have a key but this is uh okay so this is a prime prime yeah primary key uh and the what's the primary key patrick what's what am i doing here okay so the most important thing you need to know about the cassandra data modeling is the primary key and what it means so what we're going to do is we're going to we're going to create a one-to-many relationship here the the one is the customer right so there's one customer multiple orders right the way we describe that in a primary key that that customer id is what the first thing in our primary key which we call a partition key yeah and a partition key what that does is it allows us to this is what happens in a distributed system and a lot of them do this but the partition key is what's used to actually locate the data in a large cluster so it's kind of like the address of where that is okay so cassandra is a distributed hashmap more or less this is the key to that hashmap so so the key would be the customer id customer id and then the specific address of the particular records record would be the order id right and we call that a clustering column and so when you when you use a partition key with a clustering column that is a one too many relationship ah so what if i want to have multiple like compil what if i want to have a composite uh like partition key like yeah so if you want to have a customer id plus something else as a partition key you just put parentheses around it okay like that yeah so you're basically you're protecting that partition key with parentheses now you can leave it like that that's perfectly fine just it's very explicit at this point but this is the like a left joint we're just skipping straight to the left join state but as you know whenev whenever you've done an f join like a sql database you get back for every row in the end part of the left join you get back redundant copies of the data in the the one record so the customer name for example is going to be am i storing the customer name multiple times like what am i going to do there i don't want to have to store it for every single record of my orders well all right so if you wanted to just keep that one record i know where you're going with this it's like it seems like now what i'm doing is i'm just repeating myself over and over again um the first of all the thing that you should know is that um the compression algorithms and cassandra are pretty efficient and it won't be that much of a waste of space if it's repeated but in order to be the most efficient using something like a static here so customer name text and then at the end put static what that means is that you're only going to have one of those per table or per partition key i'm sorry okay anything else what what what goes at this at the end of this um you can now in this case we're just using uid so order isn't as important but if you wanted to control the order of your table um let's say the order ids were a sequential number or maybe they were a time stamp um you can use this uh you can use an a there's a clause you can put in there to change the order so um yeah with clustering or clustering order by is what it's here you go order by and what that means is that you can say hey this particular cluster in columns instead of being normal collating order like ascending make it descending now with uid makes no sense because it's pretty much a random number right but um again like if you're using time time series data really loves this like you can reverse time order things in your table that sort of thing um so you can yeah you could just say descending okay like that yeah and when when you do this you're what you're doing is you're because you're building the query ahead of time for your application you're adding adding an order by in that query initially so whenever you do a query you get the correct order can i ship it yeah oh that's so satisfying what just happened okay so so let's uh you can yeah let's just do a select on the table or you can do it yeah you do describe on the table okay i've got a table need okay so select all from and that'll just give you a nice blank thing so coloring the red means it's a partition key the teal the shark color is means it's a clustering column and then uh because it's white it's blinding light bold that means that it's a static field and these are also uh if you look at it red blue and white the french color you know the french color [Laughter] see so good so let's insert some data here real quick okay do you want to do that in your app or do you want to do it manually always in my app it's where i start it's right so let's do it there yeah all right makes more sense so okay i think we need to we need to talk to that database we've done the hard part i think we've got the model now i think we need to take our java code and map it to that right and so uh we don't want to just want to take i'm going to use spring data spring that is an umbrella project that has a number of different integrations with a number of different database technologies none so nice as cassandra of course right cassandra definitely the best you know thing to say at this point right the best it's i don't know there's obviously there's context but for so many different use cases cassandra's a great choice we want to use the integration with cassandra uh and uh so i need to map my object to because we expect to have a lot of customer orders so we need some web skill right yeah exactly scale is very important you know you don't know about scale well you do what's that who else knows about scale everybody loves scale right it's like if you think like who uses cassandra if you have a cell phone on your hand it's probably backed by cassandra right now is it an apple phone yes um and just recently like huawei did a big presentation at apache con about how pretty much all of their their back-end systems for huawei are running cassandra i mean i think between apple and huawei there's only a couple of cell phone providers left just well that does that does account for literally billions yeah and they're using a lot of springs yeah exactly both organizations uh okay so now what i've got these id's i need to map this to a primary key column right so there's these nice annotations in spring data from mapping and i can use a i want to say that this is a column called the customer id uh and what else is this what is this ordinal business um that just points out like the position in the primary key so like i said the position when you look at go if you were looking at our data model and says primary key right it's the position the first position in the primary key is always a partition key okay so if we have an ordinal of zero that you're you're basically saying yeah this is the partition key cool and the type what did i just do what what's uh partitioned yeah okay so got that and we then we could change this to order id and this doesn't seem like it would be yep and then we're going to change that to clustering can do all right good so that's the that's it that's the whole thing i guess i need to create a repository now uh and to support that i'm going to create um a beam here so i'll say or an interface rather i'm going to create a repository customer repository extends reactive uh cassandra repository now that's a lot of words sorry i said that's a lot of words uh when we talk about uh spring and we talk about reactive programming reactive programming is a way uh to address three main concerns um that you typically face when you start building cloud native services uh ease of composition robustness in the face of topology changes service outages and so on and resource efficiency so composition efficiency and reliability or robustness right and reactive programming does that by giving us a programming model that allows us to write code in such a way that we are never ever sitting on the thread waiting for the next byte to arrive we can ask for bytes from our data source from our our socket from whatever we're talking to but we don't ever just sit there waiting waiting waiting we move we jump off the thread relinquish the thread return it to the third pole let the runtime do things with that thread uh in the background and as soon as that request that we've asked for arrives then we get put back on our thread we get to process it and as soon as we're done we jump off the thread again right so we're never ever just idle we're never just sitting here you know waiting for something to happen and this allows us to keep our cpus maximally utilized right we want to keep our services hopping always uh you know close to 100 utilization um so reactive programming requires that you rethink the way you write your uh uh your code it's not going to be just you know the regular collections or arrays right these are different data types but the benefit of that is that it's just one kind of thing so no matter what the shape of your data whether it's uh you know two records that are that arrive immediately or if it's a thousand records that arrive uh over a slow trickle over the next year or if it's a billion records that all arrive in the next second it doesn't matter you still use a reactive stream type called a publisher so um i need to map this repository to these reactive apis and this interface has support for those types in publisher uh flux mono these are all reactive types mono and flux are specializations of publisher publisher is kind of like completable future except that you can have more than one value so instead of getting a callback for the availability of just one value you can get a callback for every value that comes after that as well be it an unlimited amount of data or just five records or anything in between right or you know zero one two five a trillion infinite etc so we need to also describe in spring data this is the entity this is the primary key type and that primary key in our case is just a primary key class so primary key class and of course it's going to have these two things right there we go so i'm going to put those in there that's the basic repository and now i can use that to write some data to the database so i'm going to go ahead and initialize our database uh here in this application runner this is a spring bean it's a callback interface that will be run when the application starts up so here i'm going to just initialize some data but before i do that i want to delete everything so i have a clean workspace so to speak i'm going to delete everything but notice that what i've done here is if i call delete everything if i say to delete everything uh it returns not void but mono a void a wrapper an asynchronous wrapper around void in order for me to actually see this do anything quite like the java 8 streams api you need to invoke a terminal function as they say so i need to say dot subscribe the trouble there friends is that subscribe itself uh is asynchronous right or it can be you don't you don't have any guarantee that this line will succeed and finish or whatever uh before the hello gets printed so you might actually see hello being printed and then this the results of that subscription so [Music] then we're going to write some data to the database um and what i'm going to do is i'm going to create just a reactive stream of names uh and we're just going to create some sample data here so private final string patrick equals patrick uh we want um uh josh that's josh we're gonna go for who else is uh neat uh we've got uh let's go for some spring team engineers she's a legend we've got uh uh she's another legend so you know some some spring people and some data people and me i miss big people kind of so we're gonna go ahead and uh put these people in a list here or a stream all right an in-memory string stream literal uh and i'm going to say for each one of those names i want to take each one of those names and write it to the net to the database and i'm going to write a bunch of orders i'm going to synthesize some order so i'll say add orders for and i'll just pass in the repository there and i'll pass in this name and so we'll go ahead and create that method there and what i'm going to do is here is i'm going to actually just i'm going to add some i'm going to make up some data you know i don't have real data i'm not cool like that but i'm going to go ahead and synthesize some um so i'm going to create a method that returns you know 0 to n orders for one customer um so i'll do this i'll say var customer id equals uuid randomized random id right and i'm gonna create a list and i'll say a new arraylist of customer order okay and uh four of our i equals zero i is less than some comment some random number there okay whatever uh and um then we'll just do list dot add new customer orders uh passing in the customer id there for the first thing passing in another one for the random you know for the order id and then the name right so i'm actually just creating a list of orders adding it to the list uh and then i wouldn't actually just write the whole thing out so i'll say return repository dot save all passing in that list of records right so that's going to create a reactive stream of the results of that that have been saved so um the result is that this operation i'll start with a bunch of string names and i'll get back a stream of persistent customer orders okay and um that's those are the rights so now i've got two different things i've got the delete right i've got the rights and then what happens after the rights well i think i want to just see that everything is working so i'll get all the data back i'll say okay repository.find all and you get these free methods that are supported by that repository so let's just see what that looks like um oh okay good so now i've got these three things one has to succeed the other delete first then write then get all the results and if it happens in any of the sequence it's indeterminate and things get weird the next thing you know uh you know your your schrodinger's database and it's bad so i'm going to make sure i use these operators delete then asynchronously and potentially you know a different thread then write then get all the data back and then finally as those results triple back in i'm gonna print them out right um and you know i could the the subscription there i have an overloaded version that allow me to operate on the data that gets sent but what i what i want to do is i just want to print out every single customer order that gets written here in the um that it gets returned in the all stream there so i'm going to say system out and this is going to be the repository and i'll just print out the result there so co.2 string let's run this i think that'll give me something hopefully i'm waiting for it just click it yes i mean now that you're running like java 1516 yeah 15.5 upward yeah that's all right there's the data so that worked right we've got the data you can see it's like you were saying earlier it's denormalized right so i've got multiple customers but i'm not actually i'm not storing multiple customer names i'm just it's the database is giving me that because it's smart it knows what i want but that's what you're yeah you're storing it and then what you're saying please store this for me the database is like i i got this yeah so i got multiple customer ids that are all the same for a given name over here but different order ids so that's that's cool so i was able to read the data using the repository this the next thing about spring data repositories is that you can actually provide your own finder methods these are custom queries based on your your particular use case so suppose i wanted to create a finder method to just find data by its customer id you know i could do customer orders find by customer id uh and then pass in uuid customer id right and that would actually synthesize the name of the query it'll derive the query from the name of the method and i can just use that so let's try that instead here um instead i'll say i want to use one of these ids that we've just uh persisted here so i'm going to go um i'm going to say rights dot take define the first one and then i'm going to get the co i'm going to map it to the customer id and what does that give me that gives me a stream of uuids that has one result in it so i'm going to say given that one result i then want to find that object by its customer id i'm going to actually then ask the database using the repo repository so uh then i'll say um flat map uuid repository defined by customer id passing in uuid and this gives me now i should it's even though it's a stream that is potentially unbounded because we've only taken one result it should just be the first result that comes back it doesn't matter which one so we'll actually get just by id there and in this case i want to print it out as well say okay by id do on next [Music] we'll say all here's by id okay good stuff let's run this again okay i love seeing that recompiling this is the hardest part is the compilation the recompile is the hardest part i think there's a song in this there is and it should be okay so buy id there we go i only searched for the first id that you know that i just any random id in the stream and you can see all the other ones are there right but i have just one by id here and that that's results there so that's good i can use the repositories to get easy access to my database but there's actually a lower level api that we can use here called a reactive cql template so if you don't like the high level repository and you have some particular use case that is better served by low level access to the apis you can use reactive cql productive cql template okay and in order to build this instance you need a reactive session factory so reactive station factory and we can just inject this dependency this beam we're going to inject it here template and we can use that now so what is that going to do well that's going to give us a lower lower level api that we can use because i'm just pulling data out of the database we already did the the we reset the data we're deleting deleting the whole database that's not slow by the way that's not fast by the way i'm just doing it for our demo um then i'm going to write some data and then i'm going to get all of the data just to show that i can i'm gonna get data by a particular id but then if neither of those approaches works i'm gonna now do a i'm gonna use the reactive sql template here uh template dot query and i can just actually issue a low level queries i can say select all from orders by customer is that right and i can actually provide a row mapper right and this row mapper is going to return oh i see what you're doing here yeah okay get a little more into it instead of just auto magic right i can actually control how i map and to what i map the results so i get i'm given a low level row and you know i can in this case i'm just mapping it to the mapped object but you could map it to a dtl or a view or something like that so row.get uuid it's called customer id well you know this might be a good point to bring this up to josh you know one of the reasons we use uuids is this asynchronous type coding that we're doing here right it not only is important for distributed systems but like when we need order we use that partition key with cluster column because of like any of one of these uh rows of data could be in be stored in any server in the cluster um there's a lot of like first of all that's a superpower that can make your queries go a lot faster because it can run in parallel but it's also like this order dependency like you're asking for a specific customer id and then all the orders with it that will be in order but if you ask for a group of customer ids there's no order guarantee in that right so let me rerun this again now i've got a cql query just directly so i've got three different ways of pulling data out of the database that i've then written and by the way this the one thing i didn't really show here is when when we wrote the database uh there is all three nodes tried for the query failed what does that mean select all from orders by customers orders by customer is that not valid syntax customer um what was the error all nodes failed exception failed to interrupt compactions something said something happened there let's retry yeah let's try that i again it's my wi-fi neat yeah it's just my wi-fi or something okay so we've got now three different ways of pulling the data out using spring right and that's fine we didn't talk about rights a lot i mean we saw that i wrote some data but what happens if i want to do an update one of the nice things that i uh i'll bet people would like to know about is that you can actually create a a a version field right so private there's an integer i guess version and this will how does that work it's called something called optimistic logging what is that well optimistic locking okay so then this is um this is a one of those things that whenever you have an order like you have a guarantee that you need to put into a distributed database there are mechanisms to allow you to do this we have this thing called a lightweight transaction which essentially this is like whenever you need to check something like do a read before write so check and set a lightweight transaction also sets up an exclusive operation and a cassandra cluster using paxos so for instance if you need to make sure that you're inserting you're the only process um that is inserting data into a you know for this record like you don't want to have two josh's in there just one there must be only one um yeah it's enough yeah that's plenty um but it lightweight transactions what they do is they create a paxos lock to make sure that it's exclusively writing that record so if any other process comes along later or in like a microsecond later it says i want to write a josh record two it'll get blocked with an update it just makes sure that that record exists before it updates it cool uh cassandra has this great feature that it doesn't check it just it if you say update it'll just update it it won't look to see if it even exists or not and if it doesn't it'll insert it for you that's yolo mode it's called upsert but yolo works too yeah okay um groovy okay so we've got that there's a lot of cool things we can do here now this is all sort of in the spring ecosystem but again there are so many other views or accesses uh for the data so here i'm just using this stuff but let's suppose that uh we wanted to actually talk to some of the other views of the data so let me create another listener here application ready event okay uh astra client spring data and here i can actually inject the auto configured astra client which gives me a whole bunch of superpowers that go above and beyond what the spring data integration has so here i can uh i can see astra client gives me access to the rest api so i can in theory just talk to the database through rest which is very convenient right i don't have to use spring data if you're not using spring and again don't know why that would be but if you're not using the spring you want to talk to the rest api there's this that's cool you have the devops api which allows me to administer my account my astra account i can actually you know find databases delete them create new key spaces do all that kind of stuff that administrative stuff uh from the command line from the uh from the code dynamically in response to whatever i want um there's a document api so if i want to treat my cassandra document database i suppose that's what this is for is that the idea yeah so um because we're using stargate to be a data service api gateway um stargate has the ability to create document models a document model data model on top of cassandra so if you're gonna do um if you're going to do like something like traditionally what document models do which is just store a blob of json as a document um we allow for that inside the api this is a pretty radical change for cassandra where now we're this is has um yeah you're checking out all kinds of stuff here either stargate um whenever you go into uh into document mode you're no longer in data modeling mode you know you're just storing a json blob and you're using it that way um i think there's a lot of databases i will will not name that do something similar so um it's a very it's a very different way of doing cassandra and probably a whole talk in itself right there oh yeah absolutely and it's open source so i encourage people to check it out this is useful you can use this it's not exclusive to ostra is that true like i can use this yeah um it's actually a part of the kate sander project which is uh is cassandra on kubernetes there are other databases that will be supported with it as well so yeah it's not exclusive completely open source and we love you in the community and there's actually graphql support and there is right now uh spring projects experimental here there's actually experimental work that we're doing around yes you guessed it graphql graphql because it's the cool kid now right and i can also do i think i saw something in here didn't i where would i have seen that uh not in here yeah so you go to connect okay oh i see oh you see go to graphql uh yes and start broadening this page there is a graphql playground for you there this one look at that so i can actually interact with my because of stargate i can now interact with my astra data in terms of the graphql uh apis that is so cool that is really cool and of course graphql is great because you can federate it so now not only have i accessed not only have you solved the access problem here but you've also solved the federation across multiple discrete services problem right just in one fell swoop that's so cool yeah that's you know i the thing i i have to appreciate about graphql is it it allows you to be a little more expressive than rest but gives you some of the same modes yeah i can see why it's there there are some really cool things happening in graphql um and i'm happy that it's supported in in you know out in stargate and going forward it's just another way of getting data out of cassandra but man it's it's got a huge potential huge amazing really really really cool that you've got this um what if i want to create a api what if i'm not using graphql but i do want some sort of schema for my api is there already something so yeah you are in connect so now go to uh either document api or rest api so just the the tab above or below graphql yes okay now you can click on the swagger ui now so just an employer here you go and now you can use any no all the endpoints in the web just providing the the app token you generated before and you can play with that but as we told you the sdk will wrap up all that calls for you that is so cool so i i've got so many different ways to connect to my database from low level java code all the way up to like you know just give me a swagger endpoint and i'll generate a client automatically plug in a token i'm off to the races that is yeah yeah it's really uh it's it is meant to turn cassandra into more developer friendly database and like if you're writing spring code you can write front-end code at this point and interact with cassandra but um it's just a gateway it's just a gateway but i mean that's i think you you touched on this early on is my business is not running a database so why don't we just use a data service well your business is technically yeah i know for data stacks right yes it kind of is you know like i can't say that with a straight face but you're you know you the royal you is probably not [Laughter] all other people out there listening you're probably not running a database company knocked wood yeah and i i don't recommend doing it it's hard yeah look it look at him just look at him he's 20 27. that's he's it's just completely ruined yeah you know what i i do it for the people [Laughter] yeah and it's a good time so we do have a pretty good question i would like just to enter once again regarding stargate you know we had that question okay stargaze is a gateway but is it competing with existing api management system you know like apg uh software g layer seven all uh cong all these api management system well stargate is simply exposing apis on top of data but we do not provide the features from kong like monitoring throttling stuff like that so both are complementary not it's not one against the other oh that said if you are looking for an api gateway so this is where we get into some trouble because this is a data gateway it's going to give you apis to talk to your database your cassandra database uh there's also network gateways which is all about the routing of bytes and the wire but there's also api gateways which deal with things like uh you know incepting a token for security purposes doing resource servers doing retries routing all that kind of stuff for that you should definitely check out spring cloud gateway which is like kong or apogee it's 80 of apigee it's it's kong you know it's 100 of that basically uh and it's uh built on the reactive api so it scales very well you know it scales it up to you know uh the same extent as the code i just wrote well that's um we've just built like this whole story around scale right using spring cassandra your scale's not gonna be the problem it's not the problem anymore no fine there's other problems man think about those yeah understanding your customers and building the right product let's make that the problem you've got you the the datastax team have a whole team of people that have worked on making cassandra easy to operationalize and it's already notorious for being infinitely scalable right it's that's the the thing that stands out about cassandra it's always been just scales never been the problem with cassandra right that's not the issue running it wasn't fun but you took care of that with hofstra so now what's the complaint yeah i well i'm sure someone will say maybe potentially it's having to do data modeling and maybe if you just want to use a document model and not have to do data modeling you could do that sure yeah so that's true well i think we're at a point where we should wrap things up yep all right um cedric you want to share my screen or maybe it already is i can't tell uh i learn more who's sharing i think it's i am sure so it is great all right i'm going for it um i just want to wrap things up thanks everyone for being here josh long spring god thank you very much for being here um thank you remember josh at joshlong.com at starbucks man patrick so great to talk to you and cedric uh so great to talk to you uh patrick actually did an episode the poor schlub he did an episode um of the podcast and so did cedric for completely different reasons at different times completely unbeknownst uh to each other i think um join me there for i every week i talk to somebody who's smarter than i am low bar granted but it's a it's a it's always good conversation there is a really good conversation so just a reminder if you look on the side uh astra you can try it for free all the time 25 a month i mean that that's pretty useful that gives you 30 million reads 5 million rights 40 gigs of storage yeah that's a lot um like we're a little crazy about this but all right um but also on the open source side we really encourage you to get involved in the apache cassandra project uh we have kate sander project if you were interested in running cassandra kubernetes the stargate project which i think we really showed off a bit today um another open source project we'd love to get involved um again with datastax developers uh youtube channel if you wanna hit subscribe there's always good content in here i'm sure we're gonna get josh back because i think we've found a couple more things to talk about start that spring that i own for all your spring boot uh start starters and then um spring that i o for such video where you'll find ample videos on you know cassandra and data and all sorts of stuff all things spring we have more subscribers on the spring you that spring that i of video takes you to our youtube channel which in turn has more subscribers than the official java channel for example it's very very very active very very popular so join me there you know join us all there that is fantastic all right everyone thank you very much for joining us and i think we're done here one more thing oh one more thing i just remembered today we just announced this that the call for papers for spring one our big tentpole conference is open so if you want to talk about spring and cassandra we would love to hear from you submit away please i know what cedric and i are doing after this oh yeah i hope so yeah i might have a sdk and a starter to propose maybe it'll be a great time good reference used by carson okay accept it all right cool all right well now that everyone's gone going off and filling out their cfp i think we can say goodbye thank you very much all right thanks everyone and as always don't forget to click that subscribe button and ring that bell to get notifications for all of our future [Music] i upcoming more room no more formats to [Music] change [Music] [Music] oh [Music] you
Info
Channel: DataStax Developers
Views: 1,304
Rating: 5 out of 5
Keywords:
Id: vNwCM-YKQN4
Channel Id: undefined
Length: 78min 40sec (4720 seconds)
Published: Thu Mar 18 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.