InstaBlinks: Understanding Cassandra 4.0 New Features

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] hey everyone this is tim here at instacluster today i'm joined by insta cluster co-founder and cto ben bromhet ben thanks for joining me g'day thanks for having me mate very exciting to to have you to talk about uh something that's been a big making a lot of noise in the in the market at the moment and that is the new release of cassandra 4.0 so um ben being uh the cassandra master of instacluster i wanted to get you on to to get your insight into this this new update and um to get your thoughts on on what this means for the for the broader community yeah no i'm super pumped about this one um actually and technically we're just in beta at the moment so uh you know don't quite go rush it out into production yet but um speaking of of production you know i think this is kind of shaping up to be uh one of the most battle tested uh you know rock solid stable um dot zero releases um that we've seen from the community since the project uh kind of kicked off right um so super excited about that um there's also been a ton of improvements that have kind of gone into it behind the scenes um as well more than more than happy to kind of cover uh that off but i think um for me what i'm most excited about is just seeing the process the community went through um to get this release out you know it has been a few years between drinks in terms of you know major versions um you know and uh you know i think a few people sort of prematurely reported that the death of the cassandra project but you know really really what it was is you know the community was taking a step back and saying hey what's what's important to us as operators people that have to run this stuff in production and that's properties like stability and correctness um you know and those kind of core tenants as well right so you know i think 4.0 is really shaping up um to kind of be the culmination of those efforts um and the new process that we've kind of put everything through um so that's that's kind of that's what i'm most excited about um is seeing that process in action um and seeing the results that it kind of gets yeah okay yeah awesome and and because i think for me and a lot of uh you know practitioners out there today you know cassandra you know when you think about the core tenants and the core benefits that it brings it's all about high scalability and and resilience right and so i guess when you're looking at um you know the what 4.0 and and you know the future kind of offers offers the technology what are what is making this the most the most stable uh version virtually yeah so so there's there's been some very concrete things um that the community has has done um around that and you know there's there's plenty of blogs out there that'll kind of cover it in super detail but i'll kind of go into it a little bit um you know and that is a focus on improving the ability to replay workloads and record workloads as they happen um on the cluster which is which is absolutely fundamental to being able to find bugs that are very hard to reproduce um it's absolutely fundamental to testing against known edge cases and known previous bugs as well so being able to replay those workloads super super important um as well uh plus it has the added benefit of just making it way easier for you as a user of cassandra to you know test workloads test improvements test changes to configuration right so we're really really improving um you know one of those core tenants of repeatability when it comes to you know testing and software development uh the other ones as well and this is kind of where we start to get a little bit more esoteric and a little bit more into you know kind of the computer science space but um there's been a ton of work that's kind of happened on property-based testing um so particularly the introduction of the um the quick theories library so that what that allows um people and we're kind of going right down into the weeds here but you know the benefit for understanding how this all works is that you have a lot more assurance in how central works right and so quick theory is what it does um is it allows you to kind of define on a test level um an input space and just say hey you know we're expecting numbers between you know one and ten in this particular field um it'll go through and generate automatically a bunch of edge cases and make it very very easy to kind of replay those right so it's kind of like uh almost think about it as almost like intelligent um buzzing if anyone's kind of familiar with that in the testing space um speaking of buzzing there's also been a ton more work kind of putting that into play um for cassandra as well you know a bunch of fuzzing um a bunch of property-based testing um and also a bunch of fault um injection as well right so deliberately causing faults within um you know cassandra itself and then seeing how it behaves and heals as a distributed system right um you know having said all this you know there's been a ton of effort that's gone into this you know it's software we'll still see bugs we'll still look at those we'll still find those and fix those in you know um as quickly as we can but in terms of kind of raising the bar uh you know i think the community has done an amazing job with this okay awesome and for the honey and uninitiated chuckling fuzzing can you yeah it sounds it sounds a bit odd but honestly it just it it really comes down to you know throwing a bunch of garbage at it and saying what happens right so you know throwing a bunch of garbage being able to uh record um where you got up to how you generated that garbage and making sure it's replayable you know it's not quite as simple as i make it sound but it's really really good for kind of finding those edge cases uh and some of those things that the developers might not have thought about but you know could happen out in the wild so it just makes for a much more robust system are there any um key kind of advances that you'd highlight that have been talked about for this latest release with regards to scaling yeah yeah definitely so you know there's been a number of kind of core changes um in in the project that have have kind of paid off in a few key areas um so there's been a real further push to adopt um the uh the neti performance networking library um throughout the rest of kind of uh cassandra as a project it's been used for some areas but it's now being a lot more broadly used um and because it is a high performance networking library um it has brought in some added benefits um you know kind of like you know zero copy streaming right so that's where you know um when cassandra uh starts up a new node so you're adding a new node to that that cluster the existing nodes um what when they start to provide data back to the node that's joining right so you know that new node will join like hey i need some data to serve some requests the other nodes will be like all right here's the data that takes time that takes um resources you know for the existing nodes to read that from disk copy it into memory and then send it over the network right and of course you know if your cluster's under load and it generally is sometimes when people want to scale up uh you know that adds you know even more load to the classroom can sometimes put it into a dangerous situation so with zero copy streaming what that means is we kind of short circuit or remove one of those steps around um you know having to copy it into memory effectively twice right so you know a copy still does occur um yeah but you know it's kind of getting out of the critical path in in that way that's that's my very high level layman's yeah you know description of zero copy streaming um there's a little bit that kind of goes goes into it um but what it does is it means that you know when you go to scale up and you go to stream you know those streams are going to be a heck of a lot quicker less load on the existing um infrastructure uh it also means that things like um you know repairs can be a little bit quicker as well streaming that that information about um and you know we're also seeing reductions um in you know client request latency not super related to zero copy streaming but still related to the improvements that that nettie library has kind of um brought the project so um you know not just some great improvements on you know the um the kind of correctness side of the testing side uh you know all our end users should see some nice little performance boosts as well which is always a good thing another complexity with regards to to running cassandra and production that you know when we're talking to existing users especially is around observability um and monitoring of you know the different metrics within cassandra that you should care about in a production scenario so i've just threw a little bit of research i've heard a few whispers that 4.0 is going to going to bring a richer tool set out of the box can you maybe share a little bit of insight around around that area yeah definitely so it's kind of kind of what i would say is you know the initial or the beginnings or the the seedling um of kind of a new approach to getting information operational information in particular out of cassandra and kind of into you know end users hands right um and you know one of those improvements is virtual tables right so for those that are familiar with you know relational databases a number of them expose virtual tables which is you know it kind of just looks like a normal cassandra table but you know instead of having data it'll have things like you know request latency um you know garbage collection times so it makes it a lot easier to integrate that information into the drivers and the drivers can start to kind of do stuff with that in fact a few years ago there was really really interesting paper around someone who kind of you know did that in a bit of a working proof of concept um and you know by exposing things like um how full uh you know the jvm heaps so how much how much memory cassandra is using on the java heap at the moment um by kind of exposing that to the the end client what the client could do is then go hey that node has real or that replica um has quite a full heap i'm actually going to route my request to this other replicas it's less likely to have a garbage collection pause at this time right so you can start to see how by bringing virtual tables exposing that information you know the the clients can start to become and the drivers can be a little bit more smarter about what they do with things plus you know you can also start to build observability tools in with that as well yeah and it so it sounds like that also ties in then with the overall with the overarching theme of stability of the new this new release right definitely and i i think you know not only is it about stability you know i would probably argue in terms of uh you know it's it's a lot more enterprise-ready right you know especially when you look at that audit logging that traffic replay you know they're all big box ticking items um you know for if you have socks or pci or gdpr kind of compliance requirements outside of these these areas of scalability and and resilience in the reach richie you know tool set sort of out of the box with with this latest release are there any other things you you think is important to highlight of the latest release um yeah i mean there's certainly going to be some additional things that will pay off in later versions um so i certainly think you know the support for uh java 11 which comes with um zgc or or the z garbage collector the z garbage collector depending on how you want to you want to do that um i think that'll be really exciting um there's been some interesting blog posts already around how the beta is performing um you know with different garbage collectors that are enabled um in java 11. so i i think that'll be exciting to see as well and you know i think you know anyone that's had to run um cassandra in production knows that you know fighting you know the jvm and fighting the garbage collector is something of you know a bit of a a black hole or a black box in terms of um you know understanding what's kind of going on um and so you know making that simpler um you know enabling other options better performing options um it's just going to be a huge win for people yeah yeah absolutely so ben i guess you know going back to to the start of insta cluster that can you know if you can cast your mind back to 2012 or 2011 when it was um you know but a thought between you know you and adam and a couple of other guys what was it about apache cassandra that you thought would be you could make a business out of first rule drew us to it was it it solved some interesting problems around how do you work with data that is spread across different geographies right um and and how do you do that in an easy to reason about way how do you do that in an effective way and i think you know what we saw was that the cassandra model um you know that that true active active deployment model um you know that that kind of um i guess where model where every node is a primary i mean can service any request you know it kind of you know that real dynamo model kind of really was something that was very attractive to it from a simplicity perspective and then the fact that it treated um geographical awareness and replication as a first-class citizen so it meant that you as a developer had to think about um how is my data getting replica um i replicated yeah how am i how is my data getting replicated um but not only how is it getting replicated but what do i do in the case of you know one of those replicas it's down like you have to think about that at the query level and it sounds like it's a little bit of overhead that you've got to pay up front a lot of databases kind of abstract that and say oh you don't need to think about that but you know in reality you end up having a much more operationally simpler um experience when you do think about that upfront as a developer and we first started using cassandra actually working um on a completely different problem than the what insta cluster was a completely different you know startup idea and uh and that kind of thing and we needed to use cassandra but we had to invest an inordinate amount of time getting that up and running getting it you know ready getting it in production kind of usable manner and that was just a huge time killer for a very a little startup like us but we had to make the investment and we did and we quickly saw that hey everything that we did may be more applicable to a broader market right we very quickly you know kind of pivoted in that direction um and started providing capability around that so that was kind of how we got started but but that was the things that kind of drew us to um you know cassandra in the first in the first place and so it wasn't just those properties as well as it was you know seeing how attractive those properties were to other companies that were solving similar problems you know most of them at a far greater scale than we were so you know netflix's and apples and um you know those kind of things um but you know seeing uh that it could solve those challenges um was was really really exciting awesome man ben will look thanks very much for your time enjoy the the sunny christchurch weather you're in and remember everyone to always be clustering you

Info

Channel: Instaclustr

Views: 15,942

Rating: 5 out of 5

Keywords: Cassandra 4.0, cassandra, database as a service, dbaas, open source, opensource database, open source database, apache cassandra

Id: UpiHIyxAD-Y

Channel Id: undefined

Length: 16min 24sec (984 seconds)

Published: Mon Sep 28 2020