Introduction to NoSQL Databases

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] i found someone new now i have more room no more formats to [Music] change [Music] yes [Music] [Applause] [Music] perfect [Music] hello and welcome everyone to today's workshop uh we are doing an introduction to nosql databases i'm ryan welford and i'm joined by alexand alexander veloshev hello alex how are you uh hi ryan yeah doing pretty well uh it's volitionif but you are doing even better every time don't worry don't worry i'll get it one of these days okay one of these days um if everyone can hear us and see us okay why don't you give us a thumbs up in the chat so we can uh so we know that you are uh are hearing us all right um us set the thumbs up on the youtube chat and we know oh yeah it looks like we are live and loud and clear very good very good all right let's get started let's do a little bit of housekeeping today um so as you can see we have our live stream we are on youtube and twitch we mainly uh focus on the youtube chat which is kind of our backup in case the stream on youtube fails but we are active in the chat and we have uh people answering questions so please if you have any questions feel free to ask them we also have our discord channel it's the fellowship of the rings on discord you can see the url there and we can we can drop that in chat as well that is kind of our long term um uh community where you can can ask questions um because you know with the discord or with a youtube sorry um the the chat goes away after the the workshop so we can't uh really address those questions after the workshop is over so if you have um maybe longer form questions or or want to uh engage with us after the workshop discord is the place to go um today we are going to be going over uh nosql databases and we're going to be using datastax astra to illustrate a lot of those concepts um data stack astra is a managed cassandra in the cloud uh cassandra is a nosql database and uh it is a very uh useful useful tool it is completely free um it's no cost to you and we'll get we'll get into more details on that shortly um and we also are going to be doing some interactive portions uh with mentee uh we're going to be doing a couple quizzes the first quiz is kind of a get to know you quiz and then we will have a quiz at the end um to test your knowledge and you have a chance to win some swag from that all right yeah and we ship everywhere we ship worldwide yes people often ask us like do you send it to my place and yes we do we ship worldwide so stay learn participate try to win like only top three places win so yes you have to be quick and give the right answers and to do that you have to follow us on the workshop like they're referrally but if you do you definitely have a chance yeah so if you want to get another tab open or your phone out and go to mentee.com to prepare for that the first the first part of that will be up here pretty soon uh moving on we have um some badges that we give out for attending our workshops um as you can see as an example we have some some badges up here um these are you know ways for for you to kind of show off the uh the skills that you're learning um and and uh can use it to your advantage on on your profiles social media profiles um we will assign a homework for this uh for this workshop and after completing that homework uh we will assign you a badge yep some of you doing so great job attending our workshops learning new skills we run those workshops weekly by the way so we really want to highlight this desire to learn something new and be better engineer developer administrator or whatever is your job so we really want it to use that's participation participation certificate you get if you first follow a workshop and second do the practice steps on your own you don't need to install anything we have prepared everything for you in the cloud then you get this achievement unlocked on and share it on your linkedin page like a little bit time to break but well that's a good reason definitely all right so let's uh go on to our first mentee so pull out your phones go to mendy.com let me uh switch this over go to mendy.com and enter the code 8255.3590 i'm going to drop this in the chat as well um and we will get this uh this first quiz going actually eight two five five three five nine zero so let's oops oh i did a typo i'm sorry i will delete this message yeah and once you are in mentee there you can give us a thumbs up in there to just let us know that you are in there there's also a qr code if if that's easier uh to use with your phone like uh somewhere here you may have a qr code close right looks like we have several like lots of people in here we have 70 people so far um we'll give it a little bit of time to make sure that as many people um as possible can can participate but we don't want to take too much time we have a lot of content to go through so yeah maybe i mean those are introduction questions not a real competition yet so we can start with the questions already and you will join the code is on top of the screen read through the upper part go to mentee.com and use the code code specified on top on the screen yes but we will uh we will move on to the first um well did minty manta got stuck and he got stuck it happens or just reloaded it's not a piece so it doesn't really matter all right sorry minty is behaving but it will be good in the moment all right looks better already so our first question which role describes you the best student back-end developer front-end developer the other options that i forget data architect operator awesome lots of students good to see you here a lot of back end developers very good awesome well welcome everyone uh hopefully we uh we can address a lot of the the questions you have about nosql databases in this talk um yeah lots of students awesome all right are you using nosql databases today and if so what kind and by the way not is please to answer usement.com and answer where not in the youtube chat i mean of course we appreciate any feedback and any opinion but the right place to answer questions is to go to mentor.com and use this code that's our animals you don't have to put your mobile phone number or email or credit card number you just go there unanimously and use it and participating in the youtube chat will not bring you the price definitely can yeah and menti saves like this these uh these answers so that we can reference them later so we know how to improve future workshops so youtube chat doesn't really let us quantify that so a lot of newcomers to node sql no surprise there maybe i have no idea lots of those which is quite funny awesome well yep it's good to see you all right i would notice what uh people those are already using nosql databases those usually using document oriented databases and that often often leads to a conclusion what nosql is there document oriented databases this conclusion is wrong and today i'm going to prove it all right which programming languages you are are you familiar with and you can pick multiple don't box yourself in okay python java c javascript java python c okay looks like c sharp is like somewhere on bottom of this list and go is well well well go is a great language but well very special one not for every use case cedric we will need your guide throughout the workshop please well uh we have here ryan and me and cedric i believe three of us can make it can make you bring you through the workshop so don't have reasons to be worried about that all right have you attended one of our workshops before yeah this one is really interesting okay that's amazing new people so take a look we have a lot of new people that what means we have something very special to you if you run those workshops weekly sometimes even two times per week for different topic regarding databases and software development soon we launch a free episode series on event streaming which is an amazing technology to build scalable applications what that means like uh very close to me somewhere down myself is that like and subscribe button and that's a very important buttons to hit if you like this workshop and if you don't like this one don't hit it no pushing all right so and let's go yeah last question how did you hear about this workshop i'll make this one a really quick one um we get a lot of the same answers usually it's eventbrite and instagram so but we always want to know uh if there are there are like up and coming places that we should be um pushing these invites out to but it seems like ooh actually i think eventbrite is usually the highest and this time it's instagram so let's get to know yeah so okay leaders are pretty clear instagram and bright yeah but you know what like what i like the most is through a friend that's the most important channel to us people are recommending it and that's great definitely all right let's see all right cool thank you everyone for answering those questions keep menti up on your phones or another tab we will come back to this at the end of the workshop for the game and the quiz that you can win some swag from so uh keep that up and ready um so moving on uh as i said we're going to be using astra db for our hands-on portions and you can get your instance of astrodb at this url which i will drop into the chat as well and we also are going to be working out of a github repository and that i will drop that link in as well um i'm doing that already oh okay awesome so yeah astra as i said is a managed cassandra instance in the cloud it allows us to very easily spin up a cassandra database that we can use to illustrate a lot of these these topics and it is a hundred percent free to you um and uh and it will be yeah no credit card is needed to sign up at all all right yep so uh you can use it after low production workloads and then you will be asked to pay but that only happens if you have millions and millions and millions of rights and reads per se per month and as long as you use it for educational purposes it will not happen yeah very very little um so let's go over the quick very quickly the agenda for today we're going to go over the definitions and objectives of nosql we're going to talk about tabular databases and then document databases key value databases and graph databases these are the four main flavors of nosql databases and then as i said we will do our quiz and uh prizes at the very end uh so we're gonna talk about uh definitions and objectives of nosql and to get started we're actually gonna go right into our first hands-on segment um so if you all will uh go to the uh repository um that we've linked in the chat uh as well as the uh the astra instance uh basically we're gonna go through this first part and we're just gonna set up we're gonna walk through the setup of the uh the astra instance um it's very easy to get started uh so here i am at the the astra website and i'm gonna you can sign in with the email password or your github and google i'm just gonna click google because that's what i signed up with again completely free no no let me just refresh uh no credit card is needed to sign up as alex said you know if you can you can get started with you know small workloads and even medium workloads and if you get to the point where you're exceeding kind of the 25 credit that we uh provide it uh that you will be throttled or you'll be asked to pay so but no credit card doesn't need to sign up and you have a lot it's like 40 million reads and writes that you can they can do with with just the free tier all right so when you make your account you will most likely be prompted to create a database right away if not you can click this create database uh button on the left side of the astra dashboard and we're just going to go ahead and create our database now for the database name we're going to create the nosqldb is the database name and for the key space name we're going to do nosql 1 or no sql one now you can technically do whatever names that you would like but a lot of our uh our content here in the hands-on um it uh expects these uh database name and key space and it'll be a lot easier if you just use these values but yeah if you want to change your key space you can just keep in mind that you will have to take that into account later and as far as provider and reading you can select any provider any region we we recommend choosing the one closest to you which i'm going to choose aws and us east and then you click create database and then you'll see in your dashboard you'll see that the uh the database is pending it's uh taking a moment to spin up usually doesn't take a very long time but we probably have a lot of people hitting it all at once so it might take a few minutes but that's okay because we have we can go back and talk about databases as a whole while this is spinning up so i want to give a moment to make sure that everyone is on track if anyone has any questions please let us know if you're having trouble or any clarification on uh on how to set it up let us know the instructions are in the github and it's pretty uh it walks you through it pretty well there uh okay we're looking okay cedric says that things need to be bigger let me make this window bigger there we go uh and then i'll make this one one bigger too all right and to uh vent cass hate swallow sorry if i butchered your name uh free of cost it is yes we provide a 25 credit every month that renews every month that gives you 40 around 40 million reads and writes and you can you can start as many databases as you want but it is completely free once you exceed that that initial credit it you may be asked to provide a credit card at that point to to continue using it at uh non-throttled speeds but it's uh it is free to start for sure rahul asked to me to repeat so i can uh really quickly uh if you click on if you have just created your account it might prompt you to create a database if not you can click create database it'll bring you to this page and this is where you'll fill out the database name and key space name and for this workshop we are going with the database name of nosql underscore db and the key space name nosql1 so that you'll enter those in those fields and then you select your provider in region the uh you know closest to you is probably the best and then you just click create database that's it it's very very easy all right we do have uh quite a bit to get through so please keep the questions coming if uh if you need questions but i'm gonna go ahead and move on so that we can make some progress today so we are going to talk about databases what are databases uh well it's some software and to let you save stuff and retrieve them later with queries right all right so i guess that concludes our uh our workshop today thanks everyone for joining no obviously we're gonna go a little bit deeper than that uh i think cedric put that joke in here and i love it it's awesome thank you cedric no what are databases well you can think of databases as you know layers um of a few different things um first we have our interface and that's uh you know how we're communicating with the database the language that we're using is it you know sql is it cql is it um you know a different type of language what's the the format of of um of that interaction and then we have the execution layer layer you know what the database how the database parses that request and how it retrieves that data and and sends it back out and then we have you know the physical the storage you know how it how it stores that data on disk some use text files some use binaries and how that affects you know indexing that data now relational databases are are very established and powerful and i guess we have this this comparison between oltp and olap that uh that alex is going to kind of talk about um more in depth and how traditional relationship uh traditional databases sit on that relationship you see from my point of view the most important question in engineering in software development is why and in any moment of a time if you are able to answer a question why you at least understand what's going on and if you can't well then you may be deep in troubles so for uh no sql databases we are talking today about the most important for me is why they happen to be at all first of all because for dozens of years people were using sql databases had a lot of fun with them they're amazing actually and everything was completely fine and there were no real need for others and then boom knows quill and now knows grill is a big thing read through vacancies job openings uh like really a lot of them ask you to have to know at least one nosql technology why what change it i believe there are two main changes happen to be what changed that all and lead the need of something different requirement for performance first of all before what was the slowest part of your application like some years or maybe dozens of years ago if it's a distributed application if it's something on the network like web application it was latency doesn't matter how fast perform your database uh if your internet connection is slow it just doesn't matter because the bottleneck was the connectivity so first one need for speed we are getting faster network connections my mobile phone is doing i don't know over 100 mbps per second with ease and actually much more and more with modern ways to connect something in unimaginable before and then the second one we get more and more data we have like buy some reports and buy some research uh and then by the year 2025 closer to year 2030 we are going to have over 200 zettabytes of digital data 200 zettabytes ladies and gentlemen that's really a lot yes of course most of them are videos and pictures things you usually don't store in your database but there are a lot of database data as well and then the third one we have to be able to answer more and more complex questions we need to be able to identify customers who bought something two years ago from automotive section of our goods uh if ever for example the males and customers uh have bought something about beauty i don't know whatever books um from three years ago doesn't care if it's female smurfs or whatever now search queries are much harder when we were before let's say and before relational databases were enough and at some point one click please run yep it's changed due to different set of requirements first requirement is i said need for speed it means faster bigger more data right now right now my customers can't wait or they will go to another shop or they will go to another service so need for speed that goes to oltp side of the things online transactional processing transactional means the amount of operations like quick iterational rights reads usually you have no time to wait you need to get get answer right now you don't want to wait then usually those are simple queries and queries don't change often if your application if your system i don't know sell iphones um there is quite low chain chance what tomorrow you will need to make like very sophisticated queries over i don't know us space spaceships things so queries don't change too often and that's an oil tpin on another side of the thing next click please is online analytical processing that means that's another side of thing queries can wait usually people working with analytics are your employees or your colleagues and they're getting paid for what they do so they can wait they aren't customers who will just change their um producer and go to a next shop or next service in the internet answers can wait but queries are usually much more complex a lot of aggregations a lot of search conditions a lot of different very complex sets and limitations and queries tend to change also known as ad hoc you have to be able to ask for whatever you want and i still get the answer maybe sometime later but you will get the answer then there are their traditional databases they are somewhere in between they are able to process oil tp queries not as fast as we want but still they can they are able to scale high and store some gigabytes of data maybe terabytes yes they can and also we are quite capable on answering questions so you can make many many many joins join and join and join and join and make make filtering from multiple tables at once and you will be good but are they perfect in that answer is no for both of the sites as you know jay jack of all trades master of known uh well there are some exceptions like jack fryer he is master of all trades but today we speak about relational databases object and traditional relational databases are jack of all trades master of gnome and they are underperforming on ltp site and they are underperforming on a lab site so working with quick massive production workloads we go to oltp fast fast fast fast fast billions of queries per second like facebook netflix instagram and so on and uh that was what brought a lot of new kind of databases into this world and then all up analytical that goes to data lakes where you usually can wait but you may you may ask very complex questions thanks ryan yeah no problem uh so yeah as as alex said you know relational databases you know they have been around for a long time they um uh did very well for the time that they were in and continue to do well in certain um workloads uh but they are kind of sit in the middle of these um constraints right they um they do their jack of all trades master of none as alex said and if you needed to increase your capability um in in some of these areas you'd have to scale up um which you know involves adding more cpu more ram to go faster and eventually at a certain point you will just run out of money because you you just can't scale up anymore and so at that point instead of scaling up you have to start scaling out adding new machines new servers and it's become distributed um and while a distributed system is um uh possible with relational databases uh with like sharding and and whatnot it's kind of a weak point it's not really um designed to to do that kind of thing uh so yeah it as a slide says at some point uh a single machine um doesn't fit your needs and you need to scale out rather than up and so that's kind of where the nosql databases come in they kind of fill this need of of having to handle these increased throughput these three v's that we have the volume velocity and variety uh you know with volume there's too much data coming into the db so we need we need to be able to handle a lot more volume um with velocity that that refers to like how many requests how many times um or how often this database is is being hit and with variety being able to support new data structures and request protocols um that's kind of what the the nosql databases were designed to address all right so the cap theorem or cap theorem so this is a theorem for uh distributed systems and cap stands for consistency availability and partition tolerance um consistency consistency meaning you know the um you know if we have a distributed system we're going to have data replicated throughout the system we're gonna have specific data stored on multiple nodes we need to make sure that it's consistent so that you know if we read the data from on onenote it'll be the same as if we read it on another node um you know availability making sure that our uptime uh stays stays high that if we have any um any failures we'll still be able to retrieve the data when we need it even though we have have failures and so this you know cap theorem basically states that you can have uh you can focus on two of these things but not all three you know you can have consistency and availability that's ca you can have consistency and partition tolerance cp and ap is availability and partition tolerance uh we can only focus on two and the main differences in uh these distributed nosql databases uh you know we have there's lots of different flavors of of nosql databases uh the main difference comes down to like the decision that they've made in this theorem so some pick ap like cassandra which is what astra is based on some pick cp uh which is like mongodb and so the the big difference is uh you know people talk about you know is a document based uh database and cassandra's a tabular-based database but the kind of base decision uh that that separates them is what what you're going to focus on in this theory uh so yeah we talked about different flavors obviously there are a lot of nosql databases um this is just uh yep very it's a very dense list go ahead yeah i want to uh step in uh on this moment yeah like um then you speak about no sql there is a very common mistake like really so many developers uh doing that for them there is sql and no screen but the issue is no sql as a thing just simply doesn't exist there is no sql workshop is over we are done well not really but it's more complex so nosql includes a lot of a very very very different and often quite opposite databases and they can be similar or different from many parameters for example some of those databases are proprietary so companies are writing them not discovering code source code and you have to pay to use them and some of them are open source another point of view some of them are cloud hosted and you can go and use them directly or some of them you can host on your own and some are mixed you could you you can use managed solution or you can install and maintain it on your own but most of all you have to think of the goals of different databases and problems they solve the best if you have to work with we will talk about it later in better details but if you are focused on the relations between your entities you have to use graph databases and if you are focused on the query very very quick execution of your queries and data relations not so important you just want to be able to save and retrieve data within like very very minor milliseconds microseconds then you need a different kind of a databases like for example key value databases and that's very important point so there is no sql like a single thing they are very very different it's like people say europe europe europe europe but people who live in europe know what europe can be very different and there are very different people with different traditions behaviors and approaches and so on and so forth so no sql think of which part of your nosql you are thinking is it's a document or k value or so yeah so that's what ryan will tell you in a moment yeah so we can kind of boil down um the you know variety of nosql databases into four main types first is the column oriented or tabular which is pretty close to you know the relational database style we're still working with tables but these tables are distributed we have the document uh databases um we have our key value databases and our graph databases and so these are kind of like the four big um buckets that you can you can put most most databases in and i think alex touched on those um so talk a little bit about astrodb obviously as i said astra is based on cassandra which is a tabular column-oriented database but with astra on top of this cassandra instance that you've spun up uh we've developed an open source something we call stargate.io and this is a proxy or a data gateway that provides multiple apis and ways to interact with the data and this lets us explore how to do tabular document key value and graph style database structures using just astra and so it's very cool and these uh these apis are super useful uh in production as well so let's talk a little bit about tabular databases um so tabular databases stores data in tables it uses a schema in cassandra we call it a key space uh in a distributed system this data is distributed across nodes um as you can see uh in this diagram you know we have we have this cluster this ring uh many nodes around it we have all this data from from this table and is spread around based on uh the value of the partition key which is in this case the uh the text in red right the country the country code and because uh because this data is distributed around many different servers many different nodes there are certain practices that are important for example using select star or select all is a very bad idea uh because it will iterate through the entire cluster and be very slow and i should clarify using select star without a where clause at all just saying select everything from this table it's going to iterate through every single uh node in your cluster which could be thousands of nodes to find all the data that it needs so instead you always want to provide the partition key in the where clause so that the database will know exactly where to look for that data and so in this case as i said the partition key is the the country code in red there and that and the database uses that to to spread the data around the cluster um so data is stored on separate nodes um you can sort that data on disk as well you want to request things with a partition key so that you're not iterating through every single node and we use a denormalized data structure and there are no joins needed in that case so uh that is kind of a very high level overview but we can dive deeper into how this works uh if we look at how apache cassandra works and as we've said apache cassandra is a nosql distributed database each node is a single instance of cassandra and they are capable of of lots of throughput and capacity and we we take a bunch of nodes um and we we connect them all in this in this cluster and they communicate with each other through a protocol that we call gossip and that allows them to to communicate every node can do what every other node can and that allows you know any node to be kind of a receiver of a request and they'll pass that request on to the nodes that actually have that data and then get that data and pass it back to the client and these data centers or these clusters are we call rings so we have these rings of nodes uh we have uh you know we use this example here's the table that that we had as an example so we have all these nodes in this ring and we have this table and we've just defined one of the columns here the country as our partition key and that is what tells cassandra to uh like how to distribute this data around the node we're going to group these these values these rows based on that partition key so as you can see you know usa is together on one node frances together on another node some nodes such as the bottom left we have australia and india you can have multiple partitions on the same node one node is not going to be you know a single partition or owner of a single partition you can have many partitions per node but the important thing is that the data of one partition in this case is always going to be stored together um and we can also i see a question sucreet uh just asks what if onenote goes down we're going to get into that right now the data is a good question a very good question and timely too uh yeah so we we do um replication um and so we have a replication factor and so if we have a replication factor of one then every node has uh you know has its data and and that's the only copy of it right that's the only replica of that data so you know we have our our usa um you know uh data right here it's living on this node and that's the only place that it lives you know if we have a replication factor of two then there's going to be two nodes that have that data right so we have our usa lives on two different nodes and if we go to replication factor three which is the standard and the recommended replication factor for cassandra all that data is going to live on three different nodes so if one node goes down um well i'll get to when it when it goes down but the important thing to know is that this data is doesn't just live on one node it lives on at least three nodes for a replication factor of three which is the recommended um you kind of have diminishing returns if you go much higher than that um so what happens when you have data coming in uh you can have you know let's say we have this right this data packet of of these rows uh that can come into any node so in this case node 17 is going to be what we call the coordinator node it's going to receive that request that coordinator node is going to identify where that uh that data lives based on its partition key um and basically that's that's a hashed value and uh it's going to say hey you know purple basically in this diagram uh owns this uh owns this data but it also lives on these other nodes as well right because we have a replication factor of three so it's going to communicate with those three nodes it's going to distribute that data to those three nodes and then we have uh we have our data in those in those nodes if a node goes down to your question uh then that that data is going to come in it's going to be distributed to the two nodes that are not down and a hint is going to be stored on the coordinator node waiting for that node to come back up or be replaced or you know whatever and once it's back up it sends it completes that transaction and it sends the uh the data where it needs to go so this allows um sandra to you know like we said we we focus on availability it's always going to be available because we have at least three copies of the data if one node goes down we have two other copies we have protocols for consistency making sure that all those nodes are consistent and have the right data and and it it it allows us to be always on and and function for the uh the areas um the use cases that that require that another benefit of uh cassandra being a distributed uh database is we're also able to be geographically distributed right we can have many different data centers many different clusters or rings uh work together and you know be part of the same larger um larger cluster and and we can have these uh the replication can be can be established both on a you know high level geographic level and also within each individual ring as well and we also are able to utilize hybrid cloud and multi-cloud so there's no vendor lock-in you can use google cloud aws azure any other cloud service combination of any of them and you can also include on-prem installations as well and they can all work together as part of the same cluster and this allows us you know to to prevent you know vendor lock-in um and and get your data like as close to let me go through some of these animations as close to the uh the end user as possible right we want you know if we're geographically distributed we want to make sure that our data leaves as close to the end user as possible so that the the latency is as low as possible um okay so we have a few use cases i'll go very quickly through these um you know scalability uh is a strong suit we have we're able to handle heavy rights and heavy reads um that allows us to to be in in areas like event streaming internet of things um uh time series log analytics that kind of thing availability um you know with our replication factor we can be always on um that's really important for you know banking and um inventory management use cases being globally distributed um and also being able to be compliant with you know regulations like gdpr so that you know you can control where the data is stored is very important for again banking and customer information retail tracking logistics and you'll be cloud native being able to to utilize any cloud provider and and any combination of them being multi-cloud and hybrid cloud is uh one of the big strong suits of cassandra all right so i realize i did go quick we do have a lot to get through please if any if there are any questions i know i see a lot of helpers in the chat answering questions thank you uh very much um but we are going to move on to uh our second hands-on so hopefully you still have the uh the tabs uh and windows open for the repository and your astro dashboard um if not the uh the links are up here and we can drop those again if you need them going to move over here and we're going to do excuse me uh step two we're going to talk about tabular databases and do some hands-on thing things so and uh just a second yes i see there are really a lot of questions about uh coordinator node so what is the coordinator node and how it works and how it knows which node to ask and so on so take a look today's workshop is called intro to nosql databases and apache cassandra and astra is an amazing one but only one of many so we gave you an overall a little bit of overview on this one but we have to talk about others they aren't even competitors they are just taking care of their party particular set of tasks goals there to reach then for the current so if you want to learn more about apache cassandra which really deserves your attention which is on high demand all over the world you name a company apple uses cassandra heavily and develops it netflix same huawei same instagram same and so on so your name and very well known company there is a big fat chance they're using cassandra now about the coordinator don't make a mistake in cassandra it's a master less system no primary nodes no secondary nodes no right replicas no read replicas but every replica is able to answer your response is able to answer your request at the moment a note gets your request now again at the moment any of the notes in a cassandra cluster gets your query it becomes a request coordinator so it will coordinate the operations in the decentralized system to to make sure it answers your question as soon your response was answered that's it it's done so what is a coordinator note in cassandra it's the note currently answering your question if you will ask the same question with another note it will become the coordinator so don't think of it like a master notes or write replicas of manga or mysql it's a decentralized database every node is a coordinator node i hope i transferred that yeah thanks alex and like alex said we're you know we're not we did a kind of a a slight dive into cassandra specifically but this is more this workshop is more about nosql as a whole and we're going to touch on some other things if you are interested in um learning more about specifically about cassandra we do have a uh a crash course of introduction to cassandra and i can drop that uh that link in the chat as well if you're interested in diving a little bit deeper it'll cover some of the stuff that we just talked about and more um you can take a look at that crash course and it is uh it'll walk you through more details on cassandra yep uh do you know when we launch a next intro to cassandra workshop uh i don't know somewhere yeah definitely i think uh maybe in two or three weeks two weeks yep yes i'm trying i will you know what i will throw a link to the youtube chat and discord uh to help you get into our next intro to cassandra workshop because it deserves a dedicated two hours workshop or maybe even much more than two hours yeah cedric says next week uh next week yeah on the twitter come next week like and subscribe and come next week yep and if you can't come next week the crash course goes over lots of the same uh content too so all right but yeah live live is good too we can answer all your questions all right so we're on to our uh second hands-on portion uh we're going to work with the tabular uh uh data structures um we're gonna make some some tables and uh and stuff like that so if you're following along in the uh in the github repo we're going to go to our cql console so first i'm going to make sure that i have my nosql db database selected and i apologize this window is a little bit small and up at the top there's going to be a tab for no c or for cql console it'll automatically log you in and the first thing i'm going to do is uh describe key spaces i should probably just copy this stuff i'm just going to copy this um so i'm going to describe key spaces that's basically just going to show me that i have my my no sql 1 key space that i created i want to make sure that that's there i'm going to go ahead and say use no sql1 and that just lets me not have to define it for all of my upcoming queries um so i'm going to create this table so i'm going to just go ahead and copy this in and i'll kind of describe what we're doing here if any of you have used sql whether it's mysql or postgres or anything like that this will look very familiar to you cql is a subset of sql it was designed to be very easy to pick up for those who who are familiar with sql uh but we're you know very simply you know we're creating a table uh called videos we're defining our columns the data types that those columns are and then we have our primary key defined as our video id uh and so if we describe if we visualize the the structure you know we can see we've created this table and it has all of our our columns and their data types all right so let's uh let's go ahead and add a bunch of data to this table just going to copy this over this is just kind of like filler data and i'm going to read that data back now i said earlier that you should never use select all from a table i'm doing it here because we know the table is very small but you should never do this in production because if once your your tables get big enough where you have thousands of nodes or your databases get big enough where you have thousands of nodes potentially uh this is a very bad idea you want to always provide the the partition key as in the where clause and our second example will do that so but here is our table and it's kind of it's malformed because my window's small but you can see our columns video id url email you can see all the data that's in there so we can see that our data is there and it's you know we're inserting again if you're familiar with uh sql or relational databases the insert will look very familiar we're inserting into uh the table we're defining the columns and then our values as as well so the the better way to to query this table is to provide um a where clause with our partition key so you know we defined our our partition key up here in our primary key this is what this part was um we're partitioning by video id and so in this query we're saying where our partition key our video id equals you know this value so if i execute that you'll see we'll get that single that single row all right so we're going to create a new table and this is going to show us a little bit more specifically how the partitioning works so i have you know this this new table we're going to create table if not exists and we have users by city and this is kind of a a that we use you know we're we're creating the table we're storing the data the user data and we're partitioning by city and you can see that in the primary key you know we have our columns our data types and in the primary key we're defining a few columns this column right here in the in the parentheses this is our partition key this is where we define what we're partitioning by and this is what cassandra will use to distribute that data around the cluster and these other columns are clustering columns and this allows cassandra to group these this data together and also order on disk um this this data so if we wanted to you know order so in this case we have with clustering order by last name ascending and email ascending uh we're clustering this data on on last name and and telling cassandra to store on disk based on last name ascending and then email ascending so when we read so when we read back this data it will already be in the order that we expect so i'm going to go ahead and insert a bunch of data we're doing the exact same thing insert into this table defining the columns and the values and i'm going to select all from users by city where city equals pairs right so we're providing that partition key in the where clause as you can see it brings back the the data the relevant data city paris we have cedric with an email looks like the first name and last name we're backwards that's okay uh but our last name we're we're clustering by uh last name ascending so um you know we start with you know c and then we go on to j because it's in that order and it's in that order on disk as well and then you know falling back to email if if needed and so if we also try listing values we do this this select where we're selecting from this table where last name equals gelardi which is not a partition key you can see we have an error so why do we have an error it's because we didn't provide the partition key in the where clause and it doesn't want us um to to do things the wrong way but we can uh we can't see why this is the case so we're gonna do this command uh we're gonna turn tracing on and we're gonna do this original uh query right and this kind of shows us you know what cassandra is doing on the under the hood when that request comes in right it's it's looking for um you know it's coordinating with the nodes it's determining where that that data is stored and it is um sending back the the relevant data and if we do the same you know we have so we turn tracing on if we do the same with this um second command and we're we're adding allow filtering which will allow the command to actually um go through rather than just throw an error we can see what it does and [Music] and it's it's a little bit difficult to show with such a small table in this case but if if we had you know a larger data set in a larger cluster we would see that the um the timings like the the amount of time that it takes to process this request rather than providing the the primary key or the partition key in the where clause takes a lot longer and in production uh when you potentially have you know thousands of nodes and lots of data to sift through um you want to always use the most efficient process so this allow filtering this is an important point to make this allow filtering lets us execute this command like basically force it to execute but you never want to use this in actual production because uh it's it's going to give you problems a lot of problems later on um performance wise um you know sometimes you might get used to using this early on when you're just building a an app and uh and you're just working with like test data and it makes it uh you know behave a lot like a relational database which is what you might be used to um and you know you just kind of get used to it it doing this way working this way but if you leave that on in in actual production it will most likely bite you in the butt later uh because you're going to get lots of data and it's going to slow you down and it's all going to come crashing down so in short in your queries always provide the partition key uh in the where clause and only use allow filtering if you're debugging something all right so as you can see you know nosql tabular databases are you know very similar to relational databases in how they store data and how you interact with the data um the main differences come with it being distributed around several servers uh and in the case of um of this or of cassandra you know we use a denormalized data structure which means that we're gonna you know have duplicates of data uh where we you know we we define like in this case we have users by city right we're gonna we're gonna store the user information by city but we might have another table where we're you know storing users by something else some other partition you know we're going to duplicate that data but it's going to allow us to keep our queries small and fast all right making sure that um any questions or issues are handled yep so there are a lot of questions about a low field ring and i have a very good and short explanation so now take a look again cassandra is a huge topic come next week we will talk about it and we will happy to answer all the questions now i have to answer briefly cassandra is big data ready apple i mean the company apple you may have heard of them using cassandra to handle hundreds petabytes of data digital data and like text data not movies that means what they have really a lot of that what does that mean there are no server on earth to handle this size of data that means you need to distribute your data over multiple servers and that means what you need to know which server to ask for your data and here comes the story then you give all the parts of a partition key your driver in your application or coordinator node doesn't matter will know which server to ask you ask for data of a customer living in a regular friend i live in texas okay so i you need data of customers from texas then you go your query with a partition key for texas and coordinator node or driver knows which server to ask it goes to the server gets your data brings it back it works very quickly but then you don't specify all the parts of a partition key it's a simple example it's exactly like when you order a pizza without specifying your house number and if you order a pizza without specifying your house number delivery person delivery boy will have to reach every apartment or every house on your street and have you ordered pizza ma'am have you ordered pizza sir have you ordered a pizza literally can you imagine how tight okay i live on a small street on my street it will take approximately half an hour if you live on a big city it may take days and pizza will be cold and already quite bad you would not like to eat that to get your pizza on time you give your full address working with cassandra you give always the partition key for your data because it's the address of this data so coordinator node will be able to retrieve your data within one millisecond not an hours okay so that's a simple explanation allow filtering say when you try to execute a query without giving all the partition key without address cassandra will tell you hey are you fine man maybe i should call a doctor well it doesn't but it should i think um i will not execute this query it's kind of stupid but you can force it it's still a database not a human being so as it's a software you can force it and then you say hello filtering and cassandra oh my god you did it again no please no but uh she will go and search for your data asking literally every server in your data center and if you have free servers it may be fine if you have three hundreds it's already quite bad idea you can imagine that's it yeah yep a lot of filtering is just basically an override an override that you should never use in production but that we used here for illustration purposes yep what do you need to know about allo filtering don't use hello filtering that's enough all right so that uh is that brings us to the end of that uh workshop or to that uh hands-on i should say uh and we're going to move on best estimation ever we have yes it's uh you are very good at explaining things uh okay we're going to talk about document databases all right so we talked about um column-oriented tabular databases which is what cassandra is um we're going to talk about document databases now this slide makes me laugh all the time document dbs are all about structured objects uh nested structures usually it's json it can be other formats but i think json is the vast vast majority um and we're going to group these documents um into collections and these you know documents of the same nature are grouped together as collections um and you know these uh documents these collections have keys and you can request based on the key and then you can also request based on other fields um in in the document um and the use cases are mainly for reads i think alex touched on this earlier um you know if you have not a lot of writes um but you you have a lot of read so you have this you know this packet of data that you want to store and then you're just reading it back a lot of times um that's kind of very uh um very much what document oriented databases are designed for and some of the examples are and i know i'm kind of in the way of um one of them elastic mongodb couch base these are kind of the main document db's you've probably heard of um so you know in our case we're going to show you know how document db's work and we're going to use astra and astras based on cassandra cassandra's a tabular database so how uh are we doing that if we're not a document if we're not using a document database uh well we're going to do it through document shredding so we're basically taking this data structure this structured object and we're you know dividing it up into its component parts and storing that data in a tabular way that upon request we can um kind of rebuild that object and send it back um so this is an example of of how we're doing that and we'll just a really you know quick overview of what this is so you can see you know we have um uh you know our value a is stored in p0 and our you know value b which is a nested object is stored in p1 um and so on and so forth and so this is kind of how we we just take this this um this object and through an algorithm we shred it and and turn it into a tabular format um and it's you know we have ways to to handle arrays uh as well um and this kind of illustrates illustrates that as well so you know we have our our nested array that's stored in p1 uh if we had you know another value in that array we would have another row where p0 is c p1 is the array at value one instead of value zero or position one i should say instead of position zero and then we would have you know in p2 we would have our e and then our value for that so that's kind of how cassandra and how stargate specifically handles these these documents um and and allows it to be stored on a tabular database so let's go ahead and go right back uh to our hands-on and we're going to work with um some documents in astra yep all right so um our first point here is we can actually insert uh data into cassandra you know just normally uh we can insert into videos using json and we can give it this object um and then we can retrieve uh the data as well using json so cassandra has like some native support for json objects so as you can see i requested the title url and tags from videos and so we're getting all of these these objects that we've that we threw in here earlier um and so we do have some native support for for json objects um so we're going to um continue on through this we're going to uh create a application token real quick um so you can click on this link um to view the documentation on how to create the application token um i'm going to real quick do that for myself i'm going to go to organization settings token management i'm going to generate a new token um i'm going to just go ahead and do database administrator i'm going to go ahead and copy that so this is you know and the documentation will walk through this this is your token generation you can download a csv because after you go away from this page you cannot see this token anymore but there's some copy buttons here so i'm just going to go ahead and copy that button all right so once you have your token you can go back to your database the nosql db and up at the top we're going to go to the connect tab and we're going to scroll down um to where you see swagger ui right so we're gonna go ahead and go here all right to find where this is create a namespace yeah make sure this is the right one all right so i'm going to go ahead and paste my token that i copied into the header and then we're going to use this payload let's see this payload name namespace 1 replicas three um let's see here we go all right let me go ahead and execute that and as you can see we have our response and you can see that it is create a good 401 wait what do we get if i got an error alex yep uh did this i'm a little confused is this uh did this give me an error code 401 for this uh uh give me a second i'm switching to your context i was answering questions that's okay so what's my question so i'm i'm using the swagger ui um and i'm giving it this uh ah you want to create key space yes yeah yeah uh so uh i suggest just typing please i created namespace no i mean space yep i suggest simply skipping this step and proceed to the next one using the space you created already when creating the database uh it looks like security settings has changed and now you need another kind of uh administrative access to create this one so your aurora and mostly probably four or one correct yes yeah yeah so pretty much expected i just suggest to skip this step okay all right uh then i guess we are going to get all name spaces okay all right sorry about that i guess they uh changed some stuff on us um alright so we're going to get all name spaces here go ahead and click try it out i'm going to populate uh oh see i copied when i shouldn't have copied at the i have to make a new token oops those who are wondering i'm just generating a new token i uh i copied something else and it was not in my clipboard okay so we're going to put my token in there and we're going to pull uh all of our name spaces so as you can see the expected output here this is what we have we see our nosql one key space or namespace that we have created previously and we can see all of our namespaces right there so that is all working correctly which is good thank goodness all right so we're going to create a new empty collection in a namespace um so and click on this is the second one in the list create a new empty collection in a namespace let me go ahead and click try it out we're going to give it our token for the namespace id we're going to use namespace one and i'm not going to copy i'm going to type it this time we're going to use name column one all right it's very simple and we should get a 201 hopefully hopefully i'm getting a 500. no operational server no namespace namespace one you must create it first but you should use a namespace which exists already okay so did you get the list of existing uh namespaces i did yes yeah so i'm gonna need something what exists already i'm just gonna go ahead and use uh no sql one yeah which i work which i have created already yeah it should and just repeat that so i've seen some questions what is swagger swagger is just a very it's a very simple thing it's just a tool a web browser a web based tool you can use to execute http calls so basically that's it all right so we've created this uh this collection in our nosql uh namespace keyspace um and so we're going to create a new document uh as well so let's create a new document that's uh then the fourth one down i'm gonna go ahead and try it out i'm gonna provide it the token uh namespace id no sql one collection id we're using column one which is what we defined in the collection before and this time it's gonna be this time you have to copy i do want to type it i'm not going to type that now that's going to take too way too long uh okay so i'm going to copy this i have my i have my stuff here just type sanity check um all right so i'm just gonna paste that in here um so this is uh you know we're creating a new document this is all the data that's included in the document and we're going to execute that okay an answer must be two zero one yes yep and so as expected we have our document id it sends us back uh the document id um and then we can find all documents of a collection let's try that [Music] uh search documents that's what it's called am i blind there it is and there is a question from praful goyal can documents and collections be created from the command line too in general answer is yes it depends on which document database you work but usually you have a command line tool or for http based apis you can use whatever httpi or cool like normal tools to execute http calls all right so we're in this one we're going to give our token define the name space as nosql1 collection is column1 and we're going to leave everything else uh blank we're going to execute that and this should give us all the collections so we have our our collection here and the data that we provided it in the previous example so as you can see i think the documents are being stored so we're going to retrieve a document based on its id so we need to get the document id which i believe is this guy i'm going to copy that so we're going to get a document in our next thing so very specific this one is seventh down i think go ahead and try it out i have my document id copied so i'm going to paste that in here go ahead and grab my token again give it my token namespace is no sql one column one and then we're going to execute this and we're expecting the document very similar to what we had last time because there's only one document in here but this is that document we can see all of our data in there all right so let's search a document based on a where clause so we're going to go to search documents in a collection again all of my stuff is still populated so or where we're going to use see if i can type this quickly email equals see london at sample.com i'm pretty sure i missed the quotes there it is and we're going to go ahead and execute this and so we do expect to get the same document but we are providing uh this where clause so we're looking for where the email equals uh c london at sample.com and there it is so that's how we can search documents based on uh you know certain certain data within the document all right we'll open up two questions make sure people are following along just fine cedric says he's watching me good oh yeah cedric is watching us always all right yeah so let us know if you have any questions or having any issues following along with that i know it's jumping around a lot it's basically just illustrating you know putting data into the database being able to retrieve it different ways of retrieving it yeah so let's uh go ahead and move on and at this point i'm actually going to hand it over to alex yes i think so key value give me a moment to switch yes yeah it's going to be all about key value things and we have not so much time left uh hey i'm ready all right you can switch tell you okay so uh before we proceed i asked i will ask you a question uh what do you think how many database database engines are offered uh by the database service of amazon web services you maybe have heard of this company it's like the first one the most popular cloud provider known as aws or amazon web services how many database engines it suggests you offers you to choose of just give me a number in the youtube chat first i will give you a second to think but don't think for too long it's because we have to proceed okay just give me the number seven okay yeah cedric it's not fair sorry like cedric yeah correct answer is 25 25 known of that 10 these answers was even close 7 4 3 10 3 8 4 6 3 it's all wrong overall count is about 25. uh well it's a little bit of cheating because it includes different versions but if you exclude versions it's still around 20. so 20 engines just suggested by one single company why do i need all of them as said if you need to put a nail on or into a wall what you which tool you would use ryan like which tool you would choose to put a nail into the wall to i don't know to put something on that like put your coat on that um my fist probably oh yeah maybe you can find something better no i'd use a hammer of course yeah of course and uh you and you see uh in your um and box for different tools in your toolbox mostly probably a set of different tools and that's exactly the same with the databases so now we talked a lot about the document databases what's uh the essence of this part of the workshop what you should really learn what you should really remember document databases are the best when you have no idea of which kind of data you will work from its properties point of view and then relations aren't so important to you so for example i know what one object in my database will have properties uh in my like name username email but another user might have some in general email username account i don't know t-shirt size can you fit that into normal relational databases yes you have you can by adding a new column into your table every time and it may well work well in the beginning but if those properties are changing and adding new then document databases will behave much better because they don't care of a particular set of properties you are adding they are shema less and that's a place for a next big mistake we talked a lot about mistakes today which people think very often people consider nosql as shameless it's wrong cassandra has tables cassandra is shamaful no squares chamaless k value databases like radius they don't have any shammas at all they cannot be shameless as they don't support gemmas at all in general like it's no light it's not dark it's not black it's not white it's just absence of the color so no sql is not shameless no sql can be very different that's the main idea i want to bring to you today now let's move on to key value databases what's the main thing of the key value databases hold on alex real quick i don't know that your screen share is my screen share oh oh okay that makes sense it's good i was talking about the same uh slide all the time thank you yeah uh so give me a second yeah my bet i was too engaged into this part of the talk and now i'm sharing and now i hopefully good on the screen yeah so here we go thank you miss ted uh but it was just a couple of seconds you caught me in the very beginning yep yeah i i better uh vs is right you better have a picture of a hammer yeah yeah indeed so k value databases when do you have to think about them and then they are good fit and then how do you have to use them that's first you have to think they are good at performance first so we are quick to use they are working very quickly one second they are good only working with a very simple data you have one key you have one value you have one key you have one value imagine a library in library every book has an identificator or has something like isbn number and every book has a place you come to the library you give the name or a number you get the book simple as that that's not like that in library what you store books books books books and then you store a structure it usually doesn't work like this so when you have a long set of keys and matching values and that's a simple unstructured data that's that that means what your data can be um scale it very easily it's an easy to scale approach with kvl databases so that's why i use it very often for high scale high performance and false for simple cases of non related to each other data what's the most widely used use case for the key value databases distributed cash or not distributed cash or just cash doesn't matter some of those databases are distributed so maybe not so much but the idea is still the same it's cash again you get the id of something you get a value of something it works very easy it works very quick and it's easy to scale so yes you may have distributed caches you may have multiple instances of for example radius you can use cassandra as a key value database one of the most well-known key value databases is dynamodb working on a amazon web services recently mentioned and so on so everything what has a simple structure and everything what usually has very limited set of options you can do you can store something uh retrieve something delete something update something there are no relations it's not like one key manages multiple values it's not supported it's not the purpose of this database so yes it use it also very often to store user sessions if we speak about the basic applications that's a very often use case and that's it i don't want to talk too much about the kvlu database because it's like a very simple approach i see a question on gautam what about mongodb nothing about mongodb it was discussed in the previous part of a talk now we speak about k-value databases and is document oriented databases there is a trick you can use as kvlu database yes you can because basically you have you can have document of one single volume but it may be not the best idea because monkey has limited ride capacity and with distributed cash you very often wants to have a very quick right performance as well so you can store okay you can use it as a k value but it's in general very not the best idea here so uh alert mongodb or cassandra db uh truthy i hope i pronounce your name right ruthie if someone says you this database is best and you have to use that all the time take a hammer from ryan he has one we know and uh show this hammer to the person saying that and say like that's a wrong answer simple answers are very often wrong because there is no silver bullet in the software development if you want to be a good engineer and as i'm having ryan's camera at the moment you should follow my words and be a good engineer you have to think of the purpose and for some purpose manga will be the best and for some purpose cassandra will be the best and what is wrong it's nothing wrong using mongod it's nothing wrong and using cassandra but it's wrong to use cassandra then you would need better and opposite because they are very different so you have to learn the basics of both and then you will be able to decide so question which better mon gore cassandra is wrong by definition never ask it again at least don't ask me because i will be maybe rude at this point already so let's go on and as we discuss at the basics of a key value let's go to hands-on number four k-value database so we are going to work on exercise number four and notice i'm using these repository github dot com slash data starts dev slash uh workshop introduction to nosql and now i have to find step four boom perfect so in this case to demonstrate how a k-value idea works we will do something interesting as you already know so you see learning a lot of things today cassandra is not a key value database but we are going to use cassandra as a key value with the help of a stargate mentioned today by ryan now i don't want to make too deep into the stargate just of the idea of a k value simple idea putting values and putting and getting them so first one i go i should go to my astra i already have astra account and you may be tuned by the way i'm curious on the result i bet i've created one but okay and i have this nosql db as you have or will have already and i have some connect page here and as these exercise states i will use graphql api now graphql is a very special story you may access data in multiple different ways so what i will use today is a graph query language to work with my data it's a very powerful tool to make you able to make to retrieve data and work with that over some well graphql queries so what i need to do first i have to find graphql api on my dashboard that's it and i have to i need to get an application token create a new one here so i will get to this new page so astra security settings tokens and i will generate a token in this case i will use something powerful like database administrator just because i'm going to destroy it in a second and that's my token so that's what i will need and to store that i will download csv okay so it ends pretty easy now handling that i have to go and launch graphql pre-playground alpha usually you use graphql from your application code so you develop something you write code you use graphql from here i don't want to write an application today i'm not in the mood so instead of writing an application i will use a simple um web based tool called graphql playground it's exactly like swagger we use it before but for graphql okay so it's integrated into the aster so i don't have to install anything i'm simply using this link go to graphql playground yeah your link will be different please notice that that's unique per person okay and now i have to put this setting x cassandra token notice oh um i have to move it a little bit give me a moment yep so it was hidden behind ryan i've pulled it a little bit on top don't worry i've it's already above on the left bottom side you will have http headers x cassandra token asks to populate him let's do this and i'm putting my cassandra token to it and it is done so now i can start executing queries using this token generated good so now i can use this graphql to retrieve some data from that so now i can hide it back and here we go so on the left side of the screen you have a place to put your requests here you have button to execute them and on the right side you have the output in this case so key spaces graphql is great when you want to get some data um we've how to say it better in a way so you can work with some parts of it okay you better just uh will use that and see how it works in this case i require for key spaces but i don't want to get all information of the key space i just want to work with name and therefore when i um oh yeah we always have to specify the key sorry so when i executed that like this i am getting the names not the wall metadata of them what i'm going to use okay what comes next we got information about our key spaces i'm going to create a new key space called nosql free and i really doubt if this will work but let's take a look i believe you're going to have to use no sql one again i will have to use yeah so it was recently changed yes i'm getting exception so some restrictions on the key space creation on it okay no worries so i will simply skip this step and work with um a key space i already have so that looks pretty sophisticated because in this case i'm going to create table and yes it's a k value but key values are still stored in tables name maybe may vary from the database for to database but they still you name it a collection or you name a table or whatever you name it it's still the idea of the keys and values so here i have to use what was my name test in your case it will be something like nosql one in my case it's test and table name just k volume and partition keys so something what will be the key working it will have name key just a simple one i can name it id i can name it whatever i prefer i will keep it like key and it type will be just text and every key has an associated value so in my case value name will be value i'm not so creative and type will be again text so it will be k value text text like mostly probably familiar hash map hash map familiar to you cool so i'm going to execute this one and yes so i got my data k the true it means what it has been created perfect so kiwi in this case is just the id of this mutation because i may run multiple mutations per once we need to have a way to communicate with them well then what goes next yeah cedric makes joke about rest but you know what rest for the simple cases will be much better than graphql for the simple cases again what's better rest or graphql nothing is better it all depends on your situation and all depends on your use case so okay let's move on next one i want to populate these table so i want to insert some values in this one and i'm going to execute two mutations i told you already what we may have multiple mutations per one now let's take a look these candidate query field insert key on type notation okay so key k1 insert 2k theme that's the name of annotation and two mutations we execute graphql explains how to put the values so what are we using can we explain them here key key one value something no i want to have something you know what that's we have ryan and i will have ryan here as a host and my value will be oops guest because i'm a guest speaker today and my name is alex that looks good enough to me so i'm executing this one okay and something went wrong uh in validation if field insert key value in type notation is undefined at insert key value that is interesting so [Music] oh because you didn't navigate to the the table correct okay thank you yeah change step to graphql and pick url thank you i've missed the step it's my bad and url is going to be not graphql shema but oops it's going to be graphql and my key space yep oh yeah and my key space [Music] is is test correct yes but there's probably going to be no sql one yeah so for me it's test and for you it's no sql one yes so thank you ryan for catching my mistake well it's a live demo after all it happens and you see what our data was stored now how do we retrieve that um very easy we can i don't want to execute one more mutation okay let's get this one if instruction asks me for that i can make it and we have stored some more data ah that's a graphql tab prepared to me already so i've just could have switched to your next step okay and i think for the mutations we are done and uh that's an interesting point in our case as you work with a database which is um uses so we will we use cassandra as a key value here which is pretty good feed for that especially using stargate we discussed it already we can simply retrieve the same values using sql so what i will do now is to go to astronaut and here in the tab i have sql console you used that already or going to use can ask to for my um key spaces uh i don't think you're showing the uh astra window at the moment um i think oh it just popped in yeah quite delayed just a little bit slow so oh i made a typo now we see okay we are good so i can ask to describe tables and i can ask for my values here so select all from key volume and those are records i've recently created so you might have seen them already and they are here that basically eats the key value databases and i want to switch to the next one i feel free to play with them again so asterisks cover it for you for [Music] millions and millions of queries per month so don't be afraid to overuse that and even if you do you don't put your credit card number and you will you won't be billed for that uh workshop is almost over so we are working with the last step of a workshop uh it's for two hours and we have 13 minutes left so i will go very quickly now because you still have to play a game and participate in quiz so last kind of a databases nosql databases we work today is a graph database database so that's an interesting point how do we call databases like oracle or postgresql or mysql we call them relational databases and then are relations and those databases are really first class citizens no there are entities and their properties you read table and rows and those are first class citizens but relations aren't there is no entity like relation if you need to have a relation you are basically very limited from the relational database point of view so you have one entity to another entity you might have a relation between them the foreign key one too many many too many with an intermediate table but you cannot query for relations you can query only four entities why do they call them relational after all well those times like maybe 50 years ago when they first appeared to be it was relational enough but now we want to have more now we are very focused on the relations and that brings graph databases to us graph databases are represented as graphs you may see on the right side and in those relations are first class citizens so every you can describe graph as a collection of vertices and edges and those are about highly connected sets of data like i like typical you you can imagine a facebook for example or any other kind of a social network where people comment each other the people making friendships where people making other relationships the people i don't know for example like movies and based on this graph you can ask let give me the names of all my friends or their friends who liked the movie terminator and didn't like movie i have no idea let's say matrix and that's about retrieving the data based on their relations so in the graph databases say it again relations are the most important things it's all about relations address belongs to a country customer recites on an address orders order belongs to a customer customer rates product and product has a tag then walking through those vertices and edges you can get very interesting information and graph databases is a new thing so mostly probably you never even have heard of them or definitely never used them that's very fine but what i suggest you is to really follow the practice for this one i guess we will have nearly zero time to work on that but i will try and to operate to walk through all their [Music] relations between those entities and see how it works what do they usually do they helpful for discovering relations between different objects you know already or maybe you never heard about they queries are based on filters and attributes for both nodes and edges as edges may have their own properties let's say um not exactly the properties like nodes but the idea is the same and we can traversal following those edges what's other use cases for this one social network when you need to work on this kind of a data personalization and recommendation machine learning similar behavior on calculation on this kind of thing fraud detection we have a great demo on fraud detection detection with databases with graph databases based on datastax enterprise graph healthcare pathfinding and so on and so forth and now then do they belong to graph really care graph databases really care about the relationships even more than relational are they scalable yes they are scalable what's the downside then they work fast as long as your data is limited in size when you have a huge sets of data graph databases and operations over billions of nodes and ages maybe not the fastest one so you don't want to use your graph database as your oltp database you know what why because it's not an oltp database it goes much closer to all up databases by the requirements of those and last exercise i suggest us to do the following way first we run quiz right now because our time is almost over and i want everyone to be able to participate in the quiz what do you think ryan i think that's probably a good idea yep and then you can stay with me after that after the workshop will be the main part will be done and i will show you the hands-on number five drive databases i want to show it in full not just ask you to do it on your own because uh for this exercise we have a requirement which is docker and if you don't have docker install it and as we have a lot of students you may not have that when i want to show so you will be at least able to see how it works not to run it on your own okay so all right ryan now it's your turn yes so we're gonna go ahead and do the quiz um first and then we'll get back to the the exercise if if you all have uh time to stick around but this quiz uh hopefully you still have your phone or your tab up with mentee.com um i will go ahead and put up the the code again make sure so this is our quiz to win some swag top three winners uh get swag it's important to know that you should use your phone or your tab pay attention to that from when the questions come up because there is a delay on the stream so time like amount of time it takes to to answer the question matters so speed matters so make sure you're paying attention to the phone um so let's uh let's allow people to get in i'll post the uh the code in the chat again and when you're in give us thumbs up so that we know that you're here i've got lots of people awesome awesome awesome awesome all right i think we're good so let's get started it's quiz time all right i'll wait for a little bit longer because i know a lot of people were in here and uh i think it might have reset give us a thumbs up again if you're in don't take too long though awesome okay so i'm gonna go ahead and get started remember pay attention to your phone or your uh your web browser um because speed does matter all right first question in a distributed system you can have at the same time consistency availability and partition tolerance ca or cp or ap a working and non-working system or don't worry b.a.p yeah you need to pay attention to what ryan was talking about i hope you did if you did you have chances to win yes so ca or cp or ap now this one was a little bit tricky it's or because you can have two of each consistency availability or partition tolerance and that's what ca cp and ap represent don't forget uh work can be done quickly or cheaply or in a good quality and you can have only two of those at the same time a job cannot be done quick cheap and then a high quality you have to decide all right so our leaderboard right now we have no no tyrion no sorry victor and rc are our top three it's very close though some of these names are difficult to pronounce sometimes it happens all right question two remember answer fast to get more points all right what is astra is it a local in-memory version of cassandra is it it is cassandra as a service in the cloud it's a development tool or it's a gateway to the stars all right it is cassandra as a service in the cloud correct yes most people got that right and yep yes you can use astra as a development tool but in general that's a cassandra is a service in the cloud yeah it is useful as a development tool though all right so we have i guess it's supposed to be not orion i guess that's probably a good thing if you're not me uh still in the lead victor is in second and abhi is in third taking third place all right and one point again to to participate in the quiz and to answer questions please use mendy.com not the youtube chat yes yep and go ahead and paste that again the code is at the top as well uh top of the screen all right question number three what is stargate talked about this as well it's a new mario game that'd be good a tv show a data gateway to give you rest graphql apis on cassandra or a gateway to the stars well in some kind in some kind yeah i mean i think most of these are correct no it is yes data gateway to give you rest graphql api and cassandra and as a bonus point it is also a tv show it's a very good one as well all right let's see where our leaderboard is all right not orion is still at the at the in the lead at the top with the fastest answer pratik takes uh second place and victor is bumped to third all right question four do you maybe have some music you know i used to have music and people complained about the music i think sometimes it's too loud yeah okay okay uh your data set is highly connected and needs a lot of joins what do you use column oriented database relational database a graph database a document database a key value database or an aspirin i think i need an aspirin after reading all those choices database means nothing now all right oh it's kind of spread around a graph database yes highly connected needs a lot of joins uh that's kind of lends towards what alex was talking about it is very uh relationships are very important in a graph database um and so that is the the best option all right see where our leaderboard's at yep all right now the ryan retains the lead puppies in second place now and practically gets pumped to third and hindu was the fastest one getting to the top ten yes which is already a lot yeah we have a lot of people almost 200 people almost 200 people participating in that so it's a big honor to be in top 10 for yep yep definitely all right let's see question five of seven remember answering fast gets you more points you need a distributed cache to store simple values what you use i'm not going to read these questions these options again they're all the same except for the last one another chance well maybe i should read them no no no no i think it's fine they're all on the screen all right key value database that's correct most of you got that correct awesome two people want another chance well maybe tomorrow all right let's see what we have oh looks like uh not orion dropped out that's okay sorry about that pratik is the fastest and is also in the lead hater that's mean uh is in second place and hai wu is in third very well all right it's anyone's game though it can change very quickly one mistake destroy one mistake will cost you one thousand of points yeah so yeah all right you need to distort and retrieve json document from an id what can you use column relational graph document key value or another another chance all right yes the document database is kind of gimme uh we are storing and retrieving json documents after all most of you got that correct and let's see where we are on the leaderboard all right pratik is still in the lead hater is in second haibu no change in the leaderboard uh at the top three uh we have a few that bumped up uh 825 of the the room code had the fastest answer that's funny good name all right and we have our final question so what's your last chance yep you need a scalable database with heavy reads and heavy writes which one do you use column graph document key value or another chance this one might be a little bit tougher yeah but b spawn is actually pretty easy if you know the databases but most of our attendees are beginners so hey said rick yeah still a majority of them got the correct answer call them orient the database is uh the best to use for heavy reads and heavy writes i guess i did touch that on that when i talked about cassandra use cases but all right let's see where we're at and for the top three people get ready to take screenshots because you will need the screenshot to claim your prize some people have forgotten we have pratik in first congratulations in second we have eight two five five three five nine zero and yo rudy in third congratulations everyone uh we will drop a link in the chat shortly here on how to redeem your your prizes and how to claim your prizes uh but congratula congratulations and thank you for joining us on in that quiz all right let's see all right we also have a a quick survey uh again we do want those of you who want to stick around to go over the um graphql workshop our hands-on part continue to stick around but we do have a quick survey that we would love your feedback on how to improve these workshops for the next session you know we always want to improve ourselves and and make sure that we address shortcomings so please let us know what you think um and uh and it really really does help us good and informative thank you very nice one thank you really appreciate that you put a lot of effort into these time management it's always hard to get through yeah everything that we want to in two hours it's uh it's you know uh we can feed um into that type management but then we will be not answered to answer to answer questions properly and that's what matters for us yeah it was great i need to increase time periods not everyone would like music by element yeah i've i i get comments both ways though some people don't like the music some people do like the music sometimes it's too loud it's hard to control the volume sometimes so we'll uh we'll see what we can do i think it does help the mentee to have like some music but we'll have to figure that out um mostly good appreciate that very good time management yeah uh someone someone uh what was it someone said maybe a longer workshop it's always tough to do a longer workshop we've tried some and it's uh it's tough to keep people around for even two hours than it is like any more than that so love your workshop thank you all right well with that um thank you very much uh let's uh i'm gonna just go over a few uh like final things before we go back to the graphql thing uh for those of you who cannot stick around so we can kind of get you out um we do have more resources um that you can can check out datasets.com dev there's some hands-on learning we have our community.datasacks.com where you can ask questions kind of in a stack overflow uh type forum again we do have our discord which is kind of for continuing questions uh around cassandra and the workshops that we do um you can follow us on twitter at datasacksdevs the materials um i think that's a that's the wrong link but but the materials on github will stay there and you can go through it on your own time if you would like to go through it again um so we do have our homework um and so we uh if you want to complete hands-on number five that is actually what we're going to go through uh after um after all of this we're gonna go back to hands-on number five um which is to do with graphql and we're gonna use docker and all that kind of stuff so that's one of the things and then we're going to do the try it out so datastax.com try it out or go to datasex.comdev and that you can do complete that short course as well and um we will uh give uh let me go back we will give a badges to those of you who um complete the homework and submit it i believe it's to the github repo there's a section for submitting homework and that should be described in the repo as well yes yes yes i want to throw a link to github again so i do it even three times so no one will miss that there is a dozen of questions how to get the participation certificate so i tell you uh there are two uh certificates and two things to cover first one participation certificate certificate simple one simple verifiable page you have to complete the homework and the homework is explained again in the link i'm throwing to youtube chat right now uh can i only do try out for the page uh not really for this one you need to do at least first for so that's important some of you may not have uh ability to install and run docker to do the part thief for those i will i if you can do the step five if you cannot do the four steps but to get the participation certificate to get this page you need to complete four first steps for sure link again is in there okay a link in the description of the stream and stream stays available and there is a second one which is much cooler you see participation certificate is well not a big thing you need just to spend some time and that's it certificate participation certificate doesn't really prove your knowledge it proved what you did something but doesn't can feel like how deep you into that and now there is a very different game now thanks to data stacks data stack sponsors free education and certification for apache cassandra for administrators and developers and you can pass through the course and pass through the real uh internationally acknowledgeable certification exam absolutely for free without paying a cent um how to do that that's um how do we dispatch uh vouchers for the certification now yeah i'll i'll get those in here uh so i'm just putting in uh the instructions for those of you who won the mentee uh you can submit your screenshot to jack.fire datasex.com and i am now dropping the link for the voucher yep so to pass to learn and to prepare you have to go to the academy.datastax.com it's free you don't pay for it so academy.datas stacks dot com link in the chat and um voucher link is pasted already then you choose a path on the academy developer or administrator you go through this path and then you pass the exam because we give you enough information to pass the exam and you get a real certificate uh recognizable and verifiable and everything you will need to show your passport and so on and so forth so that's a real international certification and you don't pay for it but you have to put time into that we don't want to give those certificates real certificates to everyone it will just ruin uh the value of this one so you really have to learn you really have to put your hands on and remember remember a lot on the good side you get certified you get knowledge and you really improve your chances to get a better job because apache cassandra is in high demand and this demand will only grow people want to handle more and more data within milliseconds right so uh moving on real quick we have um weekly workshops uh so datasets.com workshops also the youtube channel that you are currently on um think about subscribing we do workshops every single week sometimes we do multiple workshops a week and we we handle all sorts of different topics uh so we would love to see you come back again um and just a final uh plug for our discord um we have over 10 000 people right now um and we're very active in there answering questions um helping out as much as we can and uh it's a really good uh community so thank you very much uh for those of you who cannot stick around for the final um hands-on we are very glad that you were able to join us um and we hope that you learned something and uh can take this knowledge um and apply it in your in your careers and lives so thank you very much for joining us and for the future we run workshops two times per week same content on wednesday and thursday for different time zones so if this time zone doesn't fit you too much if it's too late and you have to quit then just switch to the first day workshop same content different speakers still live it's not a recording so we are answering your questions and that's the story now i think i can proceed with the graphql yes what do you think yep let's do it so those of you who can stick around um please do so and we'll go through that final hands-on which we're going to do some graphql stuff which is part of the homework so it's a little uh head start yeah so um that's my screen i guess yes perfect databases graph databases are all about relations and we did a great work actually yes sometimes i have to praise ourselves we did a great job trying to make this no sql workshop uh available for you to run only in your browser and you don't have to install anything but no one is perfect we didn't find a good solution for graph database so how do we work with a grade a graph databases i will show you following this exercise number five to do so i will need docker and docker compose if you have them you can do it as well or if you can install them that will work so as it's explained on my side and i'm going to use terminal a lot and first of all i'm gonna switch to this project first i will have to create a docker network create graph that's just the preparations this one has no relation to the bless view this one has no relation to the graph databases i'm just installing required things when you have to clone this repository from git actually i did that already to save some of our time so all the files are here then next step i have to do i have to execute the command docker compose up it will take some time for you because you will have to pull the docker images for those and i already did so to save some of your time time is the time is priceless time is very important so in my case pool is done already and i only is creating these new containers and they are running so very good so now i need to wait a little bit to start to let them start give enough time for things to start i can take a look at docker compose what's the status of the applications we have right now they are up and if i'm want to watch if it's ready to be [Music] asked already docker compose locks and in this case i need these in because um that contains of the two containers and we will need both of them running but it takes longer for dse to start which graph database we are using right now we use a datastax graph datastax graph is a part of a datastax enterprise but it doesn't really matter because the idea and approach will be the same and we are using the same language um to work with a database with a graph databases okay so it looks like dc is almost ready gremlin is initialized gremlin is a language you use to work with graphs not only in datastax graph but in other graph databases very often too so it looks like dsc startup complete as soon as you see the word dc startup complete it means you are good to go okay so now as it explained it here i go to localhost 9091 oops i want to be i want to keep this page open and local host 1991 and here we have data stack studio data stack studio is a tool if you have used it before like jupiter or apache zeppelin the idea is the same if you didn't i have to explain it's the way to work with the data using so called notebooks so you can write some text execute some queries write some text again it's very convenient uh to you to work with um data when you are researching something or educating yourself notebooks like jupiter i use it very a lot in machine learning for example in this case i have pre-installed one a notebook for you or better to say not me but cedric hi cedric thank you and we are going to do this workshop introduction to nosql graph databases so i'm simply opening this one it takes some seconds to start [Music] yeah or you know what i can show it on my own actually yes so you ask about the homework you know what i love that if you want to do practice it means what you want to learn more you want to be better and that's amazing so uh homework now focus don't ask questions on the chat i'm watching chat just watch what i'm doing and what i'm doing is i'm scrolling to the very top of the of this document that's our repository the link was pasted in the chat approximately 1000 times already and it's also specified in the description of this youtube video and a little bit below where is it um where's the link to homework who deleted my link who deleted my link like come on guys that's not fair okay so let me show you then how do you submit your homework then you done with those exercises and then you're done with screenshots for those exercises you go to issues page you see uh two homeworks already accepted and page is issued for homework you push the button new issue java if you would listen to me right now you will get the answer you find here homework assignment and push the button get started and it asks you to deliver information so i put my name i put my name again sorry but this time it has to be full so alex wolchniev type home i put my email blah blah blah blah blah i put my linkedin profile so it will be verifiable via linkedin and i attach screenshots of the work i did so i just copy paste screenshots here like that and put a screenshot it will upload image done it will look like that don't be surprised it's fine and finally i push the button submit new issue and boom i have my new issue you see it has labels homework and wait for review it has an assignee who will review this job and it has some screenshots and as it will be checked verified i will issue a page this participation certificate for you okay good and last time how to do the homework just read and do all the practice steps okay i don't want to get back to that finally we can proceed graph databases those are very interesting so working with a notebook i hope no more questions i have yeah no worries i have to check also for the connections because one single notebook may work with different databases what matters for me now is to check what this connection is user support 90 42 and host ddc that's it as long as it is it's fine to me so i'm to get back to the notebook that's my notebook exactly as it explained it here and finally this notebook whines about this notebook uses graph studio tutorial but doesn't exist in the connection default plot localhost yes i want to create this graph it asks me some questions again which answer to give it explain it here and studio tutorial graph core factor one strategy network topology strategy boom boom boom boom boom create so studio tutorial graph successfully created and now we can work with this so first things first if you have in a row on top of the screen then you need to configure your connection check it as it explained it in the readme md only then you have it matching what you specified from readme to data stack studio and it still doesn't work then go to discord and call us and we will come and help then second thing if you uh again uh okay so that's what i saw now how do you work with notebooks if you never used notebooks before it may be unfamiliar there are two kinds of cells one is text so you just read through them okay set up got your connection and graph grade before writing data we need to define our schema yes shama exists so user can rate a product but user can trade i don't know another user user can add a friend but cannot trade someone well in most of the use cases let's say so yes we need to define a graph gemma define vertices edges and indexes of graph so it's all explained here and i will make the i will make it bigger that when there is a second kind of cells called gremlin it means what they will be executed as gremlin code and gremlin as a language to work with graph databases and here notice are some statements those unfamiliar to you but it's completely fine just let's read for when they are defining the schema wheel with you what we will use vertex if label gut if not exist partition by name with a property h create so this statement we'll explain uh to graph database data stacks graph which how to work with this information when we create some more vertexes demigods humans monsters locations and tetans so i'm based on the greek mythology as you could guess and now interesting point it's not enough to read through that maybe enough for you but not for the database now we push the button real time notice the button here real time i push this button and those statements will be executed and boom i have success i see a good question if jupiter notebooks usable i've done scala on the python notebooks so jupyter notebooks are great and they work and yes you can work with graph databases but in this case we have data stack studio notebooks they are different so you cannot just put data stack studio notebook into jupyter okay success so now it means i have some vertexes but you know what it's not enough so i have guts demigods humans but what's the problem at this moment they have no relations between them you understand we have entities but no relations between them and relations are very important so in the next cell i create edges already h label further from demigod to gut what does that mean a demigod may have a father as a god of course and the same lay h and not the same but still age power from god to titan so titan may be a favor to a gut and so on and so forth so here for example we create an age battlet from demigod to monster because some demigods were fighting monsters and it has property time because it happened in some period of time some it's going to be timestamped and we are going to create all of them now i execute this cell as i executed previous one and something went wrong vertex labeled demigod doesn't exist in graph oh okay but that's it maybe there is a type on which line [Music] okay i guess it will be the same vertex labeled demigod does not exist in graph that is really weird need to create the graph have you created the graph yeah yeah yeah so uh graph is created uh studio tutorial graph and vertex labels are loaded so at this point i don't understand what's going on could not create each label they may got follower to cut vertex label demigod does not exist okay do we re-execute the first part where you create them if not okay yeah so they must be existing one element return it success yeah that's pretty clear okay i i bet i executed the first cell what just happened here but okay as long as it's success we are good and now finally uh we define indexes so those indexes will help us to retrieve information based on those relations i don't want to get too deep into that i want it to happen pretty quickly okay so success and finally as we have a schema for graph created now we need to load the data into this graph and idea goes very simple for us as it's all prepared you just execute it in the real time but we can take a look so first we add vertex titan give a name saturn with the age of 10 10 000. uh when we add vertex location and we add vertex got like jupiter and neptune and so on so i don't want to stop for too long but we may see some interesting interesting relations so first we add vertexes saturn sky jupiter hercules and so on and so forth and finally we are edges so we see what neptune what we add in age lives where does he live from neptune to sea and we to this now that's a very interesting what's the core difference of graph databases to relational databases relational databases kind of specify relations we discussed that but take a look we are h relation from neptune to c it's called leaves because neptune lives in c and it has property reason why it just loves waves so we assign a property to a relation which is kind of cool and we also add brother edges from neptune to jupiter because they are brothers from neptune to pluto again and so on and so forth and so once away will be some battles there will be some relations and so on i'm executing execution error okay what now what now variable brother brother is unknown okay previous one is executed hopefully this time so let me try to call this again okay it looks like we have to agree to research this one deeper what on the line 22. it was working just this morning so variable brother brother is unknown line 22. frankly speaking i have no idea it should be operational okay now we have success okay maybe some background projects are not executed then yeah uh saswad uh says it's around midnight here as i said we run every workshop twice so you can you will be able to for the future for the future workshop i recommend to take a workshop on first day which goes in different time zone and more convenient uh regarding the homework you uh better submit homework within a week but if you submit it later and call me i still will be able to review it and issue a page for you okay so we don't want you to make too hard so let's uh run next uh let's take a look at the our entire data set let's take a look for a vertexes what has labels like those i don't i'm not interested in the location for example i want to ask only four demigods therefore that's how gremlin looks when you want to retrieve data i'm asking for vertexes just a shortened simplified view give a label demigod now i execute that okay let's try to fetch for some more it looks like it's not really initialized okay now we have some so that's how it looks in rome we have some ids and we have some labels and labels you know we have here so we have for example pluta as a gut and then finally we gets to the essence graph field my favorite part like those tables i'm using two tables i use those tables for dozens of years let's take a look a look at the graph view and then i execute these query so i ask for vertexes with labels but i asked to show them not as a table like i did right here you see here i selected it as a table not as a json but more as a graph i see something already very different so here i will not repeat this you see we preloaded information to this one and we can find multiple relations between them so we see for example what whitey jupiter lives on the skies and mighty neptune lives in the sea and pluto prefers tartarus and there is a monster cerberus which was which was battled by demigod hercules which is a which has father uh mighty jupiter and also hercules fighted monster hydra and his mother is human altman and also about that well he was fighting with someone all the time yeah so it's going to be pretty easy and now we work on the neighborhood visualization so we want to find the adjustment vertices when query may look like that first i ask g here means gremlin because i work with gremlin language the a4 vertexes what has labeled guts with the name pluto and i want to show both edges for brother leaves and pad and how result would look for me pluton has brother neptune and pluto has brother jupiter he lives in tartarus and has a pet cerberus i will try to remove for example leaves part and re-execute respawn you've seen how it works anyway okay so now this edge is included because i didn't ask for that and by the way notice these nice labels here downstairs label gut and label monster so that's basically how it works i will smooth i will go down so we can work with paths and sub graphs for example in this case we are looking for paths and looking for sub graphs for hercules mother and father and so basically this tree or this graph is a sub graph of the main one showing all the relations of our users if we consider that as users and we see all the relations of hercules because of this repeat command so we basically will go through vertexes with labeled demigod which has named carcules and then repeat for outer edges mother and father create a sub graph and that's a sub graph indeed i can make it a little bit bigger but i believe you see that already that's my sub graph and that's how i retrieve it okay what's i want to show our time is over and we have to quit i want to show something now that's now doesn't matter for you that doesn't matter for you now charts are also secondary so basically i think we are done for today uh feel free to play with essex with this exercise as much as possible because it gives you the new ability to work with the data which maybe you never had before graph databases are getting more and more popular and i think basically we are done with today's workshop awesome right yeah uh well thank you everyone for sticking around with us and going through that uh final hands-on um i've i've seen a few questions coming about uh homework and how much time they have we yeah as alex said uh usually a week from now is kind of your time frame for turning those in but we really appreciate uh you joining us and sticking around and hopefully you guys all learned some things those of you who uh it's very late for as i said we do have another time that we do these workshops at and it's uh thursday i guess it's my midnight but it's your morning right yes uh the second run of this workshop will be done morning european time tomorrow and therefore it will be daytime for india i see people asking about india question we have to submit hands on five or all you have to submit homework on first four and if you can on five on first four because those you can do in astra last one is not supported by astra so you have to do it on your own um if you can't okay bad luck for you but you still can submit your homework without number five okay good i will add information about homework uh for this workshop github repo it was there i have no idea who deleted that but anyway feel free to reach me on linkedin and i will be happy to answer your questions and um again uh feel free to add to reach us on discord we will be not me this workshop was done by a big team of people uh by uh cedric loonven ryan welford i did a little bit and some other people answering questions on that thank you so much for joining us today yes thank you everyone ryan we all did great today would you please play us some music in the end i will play a uh a outro oh that one yeah let's see if i can grab that real quick yeah that's true that's true but i do have to uh and as always don't forget to click that subscribe button and ring that bell to get notifications for all of our future upcoming workshops imagine a being gifted with powers from the goddess of cassandra who grew those powers until she can multiply it will move with limitless speed and unmask hidden knowledge with those powers she was able to fully understand the connectedness of the world what she saw was a world in need of understanding from that day forward she sought to bestow her powers on all who came into contact with her empowering them to achieve wondrous feats
Info
Channel: DataStax Developers
Views: 4,878
Rating: 5 out of 5
Keywords:
Id: K0hEwBbowKM
Channel Id: undefined
Length: 163min 15sec (9795 seconds)
Published: Wed May 05 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.