Ceph Intro & Architectural Overview

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello hey so before before that I made a slide about me because I figured my wonder who I am I'm the VP of community for Inc tank and Inc Inc is the principle sponsor of the Ceph project and being VP of community basically means that my job is to make sure that as we build a company around this amazing technology and its community that we don't screw it up so that's kind of my job and this is how you can reach me I'll have one at the end again so we're talking about the cloud today when you look at most cloud stacks Wow loud planes we look at most cloud stacks you're looking at really compute network and storage are the three main pieces of cloud stacks and stuff is a storage technology so that's where we live and we call it the future storage just because we think it is but here's the past of storage right initially data storage used to be kind of that it used to be something between a human being and a rock or maybe like a chisel or something I mean when information was first being generated and stored and then eventually we kind of moved on to humans in ink and paper and so now you have you have three things here I guess you had a chisel when you're putting stuff on rocks before but you could be using mud or whatever but so then something happened we're now you have a human in the computer in some sort of digital media right and this is the history of storage on one slide so you notice that we have this human computer tape paradigm going on now which really can be expressed as you and technology sitting between you and your data right so this is what Steph is is this part of this technology stack it sits between you and your data so everybody's seen a graph like this this is what's happening to our data right we're not getting less data as a society or as a species we're getting more data Scarry more data lots and lots and lots and lots and lots of data and so people generally when they wanted to scale out data storage in the old days they'd do something like this you'd attach a bunch of disks to a computer right and then multiple people can attack can access multiple disks through a single computer and that sort of thing and it usually ends up looking a little more like this because things scale as we say right so people started figuring out how to scale things up and what that means is they take that computer and they replace it with a real big expensive computer that costs a lot of money and this is called scaling up it's broken for obvious reasons and so people have figured out now how to scale out what scaling out means is instead of taking what you have today and making it bigger you're taking what you have today and making it broader right you're taking even more of it and you're parallelizing work which is a whole new approach so as people begin to figure this out they started to build appliances that do this right so all these computers and all these disks in this scale out storage architecture are kind of put all in a in a box you know you have lots of computers and lots of disks in one box and this is what we call storage appliances right and these are actual things you you you buy them from people and you write checks for them and their hardware and you put them on a forklift can you get them delivered to your data center and you bolt it into a rack and you plug a cable into it right it's it's a thing it's an actual thing and here's what's inside that thing if you get a storage appliance you're looking at proprietary Hardware right this is hardware it's not it's not like a Dell or HP box its proprietary hardware and on top of that there's a proprietary software layer layer of software on top that's proprietary and on top of that generally there's support and maintenance and this is a storage appliance and just for a bit of trivia one of the largest storage companies is EMC and I looked at their 10k and they spend about 5.2 billion dollar they earned about 5.2 billion dollars on support and maintenance that's about thirty four percent of their 2012 revenue they put about 1.1 billion in R&D into their proprietary software and they have a 1.6 million square feet of manufacturing space to build this proprietary hardware this is really big business right so what started to happen since this model was was really its most successful I think is that the cloud has started raining bits down on our head all right everybody we're here today because there's an inflection point and something has to change so this is the approach that we favor we favor standard Hardware everyday hardware that you already know that you're already using on top of that open source software that's free and then on top of that enterprise subscriptions if you need them they're optional actually there's a tag that says optional up in there and the and the optional optional so we think this is a much more sustainable model for storage right and that's what's F is that how Steph got built is because we saw a need for this exact thing so looking at Steph there are a couple of design considerations that were made I'll start with the philosophical design considered considerations then go on to the more technical ones first we want it to be open source right we wanted to be open source because we think that it's the best way to spread new technology it's the best way to get technology into the hands of people who can use it most quickly it was also designed to be community focused so a lot of people make open-source technology that's not community focused and that's kind of the one of the big things about open source is people will will say open source and they'll infer community focused and they don't really mean community focus they just mean open source community focused means that anybody can decide what what new feature Sep should have anybody can fix a bug anybody can update the documentation and as a result because all of us are a whole lot smarter than some of us then we end up with a better product that's what we wanted for set we also wanted to be scalable as I said not just a bigger version of something but more of something right if you if you have a convention in town and you need to have 20,000 people in a hotel you won't just try to find a big hotel you'll try to find a lot of hotels right I need to deal with that so you need to be scalable and part of being scalable is having no single point of failure none not not part of it has no single point of failure and then there's some controller node on top that is a single point of failure but there really is no single point of failure and the third piece is we wanted to be software based we're not a hardware company Ceph is not about building soft ability hardware it's about building software and then finally we wanted it to be self managing because if you have something that is this big you can't be you know jumping up and down every time a hard drive fails it needs to deal with outages and deal with things in an appropriate way so we took all these design considerations and eight years and 20,000 commits later we ended up with SEF and that's actually a graph of commits and it's terrifying right because a commit is how many times somebody has added something to SEF and it's it's ramping up so we end up with SEF which is this might be a little difficult to see because our marketing diagrams are not high contrast I have a technical diagram later you can see from the back of the room but this is essentially what Steph is it's a storage cluster underneath it's an object store and on top of the object store there's a sect gateway the Steph block device in this F file system if you want to speak to the staff storage cluster using objects you speak through the object gateway if you want to speak using virtual disks you speak through the block device if you want files and directories you go through the file system but all of it is stored in the same storage cluster and there are three interfaces so this is the big picture view of Ceph so that the technical overview looks a little more like this and you can read it from the back of the room I made both of these graphs one of them a deer's tower design system and one of them does not can you guess which one does not 'dear this is kind of what all the pieces look like and this is the technical names for all the same things on the last slide I'm going to go through each of these I'm going to start with Ray DOS so ray dos is the reliable autonomic distributed object store and it's what's underneath everything inside of Ceph this is kind of how it works if I have five disks and these could be spinners that could be solid-state they could even be raid groups if you wanted but I have five disks and I'm going to put five file systems on top of those five disks and today butter FS s avec x FS and X 4 are the file systems that SEP supports on top of those file systems you put OS DS which is the object storage daemon this is a software layer and what it does is it takes each of those disks and makes it part of the Ceph storage cluster so it's a simple piece of software when you configure it you point it at a path and it turns that path into a storage location for the for the for the the storage cluster and then of course when you interact with the storage cluster you're interacting with the entire cluster as a logical unit not as an individual series of posts right you you talk to the cluster as as one thing so you'll notice that in the image before there were two types of cluster members there is the blue thing with the red bar and then there's the M and the blue thing with the red bar is the OSD which we mentioned before which is the object storage daemon this is the software that is responsible for providing access to the data so generally you'll have ten ten of these in the cluster to 10,000 of these in the cluster tens to ten thousands right you kind of want one per disk but really it's it's a particular point you really want one per path right so if you have a raid group underneath it or whatever you want one generally per disk this is responsible for Sturtz for serving stored objects to clients so if a client access requests an object the OSD is what's going to actually pass that object back to the client and it's also responsible for peering with other OSDs for replication and recovery tasks so when nodes go down when nodes come up the OSDs work in a peer-to-peer fashion to rebalance the data and recover so that's what the OSD s do the monitors are the second type of required cluster member the monitors their only job is to maintain the cluster membership and state they know who's in who's out who's up and who's down at any given point in time and generally you want an odd number of these and you want a small number of these because they vote they vote on whether a host is in or out or up or down using paxos so if you have you know a hundred of them that's a really unnecessary amount of people who have to agree on something so you want a small number and generally an odd number of monitors and another important distinction is that monitors don't serve stored data to clients they're not part of the data path they're just there to make sure that they understand the state of the cluster so that's Rey toes underneath and everything else in step was built on top of ray dose so ray doses is a really flexible usable object store that we kind of use as an application platform for building all this useful stuff and the first useful thing is liber8 O's which as the name suggests is the library for accessing great O's so the way this works is let's say I have an application and it wants to talk to my storage cluster I link that application with liber8 O's and that gives my application the intelligence to speak to the storage cluster it's that simple right and of course it's not a web services or rest-based or anything like that it's it's an it's a native protocol right it's it's it's a socket right so it's really really fast so libertas provides direct access to ratos for applications and if you want some of the functionality of Rados like access to its internal dictionaries or access to object classes or some of them advanced stuff you need to use liber8 OHS that's how you get to that stuff so the next thing built on top of librettos was Ray dos GW the Rados gateway the radials gateway sits in between your application and Ray dos similarly to liberate O's and what happens is if your application contacts liberate O's librettos will essentially act as an intermediary and the trick is it speaks that socket protocol out of the southbound side but I'll the northbound side it speaks rest and I when I say rest that's really generic but particularly it means that it speaks s3 and it speaks Swift today so this ratos gateway is an object storage proxy it's a really thin layer and all it does is is it it translates rest to to to a direct Rados call it also does support buckets and accounting that you'll need to use it as a replacement for swift or s3 and it integrates with Keystone and that sort of thing and also it supports accounting for billing and that sort of stuff so that's the ratos gateway the next thing is our BD which is probably the most interesting for people who are standing up cloud stacks this is the Rados block device this is our block storage interface so the way that this works is if I have little bits of a volume spread throughout my cluster in four megabyte chunks RBD will link into the virtualization container Zen or at will not not Zen yet but accumulated a Zen is coming up soon and present that as a disc to a virtual machine right or I'll say it the other way if I have a volume that I want to store inside Rados what RBD does is it takes that volume it stripes it up into a whole bunch of different chunks and distributes it throughout the cluster now when I want to use it you can either link with Lib RBD from within a virtualization container or something else like samba we have Samba and Ganesha plugins that are in the works as well and that kind of allows you to access those volumes and it you can do something really interesting as well since you're not storing the image on the same physical hard drive where your hypervisor liked it since since you're no longer have your compute and your storage on the same physical node because you're distributing your storage you can do really interesting things like take a virtual machine and move it from one container to another you can suspend it on this hypervisor and bring it up on this other hypervisor you don't have to copy the image across right and if your hypervisor supports it RBD actually supports live migration you need to support your hypervisor though another way to get access to our BD volumes is to use the kernel module so it's been mainline Linux kernel for a while now and you can essentially map an RBD volume to a Linux device so you know RBD map and then it basically gives you a device and /dev that you can make FS on and you can mount and treat just like a normal disk but it's actually distributed throughout the cluster and you get parallelism on your reads and you get the redundancy of knowing that you have replicas underneath and and it's it's a very robust way to store disks so the Ratos block devices storage of disk images inside Rados it decouples the VM from the host which is a really powerful thing because it gives you the ability to distribute load amongst type of different hypervisors and generally images are striped across the entire cluster although really they're striped across the pool that you're choosing because there's this concept of pools and stuff you can do really interesting stuff like snapshots of these disk images you can do copy-on-write clones which I'll get more into a little bit later it's at mainline Linda kernel support since to 6:39 and it also has support in qmu KBM and we have native Zen support coming up soon and this has also been integrated into all the cloud stacks that we can think of cloud stack most notably works pretty well with ratos block device for storage of volumes so the final thing that's part of the step system is sefa FS and cephus is a distributed file system so you'll notice there's a new type of cluster member here which is a metadata server in the way that it works is when you mount the filesystem that's stored in SEF you first have to talk to the metadata server for all the POSIX semantics right that's going to store you know permissions is going to store last model all the POSIX C you know stuff that you know what what directory it's in and what other stuff's in that directory and all that's the recursive accounting and all that so there's a separate roundtrip you go to the metadata server to handle all this metadata then you get your data directly from the OSD so the metadata server is not part of the data path either any more than the monitor is all the data is always going to come from the OSD s but the metadata server allows the management of this metadata for all the POSIX semantics to happen in a separate unit right that can also scale out the way that we need to scale out so the job of this is to manage all that metadata including directory hierarchy file metadata it actually stores that metadata inside Rados because it wouldn't make sense to have it sitting in you know some local hard drive of some metadata server so it stores the data inside Rados which means that if you lose a metadata server you can bring another one up and recover the state and it's only required if you using the shared file system so if you're not using the shared file system you don't need to have any metadata servers at all so actually I can pause for a second for clarifications on this before I go into my next section about kind of what makes SEF a little different from the other stuff that you might hear about any any questions before I move on yes oh one second there's a microphone on the way Diane did I understand correctly the Ceph supports block level storage like guys scuzzy does yes okay as a matter of fact there's a nice cozy target framework that I think we just we just did some patches too so there's actually ice guys to support but RBD yes it is block block support sorry block device support although it's it's generally user space at the moment right so you there's a live our BD that you link with that gives you access to those volumes or there's the kernel module that will give you access to those volumes the ice cozy stuff is kind of a community initiative at the moment but it's on its way okay I'll continue so I'd like to talk about what makes stuff a little bit unique some of the things that that make it different from whatever else is out there the first is how it places its data right and and therefore how it finds its data later so if you have a whole bunch of computers and a whole bunch of disks and you want to write a knob to read an object from this cluster you have to know where to connect right a lot of people will solve this by having a controller that you connect to that then sends you to the right host on the back end but that's an extra step and it doesn't scale and so you need to know which OSD to connect to for your data so I have a metaphor for this and I call it how long did it take you to find your keys this morning so every time I get home I'm supposed to take my keys out of my pocket and put them on a little dish on my counter and that way I know where they are and they're always in the same place and I never do it because I suck but I'm always looking for Mike where are my keys where they so this is the metaphor how long did it take you to find your keys this morning so there's two ways to do data placement the first is you talk to a centralized metadata server somewhere and you say okay I'm looking for this object where is it and it goes oh that one that's on the fourth box from the top connect over there and go get it out of this pool right so somebody keeps track of where every object is right this is what I call dear diary today I put my keys on the kitchen counter imagine you have your phone you go oh where to put my keys today when you write that down right that's essentially what a lot of storage systems do the other way is you look at your cluster and you sort of split it up you know and you say I'm going to put these objects here and these objects here in these objects here you split up the name space essentially and you break it up kind of like you would break up the World Book Encyclopedia on a shelf right you say a B C you know Q's tiny Z's tiny M's like huge and all torn up and then when you when you need to know okay I have an object that starts with F for example it's going to be on that box because you know where it is right this is what I call I always put my keys on the hook by the door and this is the system that I use for my keys at home but it doesn't really work when your house is infinitely big and always changing as a storage cluster is right so I mean imagine if every time you came in your house it was infinitely large and different where would you put your keys you probably keep them in your pocket I would I wouldn't put them down anywhere the way that Seth does this is totally different we call it crush right and we were really fond of our acronyms crushes controlled replication under scalable hashing and it's an algorithm essentially this is basically how it works if I have a bunch of bits that I want to store in the cluster the very first thing I'm going to do is I'm going to hash the object named split into a bunch of different placement groups so I'm figure out which placement group this object belongs in great so just split them up into a bunch of high-level groups based on their name then I'm going to call the crush algorithm on each of these placement groups I'm going to pass it a cluster state and a rule set and based on that the algorithm is going to calculate where the data lives in the cluster alright so the calculation happens on the client you don't need to talk to anybody to figure out where the data is you don't need to ask anybody you can calculate it based on the cluster state that you obtained from the monitor and the crusher rules that you obtain from the monitor and then it will take all of that and give you a statistically distributed a sort of statistically even distribution of your data right so crush is an algorithm it's an algorithm that Steph uses to place data in the cluster it's pseudo-random which means that it looks random and really isn't but it's a fast calculation there's no lookup it happens really quick like there's no overhead in the calculation it's repeatable so if you give it the same inputs it's always going to give you the same result and it's deterministic which means that I don't know determines something and it means a super super pseudo random yeah exactly exactly it provides a statistical uniform distribution right so it's not going to always every every storage nodes not going to always have the same amount of stuff on it exactly but it's statistical uniform right and it's stable which means that when something in the input to the algorithm changes the output of the algorithm changes as little as possible right which means that when something happens in your cluster the recovery has to move stuff as little as possible so for example a client wants to connect to the server that contains that object it will call crush and crush will say okay it's there and there and actually it should be pointing to both of the green ones but you know who's paying attention it'll tell you okay here's where your data is it's on this host you're not connecting to any centralized host or anything like that diving a little bit deeper this is this is how it really works let's say I have an object it's got a name of foo and a pool of bar right the first thing it's going to happen is I'm going to hash foo into the number of replacement groups and I'm going to figure out that foo based on its name belongs in placement group 23 then I'm going to look at the pool bar and figure the bar is equal to three that's the the bar pool so food dot bar ends up being 323 then you pass 323 into the crusher algorithm and it tells you your target OSDs and this is all a calculation that happens on the client so let's say I have this cluster and I lose a node right so I have ten nodes and I've lost one so what happens what happens is the OSDs are constantly paying attention and working in a peer-to-peer fashion so they'll notice that this OSD is down they'll get a new cluster map from the monitors who have voted and agreed that that cluster is actually or that node is actually down and not just momentary well it's out and not down I guess up and down is temporary in and out is permanent and it'll decide okay this node is out then each of these OSDs will go okay wait a second I have red and based on the new calculation with the new cluster map red belongs there so the nodes will move from node to node peer to peer and do their rebalancing now if I have a hundred nodes and I lose one node Crush is going to move one 100th of the data so it's one over n of the day that gets moved which is a pretty small amount of data actually move especially when you're considering that it's happening from many to many many hosts to many hosts so recovery it can be can be very effective with Ceph then of course the client having received the new cluster map is going to call the crusher algorithm to calculate the location the data and it's going to end up telling it to go where the data is now so it's this really strange thing where you have this magic algorithm that clients use to figure out what the data is and you have the algorithm that the server is also using to move the data where it should be so it's like you're shooting basketball and you're really terrible but no matter where you throw the ball the hoop moves and ball go through the hoop that's kind of how it works so the second thing that makes seth really unique is the way that it deals with layering and cloning on its block devices so remember this image from before where we have our sort of blocks spread across the cluster and then live RB D is assembling them into a disc for a virtual machine but it never looks like this it always looks more like that right you have hundreds and hundreds and dozens and dozens and dozens of these B and so the question that occurs to anybody who runs a ton of EMS is how do you spin these VMs up instantly and efficiently right and it cost effectively and our answer are one of our answers is what we call the the RBD layering alright so the idea here is if I have this block volume and it's 144 units of storage I'm not saying those are blocks or kaor bytes or whatever 144 units of storage with RBD you can instantly copy that entire disk image as many times as you need and it will take up no additional space until you start writing to them so we now have five copies of this disk taking up the original space taking up the space of the original copy then when clients begin to write data they're going to write it to the copy and when they want to read data they're going to read it from the copy unless it doesn't have it then they'll read through to the master right so this is it's kind of a basic sort of basic layered layered block device and it's super powerful because I can spin up a thousand VMs based on my standard OS image and they won't take AB additional space until you start writing to the copies now what it doesn't do is keep track of how these copies are diverging and go through and clean up and say oh they diverged but now it's the same and reduce and so it's it's it's a one-time fork of the disk image it's important to understand it so the third thing that I think is worth mentioning that makes F really different from a lot of the stuff that's out there is how it deals with metadata on the distributed file system part of Ceph so this is this is metadata I mean we look at metadata all the time anybody who works in Linux spends half the day looking at metadata from a file system and there's tons of it and it's a giant diagram keeping track of where everything is and who owns it is tough and you're called in here we have the new cluster member the metadata server but there's something strange on this image and that's something strange is that you have three of these right so these metadata servers scale out just like everything else in the Ceph architecture the metadata server scale out as well so how do you have a single tree and a single thought authoritative tree with multiple metadata servers right what we have built is called dynamic circuitry partitioning what it means is that when you have only one metadata server that metadata server is responsible for the entire tree when the second metadata server comes online the metadata servers will determine what roughly half of the load is and they will give that load to the second metadata server right and this this happens pretty quickly because the data itself is stored inside Rados so it doesn't have to transfer state or do any of that stuff it's it can be a relatively a instant handoff then when your third meditators where comes online it'll take what roughly breaks up into thirds and then fourths and then you can even go all the way down and have a metadata server that just handles a single file if you have a hotspot if everybody's trying to access one file at the same time you have metadata server that just handles requests for metadata on that file all right and it all happens dynamically which is why we call it dynamic subtree partitioning so this changes all the time constantly metadata servers will be handing off control of different parts of the subtree to other metadata servers and that's how we believe we're able to get around the skill metadata problem inside set so I'm going to sort of wrap with some some resources about Seth I'm really I'm a big fan of easy to remember URLs so if you want download Steph Steph comm slash get get set up on slash get that's what you need if you want to get the latest stuff if you want to deploy a test cluster the best way to do that is to follow our QuickStart guide which is set comm slash qsg for QuickStart guide you can also there's a link to that from safecom slash get get easy to remember get so the QuickStart guide is there and it's it's kind of a 5-minute guide you can use too if you already have a bun to set up or you know I think it's yeah but to give the bun to set up you can get a set cluster running pretty quickly the third thing is getting SEF deployed on the AWS free to you're using juju there's a really helpful blog post that one of our community managers Patrick put up that's you know a 10-minute guide to standing up a cefco stir in the AWS free tier using a bunch of juju deployment tool and it's pretty cool then there's safecom slash docs for all the rest of the docs there's a member of our team named John Wilkins if you ever see him give him a hug he spends his life looking header files and documenting options and it's a very special kind of work and we are all very happy that he does it and his doctorate chef comm / Doc's so because we're an open source project it's not it's not just ours it's all of ours right so we want everybody to be involved so we have a mailing list as most open-source projects do SEF comm / lists and it's a good place to go to ask questions there's a users list which is for any kind of question there's a developed which is for more sort of engineering type conversations there's an IRC channel chef comm slash IRC and we actually have if you if you need help chef comm slash I think help yes after comes watch out you can actually we have ships on IRC where you know that a staff engineer is going to be there so if you go there you can see the schedule if you have trouble getting stuff up and running you can show up an IRC during one of the one of the times and somebody from ink tank or somebody from another company that's part of the safe ecosystem will help you and so it's really good to know then of course there's a bug tracker that's where bugs get filed and if you want to start writing Doc's then you can contribute to Doc's and that's definitely something that we need so one final thing before I go we just released Steph cuttlefish which is we named all of our releases after cephalopods because that's how we did it so there was Argonaut bobtail cuttlefish the next one is dumpling and we're trying to figure out e and we think we might go with Emperor and then fire of F's definitely going to be Firefly but cuttlefish is the best stuff ever as every one of our releases is the best one ever as typically happens what's new in this release is the new Ceph deploy provisioning tool so this is a sort of python-based script based easily scriptable easily easily automated provisioning tool for for getting stuff up and running there's new chef cookbooks as well for deploying stuff we have fully tested packages for for rail via Apple now in cuttlefish there's also an API for rgw the rail skate with indication management which was something we needed to build for our Keystone integration for OpenStack but it's available for everybody we have some pool quotas we have an interesting set DF command which is sometimes it's difficult when you have 10,000 nodes to figure out how much space you have remaining so it's nice to have that kind of command and we now have incremental snapshots on our BD so you can not just take a snapshot of a block device you can take a snapshot and then another snapshot and then just get the difference and actually apply it to another image somewhere else if you want so that's Saif cuttlefish and again you can get it at F comm slash get so that's there's my contact information again and I'm available for questions if you happening ok so any compression is supported do some put compression I'm sorry I can hear compression I'm in NZ deflate for storing the data Oh compression compression refresh repression not at this time it could be it could be covered by the underlying file system right just like I think there's some encryption things they'd be covered by the underlying file system so if you run a safe object storage streaming on top of a file system that has compression then then it will work but there's nothing in the LSDs that does compression at this point how is it integrated with OpenStack did you mean cloud stack now I'm just kidding um in it it integrates with OpenStack let's see I had a slide about that I actually took out so on the object storage side it speaks the Swift API and it integrates with Keystone so you've got the sort of authentication handled on that side and you have api compatibility on the block storage side it integrates with glance for image storage image source you can actually take a volume and make it a copy of an image right so you can take your bun to image and then you know make a volume that's a layer of that and it also integrates with the with Nova via the hypervisor or no sorry clients what's the other one cinders Thank You sitter yeah we're in agree with cinder as well right cinder is a block device abstraction layer essentially lets you plug into a bunch of underlying block storage systems and SEP is one of those and then of course through the hypervisor we'd agree with Nova thanks Sam yes going into a little bit more detail on those questions in cinder do you support shared volumes across multiple Nova compute nodes ANOVA guest instances shared volumes seeing the same virtual machine running off of time only the same volume being shared among multiple Nova compute between multiple guest instances on Nova so with our BD we generally suggest that you mount it from one place yeah it's doable but with with with with a block device I wouldn't want to mount it and run virtual machine to virtual machines off of the same image I wouldn't want to do it that's that's fine couple more questions and you support snapshots with the cinder driver the snapshots are supported in SEF I don't know if we wired them up through cinder yet but you can always do the snapshots on set via the command line and last question on the rel client support is that the fuse the filesystem in user space so you have it in the kernel yet so I believe on rel it's the server-side components that we have in epple I don't think the kernel in rel is new enough to support our our modules yet but our BD can be done through user space as well and I believe the few stuff that really the few support but I don't think the kernel modules are there I don't think the kernel version is new enough can you possibly have a split brain condition with multiple metadata servers yeah you can't have a split brain condition both metadata servers which is why we suggest having an odd number of them because you end up with two of them on one side and one on the other and the side that has two is going to be the canonical side right so metadata services well decide decide upon whether or not one is really down or yeah yeah yeah I believe so oh oh you weren't talking about monitors you're talkin bout MDS Oh split brain with MDS oh I don't know that's a good question for Sage I'm gonna have to ask him that I would because I mean it sounds like they shouldn't because if each one has control of only a part of a tree then that's all they need to be aware of and yeah that one goes down then it's kind of iniquity unglue unequivocal that that part of the tree is no longer under control of anybody until someone else assumes it but something somebody owns the root of that tree right so it's character is a split brain it's kinda like does this brain does the side that have the what controls the root is that going to be the one that is right that's a good question that's a great question I should ask sage that when I want to get back happy to follow up if you if you want yeah cool Ross I was able to find videos of you giving this presentation online is the presentation itself going to be on SlideShare yeah I've given it to I've given a femine though okay yeah you and yeah there's videos that this the intro to Steph talk is is it's when we give a lot because it's information a lot of people need so you can find this talk in a couple of different formats online if you need it but these exact slides with all the latest cuttlefish stuff and all of that are going to be up cool alright thank you so much [Applause]

Info

Channel: Sniper Network

Views: 89,722

Rating: 4.8989897 out of 5

Keywords: sdn, software defined networking, nexus, cisco, APIC, aci, iso, switch, managemnet, cloud, Management, Software, Technology, System, Data, Training, Security, Information, Systems, Design, Computer, Solutions, Business, Computer Security (Software Genre), OpenDaylight Project, openstack, Ceph, cloudstack, tutotal, Architecture (Industry), introduction, inktank, ceph storage, red hat, open source, mutli cloud, hybrid cloud, public cloud, private cloud

Id: 7I9uxoEhUdY

Channel Id: undefined

Length: 37min 10sec (2230 seconds)

Published: Sun Jun 28 2015