NetApp Enterprise Scale Hyperconverged Infrastructure with Gabriel Chapman

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
my name's Gabriel chap and I am the senior senior manager of Ninh fhc I go to market basically I'm just like the the HCI whisperer with inside NetApp I lead a team of solutions architects and experts globally who are dedicated to bringing the message and story of NetApp's entry into the hyper-converged infrastructure market to the masses right yes is my probably fourth or fifth time presenting here at Tech field a great event I've always loved what they've been doing even back before I became one of the the red lightsaber wielders and was not able to actually come to tech field day as a participant I always wanted to so but I've always had a special place in my heart for people who really wanted to get and understand the technology and dive in a little deeper level and get past the marketing fluff that said there may be one or two marketing slides in here because the marketing people sent me here but I will try to gloss pass into this much as possible we'll start with the first one here may not like this I found this on Twitter today but you know bacon is our God because bacon is real so let's start the day off a little bit I'm actually quite perplexed that there's no bacon here given my Twitter handle but you know it's like you know there you go we'll start with that one but let's reality listen all realities start to talk about why we are here you know we you know Stephen was talking about uh you know all the HDI all the HDI is is really popular it's interesting that it's almost like the year of HDI ten years before or after the year of vdi whenever that year will actually end up becoming a reality but you know the reality is we've kind of gotten to this point where there are so many uh participants in that marketplace there's what roughly thirty eight companies that work in the hyper conversion infrastructure space there's some big dogs in there you know all the major OEMs have gotten there at some point we've heard you know acquisitions so recently spring path was acquired by Cisco we saw that simplicity was required by HPE Nutanix went an IP owed by the way you know we've seen this space grow and expand and we see you know the fourth permutation of an HDI product come out of Delhi MC it is a very real ecosystem in a very real space five years ago when I first entered this place our the space of technology it was a hard sell right I had to go and talk to people I had to do a lot of education I flew and saw 500 or 600 customers in an 18-month period and you know really kind of had to really push to evangelize this space but today a lot of people take it for granted that hyper-converged infrastructure is a solution that they can leverage inside the datacenter some of the reasons why is if we go back and look at how infrastructure and how we've provisioned resources over the last few years has changed I look at this kind of curve right there's enterprises kind of at the tarrant the turning point we've kind of started out we consolidated workloads we started to virtualize workloads we got to the point where we wanted to automate as much as policy and be actually driving a policy driven data center or policy driven infrastructure when I look at what Amazon does when I look at what Google does when I look at what a sure does if I'm an IT administrator or an IT organization how do I stop my customers internal customers going to them swiping a credit card and provision a workload because they can do that in 15 minutes to have it up and running I mean while I may be trying to figure out what the base IP address is for that machine or what even the naming convention is or if it's even gotten into the ticket system by then right so there's this all this pressure that sits in the external world where these as a service type offerings these cloud like offerings the agility that they offer is what I compete with internally and it's it's a it's unfortunate but it's the truth there is no way to make the leapfrog from traditional legacy infrastructure to cloud overnight I'm not going to take exchange and containerize it right but there are a few some other solutions that I do go out there and leverage to put forward that so I have to evaluate my infrastructure evaluate my workflows evaluate my applications and see which ones fit on which side of the puzzle because eventually we'll all start to move towards that particular model and this is the space that hyper-converged infrastructure competes with technically on premises we use the word correctly for once just kidding as a customer we all have lots of consumption choices and some of this I think I've presented some of this before in the past but the reality is is if I am a company who has technology that I leverage on a day to day basis I have these kind of four different modes of consumptions I can leverage I can consume resources as a service you know how many of you have used Google Docs how many people here have used sales for us how many people have you here have used these different technologies that basically pay just you know paper drink paper use and that's one way people can consume resources it startup companies that I've worked for and I've worked for several we didn't have huge IT departments I didn't provision a huge piece of infrastructure I had the one guy who imaged laptops and set some apples in the network and then everything else was a service that I used in leverage because it didn't make any sense for me to spit up a huge amount of infrastructure at the flip side of that we have people that have PhDs on site they have data scientists that have like true legitimate tech ninjas who know more about you know kernel file systems and everything in that particular realm that I will ever know in the rest of my life that can build versus buy and they do right if I go look at what Twitter does I look at what Facebook does if I look at those types of companies the hyper scalars the large service providers the large service integrators they more often than not based on the type of workflows or solutions that they're providing will build a solution Twitter has a 500 petabyte Hadoop cluster they're not going to put that on a storage ring right they're gonna go buy a $400 server that has no shroud on it and if it breaks and they'll know they'll combine all those together if it breaks they just throw it away because it's a fungible resource they don't care right but the vast majority of us sit someplace in the middle right we have purpose-built solutions that are very rich and robust and integrate with the workflows and solutions and technologies that we have because that's the way we've done things and then we have a look at converged infrastructure or hyper conversion infrastructure because that takes a lot of the complexity away from how we provision resources if I look at it from the net app world and you know that viewpoint I have these two parallel operating systems and a concept called the data fabric for us the data fabric is global data portability mobility visibility security I have vision into my data where it sits I understand who's accessing I understand how long it needs to be retained I understand who needs to be having access to it and whether I need to secure it whether I need to archive it whether I need to pull it from the cloud and pull on-premises or put it in cold storage for long term there is a concept here of data being currency and how we leverage it inside the organization in that construct though I have the two parallel operating system paths and tools and solutions that can integrate across those different consumption metrics and consumption Continuum's you know whether I want to run a fully functional version of on tap in the cloud and leverage it as a service or if I want to leverage the field by SolidFire program a lot of these technologies are built into the long-standing 25 history a 25-year history that knit app has leveraging on tap technologies and the second is part of the SolidFire acquisition that happened back in 2016 and how we integrate with these different disparate consumption models is just a means to an end to integrate with a larger portfolio of technologies there is portability there is visibility across all of these different consumption spectrums and not just locked into individual silos for each and every one so if I go back to my days when I first started getting involved in a hyper-converged infrastructure the story I would tell you would be hey I'm it I'm here to take the complexity of all these disparate technologies and products because what are they at the end of the day they are just a bunch of commodity x86 resources running Linux well heck that's just something I should be able to virtualize right I should be able to virtualize all those constructs put them in a single form factor and provision them to you put a slick wrapper around them and there you go you got infrastructure in a box let's start stacking them up and go out that way and that's really the first generation of the messaging of Gen one HCI was consolidation and simplification I could come to you the VMA administrator hey you need storage right click create storage you're done you don't have to design for raid I can take all the complexity away I can abstract all the complexity away from those things and make it really simple because you guys are now the new king makers all right the reality is most organizations a little more sophisticated than that but the simpler we make things the less attractive they are to a sophisticated environment so I can go after the low-hanging fruit yeah I can do the DHCP server the DNS server in the wind server and the print servers and those such things am I gonna use my production of Oracle OLTP database on this baby just depends right if we look at first-generation hyper-converged infrastructure solutions and the architectures that they leverage and design they made choices that help them get to market quickly and help them meet the needs of a certain segment of customers all right if we go back and we go a couple years where do they try to fit into they try to fit into VD I get was very linear scalable type of solution it met the needs and I kind of put everything in one every hundred desktops I wanted I scaled out but if I start to put workload three four and five on there the performance will suffer because there was no way for me to actually segment the performance on there in many respects it was a hybrid type of file system so I was using flash to do some stuffs real quick to do caching but then I was still having to rely on spinning disk for the vast majority of my storage room means and if I ever defeated the cache I had to go to spinning disk and maybe there was only four or five minute box and guess what that's not a lot of supreme account for performance and so we didn't see a huge number of customers running and highly performant workloads on there that required sub 1 millisecond read/write latency I go further into that I look at the flexibility of a solution anytime I take a one-size-fits-all approach I'm going to make a trade-off or a caveat right I always have to placate the lowest common denominator in that respect so maybe I wasn't able to scale resources the way I normally would in an organization how many people traditionally scale compute storage and memory all at the same time every time that's not normally how we reserve or leverage resources many times we get a project that comes in oh I need 5 terabytes of storage but I may need you know another 25 sockets a compute well if I'm bound to a one-size-fits-all model I just kill all the resources at once I may get imbalanced I may have too much compute too much memory or too much storage and no way they're actually leveraged it outside of that particular initial silo that I've deployed in terms of the ages a product that leads us to workload consolidation if I can't control performance and I can't flexibly scale my resources how can I go in there and put my prod test dev and put you know in QA all on the same platform well know what I mean they actually go out and do a zone to build a silo for one the other and the other and therefore you get past the vision of this large shared pool of resources that we want to leverage and actually we're back to the same way of computing or just consuming it in different form factory different metric so you know that's kind of some of the stuff that we've started to find as we started to talk to cost we started to talk to analysts we start to start to partners in terms of what we're bringing to market it's a little bit different I mean some may say that it's not quote unquote HCI you know and I can have that discussion all day long about what is and what isn't HCI but ultimately it's an outcome and there are different values that you can ingrained into a solution based on those outcomes for us because of the technology and the architecture that we leveraged we have the ability to kind of have these three metrics of you know of differentiation and compared to the marketplace I have the ability to guarantee performance we're leveraging the SolidFire storage technology under the covers that's a very mature storage model we're on version 10 of our software we're on version 4 of our hardware we've got six plus years of production environments in the ecosystem we have some of the most demanding datacenter workloads on the planet in the service provider cloud and enterprise space leveraging this technology on a daily basis we simply you know recompose it or repackage it down into a form factor that would fit hyper-converged infrastructure we have the ability to flexibly scale our resources if I want to scale storage I scale storage if I want to scale compute I scale compute I'm not bound by a one-size-fits-all approach so I don't have to scale all of those resources at once and if you start to really look at a lot of different workloads that exist in the data center there's some real drawbacks to being able that are forcing yourself to take a one-size-fits-all approach right ultimately we also have the concept of automated infrastructure I want to leverage API I want to leverage automation I want to leverage simplicity ultimately the end of the day hyper-converged infrastructure is a lot of times about the simplicity of provisioning infrastructure and the simplicity simplicity of one day zero implementation but also day one through 768 operational States one additional point that we bring to market that's a little bit different compared to some competitors in the space is it integrates with the portfolio of products that Knapp has already has on the market that's our integration with the data fabric it's the ability to migrate data between a fads and an HDI system in a storage grid and an Ulta vault and a cloud on tap and all these different disparate products and have that kind of global data mobility visualization in that billet that data and the ability to secure it so we'll talk a little bit about that as well so under the covers what it is at the end of the day and I've made this analogy probably a thousand times HD is kind of like an epi pen you know it's an auto injector I have $4 of medicine and I have it in a seven dollar you know delivery mechanism and I sell it to you for $300 right because it's if I am somebody that is it allergic to bees or fire ants or whatever it is right I don't want have to run to my car take out a syringe put it in a bottle and pull out the medicine and inject myself because I'm probably gonna pass out before I actually can do that so sometimes the packaging is actually worth the extra additional cost at the end of the day what we have here is storage technologies based on a very robust mature enterprise class cloud scale storage technology baked into a hypervisor that is predominantly 82% of the market and things that most people use and are very familiar with so you have a very mature hypervisor you have a very mature storage system what do I do now I build a thing called the net app deployment engine it's a packager what it does is it goes out packages these resources together I will basically provision a vCenter instance to control and manage it I will have plugins that integrate with the vCenter and I can drive it all right from there and and that's really you know and then I start to bring a little bit more pieces above the above the table at that point I have data service integrations around you know the traditional data reduction technologies and high availability and self-healing you know inline deduplication across the entire cluster those are part and parcel to the storage system that sits underneath it like I said that's very mature then we have integration points with the data fabric whether it's snapmare snap Center technologies how we integrate with you know a common s3 output there are all these other technologies that we've integrated with and worked with over the years as well that are all part of this part and a part and parcel to the solution as well and then there's the third party ecosystem companies like themed comm ball dedos others that we know as the virtualization space has come to rely on those techno jeez hey you know you can use us and snapmare for backup and dr or application if you want but why reinvent the wheel and force somebody to do something else in different motion if that's what they've already invested in so if you've invested in a beam it's your backup and replication strategy well well integrate with it and provide some additional value it's kind of like you know why we don't include switches with it because switches religion right in most organizations nobody's going to switch from one switch to another just because you brought in a really slick box it does some fancy stuff you know and that's it's it's the traditional layer a tissue right the politics of most organizations so I mean we could do that we could also bring a Software Defined Networking constant work with it as well but we need to make sure that those technologies are mature enough to facilitate what we actually want to do in terms of an endgame so that's kind of the high level of what it is am i turning on lights all right sorry um let's take away a little bit Lehr it actually looks like this fancy bezel make sure you get your pictures but no in reality so so hey standard to you form factor for node form factor so to you for nodes the nodes can be either storage or compute right so I have a storage node that provides storage services I have a computer that provides compute services what you do not see her is a controller virtual machine that sits on top of either these systems so we're not taking resources away to turn the box on and we'll talk a little bit more about that but at the end of the day really you have the ability to combine compute and storage in the same to you form factor and scale those resources in a 1u half-width increment so let's let's dive into these other control you know these other aspects a little bit deeper and then go a little more the architecture itself so for us guaranteed performance is leveraging a lot of the technology that SolidFire pioneered with our guaranteed quality of service right the ability to consolidate mix workloads deliver predictable performance and essentially provide grandeur to control at the larger aggregate but down at the ultimately the virtual machine layer as well so how do we do that well by basically we are able to dynamically allocate manage and guarantee performance two individual volumes in your organization or as you provision them we do that by setting the min max and burst of the performance characteristics of those volumes alright so I can go ahead and say a workload a you get a hundred I ops for your minimum you get five hundred I ops for your match you get a thousand apps for the burst because we do know that workloads will get chatty they will get bursty we will see organizations and implementations where hey I just did set the volume up and let it go and I don't know what's going to happen to it so what we see here we see a workload that doesn't have a minimum that's really very high and what happens when it starts to compete for resources for other volumes with inside the solution well its performance goes down because it's minimums not set very well but if I really do need to punch up the performance and guarantee its predictability I can change that minimum from one level to another and now what happens I have guaranteed workload predictability in terms of the performance that's being delivered so I no longer I'm competing for resources because I've guaranteed the minimum then I basically set the floor I've created a walled garden effect around that particular workload in real practice that looks more like this how many times have you tried to work run a bunch of disparate workloads in your environment and something happens somebody like myself who basically doesn't know how to do anything with a database does a table query that like scans the entire planet right and next thing you know I have this workload this could brought everything else down to its knees and it's in the middle of the day and they're trying to run payroll and next thing you know nobody's getting paid and everybody wants to come in to cut my head off and put it on a pike this is not Game of Thrones but it's closed right are you crazy but we do see this more often than not as somebody who's spent a good part of his career as an end-user managing of storage and virtualization you know we always like to say well it's the network that network guy said it's never the network it's your application and the storage layer because it's not performing because somebody else is stepping on top of you right that's the one thing that's like if you look at HCI this simplest part to manage is the storage because it's usually right create create volume it's the most complex sophisticated system to actually implement as a foundational layer all right so you look at all the work that does is is done and almost all the technical deep dive elements around hyper-converged infrastructure is based on the storage subsystem that actually operates and manages the system because that's the part this is the hardest to deal with we already had a great file system to do this right so I could go in after the fact and say look at all these workloads that are running well let's normalize performance around each and every one of them right hey VMware basically in one virtual machines all that low-hanging fruit we're gonna give you this swimline to play with and then we'll give the the data based here this one to play with and VDI we know that some point that day that the bunch of people could come from you know come back from lunch and it's time to play you know online poker so they're all gonna log on at the same time and they're gonna punch up all that relationship and instead of crushing everybody else who's actually trying to get their work done we're gonna give you the performance that you need and segment those so when we look at workload consolidation and the ability to guarantee performance I can really start to play the game of Tetris with my infrastructure right cuz no longer do I have to guess if I have enough performance I don't have to do any weird math around spindle counts I don't have to worry about pinning volumes into specific SST tiers I don't have to worry about some kind of you know algorithm that's gonna predict something I just say you get this performance you get this performance and you get this performance and that's it I'm done and if you need more I hit a button and the performance is there and is there immediately alright I have three questions to answer how big how fast who accesses it and I've done that I've done all the things I need to do for storage done now I can go work on something else I can okay I can containerize exchange then if I want so you still you want you want the administrator to hard set those values I do or so I mean I mean that's great for limited scale but I mean as you scale me and what data can you grab about the workloads that can start to do that themselves so there's a couple pieces information I can do that there's a couple ways that I can also do that let me move one slide further here I can do that at the aggregate right so I can go and build my VMFS file store in my big bucket put virtual machines in there I can turn on something like storage IO control and give shares because that's what organizations do I can leverage things like storage based policy management so there's I provision from a programmatic approach I can say hey base operating system discs in my organization if I'm delivering v-ball all get 50 ions do they need more than that if they do then I create a different policy for it I don't have an innate ability to you know read the mind of your infrastructure I do have some tools on the backend that can do that we have a product called OCI that can scan your environment and tell you everything that you've ever wanted to know about the storage infrastructure you have and give you 100% actionable intelligence so and back and place data in the right spot I can find out if data stranded I can find out if something's been provisioned too much I can find the thing that's big we've been given too many AI ops and actually tuned it back down but that's not inherent in the actual file system of the HDI solution itself it's kind of an add-on thing I don't mean that's where the value value would lie I mean if we're you know we're adding petabytes a you know storage every few months whatever is see the workloads are continuing to increase I mean we're so I work for a service provider so I mean this is our our life we live in if we go and we manually set it it's only as good until you know then there's a problem then I have to go mainly set it again if we can provide us some dynamic understanding of yeah you know this is this is a sequel database workload that's what we've seen inside of your environment inside of other people's environments I mean that's that's a pretty so I have six years of history across pretty much most every service provider in the industry that I can go in there and say we based on what the hell they've tagged their workloads right yeah so I have if I put a tag on my virtual ones and it's a sequel server or whatever okay this is maybe the i/o profile it looks like I have a thing called active IQ then to the cloud basically cloud discovery solution or it's a cloud management platform that kind of gives you analytics and telemetry into the actual clusters that you're running I also have a hundred percent API driven ecosystem with inside my own platform that will allow you to pull any piece of data you want if you want to integrate with something like Ravana and do it locally if your needs are active IQ instance if you want to write some kind of charge back that says hey I see that this particular set of volumes keeps peeking past 500 I ups repeatedly maybe that customer needs to go to the next tier of storage I need to offer them that uplift and for me that uplift is hit the button and now they're at 1,000 I ops I'm not migrating a date I'm not doing any data motion or movement or anything of that nature what I'm doing is instantly providing them that additional storage performance without any cost on my back-end yeah and you're providing a way for us to monetize it which is obviously a great monetization of storage within the service provider arena is what we've what we've called fueled by SolidFire and as an area we've been very successful in I could give you the long list of service providers who leverage our stores on a daily basis there's about 300,000 customers that you saw it fire a daily basis had no idea they're using it because we've been selling storage as a service through other service provider partners across the globe that also applies into our our world here in hyperconvergence we take those values of service providers that are looking for they were looking for multi-tenancy they were looking for security they were looking for scalability and performance because no service provider wants to write an SLA check right if I break something and I don't meet the needs of the customer and heck I don't know what they want to put on our data on our systems you know I got me I just sold this guy 50 VMs or whatever it is right how do I actually sell them the storage tiers that allow from new that how I control it in many instances I had four different storage arrays in the backend and I gave them to one storage array and if they couldn't if it wasn't performant enough I had to do a migration event and maybe there was some downtime and some bunch of complexity and maybe I was making a guess at it for us it's a simple API call or a slider that makes that performance go up or down and we get to apply those principles into our hyper commercialism as well I can do this at the volume level so the big pmfs data store because we're starting we're talking about vmware here initially but I can also bring it a little more granular down to the individual virtual volume or virtual disk layer so leveraging storage based policy management going to they're identifying the characteristics or that policy driven automation down to the individual VM level provides a lot of value to it as well because maybe I have 20 different standardized workloads templates that I want to leverage if I do a little bit of work upfront it saves me a whole lot of pain in the end and then I know that I'm not gonna compete for resources and then I have all the management tools in the world that I already have either in place or in transit or I can leverage the VMware they can give me the analytics as well as our own management platform that sits outside going a little bit further here optimizing for scale you know flexibility and scale scalability on your terms is kind of the way to say one size fits all is not so great right I mean it can be in certain instances I bought a day all-in-one printer in 2007 that was a color laser brother it was like 2,700 bucks right and that was awesome because I got rid of my flatbed scanner and I got rid of my fax machine and my base printer and I got caller all at the same time it was $2,700 and when it broke this year what did i do I went to Fry's and I bought one for 99 bucks what did I pay for I paid for the ink because the hardware wasn't worth anything at the end of the day the software based technology that we put inside this thing is the ink but then again if I were to take in that 99 dollar printer into my office with 200 people and tried to run that as the the print services and thus gain services for all those people it would have burned out in 15 minutes it's I mean how many of us had that discussion with our with our CTO or CFO or CIO at some point says well why can't I just go down to to the Best Buy and buy a terabyte hard drive and put that in the datacenter right I don't mean how many times I had that discussion at least a dozen times well because the minute more than two people hit that hard drive it's gonna melt right so there's a difference between you know something that can do something and something they can actually facilitate ly do it at scale so you know we have these different concepts here about how we interact with solutions we want to optimize and protect our original investments because it's great that I have the new shiny on the floor I'm not gonna throw away everything else that I've spent money on that's still depreciating tomorrow right it's like the board essentially you will be assimilated but it doesn't happen overnight I also have the ability like I said scaling compute and storage independently provides a lot of value in certain response in certain instances especially depend on the size and granularity of the scale that does it but then there's a concept called the HCI tax right it's it's leverage and overhead to turn the lights on and not getting exactly what you've paid for always that's not always the case per se but there's an instance here where we see these different layers of depending on what you're provisioning it could be a real challenge to make that TCO discussion with your CFO digging into a little bit more we have first we started with this said hey under the covers this is an all flash storage array that we've integrated with VMware and packaged to people who want the consumption dynamic that is smaller in scale and that has a lot of simplicity baked into it that's essentially what HCI is right I don't want to get hung up on definitions of it having to have direct attached storage in every single node because there are benefits to that but there are also trade-offs against it right I might be able to get into a very small unit of measurement but if I'm doing it for HCI are sorry for VDI and I need 10,000 desktops and the amount of compute I need is so disparate based on the amount of storage I'm actually provisioning I could have you know what 200 terabytes of stores in there completely stranded because all the other solutions don't let you leverage it right I have an open storage bottle so not only is it providing storage for our hyper-converged infrastructure solution but it's I scuzzy storage to the backend and I can still manage it that way if I want to so why not connect my docker or my OpenStack or KVM or hyper-v I'm not gonna restrict you from leveraging the storage that sits in the box because you've paid for it why not use it if I'm not using it for all the VMware instances that sit on that system use it for something else - that's flexibility that's that's investment protection that's getting value out of what you're buying all right it's flash right technically it's still expensive ultimately I want to get the most bang for my buck I want to get to 80 and 90% utilization rates I don't want us to have a bunch of storage sitting out here that I can't use for anything else unless I put it virtualize it and put it on that platform in terms of a solution that has utility across multiple disparate models we go back to that set of models we were talking about the different consumption spectrums this is the same operating system across the SolidFire all flash array across the FlexPod SF versus the Cisco base FlexPod based on SolidFire technology and our hyper-converged infrastructure it is the same OS it is the same technology that sits underneath you can replicate between all three if we wanted to you can integrate these solutions at some point down the road between each other it's providing flexibility across what we say that hey if you want to consume it this way that way or the other it's still there there's no code fork here this is Software Defined it's the ability to implement it in other areas and not necessarily bind it to one specific consumption metric or incent well that's a little too mu VD I just made up a word movie T so for us start small right we don't have a one you know we don't have a to you solution it's a for you solution right because there's certain minimums we need to require it's around for storage notes because a we like high availability and we were like n plus one redundancy you can start with to compute nodes I can scale to 30 to compute nodes and 40 storage nodes at the base max it's about 1300 cores two and a half terabytes of memory to petabytes of storage and three million I ops that's usually enough to run Crysis definitely can run quake on that right gamer jokes girl is needed grow on demand non disrupt disruptive upgrades a word I hard always struggle with kind of the key here to those we want to have portability and and we want to make sure that it's a living ecosystem of infrastructure so the forklift up rig goes away if I go back to the original SolidFire customers the first one we sold to their very first SolidFire unit I can take version 10 of that software and run it on the very first box we ever shipped as long as the customers on their support but we probably give it to them anyway because having a customer for six years but the reality is is that that's what we're trying to get to we want to make sure that the underlying systems still support we may change out the compute that we use all the time like every you know two and a half years when that new the new processor scheme comes out I can change that compute out if I want maybe the storage still has five just you know three or four more years of life in it I don't have to get rid of it it's still going to be compatible I don't have to worry about EVC or anything like that right it still ends up being the storage contract that I can leverage for as long as I want to or I can switch it out as well right so we have different sizes of these different constructs essentially you start with nodes that are available individually for scale I have three different types of storage nodes I have three different types of compute nodes I can mix and match them within reason right so it doesn't have to mean that I have to put them all on the same to you chassis I can mix and match them I could have two storage nodes and to compute nodes in it to you chassis I could do all four stores nodes I could do all four compute node it's up to you how you want to consume those resources but I have these measures of you know the t-shirt sizes of implementation small medium and large small being a large for compute and small medium and large for storage so it's anywhere from that 16 cores to 36 cores 3/4 to 768 you know we're using 10 / 25 gig and the network inside so we have a little bit of Putney little padding in there in the storage layer it's anywhere from five and a half terabytes to forty four terabytes based on effective capacities 50,000 to 100,000 Alps per node so if I have four of them I can have 400,000 I ops and whatever 44 times 4 is because public school math and I don't agree but you know like I said you start with 4 of the same type and then I can scale those resources and it's scale one at a time all right I don't have to scale 4 at a time I don't have to put an entire block of services in there once I meet the minimums for storage and to compute all right I want to scale up my computer I keep doing so I'm scaling my memory and CPU resources I'm attaching a license based on whatever consumes the license or sought you know a socket or cores and I continue to do that but and maybe a new project comes along I need some more storage I put another storage and guess what I'm gonna scale I'm going to scale the storage as well as the performance of that storage based on the type of node I put in system because we don't always scale everything all at once that one-size-fits-all approach has its benefits in certain very linear workloads but if it's nonlinear if it gets any level of sophistication and maybe heck I didn't have enough budget to buy you know the big block maybe that's my challenge but for me I can sell you a compute note to give you the horsepower you need to run those extra virtual machines because maybe you have enough storage but I consult you for half the price of a combined node and people who cut checks like that is every storage node the same years of storage knows no they're all essentially it's all one cluster right so if I go back to the salt fire world you know if I had four 24:05 s and to 192 tens it just presented as one pool of storage and all the performance of those would be there so the four 24:05 s would give you 50,000 each so you're looking at 200 thousands of to one a 2/10 would give you a hundred thousand each so you're looking at a you know a 400,000 by out pool and some amount of capacity pool and those are the just those are the knobs I get to twist right how much consumption how much of performance do I want to consume how much capacity do want to consume I don't have to design or architect for high availability I don't have to designer architect for raid because it's a raid less architecture it shared-nothing its scale out if something like a drive fails the rebuild ton to provide it back into high available state is about seven minutes if a node was to fail for whatever reason I can have that back high available again in less than an hour we designed for failure because that happens right you know I used the analogy of the node bad Christmas you know I used to have to carry a pager at Christmas time and what would happen is invariably I would get a call on Christmas morning because something broke and I have to get my pajamas off and go down and miss the you know missed the disappointment on my children's faces because they did not like what I bought them for Christmas now I get to sit at home in my slippers and before I even got into my pajamas and started to drive out the day the date is already highly available again I get to see how disappointed my children are how do I the chassis is empty the chassis will come with either one compute node or one storage node we're not yeah we're right we're not going to pride just the empty chassis but you could actually buy it in such a way that you might have a chassis this coming you know you don't have to put the nodes in there right it could be sitting there empty but we're going to ship them with either one chat one storage node or one cute note so I guess I was gonna lead towards my question of why do you need more than one storage node in a chassis I don't think just cuz you're starting small and you wanted that for at a minimum um no if I if I go back and look at this I don't need to have a storage node in a chassis we're just showing you the illustration that you could put them anywhere you want I could put a chassis with all four storage nodes and a chassis with all four compute nodes they don't have to intermix with inside there but every chassis has the disks so no every chest he doesn't have disks only hie here I'll go back as if I have it here only storage nodes have disks compute nodes will get blanks it'll look like disks but there are no so it's kind of hard to see with the glare in here but a compute node is essentially a diskless node that gets in implanted so I slide that in and it has its connectivity for networking and storage so the disks are in the nodes they're not just err on the front right the nodes themselves are stateless right so so if I plug this in or if it sold me a for compute node and then I decided that you know you would it doesn't have any disks in there so no actually you wouldn't so if you had for compute nodes right and you wanted to take one of those nodes out and put a storage node in yeah you would basically put the storage sled in and put six discs in it's always six it so next is it's always six drives associated with each storage node a compute node has no drives of so something with it is that a computer code will get blanks is that like a dedicated piece at path to that no or a South Passport the back end is is is all top iraq switching for interconnectivity we don't actually interconnect internally okay the drives in the front though how do they connect to the it's the standard backplane this is with inside there okay what what is that it's like a SAS backplane or as a pci backplane I do not know okay so like the 6100 model Jessi's kind of they don't but yeah but ultimately everything with inside this piece is redundant yeah when I add six drives do I have to reboot the chassis the chassis is it doesn't have no controller logic in the chassis itself okay so I can add six drives I could pull nodes reconfigure nodes and move things around never F yep booth to jacks with chassis itself no the chassis is just basically a vessel rack it's yes essentially just an empty control it's an empty box that's simply waiting to absorb resources so a new compute node comes in I slide the compute node in and I fire that up yeah if I get a new storage node I slide that in I put the disks and I fire that up okay but the chassis itself is simply like I said this is kind of a yep yes it's it's the Nintendo 64 it's waiting for the chip it's waiting for the the cartridge to go inside before it does something so we'll go fast this I did say it's a I did not say I did not say Atari 2600 lifeblood of my my youth pitfall man I never did to finish that one so when I look at some of these architectural choices that some of the first channel solutions did they built really cool systems right if I go back six I may I go back to 2009 when most people started working on this stuff you know think about how much technology has changed since then if I was going to build a new brand-new HCI solution from the ground up today and I didn't have any tech dad or anything like that probably build a lols containers that said you know we work on things and we can only work with what we have in our viewpoint and a lot of the first generation systems basically decided to take a controller virtual machine and virtualize all the data services and virtual eyes all the control plane inside that right so I have a CVM and it uses resources and I put that when I could use up to four or eight cores of CPU per node it could use up to 120 gigs of ram and some it consists depending on what system it could be completely hardware assisted as well I have a specialty card it just depends on what approach they took in terms of architect and design that first solutions but they all use overhead and sometimes if I turn on specific data services because maybe my file system is pretty NASA and it's not quite ready for primetime you and I have trouble scaling past four or six nodes because I don't know how to manage metadata then you know I'm kind of bound by certain restrictions so I don't turn on global D dupe I don't turn on global thin provisioning or compression maybe I use a racial protein but I only use it for a certain sub segment and that injects complexity and complicity or a complexity right complicity for an example if I take a one-size-fits-all approach where I have to scale all my resources storage and compute every time I want to add another note what am I going to do in many cases I'm going to increase the cost of my licensing whether I need storage or compute right so if I can't at a fine grain level in a very simple way scale my storage resource is independent of my compute resources what happens when I'm running any kind of core based licensing look at what sequel did sequel said hey we're gonna go ahead and charge you by the core great I just wanted five terabytes of storage but I had to put two computer into controller you know two CPUs in there because I needed this one-size-fits-all approach I'm gonna get out of balance real quickly in terms of my licensing the hidden costs in a lot of HCI is the additional cost based on the controller licensing for CPUs whether you need it or not if I'm building in a VDI environment and every node I scale with has ten terabytes of storage in it and realistically for five thousand virtual machines that are doing VDI I may only need 20 terabytes of storage period but I might need 20 nodes of horsepower and compute if I have to bring another five terabytes of storage with each one of those I have a bunch of storage sitting there that I actively do not use it's wasted wasted resource and because my storage model is closed I can't connect you know you know an exchange farm or a sequel farm from outside into it with some other compute I'm locked into their models consumption so by having an open storage model a my storage capacity and performance isn't stranded B I can scale it in and I can scale storage and compute independent of each other so I don't get walked into this lockstep unit of measurement and C I don't pay a penalty for licensing for things I don't have to keep building the psychs and we'd like to build some any Christmas getting to the other part of area other area that we thought was very important for hyper-converged as we move forward towards the next generation of it automate all the things right and there's that little meme that's kept a little it looks like a fourth-grader droid but I want to automate and streamline management I want to deploy rapidly because there's the value inside of HCI that is I need to get it up and running in an hour and if you don't do that that's you know that table stakes I want to have a comprehensive set of api's where everything is some code that I can leverage is not all of us against it in front of a GUI for the rest of our life and do stuff right so in terms of that provision model we take you know you plug a USB stick in the back of the box it copies a breadcrumb file on there it has the first IP I plug it in my laptop I hit go give me my username accept the EULA I give me my username and password what are the IP addresses for storage border the IP addresses for virtual machine networking and at that point I'm off to the races 45 minutes later I have a functioning system what are we actually doing under the covers is I'm taking these nodes I'm doing a discovery process I'm going to target the ones that I want to be storage and provisioned the element OS operating system I'm going to target the ones that I want to be compute and put the ESXi hypervisor on there I'm going to build a V Center instance and then I'm going to inject my plugins in there to manage it and then I'm going to basically spin up a management node the one virtual machine we use that basically is a data collector for alerting and management in phone home and then I'm going to integrate with our active IQ cloud based management platform that does telemetry I'm gonna package all that together that's that's the that's making the EpiPen for our customers that want to use HCI it's taking all of that complexity and turning it from like four hundred different steps and in inputs and turned it into 30 and making it simple to provision and do it I could do it all by hand if I wanted to but that's not quite as as put in in some IP addresses and going grabbing a cup of coffee and call me back I mean it but unless I enjoy you know unless I've read the book the joy of menial tasks and I really thought it was a New York Times bestseller I would rather not do that the monitoring you know that's one it's one virtual machine or one VM oh yeah it's one VM it's a lightweight VM that basically does all of our collector stuff and that's you know that's a carryover from the SolidFire days right I had a controller via or a a monitoring node VM that did my ability to collect data locally and proven it out if I want to get crazy cool and do a bunch of stuff and grow fauna I can do that as well right because if you go talk to my friend Aaron Patten he will show you all the cool new hotness that he's done with inside there we have some customers that do that so the leveraging our cloud to do that that active IQ cloud which is based on MongoDB which is like a super awesome scalable platform and then obviously you know direct hypervisor integration I mean why would I reinvent the wheel to manage VMware doesn't mean I can't right it doesn't mean at some point down the road we won't build our own you know third-party management platform and make this into a cloud operating system where things are containerized and packaged and deployed and put across you know whether in Amazon or Azure stack instance or something else of that nature the reality is as we launch and go to market 82% of customers run VMware they have 500,000 deployments right so it's it's a decent-size market to go after I was gonna go after the KVM market I'd kind of be limiting myself but I understand why diversity of hypervisor experience is important because you know maybe the market is skating away from VMware sometimes for certain types of customers going back to that open storage model you can still run KVM against us you can still run hyper-v against us you could run docker and whatever it is we're just not going to have the same level of integration in terms of that turnkey provisioning experience but we still can support them right they will still run on our platform we won't stop you from doing it like I said more utility out of the platform that you're using instead of locking you into one consumption metric in terms of API integration when we built SolidFire the first three iterations the first three software releases we didn't have a GUI it was just API calls our customers were service providers being they never see a GUI they didn't care what it looked like they just wanted to know what are my API calls to provisions block storage as a service how I consume that how do I protect it from one local one workload from another and then we started to get you know enterprise customers people like myself who have a hard time coding the words hello world into notepad I need a picture to work things right I usually misspell at the hello part too but you know I had integrations with tool sets whether it was you know puppet or chef to integrate with I'm looking at you know ansible if I look at you know native docker plug-in capability that's in there too if I want to do things were on PowerShell and be Josh Atwell for a day I can do that right I have the ability to integrate with software development kits around Perl or Python and Java now I want to give you a lot of access to different tools because not everybody does thing the same way I want to be able to automate all of those things as well when I look at some solutions lots of people say they have an API right and they do right and the reality is though that sometimes those API jar bolt ons and to do something simply as provision a volume and granted some specific characteristics and set snapshots and all that stuff it could be anywhere from you know like eight steps with dependency chains in 190 lines of code because gosh that's what I want to sit and do and make sure I don't screw that up every time I redo it for one volume or I can go into UI and do a slider and enter for different inputs or it's twelve lines of code in an API I when I focus specifically on SolidFire as as part of our office of CTO are we going to say I don't necessarily want to talk to your storage people because they won't necessarily always get this but your developers will your cloud architects will the application designers and developers role because when they want to initiate storage inside whatever application or workflow they're working with if they can copy and paste this and make it repeatable and very easy they will do that because then they don't have to bother anyone else right and that's how we get programmatic in our nature everything we do inside the UI if I hit a checkbox it will expose the call and response if I want to make that repeatable I copy and paste and I'll use it over and over and over again it's getting simple it's one step and twelve lines plus the little brackets right data fabric obviously a big piece with inside the net up world right how we integrate with the other pieces of the portfolio you know we want to harness the power of the cloud we want to build the next generation data center we want to modernize your it modernize your infrastructure how does the HDI product do that well essentially it's data fabric ready how many people are familiar with snapmirror well we don't snapmare into the product right like a snapmirror from the HDI product to a fast I can integrate with things like storage grid for object storage I can integrate with OCI our analytics piece I can do replication and I can do data protection pieces but I can also run a virtualized version of on tap on top of the solution as well as part of the product it will come with it you have a certain parent you know a certain amount of storage capacity that you could dedicate to running on tap select as a virtual machines to the other and provide very rich robust file services with 25 years of history behind it I've challenged anyone to have you know to do a better implementation of NFS etc than NetApp it's the one place that we've really have the market cornered in terms of robustness and solutions integrations why would I not offer that as part of the solution and bring it to the market when it comes to hyper-converged infrastructure so I look at the ecosystem today file services are an afterthought at best in terms of targeting of workloads you know we kind of have this this area that focus that we're going after right I don't want to play in the $25,000 remote office branch office space it is just not a place I'm interested in personally as an organization we saw that hmm based on the platform based on what we've built based on the value we'll bring in the market based on the fact that it's all flashed natively we're not going to be able to play there either and make a price point that anybody makes any money so let's go after areas where people are not playing as well or need to right because if I look at the larger ecosystem of customers that are out there I see a lot of enterprise customers have been on traditional infrastructure for a long time that are looking at specific areas of growth and would like to consume resources the way API is provisioned but don't necessarily think it's robust enough so we're going after internal private cloud we're going after large scale workload consolidation I'm looking at web 2.0 infrastructure I'm looking at next-generation data centers like Couchbase if I give you an example right so here's d1 traditional HCI strategy is to land and expand I get the first workload and great I'm gonna get a whole bunch of others but then in customers quickly realized that they can't protect those workloads from each other and they back off in it because another silo sitting in the corner for just PDI or whatever else for us because we can predict protect those workloads and we can segment them the customers can start to play Tetris with their infrastructure all right I can continue to add different disparate workloads one at a time till I get to some level of where I'm comfortable I can leverage different pieces of automation and integration into those workloads and solutions as well I'm not bound by a small silo of resources that I manage I can start to really look at my infrastructure the disparity of the workloads that I use and actually enact them in a larger broader space with a unit of consumption that's much smaller than I'm normally accustomed to five and a half terabytes of storage 16 cores 384 it gigs a memory I can start that small I can go bigger so really what we're looking for here is just in particularly is going after the broader piece the puzzle that we think HCI should be able to go to if you look at the marketing messaging that's where people believe it needs to go but it may not necessarily be there yet so that's pretty much the story here I just want to touch base on the last three pieces we're looking at guaranteed performance flexibility and scale and automated infrastructure backed by an integration point with a portfolio of products that span pretty much every use case when it comes to storage from on-premises from you know from the core to the edge to the cloud so there you go thank you so can we dive deeper into this a little bit here do you sell the slide the picture of the back of these things yeah what's the connectivity between the various nodes it is bring your own switch and gig 10 slash 25 so the back here is if the SFP 28 every node so you have a little flexibility here right storage nodes have 210 / 25 to one gigs for management that you don't necessary have to use a one out of band management the compute nodes have two nodes for virtual machine traffic movie motion etcetera 2 notes for storage drawer two ports for storage traffic at 10 / 25 - for management and one for out-of-band now you don't have to use all of those I could combine everything and have two cables come out of the back of this box but I do have customers that want actual physical segmentation between virtual machine traffic and storage traffic so the option is there so I'm gonna go to the top of your tank no 10 gig 50 gig 25 gig hundred gig switch and then I'm gonna do my discussion there in terms of our performance you know the way we handle data and everything comes in is broken into a fork a chunk hashed compressed de doop etc coalesced into one megabyte chunk written captured in an MB RAM card replicated between one node and another committed to disk all that happens within sub one millisecond in that traditional 4k 70/30 workload that's where we are in that 40k or 50,000 ions per node discussion point when it comes to performance we can do more we just tend to be conservative in terms of what we market and you're just kind of separating out like the I scuzzy traffic those words on VLAN or something yes it can be completely segmented out through its own VLAN it could be completely going to a set of physical switches that you've dedicated to ice cozy traffic or it can be conversion networking and then I missed it did you have a speeds and feeds life like that says CPU Ram options storage options yes yeah sixteen to thirty six cores three to four to seven sixty eight gigs of RAM you know various ports it's all on the website in the storage side it's you know we're looking at that five and a half to forty four terabytes of effective capacity per half width one you write different size discs that go inside there the disks denote basically the capacity with inside the system will be supporting larger sized disks at some point in your future but then again you get a really heavy storage node inside of box now you're looking at imbalancing all right if I had a hundred terabytes single storage node and so I might you know and then I have to go by you know what 100 compute nodes to take advantage of it yes sir can you add regular SolidFire knows in the same cluster just gonna ask that not at one point oh but quickly to follow yes all right so like I said that slide I showed you they had the three different platforms you know they're inter mixable alright so if I already have solved fire in the Florida and I wanted to augment my HCI I could add a SolidFire storage now it's the same operating system yeah they have to be the same OS obviously are the same level right but the reality is it's the same technology different form factor of consumption so in theory you could start with you know just compute nodes cuz you're doing a compute refresh and then when you get to a point where you need to expand your storage capabilities of your existing SolidFire you could just start adding storage nodes at that point I would have to check with the sales power that be whether or not that would be something that would be an option that we would sell but yes you could obviously buy the compute nodes themselves I think in how things are quoted there's certain realities around how we do that I'm not sure if we would just sell compute nodes just yet obviously after the fact after the initial first purchase I'm sure we would but the the inverse would still work in a sense of since this is in effect a SolidFire storage array that if I had existing compute clusters you know going back to your licensing discussion about sequel if I already had a separate single cluster I could use this and share it out to there yes already yes at the very first purchase we're probably going to require you to buy at least four storage nodes and to compute nodes at the after that it's up to you how you want to consume those resources if you want to connect a Superdome if they still exist go for it and are you recommending at least a minimum to chassés just for the minimum touch a sees is because we need six nodes minimal right I need four storage nodes and to compute nodes to be the minimally Viable Product realistically how many people build a to know dates AEC at ESX cluster but you know I see it a lot more very often so it's there right I mean you know if I didn't want high availability and I didn't want and I didn't care what happened to the system yeah I could build a two node SolidFire and a two node compute cluster but then if it you know if something went bad you lost all your data but maybe there's some people who like to use these as skeet targets I don't know yeah but if you're replicating it to another site then this just kind of becomes a robo node it could be right or I can actually sell you something else that's a little more suitable towards that particular market make it on price pray those customers look which is on tap select on bare metal portfolio company not one size fits all when you say bring your own switches are you playing down the road to integrate with say fine leaf infrastructure there will be a point yeah there'll be a point where we start to get a little more sophisticated than networking we start to investigate more heavily into you know whether its Cisco or VMware in terms of the networking that they're doing or even things like open contrail et cetera you know it really depends on the way that that market goes and who the winners and loser in the AI ops performance be over provisioned yes and in fact in many instances like the last release we did we actually got something like a hundred thirty percent increase over singer volume performance from the previous release just simply through a code update so I tend to be very conservative in terms of what I can provision and I kind of say haters your heart and fast rules hey I can still over-provision memory and a VMware instance I can over-provision CPU to my heart's desire it's just a matter of when you know the metal meets the road
Info
Channel: Tech Field Day
Views: 7,669
Rating: 4.609756 out of 5
Keywords: Tech Field Day, TFD, Tech Field Day Extra, TFDx, VMworld, VMWU17, VMworld US 2017, NetApp, Gabriel Chapman, HCI, hyperconverged, cloud, on-premises infrasturcture, SolidFire
Id: uXI1GO2jKug
Channel Id: undefined
Length: 62min 44sec (3764 seconds)
Published: Thu Aug 31 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.