DC_THURS on Operational Analytics w/ Boris Jabes (Census)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome everyone uh for another episode of dc thursday i'm pete soderling i'm the founder of data council and the data community fund and we're excited that you join us almost every thursday as we chat with a leader of a company open source project or tool in the data ecosystem and today we're very fortunate to have boris yavez who is from census and boris i think i just um said your name wrong even though we discussed it so it's jabez boris javez we aim to please we aim to please here at dc thursday so um even if it means uh repeating things on the air and showing a healthy dose of humility that's something that um is in line with our values so um boris thank you for joining us and um and for your your patience um with the name uh we're really really happy that you're here today to talk to us i'm very happy to be here man pete it's great great to do this so um boris you have a pretty interesting background um you started a company before it was meldium that you told to log me in and before that you were a senior pm at microsoft so i want to keep those things in context because usually as we traverse your experience and your career um we find that these things sort of stack up um to pointing you to where you're at today and so i like to i like to keep in mind um but i i wanted to start off and ask you you went through y combinator and so i was curious to hear a little bit more about that experience um obviously it was a company before census i believe it was meldium that you took through like yc is that right that's right awesome tell us a little bit about that experience and what it was like as an engineer um you know going to the startup accelerator like that yeah yeah it was um it almost feels like a different time uh uh in the valley this was in uh 2013 y combinator of people who for people who know is now a much larger organization uh i think they you know they there are roughly like 400 companies i think that go through yc a year now maybe 500 and we were a batch of 40 companies going through yc at the time uh it's overall it's a really it's a really amazing experience it is my short answer uh for a couple of reasons one is as with any any great uh kind of uh uh organization you're around a great cohort of people and so it's not just that paul graham was an amazing kind of uh mentor for startups and could really help you think through your ideas and how to build something kind of big but you're also around unbelievably ambitious founders so i think the two things i got from yc is the first which is in the name right y combinator is it's not going to make your company it's not going to it's not going to incubate it right as the to use that term it really is about accelerating you on the path that you're on and they do that mostly by helping you focus on the things that matter right which is mostly the short answer on this is like focusing on growth and uh focusing on the you know the derivative derivative rather than everything else and just just and that that's a very very useful thing when you're a first-time founder because it's easy to either not know what to do at all or to try to adopt all the things you were doing as a as an employee of a company where you know how to build software but but you're not you know you don't realize like what changes you have to make when you're out in the open building like you have to make a lot more sacrifices and focus on a lot less things because you don't have the you know the bandwidth to do everything and then the other thing that i got from y combinator that i think is under-appreciated and maybe it doesn't get talked about as much it's really like an ambition boost uh paul graham especially was really good at taking any idea that people had and seeing it in a bigger way than any of the founders did so so you know uh meldium was a uh identity management product right it was a single sign-on product so for people who know octa it was kind of like that but much more focused on passwords and and kind of helping uh teams and companies use sas products seamlessly where so you didn't have to deal with password management as a team and you can think of it as nowadays for people would be like some opportunity like a crosstown one password and octa think of like the the marriage of those two things and it was a little bit ahead of its time and the you know we had some ambitions that we thought this was a great idea and needed to be existed in the world but he took it to like oh you know you will have every password in the universe right like this is amazing like you you could and so he helps you think of the maximal version of your of your product your company and that's very very empowering right because you have to be able to kind of struggle through really tough times uh as you build your company so i think those are those are probably the things i remember the most about yc and then like i said your cohort is is just excellent i've made you know you make amazing friends who are all you know just brilliant and ambitious um and you you've spoken before i think about how your foray into the auth identity world and that part of infrastructure um somehow prepared you for um for the what you call the flip side of the point and thinking about the data world um what's what's the relationship between the two and how did you get from one to the other yeah yeah i mean i think a lot of founders start with you know something they've some pain they've experienced themselves right so if i could yeah the the back story here would be the best way to think about it is when we started our first company we were we wanted to use all the sas products that were coming out back in 2012. you know whether that's google apps github nowadays there's thousands right but it was new it was like you really wanted to be able to use these great apps and there was this chaos that came out of that because you had to manage employee passwords in every single system it was really frustrating and so we built a seamless tool to be able to kind of manage a team log in seamlessly and and be able to you know kind of have all the control and the comfort of you know not having to worry about access management but as a user the upside is i could use any piece of software i wanted to and i was not limited by the set of software that my it team gave me and the the analogy i give for this is when sas emerged there were two pieces of the kind of computer operating system that were forgotten so when you have a computer and you install apps right going back into the 80s and 90s you would log into your computer and not into photoshop right and you would have a file system on your computer and not in photoshop and those two things were missing in sas and so meldium was an attempt to create a central federated authentication system so that every sas app could avoid having to build that and i would say that you know octa is evidence of kind of how that grew into a massive uh a massive market in massive business and census when i say the flip side of the coin is trying to do the same thing for the rest of your data instead of employee identity it's like your customer data which in a way is yours as a company but you want it federated to every possible sas app that you're using and in order to use the maximum number of sas apps in the world you want to have a great federation system for that and so that's how census was born as like you know the flip side of that coin and so a lot of a lot of what i learned building meldium you know kind of applies here even though authentication and it are very different from data and and you know data tooling so there's this notion of federation plus sort of consumption by third-party systems that that seems to be a guiding principle for you yeah yeah that was very much at the root yeah that was very much at the root of of the idea well i think i think it's going to be an interesting conversation because you're playing really in um what many consider to be a new category of data the data ecosystem and essentially what we're talking about today um you know through our chat is this area to the right of of the data warehouse and it seems to me that you know when i met you and sean your co-founder like early on several years ago um and sean was telling me about this tool that was piping data you know from a database or a data warehouse like back into business tools for sales people um it wasn't even necessarily obvious that you necessarily were a piece of data infrastructure per se or weren't necessarily thinking of it like that um i kind of want to get your thoughts on that because it seems like it wasn't obvious even to to you and the team um that that this was essentially it could be a data it could become a data infrastructure company like how are you thinking of it at the time that's very true and yeah you have a you have the unique uh uh advantage of having met census very early in its life so census started in 2018 uh and the core first version of the product is still the product which was this bridge between the data warehouse and this application salesforce being the very first one unsurprisingly and you're absolutely right that when we started when we built that and we were out looking for our first couple customers we knew this was useful because we had felt this pain uh in particular actually after uh after we sold meldium to this larger kind of public company called log me in the the sales and marketing teams were completely separated and disjointed from our team and we had all this amazing product data so so we knew there was a disconnect there like we needed a bridge built the question is who who who's going to drive that right who's going to install it and manage it and uh the even though it's a it's a physically it's connected to the data warehouse it wasn't obvious back in 2018 that data organizations would want to touch this right because historically this was not in their purview right historically i think a data team owns the data infrastructure but not these business application scenarios and so the some of the early users of census were on the edge of what you would call the data team and they were basically business operations teams and if you were to zoom in though right you would realize like these these are effectively at the very least analysts they just don't use the data title uh and the further we kind of got users and fans back in 2018 and 2019 the more it became plausible and then obvious basically one at a time right plausible then possible than you know obvious that data teams were the the best custodians of this because it sits on the data infrastructure and it actually increases the it magnifies the output of a data team which is a very kind of attractive thing well um on on that note um you know why don't you tell us just so that the audience um have the benefit of understanding where you're coming from um why don't you tell us about what census does because i think there's a lot to unpack in you know what you said and i don't want our common sort of shared um you know knowledge or previous conversations um to leave everyone else in the dust but i think what you mentioned was something to the effect that previously teams sort of thought that that data stopped at the dashboard right like the dashboard was the last place that data was consumed and it sounds to me like you're talking about something that might go above and beyond that and tell us more yeah so so yeah let's let's let's explain uh what census the product is for the the people who have no context like you um census is a automated data pipeline product right it connects to a data warehouse and syndicates data out it publishes data from a data warehouse into all the interesting business applications that companies use whether that's sales tools right salesforce hubspot etc support tools like zendesk intercom or finance tools like netsuite so it's about taking data and models and insights that live in a data warehouse and pushing them out seamlessly on a regular cadence so that the state that lives in your warehouse can be syndicated right it can be federated out to applications so that's what census does uh and it was born like i said as this bridge between hey i want to get product data product analytics especially into my sales tool so that my sales team can target the right leads or better understand the users or drive upsells etc that was one of the very earliest of scenarios and so implied in that right is the the use cases that census enables which were completely new at the time and you know like i said in 2018 really i don't think almost any data team was doing anything like this was going far beyond a dashboard into action that sales at first like i said sales teams but eventually you know marketing finance etc uh uh are powered by census and and what i mean by action is you're not showing them information in a dashboard you're not even embedding analytics which was kind of well trodden ground i think by tableau and tools like looker already back then we were saying what if you could generate information in a third-party system and potentially mean that you're changing the behavior of the end user so i'll give you the simplest of examples a lot of our customers right are really kind of product led uh companies they have lots of freemium users they have lots of users period and you really can't you know it's a lot of work to sift through that to figure out who a sales person should actually engage with it's kind of we kind of understand that and there's a lot of amazing analytics to understand who high value users are you can put that on a dashboard but that doesn't change the fact that a sales person has to wake up in the morning and decide who am i calling today who am i talking to and our customers use census to take uh the the scoring and insights they have on the users that are most likely to convert and send that directly into the crm as leads or objects or contacts or accounts such that you know the sales person will wake up and their past list will already be generated and they can just do their work they can just start making the calls and and that's very different than in terms of a relationship between a data team and their stakeholders of a dashboard right because now it's actually my work list it's just generated from this yeah that's fascinating um so in that use case um does census have somewhat of a brain to be able to do the lead scoring as well or to what to what extent does the census have business logic versus you know the business logic sort of remaining in the in the end tool um the crm in this example yeah so i would say you see peop you know there's logic can live on either end of the let's call it the bridge right so since this is this bridge between the warehouse and let's say your sales tool some folks choose to bring kind of well-modeled data into the crm and let the crm decide so you might say i want to bring in really precise information about how many users are in in this organization in this company right so we have take figma right they have millions and millions of users all these organizations you could do hey let's sync exactly how many active users there are in every account and then you can have a rule in salesforce that says if an account has more than x active users they should be routed to this sales team where there's a round robin you know assignment and uh they will they will take the work right so there's definitely companies that do it that way and then there are companies where the the i'll just say that the programming model for this in salesforce can be frustrating for some people and sql and the warehouse is an amazing execution environment so a lot of our customers actually kind of do it the other way where they'll create those rules as just another layer in their modeling stack that can live in census right like we we can execute arbitrary kind of anything your warehouse can execute we can execute through census and they'll create those rules there and by rules i just mean yet another you know sql transform and then just send that across the wire so some people treat their destination apps as slightly dumber terminals over the data that is in the warehouse and some of them you know maintain kind of a lot of logic in the end system but they still want the correct source data to come and the only true arbiter of correct information is the data team in the data warehouse and like going back to the early days um what what were some of the the validation points that you discovered when you spoke to early customers like how did you know that you were onto something again back in 2018 like when it wasn't as clear perhaps as this today yeah yeah so yeah so i think the like i said a lot of startups i think are born this way we we built our first version because we knew we had experienced this pain right so we uh and we felt well okay what kind of companies would resemble meldium right meldium was product led freemium uh and had had had suddenly been assigned you know external sales and marketing so so i was like who who might look like that and we we're very much builders right so instead of just some people will go out with a with a with a powerpoint and walk around town and and see who would be who would bite and i i really respect that uh we you know kind of just spent a month building kind of our first version of this thing and we we kind of took that around to some friends where we said well you theoretically should look like this but it was much more of an exploration and um i think the first aha moment was in 2018 figma was just about to go on a tear and and they had just hired their initial sales team right they had this massive adoption already like self-service but now they wanted to expand and one of the things that i've noticed in in in a lot of companies and this was like i said the first aha moment is no one wants to end up in kind of what i call the evernote trap where you have this a great freemium product and they're self-service but you never figure out like the scaling revenue all the way up to enterprise and figma knew this and so they were starting around their series b and investing in enterprise sales but they wanted to create that funnel that goes from free to pro to enterprise and they they they had when they saw what we had built they said this is correct this is this is the mindset that we have because salesforce is at the end of this and and the data and our data warehouse that they were using redshift was where all their interesting product data lived and they knew they wanted this to be powered off of that flow because most people were self-service so the last thing you want to do is call people who have like just paid with their credit card like why would you want to do that so it was really that was that first confirmation that we were doing the right thing but of course one customer is you know that's enough to to kind of tide you over for a bit and then we just kind of the first 10 customers that we got were were very much in this vein and uh they just kind of increasingly confirmed that hey there's a need here and then i started noticing kind of changes in behavior that really kind of told me things were going to get interesting so i just wanted to clarify one thing you said because you mentioned that it sounded like you wanted to sell to other customers that had the mentality that meldium had meaning yeah product led bottoms up um and that was just because that was your view of the world as a founder like you believed in that court of market motion and you wanted to build some tool for other companies who believed in a similar bottoms up motion is that fair yeah i i might even go slightly stronger in that the the premise look if you let's rewind right every company since the dawn of you know companies has had customers and some form of customer data whether that's a physical rolodex somewhere you know in a sales person's hand or cue cards or whatever uh um or a database and salesforce is the the you know standard for the last 10 plus years right like it's the database that not that it's 100 of the market but a huge percentage of companies have their customer data there and the reason they do is that it's a great tool for for for tracking the work of sales people but my premise which again was informed from meldium is if software is the thing that you're selling and growth is driven from the software i.e bottom up product led which seems obvious to people i think in in the valley but not necessary but if that's the case then increasingly the locus of interesting information is not going to be coming from your sales team but actually coming from the product and so to me it was who would who would already feel that way and and could could kind of agree and confirm that they would want the warehouse as the central brain rather than the crm as the central brain and and that's yeah that's where working with these kinds of companies became the right move but yeah not everyone was on that you know people are at different stages of that journey obviously interesting um and how did your early experiences um with these customers um sort of uncover similar reactions that um you've mentioned previously felt akin to folks sort of stepping into the devops movement like you felt like there was something really foundational about this model that was that was sort of a sea change with customers like talk to us about how that awareness you and you you brought this up at the beginning of like you know what you know since we when we started our first couple of users were not even what you would traditionally call a data team uh involved a lot more of these business operations teams so what led us to you know where we are today and i think there's a couple of things and and i'll this this is how you can make the final puzzle piece of my my my professional life right because i spent the first half of my professional life in developer tools uh i worked on visual studio at microsoft and love love love developer tools so so my mind was always attuned to that as well and what we saw was this so our first couple users you know started using census as a this etl product and they actually even started referring to it as a reverse etl way back in yeah i think 2018 or 2019 and is a kind of a it's a cute terminology right because it's people are so used to tools like five tran that seamlessly bring your data down into the warehouse and census was seamlessly taking the data out of the warehouse right so they're like hey let's call that reverse et and the at first it's like augmenting you know information that is in their destination tools let's say like a salesforce or a zendesk but as they started depending on it more and more and and pushing more and more uh attributes metrics data objects etc into these tools they the the value goes up right they magnify the the value of what was in their data warehouse but they also kind of took on more and more responsibility and this was the the that the signal that was like you have to manage this carefully and you can't i think historically uh when you think of that you know kind of way people do app integration people are somewhat haphazard about it they just connect the thing to a thing and then they just connect another thing to another thing and it's great hey like when someone fills out our web form here like put them into a mailchimp it's like very easy very practical but you start to lose like what is the state of the world and is the data correct and how do we recover from errors and all these things and our customers because these were such crucial systems to them right like the entire sales process of figma was being powered by census literally the entirety of it and this is becoming pretty critical so you have to start implicitly you have to start thinking about your data the way the way people who make products think about products i.e versioning it testing it uh deploying it etc and this was new to them in fact it still is new to them because they're not engineers by by let's call it by upbringing and we've taught this for decades now uh in the software engineering world first when we were making software on disks and then as we made web software we developed like newer processes and newer tools right in the 20 in the 2000s and 2010s of you know devops so whether that's all the way from you know uh you're versioning your code which is like obvious and has been in the industry for decades to you know having a pager and and managing like a pager rotation using pages and none of this really existed in the data world or in the business operations world and so that is the larger kind of mission for census is to actually not just help people connect the data out and create this federation of data out of your data warehouse but help you do that with confidence and the way to do that is to kind of adopt what i call devops practices and so i think we're just going to adapt this to data so it's like folks are now calling that you know the data ops ecosystem so so just to make sure i understand so what you're saying is that this is essentially sort of the stakes are much higher because we're getting data in an automated fashion from the data warehouse which used to be the end of the etl now it isn't anymore we're getting it back into salesforce we're using it to automatically drive lead scoring lists or automated outreach or salesperson outreach or whatever um and so because the stakes are so high on this data that's being reimported just for example back to salesforce um you know the customer has to trust the data there has to be sort of quality assurances there has to be slas right um you have to know if the data gets stale um and so there's sort of this whole other um end of the orchestration that's required in order for the business teams to really trust the the automated data that's flowing back into their business systems exactly and and that's i think uh and you've you know you've been watching the data world for a long time too and i think this we we catalyzed i would say something that was like on its way to happening but i think census is i think a catalyst to to investing in these uh this kind of education these kinds of tools and these kinds of processes so uh nowadays everyone's talking about how you test your data right it's you know one of the things that's great about dbt for example is that you can version your data models but it's nice that you can also just test them right there right and that i think three years ago was kind of neat but not necessary uh because if your dashboard was wrong what's the worst that happens right you might get at the end of the quarter like an angry meeting and then you'll fix the dashboard with the revenue like you fix it and then you know okay we're okay but our customers i'll give you a great example uh one of our earliest customers was clear bit this is way way back uh and they were uh their first integration although they integrate with a lot of different things was to their email marketing tool and before census they had lots of different systems writing data into the into the email marketing tool and there are lots of duplicates i'm talking like they would send an email and sometimes seven the person would get it seven times and in the warehouse they could you know the data analyst could that use a census was able to kind of deduplicate build the correct you know kind of let's call it master list of users with the correct set of attributes and basically overwrote the email marketing tool completely from scratch and seamlessly switched over to the correct data set and from that point forward the emails were great however and this is the you know what's the line from spider-man like with great power comes you know great responsibility what was net new to her which had not been the case for her ever before is if you make a change to the data model now the effect is like you're playing with live ammo because the e this is automated emails are going out so if you screw up the data model just accidentally uh millions of people might get an email within hours and so that creates fear and to me this was maybe the most it was i felt really bad in a way for our user but actually this was a sign that we were doing something very important because empowerment like means you have more responsibility which means you have you you know you are going to be more afraid to screw up right and you have to have then your tools have to help you and give you the confidence and so census started we started investing heavily at that point in validating your data before it goes out and catching all sorts of failures before they could happen and i think that's a lot of you know people at first will see census as like a really easy way to push a metric from your data warehouse into take your pick right zendesk or salesforce or whatever but the real value is it helps you do that while giving you the the confidence that you're not going to screw up and and that's that's what you realize after a little bit of using it and and that also means you're working on something important right because i i ultimately think products are about empowering people and you know you've empowered people when they're more afraid yeah if you realize your your tool has that much leverage and power sort of where it work where it's sitting in the in the value chain um that's definitely indicative of something something good yeah well um i think we've kind of danced around it but i just wanted to to give you a chance to sound off on on your views on operational analytics i mean we've basically sort of talked all around this concept but yeah um that's a term that you've preferred to use i know in the past and it seems like you know if there's a banner that sort of goes over this whole conversation that we've had so far it's really the fact that businesses need to start to think about um data in terms of you know the operational power it can give to their business versus just sort of being a function that stops at a bi tool in a dashboard somewhere yeah yeah i think it's a really nice name uh because we work with analysts we work with analytics teams we work with data teams and the whole point of census if you haven't already bought into it is your everything that you do in a dashboard everything all the data modeling you do all that interesting insight you generate whether that's through ml data science you name it would be more powerful if it was pushed into systems where people are taking action i.e operationalized it was put into the way the business is run rather than the way the business is informed right which is what a dashboard does and don't get me wrong like bi was a significant step forward in the last 20 years of like well now a lot of companies have great data when they make strategic decisions but to me that was that's not really data driven right that's not data powered that's just informed by data which is great but the census difference is operationalize that those analytics like those those the output of your data team rather than even just like the raw data so so that's why we use this term operational analytics even though it's very long uh i guess we should do a we should probably get a 18 in there somewhere um but i think it really summarizes the shift and the interesting thing is it's not just that you should think about how to use your data like the fact that you should push it out to more than a dashboard more than a bi tool which is you know the first hump right it's like if we can help you realize that there's more you can do with your analytics that's great and then like i said that becomes a catalyst for investing in your data team and in the core right which i think is actually the the the real job to be done of data teams is building a foundation so that the entire company can depend on the data organization and that's new as a bi team you are like you said you're a sync right you're you're the final stage not just as a piece of software in the data warehouse in the bi tool but but actually as a team as well you're like a leaf node and with census and kind of this broader operational analytics approach you're no longer right now you have dependence at a software level and so you have to reorganize your team and you also have to kind of reorganize your logic your data well let's let's drill into that a little bit more because um i know there are some significant challenges um in philosophy and values and you know just mentality um if if a data team were to sort of you know expand some of their purview um and start to be responsible for this kind of data that flows back into business systems like like how do you think the culture the mentality of a team has to change how should they think of themselves um sort of in this new era what what does this mean for a data team yeah i think this is a deep deep vein and there's a lot and i think we're in the midst of like a really fun shift and transformation here anyway and so i've got very uh uh strong opinions uh that will probably not all pan out to be true but but i think uh they're worth putting into the world so i think you have to make not only you have to make two kinds of shifts as you expand the the the footprint of your of your data team the first is actually technical which is you can't if all you build is reports i think there's there's a tendency to just take data that's just you know hey you'll you'll drop some data into warehouse you'll you'll run like a report and like generate the visual and we know right like we already know that if you're gonna have to build one report you can just do that if you're gonna build 10 maybe you build one tiny abstraction if you're gonna build a hundred or a thousand obviously you have to build some reusable models and that's the you know that's the first shift and i think we're still at the beginning of that if you really look broadly at the at the whole world i think the the the nerds have already adopted kind of best practices around here around like dbt has really exploded as a great tool for creating central models that you can version but it really it the very infancy of that and i think there's gonna be a lot of onus on the technical side on the logical like code side of what should those models look like such that they can be reused by reports but also reused by uh these destination sas applications and if i were to reframe to you what census was trying to accomplish and still is always trying to accomplish is turn your data warehouse and that you own into the you know the central brain and the central data store for you and your company and make every third-party sas app like a cache on that data so you got to come up with the right data models for that and we have a lot of patterns that we kind of help customers build here so that they can do this well so so so that's you know kind of one transformation it's almost like a pure technical level the technical side yeah and then there's the organizational side and i think there there is changes within and changes outside of your data team so i think within we're already starting to see this but you have to kind of you know a lot of people might say what the data team is just the data science team and i think that's too small-minded right so you really have at least three types of people or three layers within your data organization you've got the core data engineering team which i think still will always exist potentially building like owning the low-level infrastructure you know setting up the snowflake building certain kinds of pipelines i think over time they're gonna they're gonna be a lot of there's gonna be a lot more purchasing rather than building in that environment but you know there is always going to be some amount of software to be built then you have this analytics engineering layer of a team right that is focused on sql focus on the data models but is about building reusable things and then you've got what i'll call the you know the tail of it which is the data science and and analysts who are finding insights using that data doing interesting things and that's you know i think not every company has decided that the data team is all of this combined because and the reason i know this is i one of those people at least one set of those people are for sure engineers and it is unusual for an engineer to report to a finance team but most bi teams report to a finance team so so i think the the there's like within organizational change right so you should create one leader around those all of those pieces even if the data analysts are going to be loaned out to uh to business teams right which is totally reasonable but i think you should have one center of excellence for this and then you uh from outside as the ceo you kind of want to think about where does this sit so that it can provide maximal value to the business and the whole point of operational analytics is that you don't just serve the finance team and the wall street side of with with dashboards and revenue reports you serve the sales team the support team the finance team the product team the you know the marketing team there's so many teams that you serve through these centralized data models so you actually have to reformat you know where the comp the organization sits and who it formerly has a stakeholders and in my mind it should be i actually think it should be a foundational piece of the company i actually think the data team should be like a platform team for the whole company i don't know anyone who does that yet just to be clear and i think for that to happen a you're going to need something that is akin to a chief data officer or at least something that reports pretty closely to the cto almost almost certainly and you're gonna need to shift from it thinking which is i think where data has historically been which is reactive to requests from various stakeholders towards product thinking and that's i think a really big shift yeah for sure i mean it seems like that that could be a conversation um you know it thinking versus product thinking um as as an overlay on a data team i think that's a that's a new concept that um could definitely be explored in a much deeper conversation because i think there's a lot there yeah i mean the more i think most people who listen who have dealt with are in data or have been around data i mean it's very classic to get questions you know like hey can you fix this report can you build me a report it's probably the most common thing can you answer this question for me and we talk a lot about how do we make that self-service which helps a little bit but of course that also moves the burden of education to the to to the edge nodes um but i think when i say product thinking i i actually mean that you don't just acquiesce to every request you actually take that all in and um and decide on a whether that's a weekly every other week every month doesn't matter you create a cadence of shipping and you you know you you take all that feedback in as the same way that a product like the census team gets customer requests 24 7 right and the law of software is you cannot ship everything a because you don't have bandwidth b because it doesn't always make sense right and you want a great product experience and so the data team has to kind of take that on and i actually think product managers should exist in every data organization for example uh whose job it is is to manage the product that is the data so so communicating with stakeholders reducing it down to what we're going to ship and and and creating a cadence around that and and and and then sharing that with the whole company uh and yeah it's like the switch from it's like it's like the switch from the project manager um which basically like manages a project plan yeah to a product manager which essentially like sort of manages a product roadmap and so it's a much more strategic way of thinking about your data teams deliverables against project management or sorry product management guidelines versus just the the project management which is sort of typical in the i.t based world as you mentioned yeah i think that's really spot-on right like you know when you're doing a hey we gotta roll out new uh new new access points for the wi-fi right like that's very much a project but here it's you're right it's a road map of what's gonna change in the in the in in the tools and then the products and by the way it's not just that right i think a really good rule of thumb for people is to go back to our earlier conversation about devops and data ops is we devise those processes for a reason because they are really well optimized for shipping software shipping product at high velocity with high confidence so you want to have versioning of your code why so that you can rewind right so that you can collaborate like so that you can kind of branch right these are all the reasons you we version our code and that applies to high velocity data shipping as well right you want to be able to have staging environments so that you can test stuff out as you go out to production right you want to do that with data as well uh you want to be able to have automated testing before data or code is deployed right and again there are certain tools emerging to do that um and then you want to deploy right the same way you go to production in a software application and census is actually that's a great way to think about census census is a deployment tool right because it deploys your data into all of the third-party systems that depend on it and so we're there to kind of have your back right and and prevent you from deploying bad data which would be disastrous and i i think all of this is you know just brand new processes that people are putting in place yeah so you so as you mentioned before you would like to your vision of the world is to see the the data warehouse be the central brain of the organization and all of the other third-party systems essentially have caches of that data is that a is that the best way you can think about explaining your biggest vision for for this area i think the that's a it's a good i think we landed on a really good nerdy definition of it so i like it uh but yeah i think this the data warehouse is one of the greatest pieces of technology i think that's happened in the last you know let's say five five to ten years infinite scale of storage and compute and it's just a much better platform for storing you know your customer data and your core you know kind of data that you care about as a company i think it's important for companies to own their data i think you should think about that as something that you manage and and build out as a company i think it's your core value right and i think you should be able to use any sas product to drive any business operation some of our customers have 300 sas applications that they use so it's not just that i want those applications to be a cache it's actually that i just want to empower companies to be able to build any operation any automation using any sas product and the way to do that is to have a product like census there to federate that data out for you and then you know for me the the ultimately products are about people and about making them like i said the best versions of themselves and so the the goal if we can turn data practitioners into some of the most powerful people in a company some of the most leveraged people in a company i will consider like you know that that a job well done and we're nowhere near that today yeah got it um well i just you know as we wrap up i i wanted to ask you um what are your insights on this broader space like what are you excited about sort of happening in the near future um in this space and and also sort of in relation to that um what are some of the key insights that you've picked up as you as you've come this this far thus far and been involved in sort of the curation of this this new category of tooling yeah so i think there's a lot that's exciting right um and i'll give you one example which is to me i'm excited by anything that it moves the the kind of quality and capabilities of the core which is let's call it the warehouse and database technology itself so the more you use census the more you operationalize your analytics the the more you realize that you've you know you're magnifying the value of your data it's great but bit by bit you will just you will realize that there are scenarios in which you want lower and lower latency and to me this is a really exciting kind of development in the data warehouse and let's call it the the grand merger of data lakes data warehouses and streaming and all these various pieces that are coming together basically and especially incremental materialized views are kind of like my favorite thing right now that are getting our customers and people closer to be able to to deliver data at lower lazy like fresh data correct data at lower latency and the more you can do that the more you can turn on scenarios that are powering the business so take take take a classic thing that a lot of our customers are doing companies like notion and figma are doing this every day in census they they have scoring systems that that run through census right so the sql scoring systems that determine who is the most active users uh how large of a team they are all these kinds of signals that they aggregate into a score that says you are worthy we should we should talk to you for a sales conversation by the way some of our customers use that in a different way which is i want to provide you different service on the support side so they'll sync that data with a support tool to be able to prioritize tickets it's a really cool kind of way of using the same the same piece of information so so you have this scoring system and then you push that you create leads right in your crm and then that generates work and a call well sometimes the the value of making that phone call within 24 hours versus within one hour versus within 15 minutes could be real and so the more our data technology can get us to running the entirety of the data pipeline including kind of the the transformation layer the the ingestion layer and the testing layer all that in such a way that it can get you the answer and push it out in less time is is super empowering and so i think that's that's probably one of my favorite areas so you know companies like materialize i think are really really cool for getting people closer and closer to that that world view of like hey you just build models they're incrementally materialized it happens faster and faster someday you know we'll get to sub second but like we're not there today but like we'll get there and i think that's really really exciting yeah well we haven't even talked about how um you know census might play in in the streaming data world but obviously um i think that could be very interesting going forward uh streaming obviously has to live up to its end of the bargain and you know it's sort of become something that the world can really um operationalize and take hold of without building infinitely complex systems um such as streaming systems so sometimes are totally totally well i'll give you actually i'll give you yeah we can't get into that fully but i'll give you you asked for like what's an insight what's something i've seen and you know one of the interesting pitfalls or mistakes i've seen people do is even way outside of core engineering people are really addicted to the idea of events and activity streams so marketers love that too um and what i found is people over index on the value like the cool things you get from event processing uh and not enough on the downsides so what i mean by that is it's absolutely necessary at high scale you know kind of in high scale systems so on the engineering side of course you're gonna need at some point something like a stream processing system like a kafka or whatever to manage messaging at scale and event processing at scale but believe it or not you see the same behavior without the same words on sales and marketing teams where they say hey every time someone hits the website i just want the event you know like they came to the website or they swiped a credit card or they did a thing and one of the things i try to teach because you can sync events you can sync state but census really i i'm constantly trying to push people to realize that with the right tooling which you know we've built syncing state is actually in many cases better for you than syncing the events and i have found that business teams because they don't have they're not versed in the you know kind of what's called the computer science of like event-based thinking versus state-based thinking it's it's actually a lesson that takes them a while to learn and the reason is like at first it seems great hey every time you know user added to org i just want the event user added to org but if you don't know how to aggregate the data in your marketing tool to realize that people get deleted from orgs and that you actually have to maintain that logic at every layer of the system versus putting that in your warehouse putting that in the central brain and having a tool like census just sync the ultimate value so hey there are now 12 users according to you know the aggregation map that we have and census will make that true everywhere we'll say it's 12 everywhere don't worry about it and that is the correctness that you get from that is so much more valuable in my opinion than the immediacy you get from user got added right and and this is something that people don't really see yeah so that's that's that's i'll give you that that's a kind of insight that i've seen people struggle with and they don't know why because they it's somewhat difficult it's like kind of distributed systems kind of thinking you have to have yeah that's a very interesting insight and that makes perfect sense well boris um it's been great to have you you know share with us today thanks for for taking time to share all your learnings and experiences in the data ecosystem i'm curious how can folks get a hold of you if they um want to be in touch yeah yeah i'd love to talk about this with anyone so uh i'll give them my email it's like boris at getcensus.com uh and i'm sure we'll put that in the show notes somewhere and people feel free to gold email me i'd love to talk about these things anytime uh you can find me on twitter it's my full name boris chabess and uh you know you can go to getcensus.com if you want to just reach out to anyone else on our team and you're going to find people on our team and hang out in all the data communities but yeah that's just love talking about someone excellent thanks for having me well i'm sure i'm sure people will um take you up on that and and chase you down so uh borah shabazz thank you for joining us thank you and to the rest of our listeners thanks as well for tuning in uh we'll have another episode next week may 27th i'll be talking with julian ledem the co-founder and cto of datakin on the new open lineage project that datakin has created so don't miss that as i think that'll be a great chat as well um don't forget to subscribe to the youtube channel and hit the bell icon so that you get notifications and thanks again for joining
Info
Channel: Data Council
Views: 343
Rating: 5 out of 5
Keywords: data engineering, data pipelines, SaaS, operational analytics
Id: 7bdHtSMBU7A
Channel Id: undefined
Length: 55min 5sec (3305 seconds)
Published: Thu May 20 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.