Data Science Product Case Study (LinkedIn InMail, Facebook Chat)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
let's say we're working on a new feature for linkedin chat we want to implement a green dot to show an active user but given engineering constraints we can't a b test it before release so how would you analyze the effectiveness of this feature so what are your first thoughts on this question um i mean right now i'm just kind of trying to brainstorm all of the different things that do come to my mind initially okay um and i think we're basically i'm like the one kind of part of the question that i'm picking in terms of like the background around this it's like oh we can't a b test it before release so i'm just basically making a mental note that not a b testing it does not mean releasing it fully because we could definitely look at something like face releasing basically okay um and probably having like a set of some specific key indicators that we would want to monitor as part of that phased release in order to make sure that we're not um um risking anything so that basically kind of from a framework point of view i'm definitely thinking that in terms of indicators that would um help us um assess the effectiveness of the of the future i would think of opportunity related indicators um and risk related indicators okay cool um yeah so that's basically a start but actually one thing that i would want to kind of circle back before digging deeper would be just basically clarifying what would effectiveness mean whether is there like any kind of further indication that i would get from the interviewer or um whether like i they're happy for me to kind of make hypotheses around how i do understand effectiveness or how how i would even recommend what effectiveness would be in this specific use case yeah so let's say that uh we want you to come up with uh your first stab at it on what you think effectiveness is okay um so first of all i think it would be good to just kind of quickly clarify uh what linkedin chat is just because multiple users have like different experiences on the platform um so my kind of understanding of linkedin chat is basically the kind of direct messaging part of the application where basically users can talk to each other uh privately so just to make sure that we're clear on that basically yep so let's say it's the uh current kind of you know pop-up sidebar um you can uh effectively uh there's its own messages tab you know and then there's also a pop-up site as well right um yeah so basically uh once i have clarified the so one one thing that i would want to think about here it's like what's the point of having um like that feature in the first place basically linkedin chat so why do we have it and what are we trying to maximize by having it um in the first place so basically what are the use cases and what would be the metrics that we would think about in order to have an understanding of those use cases and how we are um evaluating that basically the platform is catering for these different use cases um so basically just like here uh brainstorming a couple of use cases so basically um yeah so linkedin it's a social network obviously we have a lot of different features in terms of users being able to talk to each other um and linkedin chat would be one of these features um so yeah thinking specifically about what are the kind of elements that this feature fosters it would be definitely increasing communication between different members of the network yeah so basically in terms of members of the network we could have like different types of members different types of users we could have businesses or we could have candidates which are basically typically the two um usual sides of membership in this specific platform so that's just something to bear in mind so it could be private like basically like uh a member to member message um or a member to company message okay um so for like basically the candidate side i would just call it like member to member or like basically any follower or it could be like a business to member which would be basically b2m so m2m and b2m so that's one kind of initial use case based on the types of users that we have on the platform um now if i choose to specifically focus on um like the m2m for the sake of this exercise basically here i think linkedin chat would be being able to communicate with your specific connections outside any kind of public display that would be available through other features i.e commenting on a post or leaving um like a specific recommendation um or even tagging that specific user in another post okay um i guess if we dive back into it then let's say what is the benefit then of the direct message uh member to member or member to business versus the other interactions that you can have on linkedin as a social network so i think it's the ability to communicate directly and um like create a better communication channel that would not necessarily be so basically i think like the words that come to my mind are things around like intimacy um close um connections basically like building a relationship outside just like a mere connection so probably consolidating the um connection level okay uh additional thought here is like let's say a um email a connection or like a let's say message that would have originally um you know been like okay let's move this to email can now exist on linkedin correct uh correct so actually yeah that's a very good point because basically the use case would also not only be defined but what would be um like the alternatives available on the platform itself but also catering for some specific interactions that would happen outside the platform so that would definitely be um yeah one way of thinking about it okay so if we had to now so we've kind of scoped this out right we have all the different use cases we have an idea of what the benefits are can we now kind of drill in even deeper into figuring out what metrics would be good for us to track to just understand how messaging is effective and then also understand what kind of metrics we could track to understand if the green dot on like the member profile would be effective as well so yeah so first i would start with understanding what is what like health looks like right now for um linkedin chat so like what are the metrics that we would want to use and for that basically the way i'm thinking about them in like um basically two stages one is engagement and two is retention okay um so basically engagement would be anything around uh using uh the actual chat so that would be basically number of users using the chat on a daily basis it would be yeah probably that actually um i i could also look at the frequency of usage basically yep frequency of um you do number of messages sent per user right so we want to know if it's fully spread out between different users or if there's just a couple of power users on it um just gotta kind of get an idea of that distribution of usage of the chat feature um right and then maybe even like percentage of you know monthly active users that on linkedin that use the chat feature at least once right so we just want to know how much it gets utilized by the entire population of linkedin continue to segment our user base yeah an additional thing in that i think would be good to have at this stage it's also like whether we have any inner funnel within the kind of feature itself ie the number of users who start drafting a message as like the first step and then pushing it um ie sending it because i think especially thinking ahead about the green dot this is one thing that we would want to monitor because obviously we're introducing an extra factor for the user to factor into um them going through that process so i would um probably introduce a metric around um you know like the number of net like the draft to send or draft to completion probably draft to send percentage okay so we could say like the number of users that start a message divided by the number of users that end up sending the message um yes or it actually could be at message level itself just because um like one user like the basically because the green dot because one user interacts with multiple users and that when they interact with this one user they could interact multiple times uh we would want to have that at that level of granularity because this is what we would be interested in yeah okay cool [Music] yeah this is this is a like a good bunch of different metrics okay and then lastly i think you mentioned retention too right uh yes correct so retention actually is an interesting one especially in relation to what you mentioned in terms of like email being probably the most obvious alternative to like using the linkedin chat um then basically we would definitely want to uh define retention as the number of users who basically use um this feature um on a regular basis so i think what like for for me retention it's always interesting to think about that x so user number of users who come back after use like after x um and what's the the period is it in terms of days months weeks etc so usually um the way i would personally define that is by looking at like a distribution of like what would be the inactivity streaks for that specific feature and then probably defining the x uh from that distribution uh relying on like basically the the most common um inactivity streaks after which the user uh com comes back basically okay uh an activity streak would be like a number of days that they were continuously on the platform uh no actually that would be in so the opposite gotcha okay so in between they came back yeah okay exactly so basically if you think about it it's like um you have a distribution of all of your users and then base actually it's a distribution of all the inactivity periods uh but obviously it's the the ones after which the user comes back and you just you're getting the kind of most common ones after which the user comes back gotcha okay so you want it to be uh left skewed then right so we want the amount of yeah yeah yeah out of days to be minimal right exactly correct yeah okay cool nice um that sounds good that sounds like a great uh thing to look into okay so now that we have these linkedin chat metrics on the broader level of messaging uh now how do we analyze the effectiveness of introducing this new feature uh on those metrics um so i think um one way of doing it is probably picking up some like key indicators like probably one key or two key metrics per area that we have uh predefined i.e engagement and retention and we would make sure so for example for engagement um i think it's important to um probably like one make sure that the kind of usage uh by the total population is kind of stable um i.e the number of users who are using the feature pre and post um introduction of the green dot has not fluctuated too much obviously controlling for things around like seasonality over time and all of these things i think we'll get to that specifically when we basically pick the type of analysis i.e pre-post so i'm not going to go into a lot of detail at this stage gotcha um the other one would be the free so because there are two dimensions right so you want to make sure that out of the total population users are using this feature but also because um of the nature of this edition which is the green dot you also want to make sure that at a user level the frequency has not changed much they're not being discouraged by the introduction of the green dot here like my little assumption being you know it could actually create some kind of social anxiety for example seeing that the other person would um would you know is actually online is right there so that could kind of make you think twice or maybe self-doubt what you're writing especially if it's in a professional context and that you would not want to send your message i mean obviously that's like the downside version the upside would be oh you can actually see that the other person is active and therefore like the likelihood of you getting a response from them is actually quite high and therefore that would encourage you to actually increase your frequency um uh you know like talking to that person basically um yeah so basically within engagement to quickly recap so there is this idea of like um looking at the ratio of like the number of monthly users who actually use the feature on a daily basis um and then the kind of i would overlay an extra dimension which would be at a user level what is um the frequency basically i would look at maybe the number of that's an interesting one because we would not necessarily want to look at for example at the number of users that that user or actually we could stick to an average the average number of users contacted um on a daily basis okay after before and after the launch yeah before and after yeah well i think even given the so here again that like the assumption i'm making based on that part of the question that says oh we can't be tested before release then i guess we're going to be releasing it so really the only way that we can um kind of measure and analyze this is by doing a pre-post analysis yeah so i guess uh we're it seems like the only way we can analyze this right is pre and post especially when we're trying to measure different levels of engagement right um is there any way that we can maybe cohort our existing like user activity potentially in a way that would make it that we could analyze it without just relying on pre and post data on the aggregate because um so let's say like for example um we had uh users that uh never used the chat app before right and then after the launch they started using like the green dot chat app right so with that i mean that is still kind of pre and post but that kind of then kind of shows the uh uh the interaction right of what would happen if um we they we actually got them on the chat app or like maybe we find out that people that send like their first chat are suddenly like you know active like go from multi-active users to weekly active users at a higher rate right and so potentially if we uh hypothesize that a green dot uh encourages you to send that first chat more than if you didn't have the green dot then effectively then we could be like okay that is pretty effective because now we're just pushing more people down this funnel of like basically starting their first message or something like that right [Music] so yeah yeah yeah in fact i'm thinking do we we probably would want to add this um like this metric to the to our set of metrics and there didn't get the engagement section the number of users who send their first chat out of like the total monthly active population russia uh because i think you're quite right in mentioning that really there are two types of actions for that we would want to kind of have a close look at one is like sending your first kind of message ever and then it's like and all the other messages that come after after that yeah there's also the i think additional thing in which you know if if a user is not active right then now they won't have a green dot on their uh profile right and so potentially we could still look at the users that are still sending uh messages to users that don't have green dots right and see how that maybe affects uh engagement or retention down the line right so uh you know we release this feature there's now a green dot on everyone who's active right uh but uh that doesn't just mean that you would send uh messages to users that only have green dots like you're still going to send maybe if you're like a daily active messenger you're still going to send messages to people that are not active right just because you want to reach out to them you're trying to get like some sort of uh you know deal in place maybe you're a recruiter and you're reaching out to a candidate so are there any kind of maybe comparisons we can make in that regard uh in terms of the feature change there well actually um i just don't necessarily think that after exposing a user to the green dot existence we could actually um like consider them sending a message without having a green dot as being very similar to them sending a message without ever being exposed to the green dot if that makes sense because okay could you illustrate an example maybe but as in what what would so the thing is that right now like if you think about it before a user seeing a green dot at all like the only way they are used to sending messages is like basically they're sending a messages so they have no idea whether the user on the other end is actually active or not yeah they they would just send a message all the time now post green dot so they can see that sometimes the user is active and some other times the user is not active so when the user is not active that would that could still change their behavior now that they know that actually the user is not active okay um so i don't yeah so i just want to kind of a little bit flag it here that we cannot compare these two as being similar um given the fact that now the user has been exposed to this concept of a green dot that's it okay um i guess then could we then try to figure out uh how that might change the metrics down the line right because let's say uh we launched this feature and then only 10 of all users are active right uh and then maybe we know like the active after a week we get enough data to know how many people are sending messages when users are active versus when users are not active after they're exposed right and so then it just becomes a function of like okay so now we know how many people are active you know at all times of the day on each day of the week on average can we now calculate the total number of messages that are going to be sent because we have these like fractional components right so um you know if there's a 90 chance that you send a message to a user when they're active and with the green dot and then only like a 50 chance you only send it to a user when they're not active and then you know only user users are only active like 10 of the time we kind of have our formula now for understanding how many messages are going to be sent essentially right [Music] yeah so then so then basically what you want to make sure of so provided again that you've covered all um the kind of um like seasonality concerns etc it's to make sure that basically the changes that we are seeing between sending a message while there is a green dot versus uh sending a message when there is no green dot that basically the difference is um like compensated for all right um so that overall um like the average likelihood of the user is not dropping below what it used to be um basically it's at least it's yeah it's basically at least similar or even like better higher basically okay gotcha yeah so i think really there are like two important concerns here to like not concerns but like things to really make sure of in terms of like the validity of like the assessment so one is definitely like seasonality because we want to make sure that um we are like aware of like any changes are not being due to some sort of like seasonality that um might affect the user's behavior and therefore draw some like um incorrect conclusions and the other one is basically sampling and by sampling really um just making sure that basically the like the two population well actually like the how to say that um just basically users actually who have been exposed to the green dot um in terms of like the distribution of all the different user properties are very um similar to the ones um who haven't and by that i mean we we need to make sure that all of our users have at least seen that green dot um at least once basically because you could still have the feature released technically but just never truly see it so the feature would have been released to you that you've never been exposed to it like in a session kind of level if that makes sense okay so um yeah i mean what happens if you've signed on and then none of your connections have been online right yeah yeah and you continue messaging them uh would that be a almost like a control like group that we could use to then still see how many messages people are sending within the same time period yeah and that's why um i mentioned we could provided that overall the like that kind of how to say um like control group that just kind of erupted out of nowhere is not basically skewed in terms of like having some specific types of users like over over represented ie maybe like just inactive users or users from like a specific demographic or like from a specific kind of user bucket in terms of like number of connections etc yeah exactly right because it'd be pretty easy for if you only had 10 connections like obviously you know there's pretty high likelihood that no one might be on versus someone with like 400 connections um you know there's probably very little high likelihood that at least one person is active right and so i mean even then it's like we have to start thinking about the the product itself right like how does a green dot kind of feature in the product is it do they always uh go to the top of that like little messaging widget thing uh or do you have to like scroll down to see them like through your list uh is it recommended by like your top your most like closest connections with the most amount of mutual friends or is it the number of people that are active and then if you go into the messaging's tab um do you see is it then maybe that's more of a control scenario because then you only see the green dot if they're active for the last three conversations you've had and then there's that's like a more likely scenario in which users may not be active right because those are just your three last active uh conversations and they might not just be the active people that are there at the time uh on linkedin right so i think there's more uh ways now that we actually have in terms of comparing like users uh that uh have been exposed to the green dot uh and are messaging people versus users that uh or aren't exposed to the green dot yet or uh have been exposed and then are still messaging people potentially so now it's kind of like a weird third group almost we've like kind of brainstormed a lot of different ideas and stuff down uh is there any way we can like prioritize like the ones that probably matter the most um like if i was a product manager on this chat app uh and i was like hey let's like give me like three metrics i can show to like the executives to show that this green dot launch is like doing really well or maybe like not performing well we should scrap it um is there any sort of like graph or metric that you can come up with to demonstrate uh that sort of like direct effectiveness then well i think starting with just like an overtime view of the number of users who use the chat app um out of total monthly active users or weekly active users depending on which one makes most sense for this use case and like how long the feature has been basically out for but actually what i would even look at is like the daily active users just because when you are initially launching this you really want to keep like a close eye on um how this is like varying on like a daily basis and basically what i would highlight is like the day of the launch and see if there has been like any differences to that ratio basically who would know some sort of like seasonality and um when you know we're not kind of likely to see some kind of high fluctuations so i'd probably go with either weekly active users or monthly active users depending on um which one kind of has the least amount of fluctuation just to make sure that we're not um seeing any differences due to those like uh normal recurring fluctuations as opposed to the introduction of this new green button so i think this is a like a good kind of initial visual to have uh and also to track like just the basic health of the um of the the future itself and it can easily highlight whether you know they're it's like they're basically the this introduction has um been having like a positive effect or a negative one so the good thing about this ratio is that it can ex really be a good way to basically um understand you know whether there is any risk or opportunity um related to that specific feature so i would definitely start with that um so an overtime of that specific metric um i think in terms like based on what we brainstormed um i i really think the one regarding specifically the number of users who send their first chat is also important because that definitely solves for the use case of whether actually this feature has any um impact that we would not necessarily um see from that like initial uh overall ratio of just users who are using the the feature um on all active users but like basically the proportion of users who send their first chat um i think another thing that i would want the kind of the pm to be aware of and for them to also present this to their exacts is the kind of likelihood that we came up with uh of like basically you sharing um you messaging when seeing the agree while having like a green dot versus not yep um and basically what's like that likelihood like and just making sure that um basically yeah these two one are compensating for each other and two that the the average is basically very comparable to like the pre uh period as well um of the number of messages sent correct okay so intuitively we also i think think that if you have a green dot then the message conversation that you're um creating would be longer right so like if i start a conversation with you we both see that we're both active uh we're probably gonna stay on the platform and chat for a little bit longer rather than compose you know like one long message and then log off right and so um potentially i mean there's ways so that we could see like maybe total number of messages sent divided by a number of conversations would be a good metric there to just keep track of uh yeah like an average of okay that's yeah it could be [Music] i'm just trying to think though do we do we want to introduce like a session level like how would you define conversation yeah that's true it would have to be uh it'd probably be like conversations per uh day uh as like a session right uh we're assuming that these conversations have a stretch you know multiple days when you're singing each other so i think that'd be a good distinction um and then yeah i think one last thing that might be interesting is also looking at just general like how does this affect uh you know different parts of linkedin's business right so we can theorize that you know uh you can buy you know linkedin premium as a consumer right as a member just to basically message more people even and then there's also like a recruiter like linkedin recruiter right in which recruiters can go out and message candidates and so um these are all directly related to messaging right so it's like is there any way that we can understand how uh maybe we're directly increasing revenue by attaching this like green dot to it so actually how it how is it monetized yes um so there's like linkedin in mail right so you can message people that you're not connected with uh if you have premium and then you get some certain number of messages per month uh and so and then there's also like a recruiter right so that you can like search a bunch of candidates you can potentially send out like message blasts actually i'm not sure about that quick question does that mean you would so are we here making the assumption that the green dot would be shown for anyone you're messaging or only your connections because usually i'm thinking for the linkedin mail you usually reach out to people who are not your connections because if they are your connections then you don't need to like email them right yeah you're right so um let's say uh yeah you wouldn't see the green dot then but potentially after you make that email request and you get connected then there's probably some sort of follow-up in terms of uh being able to if they're active or not right so sending that like second message essentially um many people also within sales right they like they connect with you they send my first message request and then they now you're connected and you can see when they're active at any time essentially and so uh that's also raising like kind of this like are they going to be targeted is there like targeted message like are they only going to message you when you're active now which actually could have like an annoying effect exactly right yeah like a little ding yeah it's like this thing is not working for me i'm being spammed twice as much exactly right and so it could definitely turn people off uh yeah okay cool i think we've gone through a good amount of these um i guess lastly i'd love to know just kind of like what you think about this interview question uh what do you think about um the problem at hand in terms of like the green dot messaging feature uh what kind of frameworks do you think you would use in the future if you're encountering this question again um and uh where do you think we could like improve with our like theorizing and brainstorming i guess um great um i think it was a good question in the sense that it's a bit of an unusual one because the kind of um like automatic thing that comes to my mind is like oh it's like about how you think about an experiment because a b testing is like a given usually yeah um so pre post is definitely like something that you would also do so i think in this in this sense it was quite um a constructive exercise to go through just to kind of go through this checklist of if it's a pre-post what are the things that i need to be thinking about um so there is this aspect to the question um and then in addition to that there is obviously the kind of more um like basic layer of you have a specific feature how do you think about any changes made to this feature uh what is again your other kind of um mental checklist to to go through in relation to that um and i think we kind of covered a fair bit in our brainstorming process i think initially like when we started kind of just uh building an understanding of you know what is the future we're talking about um what is the goal for having the future what are the different types of users and after the you know if you decide to kind of focus on one specific use case what would be the metrics you would go for specifically how can you bucket those different metrics in terms of like areas um so us thinking about engagement retention is a very good way of kind of going about it and within that um probably can't have brainstorming a bunch of these metrics i think one one probably step that i would have probably done differently at this stage is right after picking a list prioritizing like maybe what's the one metric or the two metrics that uh like right then and there like we would just decide that this is the metric that we would want to use uh i mean obviously we could always kind of circle back to that in light of like new elements of the discussion coming up as we did for example in terms of like the first chat users um yeah so definitely that's something that i would tweak at that point um i really yeah sorry but i think in terms of just feedback for you i really like the fact that you kind of um picked like you made some choices and basically just hypothesized about some like possible um outcomes of the actual test you know like what if actually the likelihood is like this and this then what are the things that we would want to kind of think about and i think just kind of building that kind of progression through thinking around the different phases of how the feature will be rolled out um and also later down the line in terms of oh actually wrapping up um if you think about this from like a pm perspective or an exact perspective that kind of automatically puts you in you know now that we've brainstormed it's great we know that like we've covered all the different aspects of the problem uh but actually if you are to make a decision about this what are the things that you would kind of um look at ultimately so that was a kind of also good good part um i definitely think that's like necessary in terms of every interview right because it's uh whenever you do anything in terms of like analysis or even like a presentation or like writing you kind of have your key points right and um those are the kind of things that we have to circle back to especially when we're trying to uh one prove and our logical argument and then also communicate our results um and uh i i don't think it gets emphasized enough uh for sure especially when um you know we're in this day and age of uh like everyone's working on very something very complicated all the time and i think everything needs to be broken down like very simply uh and so that's kind of like part of the job as a data scientist right is to translate that and then part of the interview is just to do that within an hour yeah that's very true
Info
Channel: Data Science Jay
Views: 12,170
Rating: undefined out of 5
Keywords: data science interviews, facebook data science interview, linkedin data science interview, product analytics, product analyst, case study interview, technical case study, data science case study, data science interview questions, product manager interview questions, product analytics interview, google product analyst, microsoft analytics interview, data analytics interview, data science coaching, data science mock interview, data science project
Id: Wie80J99BV4
Channel Id: undefined
Length: 42min 6sec (2526 seconds)
Published: Wed Sep 23 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.