Applying Data Science Methods for Marketing Lift

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone thank you so much for joining us today my name is Selma Beach and on behalf of the American Marketing Association I'd like to welcome you to our webcast for today and this is sponsored by Cardinal paths entitled applying data science methods for marketing lists so before we get started I just quickly wanted to cover a few housekeeping items this session is being recorded and a recording will be made available to you soon we also encourage everyone to continue the conversation on Twitter and you can do so by referencing hashtag data science and if you have any technical or content related questions please feel free to type them in the chat area that's located on the left hand side of your screen and we will address those at the end of the presentation and so with that I'm very happy to turn things over to Charlotte to get us started good morning everyone thank you so much for joining today's webinar on data science methods for digital marketing optimization my name is summer said in Charlotte Boren I'm the manager here at Cardinal path in the data science department and I have quite a bit of experience over in the field of web analytics I'm going to be talking today about some of the range of applications that are available for digital marketers in the realm of predictive analytics and also on the call with me today is Danica Law Danica is a consultant in our digital intelligence practice on the data side side she has a wealth of experience in statistical and machine learning models deniz Danica's experience includes working for Statistics Canada which is the Canadian government's National Statistical agency so when you registered for the webinar we asked you what it was that you wanted to know about this topic area and the responses we got were incredibly broad as you can see just by this word cloud that you're looking at some people were looking for general information other people were looking for some very very specific data science questions to be answered and some people were just looking for new ideas about what data science methods could be used so in this webinar what we're going to do is give you a small sample of what are pretty fundamentally very different data science approaches what we're trying to show you is just the breadth of options that are available out there and what we want you to take away is that data science just has a huge range of applications so hopefully you're going to get some new ideas for your own organization just based on what you're going to see here today so the topics we're going to touch on are forecasting customer lifetime value and text analytics we also took a look at some of the questions that came through the registration form and we're going to touch on some of those at the end of the webinar this webinar is also being recorded and everyone who registered will get an email with the link to the on-demand version along with a bunch of different assets including our data science playbook which summarizes sixteen different data science methods that you could potentially use in your own organization one of the reasons why we picked this as a topic is just out of the sheer potential opportunity the online space is great for a lot of different things but one of them is just generating volumes of data and digital marketing it is no exception because there is so much data being generated this lends really really well to data science techniques which can be trained on and can execute on that same data some organizations consider and build data science programs as a continuation of their business intelligence departments but one of the main differences is that while business intelligence is a little bit more backwards looking and may be more of a reporting based view data science changes that to a more forward-looking predictive view one of the first things you're going to need to think about as you bring data science into your organization is staffing and one of the big challenges with this field is just the large number of skill sets that's required to successfully execute a data science project we've created our own data science Venn diagram this works particularly well for our own organization here at Cardinal Pass if you google data science Venn diagram you're going to see a lot of different variants out there some which may fit for your own organizations a little bit better but what you're generally going to see is these four skill sets you need to be represented the first is the areas and statistics second is the area of visualization and communications third particularly if you're dealing with large data sets we start thinking about databases and computer programming and then you also need ideally domain expertise to understand what the results mean so that they're actually more applicable so to try and find someone with all four of these skills is really difficult and it's sometimes referred to as the data science unicorn which is of course a rare and mythical and perhaps not even a real type of animal so one thing we do to get around this problem of trying to find someone with just an incredibly broad skill set range is we hire for single areas of data science expertise and then make sure that all the different core skills are covered within our data science department so everyone on our data science team can actually play in more more than one of these different data science areas but we always have a really really strong go-to resource for each key area and our philosophy is that data science is a team sport and that the team together can provide stronger support to a project than any one person alone can when you first get started with data science it helps to understand the overall workflow of a project and one of the models we like to use is known as crisp DM which stands for the cross industry standard process for data mining this works really well as a way of understanding how a data science project typically executes and the idea here is that each stage is potentially iterative you can proceed along the past of delivering a data science project but depending on what you find in the data or depending on what you find in your models you may need to loop back to earlier steps just to better refine your project but the key point and the main point of reference for any data science project it always needs to be the very first step here and that is business requirements if you execute a data science project without having the goals of the business in mind you can sometimes develop really complex pieces of work that end up driving no value for your organization so that's you should expect that when data science projects that are going to change as they execute just always keep in mind with final business requirements are his Charlotte on the slide can you maybe give an example of when you would go back in the data science project I mean that might be a little counterintuitive going backwards in the process yeah that's a really good point one project we worked on a little bit earlier this year was for a fortune 500 company and they had launched it was a really big consumer based campaign and it was targeting a large number of people and what they were trying to figure out was who was their audience and who was really responding really well to this campaign so we said well we can model this out we can make a response model and in more technical data science terms that's called logistic regression which was the method we were using so we knew the business requirements understanding the audience and we also had the data set available but before we went down the path of modeling we took a deep dive into the data set and that's when we found that we weren't actually seeing a whole lot of differences between some of the different demographic variables which means that demographic variables at least for this client were not predictive so if we hadn't done that initial data exploration step we could have proceeded all the way to modeling and then produce a model that provided no value to the client so instead we want business requirements we went data evaluation and data exploration and then we had to boot back to the business requirements and say okay either requirements need to be slightly changed or we need to come up with a different approach that meets the requirements that we know and we have a pull up right now the idea here is to find out a little bit more about what data science topics you're interested in getting insights on and we've got four potential areas here very broad topics the ideas are customers marketing channels products or price and I see the responses that are flowing in really quickly here in our own customer base we tend to see that the clients that are working with tend to veer a little bit more for the first two for customers and marketing channels and let's close the poll to find out the audience results you exact same thing and it makes a lot of sense to me that we're seeing customers and channels as being key points of interest just because they usually are areas where you can really move the needle so with that general introduction about data science and staffing and the execution process I'm going to hand it over to Danica and she's going to go into a little bit more detail about some of the specific topics we have here the first step being forecasting thanks Charlotte so what can we forecast well really you can forecast is about anything any online or offline actions like your online sales your display web downloads just about anything that your website or your offline measurements are measuring so the sky's the limit with what you can predict here of course you're going to need some data to make those predictions so you can predict your data based on user attributes previous time periods and more if the data is available we can assess whether including in the model is going to lead to more or less accurate and useful predictions but I want to start this off by asking why do you want a forecast what is impact going to be are you going to be able to optimize anything as a result of this so for example is we probably uses you can use your forecast or you could look at how decreases in revenue occur because of scaling back your advertising budget or you can start using it to reduce uncertainty that way you maybe know what to expect in your business over the next several weeks another use case would be using it to clearly and accurately communicate expected goals for your business so as I said what you need is something to the algorithm to learn on the model must be trained this is why we need to collect lots of rich data before we begin our forecast and when doing this it's a good idea to think about what influences a purchase particularly if you're forecasting sales so someone might first interact with a business via word of mouth when the one of their friends maybe tells them about an interaction with a business maybe they see some promotional materials or online advertisement like paid media maybe this uses TV ads and there are all of those offline and things like TV ads we also have more vague things like seasonality and how these means coming into play these are all things you should start considering when thinking about creating your flora path so Genesis these are all pretty I think straightforward and logical things that I would expect to see in a predictive model for forecasting are there any variables that you can think of that maybe are a little bit more counterintuitive that maybe people in the call hadn't considered yeah sure this is maybe a unique case I was dealing with so on a particular client they were going to be introducing a new website iteration and so we expected a lot of the metrics to change drastically at that time as different change and the whole flow of the website changed so what we did was consider introducing a variable that was sort of an on/off the first part of it indicated this is a time period when we are on the old website and then it would switch to this is the time period indicating the new website and that way we can have two different baselines so what the measurement is during the pre period of a new website and the post and that was a cool way of sort of differentiating between those different measurement times so after we considered these different influencers to put into our model we have to start thinking about evaluating the best approach and there are two main approaches we might consider here the statistical and the machine learning approach and they have different areas where they lend to different and better solutions to different problems for example machine learning tends to leak more accurate but less interprete balada 'ls what this means is it's going to get our predictions on the spot on as close as possible however we're not going to be able to necessarily know how much as a change is due to what and that's where a statistics is better it's usually better to for more interpretable models where we can see how much of a change is due to what however some we may lose some accuracy and using that approach there are other things on this slide to talk about how long the model takes to run which has the lowest error rate or the difficult to configure and these are all sort of areas where we need to start considering when we're choosing between the statistical and a machine learning approach and here at Cardinal path we consider these each time in a fortune 500 example we found a methodology that met business needs we chose a machine-learning forecast so that we could have increased accuracy a highly accurate forecast the wouldn't let that would let them know what to expect in their business KPIs and what was important here is we weren't bound by one approach we were able to adapt our approach to meet their goals to increase the accuracy and at card will pass we take this approach where we evaluate multiple models on all clients rather than work with a single methodology we find the methodology that works best for your specific business needs we can quickly iterate through a range of models as seen on the right here and find the one that's the winning model the one that is the best for your approach be it more accurate or more interpretive and then we also do an iterative approach here so what this means is as we get new data we can update our model so more more old data historic data or more recent data we can also add new variables so if maybe a new channel is introduced to the mix we can introduce that data set to increase the accuracy of our model in the fortune 500 example we did just that we had originally forecast and using only paid search spend but then eventually we were able to collect all of the different spends so display spent TV spend email spend and by using all of these in the model we were able to iterate and increase the accuracy so now I'm going to dive a little deeper into that example I've already started talking about this client spent millions of dollars each month on paid search advertising so really knowing what to expect in that marketing channel would be a huge asset here they wanted to know how many calls appointments and clothes to expect based on how much they were going to spend on all channels by knowing his paid search spend it helps them meet their calls targets are not they could change the budget and respond to what the forecast was telling them they could expect however for forecasting attempts in the past weren't getting the accuracy that they needed also they weren't getting the level of detail that needed they were just forecasting all calls as a huge bucket there wasn't any breakdown and to differentiate between paid search calls display calls or any other type of call sources so with this in mind we had two goals improve the accuracy and break down the metrics by marketing channel so previously we have known many organization had taken a statistical time series approach since as discussed earlier we know that machine learning can lead to more accurate models we move to what this approach we created many forecasts using many different algorithms and chose the one that led to the best performance we also broke down each metric into the marketing channels and in doing this we are both able to break down and provide a more granular view but also increase the accuracy by 30 percent compared to the traditional statistics model what we do now is we update the forecast once a month where we bring in more recent data to keep the model fresh and currents also we go back and assess whether different models we can more accurate approaches so if we go back and test a different approach than the round-trip the one we chose here and find it's more accurate now we can change it on the fly we deliver the forecast as an output of values expected values so that they can gauge how much of each website in action to expect through the year and here is just a visualization of what we were looking at during the forecasting process so the red line is a fitted line what I predicted and then the blue line is the actual line and you can see some deviation there no forecast is going to be perfect but we're really narrowing down on what the actual trend is as compared to what the statistical time series approach was previously I really like this example key go back a slide yeah I really like this because this is an example from the auto industry and it's for somebody looking for quote completes for the paid search marketing program and I like every single client I work with has active paid search marketing programs so it's important to everyone and I thought a kind of interesting counterpoint to example to this one would be also from the auto sector it was a client we worked with earlier this year and we were giving them a couple of different proposals for different forecasting techniques we could do and they were really interested in auto sales and we said sure we can do auto sales but is something a little bit more novel Auto Sales has leading indicators so we said what if instead of predicting auto sales we predict the leading indicator of auto sales so what we were effectively proposing is to do a forecast of a forecast yeah that's a really interesting example there are almost a forecast of a forecast and so here we're going to now talk a little bit about ongoing delivery how we for delivery with the output of a forecast so outputs can be delivered in multiple formats where we manually update them or we automatically schedule them or more interestingly we can deliver them as part of a dashboarding project in particular through a tableau or integration so when delivered as part of a dashboard we can more easily scale predictive models across sites brands and audiences dashboards are a great way to see the data and the forecasts all in one place you can see how the forecast is expected to change based on your advertising budget and more quickly respond to what the forecast is showing you so for example if the forecast says sales are going to decrease in November and a dashboard visualization allows you to allow you to see this more quickly you'll see that drop in the plot and maybe you'll be able to respond to that through increasing advertising in particular channels as a responsive measure so with the tableau our integration we can quickly find a high quality forecast using an almost endless range of models our has tons of different modeling options are being a statistical software and it just has a lot of different options that aren't necessarily available in the built in tableau forecast so we can really increase the accuracy here then by having it connected to tableau through a custom integration all the visualizations and all the predictions are going to live in one spot this can allow for easier updates and this saves time which it can instead be used to gather insights from the forecast so how are we going to get started here first you should take stock of what data inputs are currently available do you have data for previous time periods so historical data do you have any user level data and think not only about what variables are available what but what about the variables are going to have an impact on the metric that you forecasting so like in the fortune 500 example we discussed we knew that all spent not just paid search spend is likely to impact their metrics and so we sought out to get all spend incorporated into the forecast I'll spend being display paid search paid social and not just the page searched spent so adding more data so it's a good idea to include all these possible data inputs even if you think they may not be the most relevant because during the model building process we can include and exclude the data that leads to the most accurate predictions so I can test different models and figure out which one works best for the data this is why including as much data as possible will help us lead to better forecast determine what you want to predict that's going to have the highest impact for your business sales can be a good starting point here but this could be anything for example store locator clicks if that's something that your website is trying to optimize towards and this goes back to checking what aligns with your business requirements think about the why that's what we sort of obsessively think about here why are we doing this project so it's very important that the business requirements always apply what we're going to forecast because these are the things that we're measuring our key KPIs otherwise you may end up forecasting a metric that has little use in impacting the decisions of your business so here at Cardinal path we frequently do stakeholder interviews to help define those business requirements for you getting set off on the right path for these measurements is not only going to clip you for the best forecast possible but it's going to give all your data projects forecasting and otherwise meaning and that's going to take us into our next topic but before that we have a poll question and this poll asked what is your biggest challenge when tackling data science staff knowledge technologies and data volume or quality so if you're already tackling data science projects you probably already encountering some of these challenges whereas if you're new to it these are going to be challenges you'll encounter all along the way so I'm curious which ones are the biggest ones that people are seeing so let's give a few moments to get a few more poll answers in and I think we can stop the pull now and see what the results are interesting so the main challenges here is knowledge and send the data volume and quality so yeah I encounter data volume and quality issues quite frequently where we don't have enough data or going back and checking the quality of the data leads to challenges where we have to maybe collect more data so really interesting thanks there so what is customer lifetime value it's important to note the customer lifetime value isn't just what a customer is spent with you so far that would just be a simple sum of the total revenue by customer instead it's a predictive measure of how much is a customer going to spend with you in the future then we can slice in by demographic behavioral information and more to identify key patterns by looking at predicted lifetime value rather than the amount each customer has spent so far we can get a sense of how valuable certain customers will be rather than just how valuable they are now this is a really important distinction because if we just look at how much someone who's spent with us so far that means that we're going to undervalue new customers they haven't had time to build up the revenue history that long-term customers have so if we could predict how much value a customer is going to have with just recent purchase information we could use this information much earlier and take action accordingly so why would you want to use customer lifetime value so we may have heard about the 8020 rule this is saying that 80% of your sales come from 20% of your customers but this leads to really important question who are those 20% of customers can you bring in more of them to drive high amounts of revenue like this by predicting a customer's lifetime value we can discover who these 20% of customers are that are driving the bulk of our sales then we can look at their behavioral and demographic traits and then use this to find more of them through targeted advertising or we can use the information personalize the experiences of our existing customers we can also use lifetime value to help determine optimal customer acquisition costs often channel acquisition costs are determined by how much revenue your channel is bringing in but what if we looking at the customer levy level that is is the channel bringing in high lifetime value customers or low lifetime value customers then we could use this as part of determining optimal customer acquisition costs for example look at the customer path on the right if we only look at the first purchase a channel brings in paid searches heavily undervalued compared to display display is bringing in $100 but paid search is only bringing in 25 but it turns out paid search brought in a higher value customer when we look at the full customer lifetime value display is bringing in one-time purchasers but paid search is bringing and frequent purchasers who will buy with the company over and over again yeah I really like this example because it reminds me of a client we worked with a couple years ago now but from the education space and what they were trying to do is optimize their customer acquisition costs and they thought that by targeting affiliate and display they had a very low cost for customer acquisition but when we ended up looking at the lifetime value what we found is that while display an affiliate definitely brought people in and it was students who were enrolling they didn't tend to last all the way through the educational program like a four-year bachelor's degree so display an affiliate brought in yeah low initial customer acquisition cost but also low customer lifetime value so when you considered that full view it actually turned out that the more expensive channels or at least that appeared to be more expensive if you were just looking at that first customer acquisition cost they actually ended up bringing in the better lifetime value so in the end we were able to make a couple different recommendations to this client the one was it looks cheap on paper but when you consider the bigger picture you need to change your marketing outreach strategy to find new students and new customers and then the other one we were able to recommend is when we slice into the lifetime value at a demographic level and we were slicing down into zip code we actually found that certain zip codes tended to produce higher lifetime value students so we were able to give a couple of different strategic recommendations there that the client was able to incorporate to improve their big customer acquisition costs and lifetime value yeah and I really like that example too since it's going through some really surprising findings that helped change what the business is doing to help actually drive more revenue and inform their marketing strategy so thanks so who would be a good candidate for using lifetime value by knowing how valuable customers are and how they behave we can start using this for targeted advertising for example same results for models say that females age 18 to 24 living in suburban areas are driving more value for our business than other segments we can set up display our advertising targeting to search for new customers like this by taking into account a customers will purchase potential rather than their first purchase we can begin optimizing our advertising budget like in the previous slides examples and finally we can start setting up some personalization on our website to serve different experiences to different customers based on their lifetime value segments so maybe you could set it so that low lifetime value customers are served differently priced products or try some upselling on an effort to try to increase their customer lifetime value so by knowing who has low lifetime value you know where to spend time trying to increase the purchase value of your customers I'm going to walk through a couple of customer lifetime value examples and more concrete models but first we're going to talk about evaluating data data valuation is one of the first steps in a project after defining business requirements of course so data evaluation is when we explore the data make sure we have all the data needed for the project checked and then it meets the modeling needs the main takeaway that I want to give here is it's the richer the data the richer the insights and the impact at a minimum we can just use a transaction history but this isn't going to lead to super rich insights when we add in information like marketing channel info demographic info or behavioral and so that's when we really start to get this model shining because what this does is it layers in information about our clients not just how frequently they purchase or when they purchase but more information on where they came from like in the example where we talked about paid search versus display or demographic info where we talk about different zip codes or behavioral info maybe on your website there are different cues that happen for example people who purchase at certain times a day or who purchase certain product mixes and maybe these are all different indicators for lifetime value so looking at them in combination together it's going to lead to a really rich insight another thing that is important here is that we have good data integration data needs to be linked at the customer level what this means is that a user ID without this the data is going to be disconnected we won't know how the behavioral information ties to the demographic information or the revenue and we won't be able to make predictions so now I'm going to walk through three examples of customer lifetime value models starting with some that require less data than the last one we're going to go through so first off is a recency frequency monetary model this requires very basic data just a transaction history and from this we're going to derive recency frequency and monetary values what this means is recent for recency is how often how recently did a customer purchase ago a short time or a long time ago the frequency is how often they're purchasing and the monetary is just how much they're purchasing in sheer revenue since we're not putting demographic or behavioral data into this model we don't get insights for demographics or behaviors instead we're just going to get insights based on how often purchases happen this can be used to classify existing customers and any future ones so we know exactly where they are grouped in terms of customer lifetime value there's a one interesting point on this slide that maybe the participants would have a question about and on the right the title of the chart says the probability that a customer is alive I'm sure that most people on the call probably think of all the customers as being alive maybe you can explain that one a little bit more yeah that's a really good point so yeah we probably think of all of our customers as being alive what this is doing is it's boring terminology from an area of statistics called survival analysis and it's basically just looking at customers who are still customer your business are alive and once they leave your business once they've turned they're dead which is maybe a little morbid way of putting it but it's just borrowing from a different from the terminology of the area of statistics that the models come from so it is a funny way of structuring our graphs for sure the next model also requires just the basic transaction history but we're going to layer in another dimension here and that is time this is a transition probability model and what we're going to do here is look at how customers change in their demographics and their groupings they're in customer lifetime value groupings over time so for example what's the sense of inactive customers are going to become active sometimes or are what percent of active low value customers are later going to become high-value this again doesn't bring in as much information as a model I'm going to go through next but we can begin to optimize based on this alone finally we're getting to the types of models where I actually get really excited with we're working with much richer data and that means we're actually going to be able to take extreme action on the outputs of it so this model means we're taking into account not just the transaction history but at least one of and hopefully all of demographic behavioral and marketing information rather than knowing who has the highest predicted lifetime value based on the recency of purchase frequency of purchase and revenue values we also dive into demographic segments like gender age education we dive into behavioral traits like as those who visit websites often in free quickly and then we can look at these in combination with their marketing histories and we seek patterns in this information through clustering and segmenting we can identify unknown groups of highlights and values and uses the driver optimizations by using exploratory data analysis techniques we uncover segments and combinations with behaviors of demographics that were previously unknown in driving high value for the business and this ends up giving us a more encompassing view of who our customers are and which ones are high or low value then we can use this information and target advertising change personalization efforts and more how can you get started in doing a customer lifetime value analysis first collect the data that will be used to predict lifetime value again this can be user actions user attributes and more the more Danny has put in the models the richer the actionable insights will be later the output of the model is the expected customer lifetime value depending on the input data source your customer lifetime value can represent individual customers or particular segments such as gender and age breakdowns finally and more smotes importantly you're going to want to use the results use the results to lower new customer acquisition costs through authorizing your advertising budget spend more to acquire customers of high lifetime value and less for those with indicators for low lifetime value this can significantly improve your marketing return on investment when you know your high value traits you can set up place the data and identify when a customer moves from low to high value and then you can personalize the website experience and advertise and base on these flags and more the more information you put in the more you're going to get out of this customer lifetime value model it's a Cardinal task we want to make sure you not only get a high quality model but that you use the results of the model we help clients figure out how to act upon the model findings all the time so in sum customer lifetime value is a great way to learn more about your high value customers and start using this information great Thank You Danica for this final section we thought we'd shift gears just a little bit we've been talking about different ways that you can use numbers and data science and in this section we're going to give you a few ways where you can use text in data science but first we have another polling question so for those of you who do have active data science programs we're interested in knowing where you are spending most of your time so the options here are are using spending a lot of time just collecting your data is it in cleaning the data is it in data evaluation are you doing a lot on the refinement of your algorithms and your models or is it actually on the understanding of the outputs of your models and I have a small bet with Danica because I'm pretty sure I know which one's going to come in first just based on what we see here within Argan organization but we'll see if I win that bet so looks like the responses I've slowed if we can close the poll okay yes so I barely I have barely won my bet yeah it's cleaning data if you are an active data science practitioner you know just how much time you can spend on this getting in it to the right format removing junky data just merging data together just so much time can go into this particular step thank you for your responses so text analytics text analytics is exactly what you think it is it is using analytic techniques and sometimes the realm of natural language processing to understand your text-based data and it really doesn't matter where your data is coming from all we're looking for it to be is in text-based form so a few examples that you might have access to within your own organizations would be emails from customers web surveys it could be online comments to be web forms text-based reviews social media is a huge source of potential material here but the ways that you can dig deeper into this different type of data would be through the methods we're going to discuss here we have topic modeling and then we have sentiment analysis to discuss so topic modeling is about understanding really large volumes of text to understand what the major concepts are that are going on within the text-based corpus so here's an example if you were trying to review all of the different tweets that happen during the u.s. presidential debate let's say the first one and you tried to figure out what the major topics on Twitter were that would be very overwhelming just because there's so much content available to you but what top modeling does is it provides a way of summarizing what the major topics squirt and it's much more sophisticated than a word cloud so that was something you saw in one of our earlier slides a word cloud is only going to tell you about frequency but in topic modeling the algorithm actually understands what words are related together so for example a really good topic model should be able to understand that Obama care and the US health system are conceptually related and actually sit under a single common topic which would be be US healthcare industry and then the question is why why would you do topic modeling so really what it's meant to deal with is when you have large volumes of text and just can't sort through it on your own to make understanding of what's going on so this is an example from a textbook by Auden art and what you see is underneath arts you've got a bunch of different words so you see new films show music etc so I think anyone on the call would automatically associate words like film and show and music with the topic of arts but the one that's interesting here is the first one knew why did the topic model say new is associated with arts and none of these other subjects and the reason why is the topic model looked at the documents and it found that new is very frequently in close proximity to these other words that it associates with arts and this is an example of a pretty good topic modeling because it does that classification quite well topic modeling in the sentiment analysis go really well hand-in-hand and working together the topic models are going to tell you what people are discussing and then the sentiment analysis is going to tell you if people are viewing different subjects in a positive or a negative light so here's an example of a topic model that we did internally at Cardinal path and we were analyzing web forum posts and this is where it gets super interesting the ways that you can apply topic modeling particularly on the visualization side are just fascinating what we did if you think of topic modeling topic modeling is about understand understanding how closely related different words are it's a measure of distance and a great way of measuring distance between different items is a network graph which is why you're seeing this cool spider graph on the right hand side and so what we can do is if we were to hover over these different nodes or get more information about the different blue and red elements we'd understand what other content topics are closely related to the one that would happen interest in so if you think of the applications you've got a really interesting way to understand related content for a user who would be interested in specific specific area and then if you think about other ways that you can analyze topic modeling results you can think about ongoing reporting for how topics are changing over time or one thing that we really like to do is analyze how topics are changing in different geographic regions particularly for international tech brands and this is an example of sentiment analysis sentiment analysis in general I think it's really intuitive to most people you're just understanding are people saying positive things or people saying negative things and in this example we're looking at individual tweets and so for this particular sample of tweets that we took you can see it's skewing a little bit to the right and that's on the right we have the more positive view so in this particular sample people are saying generally positive things and again where this gets interesting is not in the sum of everything that's going on it's when you start slicing into it in different dimensions the dimensions that you choose are always just going to be the ones that are most interest to your business but really good ones could be again geographic segmentation is a good one segmentation by brand segmentation by topic yeah at another example I trying to go back to that slide Charlotte Thanks so another example of sentiment analysis I've seen is one of Yelp restaurant reviews so what was done what you can do there is look into Yelp restaurant reviews and sense whether there is positive or negative feelings about specific restaurants and so obviously words like awesome delicious indicate positive sentiment so there's some surprising things that can be found in analyzing restaurant views like people and I end up being positive possibly because the restaurants trying to meet all of your needs for negative things like minutes maybe your food you waited 30 minutes for your food and by analyzing how people are feeling about a restaurant over time maybe you can go back to specific restaurants and use this to have a feedback for the staff and change how you're running your day-to-day business excellent so there's a lot of different ways that you can execute on topic modeling and sentiment analysis you could if you are an avatar or Python user there are already packages available for those tools that you can access to get up and running with these different models there are also a large number of vendors available in this space so I have an example here of something that was posted by Yuri cotton where he did an analysis of different natural language processing eyes and how they work in terms of entity extraction and so this is really important if you are working in a domain-specific space because these type of models you really want to make sure they work for your particular domain sometimes the underlying models will need to be trained to work a little bit better for your industry so when you are considering your vendors here are the different things to keep in mind very first one number of languages supported making sure that all the languages that you have access to are represented in the API that you or the vendor that you're trying to use the second is that area of domain expertise so if a sentiment analysis model has been trained on computer software content it's going to do well for other computer software vendors in the space but if it's a model that's never seen that topic before it may have trouble figuring out what's going on the number three is what type of features are extracted and this is starting to get into a little bit about the additional services that could be available most vendors that are out there they don't just do one thing they actually have a bunch of different services and can give you a bunch of different information so take a look to see what is offered and what meets the needs of your business the best and one thing I noticed was a lot of the natural language processing vendors is some of them are providing the ability to integrate with other platforms I see Twitter integration on a lot of them so if there are specific integrations with either analytics or social media platforms that you need definitely take a look to make sure you find a vendor who can provide you with that and then the final one here is quite obviously cost these all come at very very different cost points so what do you need to get started in text analytics it's literally just text that's all you need the output can be incredibly varied so we had one example we were showing you a visualization of how topics were related another area would be just a deeper understanding of what topics are being discussed by your customers and how those topics may be changing over time and then the final one is if you're thinking more about sentiment analysis the output here is about understanding the sentiment of your customers over time so the way to use this because social media is so text based this can do a great job at informing your social media campaigns you can track the sentiment of your customers over time see if it's improving or if there's some problems and then one of the discord uses is if you got large volumes of text base data this is a great way to dive into it a little bit more effectively so thank you very much for your time okay wonderful thank you so much to both of you Thank You Charlotte and Danica I think we have some time for some questions and have a couple for you already some great ones so I think I'm going to go ahead and get started and so the first question for you - we have an audience member that wants to know which courses are needed to train a data science team yeah sure I'll take that one so if we think back to the slide Charlotte went through earlier in the presentation it went through some key areas of data science so there was the math and stats skills so courses in understanding when to use what modeling approach to solve your problem is one area where you would want to take courses for things the other area is what I like to call hacking skills and this is knowing how to program so how to program the models and clean all of your data using programming languages and then there's the most important one domain expertise this is one which you can't really learn from courses it's practical hands-on knowledge about the fields at hand and this can only be gained through experience with the type of data you're working with and the ways that it can be applied so I guess some courses in areas that like programming math and statistics and then exposing yourself to the field that you're looking to apply data science to alright wonderful thank you next question and for you what is the highest impact way of using data science in an e-commerce operation to drive sales that's a great question yeah and I think if I had to generalize it I think I saw different versions of this question in the registration form quite frequently it was really you know what cyclists win what circle kiss win unfortunately the quickest win is 100% dependent on the business that you're working for so I would imagine that the the quickest win for an e-commerce site if I was working with a large e-commerce commerce client they typically have large marketing budgets and so the very first thing that I would usually zero in on is can we make that spend more efficient so we would be looking at anything related to forecasting is one angle if we want to get a bit more complex we'd be thinking about attribution modeling or medium mix modeling but it's when you're trying to figure out what's going to be the most impactful for your organization it's usually about understanding either where you're spending the most money where you can maybe save some or where the greatest area of opportunity is and maybe that's an area that you already have a great deal of success in or maybe it's an area where you know you could have some success in but you need to dive in a little bit deeper just to understand what's going on there a little bit more all right thank you for that explanation all right so next question that I have for you too we have an audience participant that is inquiring about best practices technologies and tools to be used and there are some examples provided such as our Python Amazon s3 etc so if you can share your thoughts on combat yeah that's a really general question-there and a lot of burying technologies listed so I like to think of this sort of where you are starting out in data science or well-established so if you're starting out you may just want to start with things like R and Python to start scripting for smaller problems but eventually you're going to hit a point where the data you're working with can't be run on your local computer and you're going to want to move into the cloud and so at that point that's when you start to need you need to start looking at things like Amazon Web Services and the like finally when you get to even larger data where you need to do parallel processing now you need to start considering things like Hadoop so sort of when you're starting out start with the basics and step up as you lose the capability to process your data on your computer alright this is some great tips and so I'm trying to get to some more questions here before we run out of time so another question for you - and how do you translate the insights from a data science model into actionable insights and strategy that was a really important question because obviously we want to use the results of our models but it's going to vary based on the model you have so the output from the text analysis model is going to be really different than the output of a customer lifetime value model they're going to have very different approach output so the best way to know how to translate the output of your model into insights is to have access to the areas of your business you can make those optimizations so if you have a website you can make personalized patients based upon your model output keep that in mind as a way of using your model if you have different marketing channels you can change your budget on do that and if you have different areas to inform your marketing strategy be it through the messaging you're using or who you're targeting just think of all the levers you can pull in your business and your model output should change one of those levers and allow you to change the results of what you're working with okay great and we have an audience member that commented that they're very new to data science and web analytics and so they wanted to know um what techniques they should use to get started okay III think I have an angle for this one so when we think about web analytics and data science and sort of how you evolve along that path there's a way we think about it at cargo path and it's literally a past but the very first step is your implementation step it's your data layer step it's about making sure that the data you acquire is as clean and as accurate and as believable by the stakeholders of the business as possible so that's literally the foundation to everything you do before you can dive into some of these more fun applications so step one get foundation in order step two is what we call our business layer and that's when you start to activate the web analytics information or other analytics information that you have and that can be through things like segmentation or reporting or you know the beginning of different insights projects so things that inform your day-to-day operations and then the final step is what we call the strategic layer and that's really where the data science elements live and that's where we get into all the sorts of fun stuff that you're seeing here in these different slides but it is it is a very methodical progression between those different steps you can't skip the clean data collection step and expect to really do really well in the industry chicly are working on data science you need to have sort of those foundations in place and they also kind of correspond to how you can end up getting the most ROI out of your program right you don't want to go from zeros to 60 because at 60 you're not going to be able to leverage those insights as well if you miss those intermediary steps great well we have a sort of follow-up question to that because we have an audience member that wants to know how his business and can get started with data science when they don't even know what questions to necessarily ask so which method would you recommend and to use that would benefit their business yeah I think I'm going to tie that back to my previous points about the past and then Dan Acosta made a points about the same thing so think about sort of that methodical process of getting the data in order thinking about your business operations and then thinking about the strategic step can you repeat the end part of that question again it was it about the digital techniques yes - yes - no and which method would benefit their business if they don't if they're not sure which questions to ask - how - how to get started using data science yeah for sure so if you're not really sure what area is going to move the needle for your business or where there's the most questions or interest or what's going to provide the most value we do see that when we start working with clients initially so usually what we do there would be a series of stakeholder interviews trying to understand the business just a little bit better or get into areas where maybe you know if you are working out of business areas that you don't currently have exposure into and once you start to understand some of the challenges that are going on internally at the business you can start to get a feel for maybe where data science can fit in to provide the most value all right wonderful next question for you and do you still have to go through the same data evaluation steps for text based methods like you do for non-text methods yeah actually you would think that you could just throw text into a model because you need all of the text-based data but actually there is still that same data valuation data preparation step that you see for numerical based analysis so just a couple examples like if you're scraping web data you're probably going to have HTML or URL encoding which is going to give you junky results and then even if you're using non web data you still have to worry about things like does your text analytics methods do they handle pluralization or do they think that plurals are completely different words right because that's going to cause essentially duplicate content other things to think about stop words is the thing so the idea of stop words would be sort of nonuseful words in a model so things like I need you and and it just you know really short words that are really common but which would you know cause a bit of noise in the final output slang can be problematic and then URLs can also be problematic so those are a couple of different things that you think about in terms of just preparing text based data wonderful believe it it's all the time that we have and today Charlotte and Danica thank you so much for for sharing your insight I'm taking some time to share all this wonderful content with us just as a reminder to everyone and AMA members can view an archive of this webcast at any time you just have to visit AMA org backslash webcast we also do have a short survey at the end so please take a moment to complete the post-event survey and just let us know your feedback and your thoughts and I want to thank our generous sponsor Cardinal path again thank you so much I also want to thank ReadyTalk who provided us with the web conferencing platform for our webcast today if you'd like to learn more about ReadyTalk and their services feel free to check them out at ReadyTalk com backslash ama and last but not least I want to thank our great audience thank you so much for your time and your participation we appreciate it and I hope everyone has a great rest of your day
Info
Channel: Cardinal Path
Views: 7,513
Rating: 4.8681316 out of 5
Keywords: Data Science, predictive analytics, forecasting, AMA, American Marketing Association
Id: 6auG0n5Iz6s
Channel Id: undefined
Length: 57min 5sec (3425 seconds)
Published: Fri Nov 04 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.