On How Machine Learning and Auction Theory Power Facebook Advertising

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Thanks okay yeah so thanks everybody and so apologies in advance this is so Chinmay and joaquin both really wanted to be here Chinmay was sick and his son was sick and Joaquin just hours ago had some emergency so I'm very much flying by the seat of my pants here so apologies in advance but I'll do the best I can so this is basically Claudine was going to talk about the machine learning side of Facebook and Shinmei was going to talk about the auction side and so so my background sohe Keynes in machine learning groups in maze in this ads delivery product group I'm in this economic research group so I'm definitely not as intimately familiar with either of these areas but I'm at least on the auction side I'm working you know with Chinmay so I can at least even if the presentation is a little bit rough hopefully please ask questions if things are unclear and hopefully I'll be able to answer them as best I can okay so as I said the the talk is basically broken down into this machine learning side how do we predict various things that we care about on Facebook probabilities that users are going to be doing different things and then given those predictions as input how do we actually decide what content to show a given user okay so a high level we can sort of think about Facebook as this personalized newspaper and so essentially all this really means is we want some individual content for for each person and each person cares about different stories in different ways so if I come to Facebook maybe there there's two possible worlds that I could be in one is where I see these stories on the left one is where I see these stories on the right and so our question is which of these should we actually show a user when they come to Facebook right and so some of the you know maybe this person on the left is you know maybe this story itself isn't that interesting but maybe it's a close tie or something like that and so it's really a challenging problem here but what we want to show people is the stories that they're actually caring about and not showing them the things you know all the the other noise essentially and so for a long time actually this wasn't really an issue because we just showed all users everything just in reverse chronological order as I think Twitter had done as well and eventually it just got to be too much so so there are people that have complained when we switched to this sort of ranking version because you're not seeing everything and there is this risk that maybe we show you something that or maybe we don't show you something that you would have wanted to have seen but the reality is just a given person has on average 1,500 stories that they could potentially see in a given day and so we've run experiments on this without ranking you know people click comment and like things significantly less often than they do when we have this ranking people miss an important stories or stories that we deem to be important people see content that we think is irrelevant you know maybe you don't care that you someone that you went to junior high with had a carrot cake for lunch or whatever it may be and so you know that this is why ranking is important essentially we have to figure out what stories to actually be showing the user and as I said there are different so different users have different preferences so one person may only like updates that are coming from is his clothes his or her close personal friends someone else may really have an interest in news stories someone else may you know that they actually have value for these stories that are coming from weak ties and they you want to get the diversity in stories someone else maybe really likes funny cat videos somebody else I guess likes clicking cows I didn't make these slides also as a preface but but I think what this is trying to get at is you know some very specific preferences that people have and these interests that you know this would be the ideal as knowing these sorts of things about individual users but we don't have a way to actually get at this specific of preferences we don't ask an individual user what sort of things they care about other there are some ways for users to express their preferences so we can't make these sorts of decisions for everybody we just don't have that you know we can't measure that so but what we can measure are the things that we observe happening on Facebook right so these different what what are called feed Met feed metrics so just different things happening in feed right we see when somebody clicks on a given story through to a news article or clicks on a link to a video when somebody likes a given story we see that when someone comments on a story when somebody shares a story how long someone is viewing a story we see how if there's a video how long they've watched it and so we we have these are sort of the observables that we have and we could think about users having you know this is some indication of the value that the user is having by they're giving some signal that they like something by by liking it on Facebook so basically the machine learning part of this then is predicting the probability that these different events occur right so you know it's just simple CTR prediction is one example of this right and it's just we have other events aside from clicks that we also want to be predicting okay so if we have these predictions of some probability of all these different events occurring we want to say what the value that a given user has for seeing this story and so Facebook is very much you know get about incremental improvement just getting something out there and and you know revising it as as we go so this is sort of what started out was somebody with some some good intuition hopefully of the system decided that well alike is five times the value of a click a comment is four times the value like if you are hiding a story then that's very bad and so these this is basically some linear utility function that we're thinking about where we've sort of decided you know some somebody on this team has decided upon these coefficients essentially so there's sort of an interesting thing here where so I'm mostly gonna be talking about this left-hand column for the first part of the talk anyway right so we have these relatively sophisticated models for coming up with these probabilities the question is like when you're multiplying them by these sort of hand-picked coefficients you know maybe there's room for some improvement there so I can talk online if you want about what you know there has been some some improvements on this where now there's sort of a I guess reveal preferences approach of trying to back out some more meaningful values for these underlying events but I'll be talking a little bit at a high level about these event predictions okay so the example that Joaquin was gave here is actually I guess timely given Star Wars this I think's I'm coming out soon but so so basically we have the this is a possible story on Facebook for those of you that are unfamiliar this is one thing that you can do is sort of associate some relationship that you have with another user on Facebook so here Luke has added this other user Darth as his father on Facebook and there may be some other user say Yoda that can potentially see this story as well as you know a number of other stories of other people that Yoda is friends with so the question is should we show Yoda this story and I've already said that the way that we're going to think about this is this this linear utility function so probability of a like x the value of a like and I said for for now we're just sort of we've chosen this value for a like and so we're really just trying to figure out what the probability of a like is here and so how do we actually do this so basically there's few different types of features at the hot at a high level that are mainly looking at historical data right so the first thing is the relationship between Yoda and Luke right so there have been previous stories that Yoda has seen from Luke say these three stories here and maybe Yoda has liked two of those three stories right so that's our first signal second is the relationship between Yoda and this type of relationship story right so maybe if we look at the data on Yoda he's liked 50% of these relationship stories that have come through as opposed to when he sees a someone posting a photo maybe he only likes 10% of those stories birthday posts he sees a lot of those he doesn't care he likes point one percent of those so this is sort of the second piece is the the relative amount that Yoda is liking this type of story so the third high level feature is has nothing to do with Yoda but it has to do with the story right what historically have other people done when they see this story so maybe these five people around the border here had also had previously seen this story about Luke adding Darth as his father and two of them liked it and three of them didn't right so this is at a very high level these are sort of the types of at least base features that we're thinking about when we're trying to predict these things so that's sort of a very high level you know we want to show these stories you care about nothing else and the main way we're getting at these predictions so we have this linear utility function and sort of handpicking the coefficients as I described it here using some of these features to predict the probabilities of these different things and so the metrics here are these observables that we actually have trying to get at the value I'm sorry X is here just mean that did not like ya so sorry that's ambiguous so it is not the case that they liked it they just saw it and did nothing say yeah but but there could be it's this linear combination right so there's it's not just like sigh I didn't show this other prediction of the probability that somebody hides the story and but it's a similar sort of thing where you know we'll try to estimate what the probability is that this user hides the story based on how he's height he or she has hid things in the past and how others have hid that particular story all right any any questions about that so far yes every user have some sort of interaction with everything on his feet or not and would every user so would we like it to be the case that users are interacting with everything like one comment the next like have some sort of interaction with everything on my feed so if you actually interact with stories that if you hide things you don't like and you like things you do like if everything is working it should be the case that you see better content because of that so yeah if you're you're supplying more data essentially to the predictions so yeah so high things you don't like basically is will improve your stories you see hopefully create some kind of a selection problem that we're going to only see stories that we like should we throw in a story that somebody doesn't like everyone's to know in order to yeah that's a good question so okay so I guess you're you're thinking like maybe in this extreme case I I have these very homogeneous preferences and now it turns out that I am you know seeing nothing essentially but stories from Darth Vader or something like that are from Luke yeah that's a good point so I am NOT in intimately familiar enough with the the how the models are trained but I imagine there's some exploration that's being done that's a good point maybe I tend to like things that you know these have a specific personalized and so of course there's like a deep set of possible features yeah so so this is sort of a at a very high level some of the types of features that are used so actually one advantage of this system is that it's very easy to experiment with new features and that's that was sort of one of Joaquin's main takeaways basically was that the system that has been built has allows for very rapid experimentation where if you have some idea for a feature you can basically get it in the system as fast as you can think of it and so if you have some additional features like that you can throw it in and test it it's relatively easy to do alright so so so far I've been talking about these organic stories so so what is the you know my value for seeing some user-generated content and we can ask the same thing about ads though right and so I think the Joaquin was originally giving some example here of sort of the ideal case of an that that he saw so looking at least used to have a hearing aid and it just so happened that he saw an ad for this you know discreet hearing aid that's affordable that yeah he essentially had no prior knowledge of this and it's not something that he would have gone to search engine and searched for and so this is just something that you know he he found to be you know something where you you don't necessarily have intent but there is still potentially a lot of value of seeing ads on Facebook and this is sort of the ideal we would like all ads to be like this of course that doesn't always happen okay another thing to note about the ads is just the scale of what we're dealing with here so there's over two million advertisers in the system advertisers have multiple ads so at any given point when a user comes there's tens of millions of ads that we have to choose from we have a billion people that are coming to the site daily so this results in every day having trillions of ads that we have to come up with predictions for so the point is whatever we do it really has to scale and it has to be fast and furthermore it has to be accurate right because especially when we're talking about the on the outside of things as well so this is sort of a pointer forward to the the second half of this but you know if you're familiar with sponsored search as well it's you know if we're predicting incorrectly some probability that this ad is going to get clicked we're sort of incorrectly boosting it and potentially showing it over some other ad that the user would have preferred seeing so this is essentially let's see so this is sort of part of what the advertiser this is part of the interface that the advertiser sees so the first point here is to note that advertisers can bid for different events different objectives so here this particular advertiser is maybe bidding for users liking its page that's what it has value for and so we're essentially converting this into so there's some value that the advertiser has for a like there's some probability that we think there's going to be alike given an impression so this is basically the same sort of utility function that we were thinking about before except now instead of those coefficients that we were using the advertisers just expressing what its value is for these different events okay and the other point is just that ads are created all the time and so we have to be able to learn very quickly what the probabilities of these different events are the probability of yes yep so so basically the the question is yeah for what is the probability that this advertiser will have this event occur in this particular context say here on the page so this is I'll say this is the the piece that I'm least familiar with but essentially there are a few different models that have been tried this is just starting with a very simple logistic regression so the main point is right things have to be fast they have to scale so let's try the simplest things that we really you know that we can and see how they work so really all the high level point is here is you know we've sort of incremented on a number of different models first is this logistic regression then so this had weights for user IDs and weights for ad IDs so individual user and individual ad this is just a iteration on that where actually there's just a weight on the click-through rate so there's a lookup table essentially that says what is the historical click-through rate for a given user ID what's a historical click-through rate for a given ad ID puts those in and you know then puts it into this logistic regression so these are very simple models very compact and we've tried some additional things with like boosted decision trees the problem with this while at least some more accuracy is that it is cannot train online as quickly as some of these other models do so eventually what Seok Joaquin has a paper on this that I can refer anyone to who's interested in in it for but I'll focus more on the econ side of this but essentially the the take away from Joaquin was they really didn't have any theoretical basis for coming up with this model of you know boosted decision trees and then running a regression on top of that logistic regression I was just something they tried because it was so easy to experiment with these different models and so you know what what the main takeaway again was just the ability to try these things so quickly his has led to you know it being very easy to explore these different types of models so this is more a you know the the practical elements of prediction rather than having a good theoretical basis or knowing what it was in advance okay and one one last practical aspect is about the so so I said there's a lot of ads that we have to make predictions for in a given day really more than we actually can make predictions for in practice so we don't when a given user shows up we don't predict the probability of a click for every single ad that we could potentially be showing to that user there's some sort of filtering process that's done and this is essentially just this is another practical aspect that was used where so on the x-axis what we're showing here is the course of a week and this orange line is the number of requests that are happening throughout that time period so you know you see the cyclic effect is just the time of day so at night in the US probably we're just getting fewer requests and we other otherwise are and so just the the system it's so easy to add you know these these different layers what was done here is just having some controller that is dynamically controlling the number of ads that are ranked based on the number of requests that are happening at that given time so if there's fewer requests happening the middle of the night we can actually consider making predictions for a larger number of ads for a given request so this actually ended up being you know leading so right so basically what you're potentially losing by not doing this is some of these things that you didn't bother predicting a click-through rate for actually if you did it you found out that it was you know a good thing to show and so we're by having this sort of dynamic controller we're able to capture some of this you know lost value okay so really a qua Keens main takeaway here I think was I think I've said it enough but just like you know experimenting with things and having the frameworks in place so that you can very easily try new things and see if they work in practice and we have found often times that things that you know we don't always have good theoretical basis for the things that we try but it's just so easy to try them that sometimes that has actually worked out pretty well okay any so that's that's sort of the machine learning side in a nutshell is there any any questions on that okay so next what I'm going to be talking about so we have these predictions probabilities of clicks probabilities of likes now the question is what do we actually do with them to decide not just what organic stories to show but really how to fill out this entire page when a user comes to Facebook and so one thing to note is there's actually three different types of stories on Facebook so the first group is these organic stories generated by users so this is your friend posting a photo or sharing a video the second Eric ads right submitted by advertisers the third are these what we call long-term value content and so this is content generated by Facebook that we think has some positive value on the ecosystem by by showing this to the user so it's kind of small but some examples of this would be it's a lot of recommendations right so recommendations for friends which we think will you know give you and this potential friend additional content in your feed recommendations for groups that you might be interested in joining if you are an advertiser as well as a user you might get recommendations about things that you can do to your improve your ad campaign maybe adding a conversion pixel or something like that so really the the problem that we have is we are going to take all of this as input these a user shows up and we have all of these candidate ads these candidate long-term value stories and these organic stories that have been ranked already in the process I described previously we want to come up with some allocation and payment and when we're talking about an allocation here so now we're in in Shinmei section so Chinmay is calling this a configuration I think because it's I think he's calling it this because it's not just ads it's really sort of allocating all of these you know organic stories and this other content as well one thing to note here is that we are straining the space that we're thinking about these possible allocations or configurations the organic stories are already ranked as input so we do not consider on the outside flipping organic stories around we're only reasoning about inserting organic stories in between I'm sorry inserting ads or inserting these long-term value stories in between various organic stories not chronologically good questions so basically ranked according to you can think of that the utility this linear utility function the probability of a click x well we think the users value of this click is so we are so that there's basically some process upstream that is doing upstream from this ads allocation problem that is deciding on the ordering of these organic stories which is based on this sort of utility function now it wouldn't have to be like this right we could throw all of the decide all of these things at once but this is purely a simplification for computational reasons and also just organizational reasons with it like there's employees working on these things and you know it's it's nice to in practice really to have sort of these module modular areas that people can work on independently the coefficients yes yep yep question if you were to merge them together of exactly what I mean clearly how good how engaging the content these the more likely through something there scroll down she was saying yes you're saying like this probably we're not losing too much by making this simplification I think you're saying yeah agreed yeah it wouldn't it would have to be it can actually happen but would it would have to be a weird situation where we would think that actually is showing something in a different order than its originally and would would be better and so actually maybe this is a good point too to bring up a difference between sort of the classical model so yeah so so some of the things I'd like to sort of convey here is maybe differences between what we're doing here and sort of what the theory you know is normally the assumptions that are normally being made and so one thing that is different here is when you think about a probability of a click right in most models there's this separable 'ti of this advertiser effect so there's some probability that you know this ad will be clicked on and then there's this position effect which is discounting that this multiplicative effect right so we we do have that in our model but a difference which actually complicates things is we found that different events have different discount rates so for example you know the click discount rate is different from the hiding a story discount rate and so with this me this sort of flies in the face of this cascade model if you think about like there's some probability that you will get to this story and then you know it will have your attention and then given that it has your attention there's some probability that you will click or some probability that you'll do whatever event so with what we found is that actually you know that we we do see that there's different sort of decay rates for these different events and so really what this breaks you know fortunately our you know we we don't have there's a number of other reasons where it doesn't matter where we can't just greedily rank things anyway but but that's what it would what it would break essentially it's not necessarily the case that we can sort of greedily say this is the best story this is the second best story this is a third best story because it depends you know so some story may prefer one slot and some other story may prefer some other slot if you know one of them has a high probability of getting clicked and the other has a high probability of some other event but that decays in a different way so that's I guess that's maybe takeaway number one is that this is sort of one assumption that is typically made that that we are not making and it adds some additional complications to what we're doing okay any other questions ok so ok so this is our problem now is coming up with this configuration so how do we do this we we actually run a VCG auction and so so Chinmay says VCG auction is a necessity here I think really what he's saying is is that GSP will not handle this problem right because if you look at these different stories for one they're of different sizes and for - they're for - there's various additional constraints on the essentially the ordering of things that we can show for instance will never show an ad and then a friend recommendation and then an ad so if there's going to be a friend recommendation we'll put it on will sort of make these things distinct it's about friend recommendations and then ads so so we don't have this model even if we wanted to it doesn't really fit into the the GSP slot model and so this is basically for those of you unfamiliar there's sort of two places that we're putting these stories so one is on the right-hand side here and this is sort of some fixed length but individual ads can have different sizes second is the the feed this is where you know that this is where we're spending most of our time thinking about it's a much larger portion of our revenue but so the properties here again we have these different stories that are of different sizes and this is essentially some infinitely essentially infinitely long page and what we're thinking about when a user loads the page we are sort of plopping if we think about those inputs we're sort of plopping these organic stories on the top and now we're thinking about how to insert these sponsored stories but but considering how that is actually decreasing the value of displacing value of everything else that's potentially on the page and so to make things even more complicated so we also have horizontally scrollable slots essentially so so it's not just that we're putting ads within this feed top to top to bottom but there's also an individual story can go left to right and also have yeah potential ads go in there so why VCG I think this audience will be ok with with this reasoning so value maximizing and incentive compatible now we put an asterisk by instead of compatible because really we need its instead of compatible with some assumptions holding that may not actually hold in practice and so this is sort of a question we're actively thinking about is you know in reality how incentive compatible are these auctions but then from the engineering perspective it's also useful to have this because it is you know general just you know we introduced some horizontal scrolling ads and there's not any question really about what the allocation rule is going to be we already know what it is so really we're just pushing all of the hard work into this coming up with some optimization that will find this optimal allocation okay so very quickly this is really this is just describing VCG right so we I guess in our context a little bit so we have some set of agents and we have some set of configurations so these are just possible allocations that we could have and so there's some value that a so this right hand side is the value that a given agent has for some configuration and there's some total value for the configuration which is just the sum of all of the individual values right and so of course you know VCG we're just trying to choose this value maximizing configuration and an individual agent we're just going to charge them their externalities so let's remove them let's rerun this algorithm and let's see how everybody else's value is different now there's a question so the first question is how do we actually come up with this value that an ad has for a given configuration and I've touched on it a little bit with this you know utility function value of click probability of click but there's other rules as well some of which are just saying that we don't want a given user to see too many ads and so there's basically some business logic that goes in potentially changing the values of various configurations in addition this is sort of thinking about a single auction in isolation but of course there's these cross auction dependencies such as budget constraints that are also going to play into what we think about is an individual bid and then yeah so so given these values then of course we have to solve this optimization problem and so really over here we said incentive compatible had an asterisk but we could put in an asterisk next to value maximizing as well because we're we're trying to find the optimal allocation but you know we're taking many shortcuts that are not actually going to find it in practice but we're doing our best to fine that so this is yes you yep user comes loads the the fee if there's some new stories there's this upstream thing that's deciding on this ranking of organic stories and then we're going to run VCG to decide where to put these sponsored stories and these long-term value stories both in feed and on the right hand side that's right so it's for on an individual query level basis yeah that's a good good good point no so we don't do anything like that so there's sort of this process of you know because we can't consider all these ads at once there's this candidate set and then yeah once we remove that ad we don't try to get some new ad from the candidate set yeah I don't know how much hopefully it wouldn't make that big of a difference because it would just the only time would make a difference is if this one ad was on the margin where it actually would show up but I don't think anyone's tried to say for sure if that would make a difference or not okay and so a right so so we've said that advertisers can express values for these different objectives some advertiser might care about page clicks someone else might care about conversions to their website so the first thing that we do is just convert all of these values for these different events into some common unit essentially so some underlying value for a given say impression so that there's essentially some bid for an event some probability of a given events and so the value function again is just this dot product to the to the vector of per event bids times these probabilities okay okay so one other twist here from sort of the classic sponsored story or sorry sponsored search model is you know we do have these organic stories that were inserting these ads amongst and so you know we can say how the value has changed for this particular advertiser that has specified this value but these this utility in for these organic stories is not in terms of dollars right and so one thing that we have to do is actually come up with some monetary value for these utility you know this utility that we've come up with for a given story and similarly another difference is that a given story a sponsored story there's some probability that the user may say hi did that sponsored story right so so even sponsored stories so there's some advertiser value for that story being shown in this position but there's also the user value for seeing this ad in this position so if we think there's some probability that the user is going to hide this particular story that's essentially a negative bid a you know that is going into this this Sponsored Stories bid okay so that's what I just described here so yes so this is essentially what we're doing is it's not just about the value that the advertisers have for these different stories appearing in different places we're sort of placing bids on behalf of the user or the ecosystem for these different stories appearing in different places okay so any questions on that so that's sort of looking at this from a single auction perspective and now we'll address some of these aspects where you know maybe we can't look at a single auction in isolation yeah that's a good question so I I can't tell you a number but I can give you sort of qualitatively how we do that and that will come up and I think three slides good question all right so so one of the these cross auction dependencies of course which is common and sponsored search as well is budget optimization right so so this is the interface that an advertiser has and down on the bottom here the advertiser is specifying some bid for clicks up top they're specifying some budget some amount they're wanting to spend so this advertisers wanting to spend $700 starting June 9th for lasting about a week so so this is really I think this is not this actually is not unique to Facebook this is just you know sponsored search also has this problem but we'll say what we're doing here to handle this so so right if we think about this particular bid that that advertiser has expressed if we entered that bid in all of these auctions it may turn out that the advertiser spends its budget half way through its campaign all right so this is time on the x-axis you can think of each of these dots as just being an auction that was happening at that time and this if you know even think about these is like you know single slot auctions or something like that where this is just the price that the the advertiser will pay if he's bidding more than that right so it could be that this advertiser is exhausting its budget halfway through the day this is bad for a couple reasons one is just potentially expectations this advertiser may be intending to have its budget being spent throughout the entire period but also it could be that if if the advertiser actually has the same value for all these different clicks happening you know it would prefer to not be paying that high amount for a click up there when it could have been lower and received a click here there's sort of two ways that this is handled so one is this problem of probabilistic pacing so we'll choose some multiplier and basically probabilistically have this ad going to the auction trying to find this multiplier essentially that exhausts the budget over you know so it spends it smoothly over the course of the period we actually do something different which is this multiplicative version and so instead of probabilistic we just scale the bit down so that the budget is spent in this time period though there's arguments for both of these it's not I don't think that this is just strictly dominating the other it really depends on what assumptions you're making about the value that advertisers are having here right so if it's actually not the case that let's say the the advertiser is bidding for clicks but it actually cares about conversions right and it may be the case that for one reason or another maybe demographics are different in you know for these higher bids it's more females versus males or maybe there's a common value element but it could be that actually the probabilistic version would be better but so so basically you know this this is an optimal thing to do under some assumptions about the underlying values that that advertisers are having for for clicks and to get to the point that was made about what is the actual value that we're thinking about for the organic side so this is sort of another cross auction dependency that I think I think the analog really works well with the the analog is the budget pacing essentially where we have this multiplier that's scaling the bid up and down to try to get the budget spent smoothly over this period similarly the intuition is that a user is going to be you know have a more negative reaction to ads the more and more ads that it's seeing so basically we have a controller on the organic bid so we're scaling it up and down based on how many ads it is seen previously so this is sort of a control bull for the quality that the advertiser is getting and it's also sort of subsuming reserve prices in a sense as well because if we you know scaling this up higher is causing this some advertiser to have more displaced it's displacing more value when the ad actually is shown hey this will be there does he freedom to book them right on the bottom right oh the the right hand side yes yep yep so it's basically you could think of it as any given story you know whether it's an ad or otherwise has this organic component regardless of where it shows up and yeah we'll scale that up and down based on how many ads users see if the control loop roughly keeping the number of ads roughly constant across users or is this some extra loop that decides how many ads each user should see yeah good question so this is we're not choosing to have some yeah so it's the former with so basically we do not want just because some user is more valuable to advertisers than others we don't want that particular user to just be swamped with ads so it is sort of a you know there's some amount of ads essentially that a given user should be able to see and so will will change this accordingly so the one friend had like oh I see what you're saying yeah so that's sort of like in this murky middle ground like is it an ad or is it a story so that is an ad but this is with some I don't remember what we call it internally but it's some additional context that basically we may think that you're more likely to like this thing or click on this thing because we're giving you some additional signal that yes yep yep big lowering and throttling are so different shoots every times will actually decide yes yeah that's a good good question and so I think if you look at I think that's a feature that that perhaps we could implement so really I think the the whole perspective that we're taking really is nothing to do with trying to extract revenue we're really trying to have this system that is providing as much value to advertisers and users as we can and so this seems like sort of the the areas for improvement is like figuring out how we can better allow advertisers to easily either easily express their preferences or yet so either they can express their preferences right now but it's just challenging to do so or maybe the language doesn't even allow them to express that underlying value so we do have now some like incremental version towards that where you do not have to be paste so you don't need this multiplicative pacing to go down as far as I know I don't think we offer probabilistic pacing but it at least allows advertisers to sort of do this on their own whereas before if they wanted to do this you know set an extremely high budget so that you you know essentially are not going to have your bid discounted and sort of you know you're sort of jumping through hoops to you know making advertiser jump through hoops to do that they want to do so as much as we can remove that from the system and making making things easier for advertisers the better off I think will be so--that's yes so that's sort of this summary of what we're doing on the machine learning side of predicting these different events and then how we're actually using that to decide what we're going to show - you know what these allocations and payments are going to be I'm happy to chat with anyone about this I think there's a lot of interesting problems even just understanding that last point about you know what is this system currently not capturing that advertisers care about and how can we identify this from the data and you know second is just you know the actually implementing some of these features and understanding what you know what is not being captured right now so thank you very much I yeah so I did not mean that we're considering some different mechanism but rather it's more an empirical question right so there's a number of assumptions that would need to hold for incentive compatibility actually to hold for instance we are actually predicting these different click-through rates and things like that correctly it would be one of them and so yeah so it's more about understanding in the current system even if you said that you know the underlying values are expressible like there is some single value for a click and some value for a like and we can set have these things separable is it actually you know even in that case can an advertiser do they have incentives to be you know changing their their bids so it's more an empirical question rather than coming up with some new mechanism mechanism because really the solution I think is just providing the expressiveness that would allow the advertiser to press their values and once you have that then you can just run VCG right so that's that's really I think what the challenge is so when you talk about probabilistic smoothing I believe that for one end what happened when you perform that for multiple ends or one options what they have is some kind of we interactions way you just one time for move oh man because yeah so I suppose that is an issue so I should say that we're we're not doing probabilistic pacing just to be clear we're doing a multiplicative version well yeah so I suppose that so maybe I don't know if anyone who's doing probabilistic pacing wants to comment on this but I'm guessing that's just an issue that comes up if you're treating these is just independence probabilities that maybe sometimes you get that draw so the the question was so we're doing this multiplicative pacing but the question was with probabilistic pacing the concern was that sometimes maybe you get you know say there's two ads that have a probability less than one and sometimes it just so happens that neither of these ads ends up being eligible just by the unfortunate luck of the draw and so I I think maybe you were asking if it is actually these independent draws or if maybe there's something joint that is done that's trying to avoid that yeah so the answer to that is just we're not doing probabilistic pacing so maybe we'll have to think about that if we you know because advertisers potentially do value this I think but at least you're not having this situation where nobody is participating right it's just people are lowering their bids thank you very much
Info
Channel: Simons Institute
Views: 14,663
Rating: 4.9405942 out of 5
Keywords: Simons Institute, UC Berkeley, computer science, theory of computing, Economics and Computation, Eric Sodomka
Id: 94s0yYECeR8
Channel Id: undefined
Length: 54min 20sec (3260 seconds)
Published: Fri Nov 20 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.