What Makes Them Click: Empirical Analysis of Consumer Demand for Search Advertising

Przemyslaw Jeziorski, Ilya R. Segal
2009 Social Science Research Network  
We study users' response to sponsored-search advertising using data from Microsoft's Live AdCenter distributed in the "Beyond Search" initiative. We estimate a structural model of utility maximizing users, which quantifies "user experience" based on their "revealed preferences," and predicts user responses to counterfactual ad placements. In the model, each user chooses clicks sequentially to maximize his expected utility under incomplete information about the relevance of ads. We estimate the
more » ... ubstitutability of ads in users' utility function, the fixed effects of different ads and positions, user uncertainty about ads' relevance, and user heterogeneity. We find substantial substitability of ads, which generates large negative externalities: 50% more clicks would occur in a hypothetical world in which each ad faces no competition. As for counterfactual ad placements, our simulations indicate that CTR-optimal matching increases CTR by 15% while user-optimal matching increases user welfare by 25% (and neither coincides with assortative matching). Moreover, targeting ad placement to specific users could raise user welfare by 60%. Finally, user welfare could be raised nearly 15% if they had full information about the relevance of ads to them. * The authors are grateful to Microsoft Corp. for providing the data and computing facilities and hosting them during the summer of 2008. The second author also acknowledges the support of the Toulouse Network for Information Technology. 1 Over the past decade the Internet has become the dominant channel for consumer information about goods and services. A substantial fraction of this information is provided through Internet advertising. In 2007, Internet advertising revenues rose 26 percent to reach $21.2 billion, according to the Internet Advertising Revenue Report published by the Interactive Advertising Bureau and PricewaterhouseCoopers LLP 1 . To gain understanding of the online advertising market, compare alternative market structures and designs, and examine their welfare effects, it is important to understand the behavior of consumers in this market. Our paper makes a step in this direction, focusing on "search advertising," i.e., "sponsored links" that accompany results produced in response to the consumers' search queries. Search advertising accounts for 41% of the total Internet advertising revenues. It is viewed as the most effective kind of advertising because of its very precise targeting: a consumer's search string reveals a great deal about the products (s)he is likely to be interested in. This precise targeting allows to display only the most relevant ads, which in turn induces consumers to click on the ads. While the market for search advertising has recently received a lot of attention, not much is known about consumer behavior in the market. This paper makes a step towards remedying this problem. Existing papers on search advertising postulate very simple and restrictive models of user behavior. For example, Edelman, Ostrovsky, and Schwarz (2007) propose a model that assumes that the CTR (clickthrough rate) on a given ad in a given position is a product of ad and position specific effects and does not depend on which other ads are displayed in the other positions. (Henceforth we will refer to this model as the "EOS model," which is consider the ads sequentially from top to bottom, deciding whether to click on the current ad and whether to continue clicking with ad-specific probabilities. These restrictive models have not been compared with actual user behavior. Also, as these models have not been derived from utility-maximizing behavior of users, they could not be used to evaluate user welfare. This paper offers the first empirical investigation of user response to sponsored-search advertising that is based on a structural model of utility-maximizing user behavior. One advantage of a structural model over reduced-form models is that once the model's parameters are estimated 1 http://www.scribd.com/doc/4787183/Internet-advertising-revenue-report-for-2007 2 and its fit with the data is established, the model can be used to predict user behavior for all conceivable counterfactual ad impressions. Another advantage of the model is that it quantifies the "user experience" on a sponsored-search impression as users' expected utility from the impression, and estimates this utility from the preferences of actual users revealed by their clicking behavior, rather than from the judgements of disinterested experts (as in Carterette and Bennett (2008) ). 2 Improving user experience is crucial for the survival and growth of an internet platform, and our model can be used as a guide toward that goal. Our dataset offers a selection of advertising impressions and user clicking behavior on Microsoft's Live Search advertising engine. The data contains a random selection of search sessions between August 10 and November 1, 2007. In each session, the user entered a search string and was then shown "organic" search results accompanied by advertisements ("sponsored links"). An advertising "impression" is an ordered list of sponsored links. (The first sponsored link is displayed at the top of the page in a highlighted box, while the others are displayed in a column to the right of the organic search results.) For each advertising impression, our data describes the ads clicked by the user and the times at which the clicks occurred. Our estimation strategy is based on the fact that searches on the same search strings often generate different advertising impressions. We treat this variation in impressions as exogenous and uncorrelated with users' characteristics. Indeed, we have been assured that the impressions were not conditioned on the user's known characteristics or browsing history. We also make the crucial assumption that the characteristics of ads that determine users' values for them did not vary over our 3-month window. This assumption appears plausible for the four search strings we consider: "games," "weather," "white pages," and "sex". 3 In fact, it is easy to convince oneself of the large 2 Dupret and Piwowarski (2008) quantify ad quality by calibrating a heuristic model of user behavior on real data. However, since their model is not based on utility maximization, it cannot be used to quantify user welfare. 3 To understand the importance of this assumption, imagine that the preferences of users searching for "Paris Hilton" changed abruptly from looking for a hotel in the capital of France to looking for the infamous sex video, and that the advertising engine quickly responded to this preference change by changing the placement of ads. In this situation, our estimation strategy would be invalid: for example, it might wrongly find that putting an ad in the top position raises its CTR when in fact it may just be that the engine puts the most relevant ad at the top and there is no position effect for any given ad. Microsoft plans to release a dataset in which ad impressions are truly randomized and independent of ad characteristics -an initiative known as the "adCenter challenge:" http://research.microsoft.com/workshops/ira2008/ira2008 talk.pdf Repeating our analysis on this dataset would eliminate any possible concerns about the endogeneity of impressions. 3 random component in ad placement by searching for the same search string several times in a row. The ad placement results from several fast-changing factors, such as advertisers' varying bids and budgets, the advertising engine's estimate of the ad's relevance based on its historical clickthrough rate (CTR), and explicit experimentation by the engine. We believe that at least on our search strings, this randomness swamps any possible shifts in the ads' relevance. We begin by examining reduced-form evidence that contradicts the existing theoretical models and suggests some dimensions in which the models need to be enriched. In particular, the EOS model is contradicted by the prevalence of externalities across ads: the CTR on a given ad in a given position depends on which ads are shown in other positions. For example, the CTR of Domain 1 in position 2 on the "white pages" search string is 18% if its competitor in position 1 is Domain 3 (which is not a good match for "white pages"because offers yellow pages ), but drops to 8% if the competitor is Domain 2 (which is a specialized advertising company). 4 This difference is statistically significant. The "cascade model" is contradicted by the observation that 46% of the users who click on ads do not click sequentially on positions (1,2,. . . ), and 57% of the users who click more than once do not "cascade," i.e., click on a higher position after clicking on a lower position. Also, the data exhibits certain kinds of externalities that could not emerge in the cascade model: the CTR on a given ad in a given position depends on which ads are shown below it, and the CTR on a given ad in position 3 given the two ads shown in position 1 and 2 still depends on the order in which the two ads. Next, we formulate and estimate a structural model of rational user behavior that nests the existing models. In our model, a user chooses his clicks sequentially under uncertainty about the relevance of ads to him. The model is related to the literature on consumer search (e.g., Hong and Shum (2006), Hortacsu and Syverson (2004) ), the closest work being Kim, Albuquerque, and Bronnenberg (2009) , which estimates online search for durable goods at Amazon.com. The latter paper assumes full satiation: a consumer gets utility from at most one purchase. Our model instead parameterizes the degree of substitutability (satiation) among ads with a parameter R in a "Constant Elasticity of Substitution" utility function. For R = 0, user utility is the sum of the utilities derived from the clicked ads, and so there are no externalities across ads, as in the EOS model. At the other extreme, when R = ∞, user utility is the maximum of the values of the ads he clicks on, and so he derives utility from at most one ad, and the externalities are the most prominent 4 The domain names are available in the dataset by Microsoft does not allow us to publish them to protect advertiser privacy. Animesh, A., V. Ramachandran, and S. Viswanathan (2007) : "An empirical investigation of the performance of online sponsored search markets," in ICEC '07:
doi:10.2139/ssrn.1417625 fatcat:2cljkbphh5eqzohvfog3zqrxi4