A Large-Scale Field Experiment to Evaluate the Effectiveness of Paid Search Advertising
(with Uri Gneezy, Lorenz Goette)
Companies spend billions of dollars online for paid links to branded search terms. Measuring the effectiveness of this marketing spending is hard. Blake, Nosko and Tadelis (2015) ran an experiment with eBay, showing that when the company suspended paid search, most of the traffic still ended up on its website. Can findings from one of the largest companies in the world be generalized? We conducted a similar experiment with Edmunds.com, arguably a more representative company, and found starkly different results. More than half of the paid traffic is lost when we shut off paid-links search. These results suggest money spent on search-engine marketing may be more effective than previously documented.
Limits of friendship networks in predicting epidemic risk
(with Massimo Franceschetti, Manuel Garcia-Herranz, Iyad Rahwan)
The spread of an infection on a real-world social network is determined by the interplay of two processes - the dynamics of the network, whose structure changes over time according to the encounters between individuals, and the dynamics on the network, whose nodes can infect each other after an encounter. Physical encounter is the most common vehicle for the spread of infectious diseases, but detailed information about said encounters is often unavailable because expensive, unpractical to collect or privacy sensitive. The present work asks whether the friendship ties between the individuals in a social network successfully predict who is at risk. Using a dataset from a popular online review service, we build a time-varying network that is a proxy of physical encounter between users and a static network based on their reported friendship. Through computer simulations, we compare infection processes on the resulting networks and show that, whereas distance on the friendship network is correlated to epidemic risk, friendship provides a poor identification of the individuals at risk if the infection is driven by physical encounter. Our analyses suggest that such limit is not due to the randomness of the infection process, but to the structural differences of the two networks. In addition, we argue that our results are not driven by the static nature of the friendship network as opposed to the time-varying nature of the encounter network, as a static version of the encounter network provides more accurate prediction of risk than the friendship network. In contrast to the macroscopic similarity between processes spreading on different networks, the differences in local connectivity determined by the two definitions of edges result in striking differences between the dynamics at a microscopic level, preventing the identification of the nodes at risk. Despite the limits highlighted, we show that periodical and relatively infrequent monitoring of the real infection on the encounter network allows to correct the predicted infection on the friendship network and to achieve satisfactory prediction accuracy. In addition, the friendship network contains valuable information to effectively contain epidemic outbreaks when a limited budget is available for immunization.
What do your friends tell about you? Prediction of individual political participation using social network information
(with Robert Bond, Massimo Franceschetti, James Fowler)
The availability of massive repositories of social interaction has opened new horizons in the prediction of human behavior, in which the social contacts, or friends, of an individual constitute data about the individual. We consider millions of people who use the social network Facebook and study how their connections and interaction help predict their future political participation. We consider both the aggregate demographics and past behavior of one's friend. We also compare different definition of social network, and we compare close friends versus randomly chosen friends.
Group buying with bundle discounts: computing efficient, stable and fair solutions
(with Massimo Franceschetti)
We model a market in which nonstrategic vendors sell items of different types and
offer bundles at discounted prices triggered by demand volumes. Each buyer acts strategically in order to maximize her utility, given by the difference between product valuation and price paid. Buyers report their valuations in terms of reserve prices on sets of items, and might be willing to pay prices different than the market price in order to subsidize other buyers and to trigger discounts. Differences between implemented prices and market prices can be interpreted in terms of transferable utility, a form of cooperation between buyers. The core of the market is in general empty, therefore we consider a notion of stability that looks at unilateral deviations. We show that efficient allocations - the ones maximizing the social welfare - can be stabilized by prices that enjoy desirable properties of rationality and fairness. These dictate that buyers pay higher prices only to subsidize buyers who contribute to the activation of the desired discounts, and that subsidies are balanced between buyers. Building on this existence result, and letting N, M and c be the numbers of buyers, vendors and product types, we propose a O(N^2+NM^c) algorithm that, given an efficient allocation, computes prices that are rational and fair and that stabilize the market. The algorithm consists in first determining transfers between groups of buyers with an equal product choice, and then using them to compute single buyers' prices. Our results show that if cooperation is allowed then social efficiency and stability can coexists in a market presenting subtle externalities, and determining the right amount of cooperation is computationally tractable.
An instance of distributed social computation: the multi-agent group formation problem
(with Massimo Franceschetti)
We consider a scenario in which agents in a network are rewarded if they collectively solve a specific group formation task from only local distributed interactions. We propose a simple distributed algorithm with a number of desirable features aimed at modeling distributed social computation: the algorithm is self-stabilizing, decisions are made using only local information, the exchanged messages are minimal and can be represented by a single bit, agents pursue local stability, and have no memory of the past. We characterize mathematically the tradeoff between the quality of a solution and the time needed toreach it, proving that a good solution is always found quickly while improving to the optimum might require excessive time. We also conducted laboratory experiments where a group of human subjects were rewarded if they solved the same group formation task. We observed a good fit between the humans' performance and our algorithm's predictions, showing that the global dynamics of agents with diverse strategies can be well described by simple, synthetic agents with a uniform strategy of interaction. This suggests that simple models of distributed computation can be constructed to predict the performance of real populations of humans solving computational problems over networks, addressing a question recently posed by Kearns (2012).
Words on the web: contagion of semantic expression in on-line social networks
(with James Fowler, Massimo Franceschetti)
Is semantic expression contagious? And if so, how does it spread online from person to person?
To address these questions we developed a model to detect and quantify contagion of semantic expression in massive social networks.
In this model we combine geographic aggregation and instrumental variables regression to measure
1) how an exogenous variable directly affects an individual's expression and 2) how this exogenous change in expression influences the expression of others to whom that individual is socially connected.
We apply our method to the content posted by millions of Facebook users over a period of one year, and use exogenous changes in local rainfall to estimate which semantic categories are contagious between social contacts.
This analysis not only allows us to see whether usage of words in the same semantic category spread -- it also allows us to estimate a signed relationship between different semantic categories, showing how an increase in the usage of one category alters the usage of another category in one's social contacts.
We can then use these results to estimate the total cumulative effect that a user has on all her friends. The results suggest that each post in one semantic category can cause friends to generate one to two additional posts in the same category, and it can also have positive or negative effects on friends' posts in other categories as well. The categories that survived corrections for multiple testing included emotional, sensory, and sexual words.
Detecting Emotional Contagion in Massive Social Networks
(with Yunkyu Sohn, Adam Kramer, Cameron Marlow, Massimo Franceschetti, Nicholas Christakis, James Fowler)
Happiness and other emotions spread between people in direct contact, but it is unclear whether massive online social networks also contribute to this spread. Here, we elaborate a novel method for measuring the contagion of emotional expression. With data from millions of Facebook users, we show that rainfall directly influences the emotional content of their status messages, and it also affects the status messages of friends in other cities who are not experiencing rainfall. For every one person affected directly, rainfall alters the emotional expression of about one to two other people, suggesting that online social networks may magnify the intensity of global emotional synchrony.
From posting to voting: the effect of political competition on online political engagement
(with Jaime Settle, Robert Bond, Chris Fariss, James Fowler, Jason Jones, Adam Kramer, Cameron Marlow)
How does living in a battleground state during a presidential election affect an individual’s political engagement? We utilize a unique collection of 113 million status updates on Facebook to answer this persistent question in the study of political behavior by comparing users’ political discussion during the 2008 presidential election in politically competitive versus uncompetitive states. "Battleground" state users are significantly more likely to discuss politics in the campaign season than are "blackout" state users, and posting a political status update—a form of day-to-day engagement with politics— affects a person’s self-reported voter turnout, mediating approximately 20% of the relationship between exposure to political competition and turnout. While our results can only be generalized to the tens of millions of Facebook users and not the entire American voting age population, this paper is one of the first efforts to use the massive quantity of data generated through online social media sites to explain the microfoundations of how people think, feel, and act on a daily basis in response to their political environment.
Stabilization over Markov feedback channels: the general case
(with Paolo Minero, Massimo Franceschetti)
The problem of mean-square stabilization of a discrete-time linear dynamical system over a Markov time-varying digital feedback channel is studied. In the scalar case, it is shown that the system can be stabilized if and only if a Markov jump linear system describing the evolution of the estimation error at the decoder is stable - videlicet if and only if the product of the unstable mode of the system and the spectral radius of a second moment matrix that depends only on the Markov feedback rate is less than one. This result generalizes several previous data rate theorems that appeared in the literature, quantifying the amount of instability that can be tolerated when the estimated state is received by the controller over a noise-free digital channel. In the vector case, a necessary condition for stabilizability is derived and a corresponding control scheme is presented, which is tight in some special cases and which strictly improves on a previous result on stability over Markov erasure channels.
Human matching behavior in social networks: an algorithmic perspective
(with Massimo Franceschetti, Mathew D. McCubbins, Ramamohan Paturi, Andrea Vattani)
We argue that algorithmic modeling is a powerful approach to understanding the collective dynamics of human behavior. We consider the task of pairing up individuals connected over a network, according to the following model: each individual is able to propose to match with and accept a proposal from a neighbor in the network; if a matched individual proposes to another neighbor or accepts another proposal, the current match will be broken; individuals can only observe whether their neighbors are currently matched but have no knowledge of the network topology or the status of other individuals; and all individuals have the common goal of maximizing the total number of matches. By examining the experimental data, we identify a behavioral principle called prudence, develop an algorithmic model, analyze its properties mathematically and by simulations, and validate the model with human subject experiments for various network sizes and topologies. Our results include i) a 1/2-approximate maximum matching is obtained in logarithmic time in the network size for bounded degree networks; ii) for any constant x, a (1-x)-approximate maximum matching is obtained in polynomial time, while obtaining a maximum matching can require an exponential time; and iii) convergence to a maximum matching is slower on preferential attachment networks than on small-world networks. These results allow us to predict that while humans can find a ''good quality'' matching quickly, they may be unable to find a maximum matching in feasible time. We show that the human subjects largely abide by prudence, and their collective behavior is closely tracked by the above predictions.
Query incentive networks with split contracts: robustness to individuals' selfishness
(with Manuel Cebrian, Andrea Vattani, Panagiotis Voulgaris)
The present work deals with the problem of information acquisition in a strategic networked environment. To study this problem, Kleinberg and Raghavan (FOCS 2005) introduced the model of query incentive networks, where the root of a binomial branching process wishes to retrieve an information (known by each node independently with probability 1/n) by investing as little as possible. The authors considered fixed-payment contracts in which every node strategically chooses an amount to offer its children (paid upon information retrieval) to convince them to seek the information in their subtrees. Kleinberg and Raghavan discovered that the investment needed at the root exhibits an unexpected threshold behavior that depends on the branching parameter b. For b > 2, the investment is linear in the expected distance to the closest information (logarithmic in n, the rarity of the information), while, for 1 < b < 2, it becomes exponential in the same distance (i.e., polynomial in n). Arcaute et al. (EC 2007) later observed the same threshold behavior for arbitrary Galton-Watson branching processes. The DARPA Network Challenge (retrieving the locations of ten balloons placed at undisclosed positions in the US) has recently brought practical attention to the problems of social mobilization and information acquisition in a networked environment. The MIT Media Laboratory team won the challenge by acting as the root of a query incentive network that unfolded all over the world. However, rather than adopting a fixed-payment strategy, the team implemented a different incentive scheme based on 1/2-split contracts. Under such incentive scheme, a node u who does not possess the information can recruit a friend v through a contract stipulating that if the information is found in the subtree rooted at v, then v has to give half of her own reward back to u.
Motivated by its empirical success, we present a comprehensive theoretical study of this scheme
in the game theoretical setting of query incentive networks. Our main result is that split contracts are robust (as opposed to fixed-payment contracts) to nodes' selfishness. Surprisingly, when nodes
determine the splits to offer their children based on the contracts received from their recruiters, the threshold behavior observed in the previous work vanishes, and an investment linear in the expected distance to the closest information is sufficient to retrieve the information in any arbitrary Galton-Watson process with b > 1. finally, while previous analyses considered the parameters of the branching process as constants, we are able to characterize the rate of the investment in terms of the branching process and the desired probability of success. This allows us to show improvements even in other special cases.