Research on Algorithms for the Internet
General information about RAIN
Spring 07-08
Time: 2-3pm Fridays; Room: Gates 498.
| Apr 11, 2008 | BAGT | |
| Apr 18, 2008 | Robbie Yan | Reputation Markets |
| Apr 25, 2008 | Cancelled for WWW | |
| May 2, 2008 | Bobji Mungamuru | Click Fraud |
| May 9, 2008 | Ying Xu | Oral Examination |
| May 16, 2008 | Esteban Arcaute | Network Formation Games |
| Mar 23, 2008 | Christina Aperjis | Reputation Systems |
| May 30, 2008 | Evimaria Terzi | Inferring Links and Initiators |
| June 6, 2008 | Yury Lifshits | Social Design |
Winter 07-08
Time: 2:15-3:15pm Fridays; Room: Gates 498.
| Jan 25, 2008 | Alon Altman | An Axiomatic Approach to Personalized Ranking Systems |
| Feb 1, 2008 | Ashish Goel | Open Problems |
| Feb 8, 2008 | Cancelled | |
| Feb 15, 2008 | Brian Babcock | Adchemy |
| Feb 22, 2008 | Rajat Bhattacharjee | Oral Examination |
| Feb 29, 2008 | Panagiotis Papadimitriou | Web Graph Similarity for Anomaly Detection |
| Mar 7, 2008 | Brad Null | Ratio Index for Budgeted Learning, with Applications |
| Mar 14, 2008 | Mukund Sundararajan | Optimal Marketing Strategies over Social Networks |
Autumn 07-08
Time: 5-6pm Mondays; Room: Gates 459.
| Oct 15, 2007 | Michael Schapira | Interdomain Routing and Games |
| Oct 22, 2007 | Esteban Arcaute | Decentralized Search of Random Graphs |
| Oct 29, 2007 | Orkut Buyukkokten | Social Network Revolution |
| Nov 5, 2007 | Aranyak Mehta | Random input models for Adwords |
| Nov 12, 2007 | Phil Long | Boosting the area under the ROC curve |
| Nov 19, 2007 | Thanks-giving | |
| Nov 26, 2007 | cancelled | |
| Dec 3, 2007 | Arash Asadpour | Ad auctions with reserved price |
Social Design
Yury Lifshits, Caltech, http://yury.name
Social Design is an art and science of setting and managing incentives in
web applications. So far, machine learning is the dominant approach to all
central algorithmic problems in the Web: search, advertising, recommendation
systems. Social design can become an important counterpart to ML, if we find
the ways to incentivise users to give us more and better input data and even
the answers themselves.
This talk is an outline of a new research field of Social Design. We start
with several stories including Internet press conference of president Putin,
reputation crash in largest Russian social network, and Yahoo! SearchMonkey
project. Then we attempt to extract, characterize and classify the commonly
used patterns in social design. Finaly, we set up the research agenda in the
area with focus on social aspects of semantic web and advertising
technologies.
Slides available at http://yury.name/talks.html
Bio: Yury Lifshits obtained his PhD degree from Steklov Institute of Mathematics
at St.Petersburg (2007). He is currently a postdoc at Caltech. He won two
gold medals at International Mathematical Olympiads, received the Best Paper
Award in Application Track of CSR'07 and the Yandex Teaching Award (2006)
for his course "Algorithms for Internet". Yury is a maintainer of "The
Homepage of Nearest Neighbors": http://simsearch.yury.name
Inferring Links and Initiators
Evimaria Terzi,IBM
Consider a 0-1 observation matrix M, where rows correspond to entities and columns correspond to signals; a value
of 1 (or 0) in cell (i; j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix
we study the problem of inferring the underlying directed links between entities (rows) and finding which entities act as initiators.
I will formally define this problem and propose an MCMC framework for estimating the links and the initiators given the matrix of observations M. I will also show how this framework can be extended to incorporate a temporal aspect; instead of considering a single observation matrix M we consider a sequence of observation matrices M1; : : : ;Mt over
time. Finally I will show that this problem is a generalization of several
problems studied in the field of social-network analysis.
Bio: Evimaria Terzi is a Research Staff Member at IBM Almaden since June 2007. She obtained her PhD from the University of Helsinki (Finland) in January 2007, MSc from Purdue
University (USA) in 2002
and BSc from the University of Thessaloniki (Greece) in 2000.
Reputation Mechanisms for Online Markets
Christina Aperjis
In online marketplaces potential buyers have to decide whether to buy an item and how much to pay for it without being able to observe it and without knowing whether the seller is trustworthy. Reputation mechanisms promote trust by giving information about past transactions. With the goal of incentivizing sellers to advertise truthfully, we study how the marketplace is affected by (i) the reputation mechanism, and (ii) the way buyers interpret the seller?s reputation score.
Two-Stage Myopic Dynamics in Network Formation Games
Esteban Arcaute
We consider two-stage myopic dynamics for network formation
games, where utility to a node is based on a function of
the distance to all other nodes. Since our network formation game
is a generalization of Myerson's announcement game, we use
pairwise Nash stability as the solution concept.
We prove that, although the price of anarchy of the static game is
unbounded (in the number of nodes in the network), our dynamics
converge to a subset of pairwise Nash stable networks with a constant
efficiency ratio. For some specific utility functions, we prove that
our dynamics converge to efficient networks.
This is joint work with Ramesh Johari and Shie Mannor.
Measuring and Modeling the Web
Ying Xu
The last couple of decades have witnessed a phenomenal growth in
the World Wide Web. The Web has become an ubiquitous channel
for information sharing and dissemination. This talk describes several
research contributions in an endeavor towards a better understanding
of the Web.
One of the basic questions about the Web is its size, i.e., how many
webpages there are on the Web. A closely related problem is the sizes
of search engine indexes, as search engine index defines an important
subset of the Web that is most easily accessible to users. In the last
ten years both the scientific literature and the popular press dealt at
length with methodologies and estimates for the sizes of the Web and
public search engines. In the first part of the talk, we propose
algorithms for measuring the Web size, including both empirical and
theoretical results.
The Web is so large that even estimating its size is a hard problem.
To better understand the Web, people study random web graph models
-- a stochastic process that generates random graphs that with high
probability resemble the Web graph. Random web graph models are
useful theoretical tools to provide insights to the evolution of the Web
graph, to predict future behaviors, and to generate synthetic data sets
for testing and simulation. In the second half of the talk, we briefly
present several results in analyzing web graph models and applying the
models to real world applications.
Should ad networks bother fighting click fraud?
Bobji Mungamuru
Suppose an ad network decides that a given click-through is invalid (or "fraudulent"). The implication is that the ad network will not charge the advertiser for that click-through. Therefore, arguably, the ad network is "taking money out of his own pocket" by marking clicks invalid. As such, should ad networks even bother fighting fraud? We analyze a simple economic model of the online advertising market, and conclude that the answer is "yes".
Effective reputation systems should incentivize users to acquire and reveal information about content quality. They should also counter distortions induced by strategic agents that benefit from manipulating reputations. Mechanisms have been proposed in the recent literature with the aim to provide such incentives. These mechanisms, which we will refer to as reputation markets, can be viewed as information markets designed to assess the quality of online content. In this paper we develop equilibrium models to study how incentives created by a reputation market should influence community behavior and the accuracy of assessments.
Optimal Marketing Strategies over Social Networks
Mukund Sundararajan
Abstract: We study revenue maximization in social networks. We identify a
family of strategies called 'influence and exploit' strategies that are easy
to implement, easy to optimize over and approximately optimal.
Based on a paper that will appear in WWW 08. Jointly with Jason Hartline
(Northwestern University) and Vahab Mirrokni (MSR Redmond).i
Ratio Index for Budgeted Learning, with Applications
Brad Null
In the budgeted learning problem, we are allowed to experiment on a set of
alternatives (given a fixed experimentation budget) with the goal of picking
a single alternative with the largest possible expected payoff.
Approximation algorithms for this problem were developed by Guha and
Munagala by rounding a linear program that couples the various alternatives
together. We present an index for this problem, which we call the ratio
index, which also guarantees a constant factor approximation. Index based
policies have the advantage that a single number (i.e. the index) can be
computed for each alternative irrespective of all other alternatives, and
the alternative with the highest index is experimented upon. This is
analogous to the famous Gittins index for the discounted multi-armed bandit
problem.
The ratio index has several interesting structural properties. First, it can
be computed in strongly polynomial time; the techniques we develop may be
useful for deriving strongly polynomial algorithms for other Markov Decision
Problems. Second, with the appropriate discount factor, the Gittins index
and the ratio index are constant factor approximations of each other, and
hence the Gittins index also gives a constant factor approximation to the
budgeted learning problem. Finally, the ratio index can be used to obtain a
"discount oblivious" solution to the multi-armed bandit problem, which in
turn gives constant factor approximations to a wide class of problems that
we call "list ranking problems".
Joint work with Rajat Bhattacharjee, Ashish Goel, and Sanjeev Khanna
Web Graph Similarity for Anomaly Detection
Panagiotis Papadimitriou
Web graphs are approximate snapshots of the web, created by search engines. Their creation is an error-prone procedure that relies on the availability of Internet nodes and the faultless operation of multiple software and hardware units. Checking the validity of a web graph requires a notion of graph similarity. Web graph similarity helps measure the amount and significance of changes in consecutive web graphs. These measurements validate how well search engines acquire content from the web. In this work we study five similarity schemes: three of them adapted from existing graph similarity measures and two adapted from well-known document and vector similarity methods. We compare and evaluate all five schemes using a sequence of web graphs for Yahoo! and study if the schemes can identify anomalies that may occur due to hardware or other problems.
I'll talk about a number of computational problems that arise in online
advertising, primarily from the perspective of the buyer of advertising.
Topics to be discussed include:
* When buying sponsored search links from search engines, how does one
manage a portfolio of millions of keywords with minimal human involvement?
Challenges include determining how much to bid for each keyword and what
ad text to display to attract the most customers.
* How can one learn what properties of a web site or advertisement will
lead to the greatest number of purchases or clicks? The problem is made
more difficult by the fact that different classes of users respond well to
different marketing messages, so it's best to use different content for
different user segments.
* Consider an auction where the items to be sold are not known in advance,
but rather arrive online, and buyers are willing to pay different amount
for each item (revealed as the items arrive). Each buyer has a maximum
limit on the total number of items that they will buy. How should the
seller assign items to buyers so as to maximize the seller's revenue?
An Axiomatic Approach to Personalized Ranking Systems
Alon Altman
Personalized ranking systems and trust systems are an essential tool for collaboration in a multi-agent environment. In these systems, trust relations between many agents are aggregated to produce a personalized trust rating of the agents. In this talk I introduce the first extensive axiomatic study of this setting, and explore a wide array of well-known and new personalized ranking systems. We adapt several axioms (basic criteria) from the literature on global ranking systems to the context of personalized ranking systems, and fully classify the set of systems that satisfy all of these axioms. We further show that all these axioms are necessary for this result.
Ad Auctions with Reserved Price
Arash Asadpour
I will talk about the Ad Auctions when the auctioneer enjoys a relevant ad
that she can show in one of the slots if she wants to. The setting is to
some extent similar to the Ad Auctions with reserved price.
I will explain a randomized auction which is truthful in expectation for
both the bidders and the auctioneer. Also our mechanism is individual
rational and budget balanced. I will discus the revenue of this auction and
also will talk about some other settings in which the technique used to
design the auction can be helpful.
This is a joint work with Kamal Jain (Microsoft Research) and David Abraham
(CMU).
Boosting the area under ROC curve
Phil Long, Google
We show that any weak ranker that can achieve an area under the ROC curve
slightly better than 1/2 (which can be achieved by random guessing) can be
efficiently boosted to achieve an area under the ROC curve arbitrarily close
to 1. This boosting can be performed even in the presence of independent
misclassification noise, given access to a noise-tolerant weak ranker.
(This is joint work with Rocco Servedio.)
Random input models for Adwords
Aranyak Mehta, Google
I will talk about a new result on Random Input models for the Adwords Problem: What are good algorithms for allocation when the query stream is modeled as a random process, rather than as a worst case input. How does the natural "highest-bidder-wins" greedy algorithm perform? This question leads to an interesting generalization of a classic online matching algorithm. I will also talk about a related result on extended bidding models for Ad auctions.
Who do you know: The Social Network Revolution
Orkut Buyukkokten, Google
Online social networks fundamentally change the way we get connected. The people we cross paths with have the biggest influence in our lives. Now it's easier to cross paths than ever as we are much closer and so much more connected. In this talk I will discuss the motivation behind the development of orkut.com, touch on the social and technical aspects of implementing and maintaining a system that has over 60 million users and reflect on the lessons we learned.
Deterministic Decentralized Search in Random Graphs
Esteban Arcaute
We study a general framework for decentralized search in
random graphs. Our main focus is on deterministic memoryless
search algorithms that use only local information to reach their
destination in a bounded number of steps in expectation. This class
includes (with small modifications) the search algorithms used in
Kleinberg's pioneering work on long-range percolation graphs and
hierarchical network models. We give a characterization of searchable
graphs in this model, and use this characterization to prove a
monotonicity property for searchability.
Joint work with Ning Chen, Ravi Kumar, David Liben-Nowell, Mohammad Mahdian, Hamid Nazerzadeh and Ying Xu.
Interdomain Routing and Games
Michael Schapira,
Hebrew University of Jerusalem ( http://www.cs.huji.ac.il/~mikesch/)
We present a game-theoretic model that captures many of the intricacies of
interdomain routing in today's Internet. In this model, the strategic agents are source-nodes located on a network, who aim to send traffic to a unique destination node . The model enables complex dynamic interactions between the agents (asynchronous, sequential, and based on partial-information).
We provide realistic conditions that guarantee that executing the Border gateway Protocol (BGP), the only protocol currently used for interdomain routing, is in the best interest of each of the players. Moreover, we show that even coalitions of players of any size cannot improve their routing outcomes by jointly deviating from BGP. Unlike the vast majority of works in mechanism design, these results do not require any monetary transfers (to or by the agents).
Based on joint work with Hagay Levin and Aviv Zohar.