RAIN meeting

RAIN Meeting Schedule

Research on Algorithms for the Internet

RAIN schedule, 04-05 Winter ~ 06-07 Autumn

Spring 07-08

Time: 2-3pm Fridays; Room: Gates 498.

Apr 11, 2008 BAGT

Apr 18, 2008 Robbie Yan Reputation Markets

Apr 25, 2008 Cancelled for WWW

May 2, 2008 Bobji Mungamuru Click Fraud

May 9, 2008 Ying Xu Oral Examination

May 16, 2008 Esteban Arcaute Network Formation Games

Mar 23, 2008 Christina Aperjis Reputation Systems

May 30, 2008 Evimaria Terzi Inferring Links and Initiators

June 6, 2008 Yury Lifshits Social Design

Winter 07-08

Time: 2:15-3:15pm Fridays; Room: Gates 498.

Jan 25, 2008 Alon Altman An Axiomatic Approach to Personalized Ranking Systems

Feb 1, 2008 Ashish Goel Open Problems

Feb 8, 2008 Cancelled

Feb 15, 2008 Brian Babcock Adchemy

Feb 22, 2008 Rajat Bhattacharjee Oral Examination

Feb 29, 2008 Panagiotis Papadimitriou Web Graph Similarity for Anomaly Detection

Mar 7, 2008 Brad Null Ratio Index for Budgeted Learning, with Applications

Mar 14, 2008 Mukund Sundararajan Optimal Marketing Strategies over Social Networks

Autumn 07-08

Time: 5-6pm Mondays; Room: Gates 459.

Oct 15, 2007 Michael Schapira Interdomain Routing and Games

Oct 22, 2007 Esteban Arcaute Decentralized Search of Random Graphs

Oct 29, 2007 Orkut Buyukkokten Social Network Revolution
Nov 5, 2007 Aranyak Mehta Random input models for Adwords

Nov 12, 2007 Phil Long Boosting the area under the ROC curve

Nov 19, 2007 Thanks-giving
Nov 26, 2007 cancelled
Dec 3, 2007 Arash Asadpour Ad auctions with reserved price

Social Design
Yury Lifshits, Caltech, http://yury.name

Social Design is an art and science of setting and managing incentives in web applications. So far, machine learning is the dominant approach to all central algorithmic problems in the Web: search, advertising, recommendation systems. Social design can become an important counterpart to ML, if we find the ways to incentivise users to give us more and better input data and even the answers themselves.
This talk is an outline of a new research field of Social Design. We start with several stories including Internet press conference of president Putin, reputation crash in largest Russian social network, and Yahoo! SearchMonkey project. Then we attempt to extract, characterize and classify the commonly used patterns in social design. Finaly, we set up the research agenda in the area with focus on social aspects of semantic web and advertising technologies.
Slides available at http://yury.name/talks.html
Bio: Yury Lifshits obtained his PhD degree from Steklov Institute of Mathematics at St.Petersburg (2007). He is currently a postdoc at Caltech. He won two gold medals at International Mathematical Olympiads, received the Best Paper Award in Application Track of CSR'07 and the Yandex Teaching Award (2006) for his course "Algorithms for Internet". Yury is a maintainer of "The Homepage of Nearest Neighbors": http://simsearch.yury.name

Inferring Links and Initiators
Evimaria Terzi,IBM

Consider a 0-1 observation matrix M, where rows correspond to entities and columns correspond to signals; a value of 1 (or 0) in cell (i; j) of M indicates that signal j has been observed (or not observed) in entity i. Given such a matrix we study the problem of inferring the underlying directed links between entities (rows) and finding which entities act as initiators. I will formally define this problem and propose an MCMC framework for estimating the links and the initiators given the matrix of observations M. I will also show how this framework can be extended to incorporate a temporal aspect; instead of considering a single observation matrix M we consider a sequence of observation matrices M1; : : : ;Mt over time. Finally I will show that this problem is a generalization of several problems studied in the field of social-network analysis.
Bio: Evimaria Terzi is a Research Staff Member at IBM Almaden since June 2007. She obtained her PhD from the University of Helsinki (Finland) in January 2007, MSc from Purdue University (USA) in 2002 and BSc from the University of Thessaloniki (Greece) in 2000.

Reputation Mechanisms for Online Markets
Christina Aperjis

In online marketplaces potential buyers have to decide whether to buy an item and how much to pay for it without being able to observe it and without knowing whether the seller is trustworthy. Reputation mechanisms promote trust by giving information about past transactions. With the goal of incentivizing sellers to advertise truthfully, we study how the marketplace is affected by (i) the reputation mechanism, and (ii) the way buyers interpret the seller?s reputation score.

Two-Stage Myopic Dynamics in Network Formation Games
Esteban Arcaute

We consider two-stage myopic dynamics for network formation games, where utility to a node is based on a function of the distance to all other nodes. Since our network formation game is a generalization of Myerson's announcement game, we use pairwise Nash stability as the solution concept.
We prove that, although the price of anarchy of the static game is unbounded (in the number of nodes in the network), our dynamics converge to a subset of pairwise Nash stable networks with a constant efficiency ratio. For some specific utility functions, we prove that our dynamics converge to efficient networks.
This is joint work with Ramesh Johari and Shie Mannor.

Measuring and Modeling the Web
Ying Xu

The last couple of decades have witnessed a phenomenal growth in the World Wide Web. The Web has become an ubiquitous channel for information sharing and dissemination. This talk describes several research contributions in an endeavor towards a better understanding of the Web.
One of the basic questions about the Web is its size, i.e., how many webpages there are on the Web. A closely related problem is the sizes of search engine indexes, as search engine index defines an important subset of the Web that is most easily accessible to users. In the last ten years both the scientific literature and the popular press dealt at length with methodologies and estimates for the sizes of the Web and public search engines. In the first part of the talk, we propose algorithms for measuring the Web size, including both empirical and theoretical results.
The Web is so large that even estimating its size is a hard problem. To better understand the Web, people study random web graph models -- a stochastic process that generates random graphs that with high probability resemble the Web graph. Random web graph models are useful theoretical tools to provide insights to the evolution of the Web graph, to predict future behaviors, and to generate synthetic data sets for testing and simulation. In the second half of the talk, we briefly present several results in analyzing web graph models and applying the models to real world applications.

Should ad networks bother fighting click fraud?
Bobji Mungamuru

Suppose an ad network decides that a given click-through is invalid (or "fraudulent"). The implication is that the ad network will not charge the advertiser for that click-through. Therefore, arguably, the ad network is "taking money out of his own pocket" by marking clicks invalid. As such, should ad networks even bother fighting fraud? We analyze a simple economic model of the online advertising market, and conclude that the answer is "yes".

Reputation systems
Robbie Yan

Effective reputation systems should incentivize users to acquire and reveal information about content quality. They should also counter distortions induced by strategic agents that benefit from manipulating reputations. Mechanisms have been proposed in the recent literature with the aim to provide such incentives. These mechanisms, which we will refer to as reputation markets, can be viewed as information markets designed to assess the quality of online content. In this paper we develop equilibrium models to study how incentives created by a reputation market should influence community behavior and the accuracy of assessments.

Optimal Marketing Strategies over Social Networks
Mukund Sundararajan

Abstract: We study revenue maximization in social networks. We identify a family of strategies called 'influence and exploit' strategies that are easy to implement, easy to optimize over and approximately optimal.
Based on a paper that will appear in WWW 08. Jointly with Jason Hartline (Northwestern University) and Vahab Mirrokni (MSR Redmond).i

Ratio Index for Budgeted Learning, with Applications
Brad Null

In the budgeted learning problem, we are allowed to experiment on a set of alternatives (given a fixed experimentation budget) with the goal of picking a single alternative with the largest possible expected payoff. Approximation algorithms for this problem were developed by Guha and Munagala by rounding a linear program that couples the various alternatives together. We present an index for this problem, which we call the ratio index, which also guarantees a constant factor approximation. Index based policies have the advantage that a single number (i.e. the index) can be computed for each alternative irrespective of all other alternatives, and the alternative with the highest index is experimented upon. This is analogous to the famous Gittins index for the discounted multi-armed bandit problem.
The ratio index has several interesting structural properties. First, it can be computed in strongly polynomial time; the techniques we develop may be useful for deriving strongly polynomial algorithms for other Markov Decision Problems. Second, with the appropriate discount factor, the Gittins index and the ratio index are constant factor approximations of each other, and hence the Gittins index also gives a constant factor approximation to the budgeted learning problem. Finally, the ratio index can be used to obtain a "discount oblivious" solution to the multi-armed bandit problem, which in turn gives constant factor approximations to a wide class of problems that we call "list ranking problems".
Joint work with Rajat Bhattacharjee, Ashish Goel, and Sanjeev Khanna

Web Graph Similarity for Anomaly Detection
Panagiotis Papadimitriou

Web graphs are approximate snapshots of the web, created by search engines. Their creation is an error-prone procedure that relies on the availability of Internet nodes and the faultless operation of multiple software and hardware units. Checking the validity of a web graph requires a notion of graph similarity. Web graph similarity helps measure the amount and significance of changes in consecutive web graphs. These measurements validate how well search engines acquire content from the web. In this work we study five similarity schemes: three of them adapted from existing graph similarity measures and two adapted from well-known document and vector similarity methods. We compare and evaluate all five schemes using a sequence of web graphs for Yahoo! and study if the schemes can identify anomalies that may occur due to hardware or other problems.

Adchemy
Brian Babcock,Adchemy

I'll talk about a number of computational problems that arise in online advertising, primarily from the perspective of the buyer of advertising.
Topics to be discussed include:
* When buying sponsored search links from search engines, how does one manage a portfolio of millions of keywords with minimal human involvement? Challenges include determining how much to bid for each keyword and what ad text to display to attract the most customers.
* How can one learn what properties of a web site or advertisement will lead to the greatest number of purchases or clicks? The problem is made more difficult by the fact that different classes of users respond well to different marketing messages, so it's best to use different content for different user segments.
* Consider an auction where the items to be sold are not known in advance, but rather arrive online, and buyers are willing to pay different amount for each item (revealed as the items arrive). Each buyer has a maximum limit on the total number of items that they will buy. How should the seller assign items to buyers so as to maximize the seller's revenue?

An Axiomatic Approach to Personalized Ranking Systems
Alon Altman

Personalized ranking systems and trust systems are an essential tool for collaboration in a multi-agent environment. In these systems, trust relations between many agents are aggregated to produce a personalized trust rating of the agents. In this talk I introduce the first extensive axiomatic study of this setting, and explore a wide array of well-known and new personalized ranking systems. We adapt several axioms (basic criteria) from the literature on global ranking systems to the context of personalized ranking systems, and fully classify the set of systems that satisfy all of these axioms. We further show that all these axioms are necessary for this result.

Ad Auctions with Reserved Price
Arash Asadpour

I will talk about the Ad Auctions when the auctioneer enjoys a relevant ad that she can show in one of the slots if she wants to. The setting is to some extent similar to the Ad Auctions with reserved price. I will explain a randomized auction which is truthful in expectation for both the bidders and the auctioneer. Also our mechanism is individual rational and budget balanced. I will discus the revenue of this auction and also will talk about some other settings in which the technique used to design the auction can be helpful.
This is a joint work with Kamal Jain (Microsoft Research) and David Abraham (CMU).

Boosting the area under ROC curve
Phil Long, Google

We show that any weak ranker that can achieve an area under the ROC curve slightly better than 1/2 (which can be achieved by random guessing) can be efficiently boosted to achieve an area under the ROC curve arbitrarily close to 1. This boosting can be performed even in the presence of independent misclassification noise, given access to a noise-tolerant weak ranker.
(This is joint work with Rocco Servedio.)

Random input models for Adwords
Aranyak Mehta, Google

I will talk about a new result on Random Input models for the Adwords Problem: What are good algorithms for allocation when the query stream is modeled as a random process, rather than as a worst case input. How does the natural "highest-bidder-wins" greedy algorithm perform? This question leads to an interesting generalization of a classic online matching algorithm. I will also talk about a related result on extended bidding models for Ad auctions.

Who do you know: The Social Network Revolution
Orkut Buyukkokten, Google

Online social networks fundamentally change the way we get connected. The people we cross paths with have the biggest influence in our lives. Now it's easier to cross paths than ever as we are much closer and so much more connected. In this talk I will discuss the motivation behind the development of orkut.com, touch on the social and technical aspects of implementing and maintaining a system that has over 60 million users and reflect on the lessons we learned.

Deterministic Decentralized Search in Random Graphs
Esteban Arcaute

We study a general framework for decentralized search in random graphs. Our main focus is on deterministic memoryless search algorithms that use only local information to reach their destination in a bounded number of steps in expectation. This class includes (with small modifications) the search algorithms used in Kleinberg's pioneering work on long-range percolation graphs and hierarchical network models. We give a characterization of searchable graphs in this model, and use this characterization to prove a monotonicity property for searchability.
Joint work with Ning Chen, Ravi Kumar, David Liben-Nowell, Mohammad Mahdian, Hamid Nazerzadeh and Ying Xu.

Interdomain Routing and Games
Michael Schapira, Hebrew University of Jerusalem ( http://www.cs.huji.ac.il/~mikesch/)

We present a game-theoretic model that captures many of the intricacies of interdomain routing in today's Internet. In this model, the strategic agents are source-nodes located on a network, who aim to send traffic to a unique destination node . The model enables complex dynamic interactions between the agents (asynchronous, sequential, and based on partial-information).
We provide realistic conditions that guarantee that executing the Border gateway Protocol (BGP), the only protocol currently used for interdomain routing, is in the best interest of each of the players. Moreover, we show that even coalitions of players of any size cannot improve their routing outcomes by jointly deviating from BGP. Unlike the vast majority of works in mechanism design, these results do not require any monetary transfers (to or by the agents).
Based on joint work with Hagay Levin and Aviv Zohar.

Apr 11, 2008	BAGT
Apr 18, 2008	Robbie Yan	Reputation Markets
Apr 25, 2008	Cancelled for WWW
May 2, 2008	Bobji Mungamuru	Click Fraud
May 9, 2008	Ying Xu	Oral Examination
May 16, 2008	Esteban Arcaute	Network Formation Games
Mar 23, 2008	Christina Aperjis	Reputation Systems
May 30, 2008	Evimaria Terzi	Inferring Links and Initiators
June 6, 2008	Yury Lifshits	Social Design

Jan 25, 2008	Alon Altman	An Axiomatic Approach to Personalized Ranking Systems
Feb 1, 2008	Ashish Goel	Open Problems
Feb 8, 2008	Cancelled
Feb 15, 2008	Brian Babcock	Adchemy
Feb 22, 2008	Rajat Bhattacharjee	Oral Examination
Feb 29, 2008	Panagiotis Papadimitriou	Web Graph Similarity for Anomaly Detection
Mar 7, 2008	Brad Null	Ratio Index for Budgeted Learning, with Applications
Mar 14, 2008	Mukund Sundararajan	Optimal Marketing Strategies over Social Networks

Oct 15, 2007	Michael Schapira	Interdomain Routing and Games
Oct 22, 2007	Esteban Arcaute	Decentralized Search of Random Graphs
Oct 29, 2007	Orkut Buyukkokten	Social Network Revolution
Nov 5, 2007	Aranyak Mehta	Random input models for Adwords
Nov 12, 2007	Phil Long	Boosting the area under the ROC curve
Nov 19, 2007	Thanks-giving
Nov 26, 2007	cancelled
Dec 3, 2007	Arash Asadpour	Ad auctions with reserved price