Fair Clustering Through Fairlets
2018
arXiv
arXiv:1802.05733v1
fatcat:p67prw2tyfhm5kn2gfsslzhe4y
*Flavio**Chierichetti*was supported in part by the ERC Starting Grant DMAP 680153, by a Google Focused Research Award, and by the SIR Grant RBSI14Q743. ...##
On Discrete Preferences and Coordination
2013
arXiv
An active line of research has considered games played on networks in which payoffs depend on both a player's individual decision and also the decisions of her neighbors. Such games have been used to model issues including the formation of opinions and the adoption of technology. A basic question that has remained largely open in this area is to consider games where the strategies available to the players come from a fixed, discrete set, and where players may have different intrinsic

arXiv:1304.8125v1
fatcat:o6a44frwdbgljaygsxuxjq7giq
... among the possible strategies. It is natural to model the tension among these different preferences by positing a distance function on the strategy set that determines a notion of "similarity" among strategies; a player's payoff is determined by the distance from her chosen strategy to her preferred strategy and to the strategies chosen by her network neighbors. Even when there are only two strategies available, this framework already leads to natural open questions about a version of the classical Battle of the Sexes problem played on a graph. We develop a set of techniques for analyzing this class of games, which we refer to as discrete preference games. We parametrize the games by the relative extent to which a player takes into account the effect of her preferred strategy and the effect of her neighbors' strategies, allowing us to interpolate between network coordination games and unilateral decision-making. When these two effects are balanced, we show that the price of stability is equal to 1 for any discrete preference game in which the distance function on the strategies is a tree metric; as a special case, this includes the Battle of the Sexes on a graph. We also show that trees form the maximal family of metrics for which the price of stability is 1, and produce a collection of metrics on which the price of stability converges to a tight bound of 2.##
On Additive Approximate Submodularity
2020
arXiv
A real-valued set function is (additively) approximately submodular if it satisfies the submodularity conditions with an additive error. Approximate submodularity arises in many settings, especially in machine learning, where the function evaluation might not be exact. In this paper we study how close such approximately submodular functions are to truly submodular functions. We show that an approximately submodular function defined on a ground set of n elements is O(n^2) pointwise-close to a

arXiv:2010.02912v2
fatcat:ia6strhpdzhv7mp5v5njeob2ce
... modular function. This result also provides an algorithmic tool that can be used to adapt existing submodular optimization algorithms to approximately submodular functions. To complement, we show an Ω(√(n)) lower bound on the distance to submodularity. These results stand in contrast to the case of approximate modularity, where the distance to modularity is a constant, and approximate convexity, where the distance to convexity is logarithmic.##
Matroids, Matchings and Fairness

2021
Zenodo
*

doi:10.5281/zenodo.4697719
fatcat:opi75tam6jft5cw465rtyf2ave
*Flavio**Chierichetti*was supported in part by the ERC Starting Grant DMAP 680153, by a Google Focused Research Award, by the "Dipartimenti di Eccellenza 2018-2022" grant awarded to the ... ., 2016] , fair ranking Celis et al. [2018c] , fair clustering [*Chierichetti*et al., 2017, Rösner and Schmidt, 2018] , fair bandit algorithms [Dimitrakakis et al., 2017] and many others. ... Following previous work by Celis et al. [2018a,b,c] ,*Chierichetti*et al. [2017] , Rösner and Schmidt [2018] , we encode fairness by posing additional balance constraints on the solution. ...##
Discrete Choice, Permutations, and Reconstruction

2018
Zenodo
*

In this paper we study the well-known family of Random Utility Models, developed over 50 years ago to codify rational user behavior in choosing one item from a finite set of options. In this setting each user draws i.i.d. from some distribution a utility function mapping each item in the universe to a real-valued utility. The user is then offered a subset of the items, and selects the one of maximum utility. A Max-Dist oracle for this choice model takes any subset of items and returns the

doi:10.5281/zenodo.4697684
fatcat:2ttmjkmlcbdezfj5dunec75z5i
... ility (over the distribution of utility functions) that each will be selected. A discrete choice algorithm, given access to a Max-Dist oracle, must return a function that approximates the oracle. We show three primary results. First, we show that any algorithm exactly reproducing the oracle must make exponentially many queries. Second, we show an equivalent representation of the distribution over utility functions, based on permutations, and show that if this distribution has support size k, then it is possible to approximate the oracle using O(nk) queries. Finally, we consider settings in which the subset of items is always small. We give an algorithm that makes less than n(1−ε/2)K queries, each to sets of size at most (1−ε/2)K, in order to approximate the Max-Dist oracle on every set of size |T| ≤ K with statistical error at most ε. In contrast, we show that any algorithm that queries for subsets of size [Equation] must make maximal statistical error on some large sets.##
Voting with Limited Information and Many Alternatives
2011
arXiv
The traditional axiomatic approach to voting is motivated by the problem of reconciling differences in subjective preferences. In contrast, a dominant line of work in the theory of voting over the past 15 years has considered a different kind of scenario, also fundamental to voting, in which there is a genuinely "best" outcome that voters would agree on if they only had enough information. This type of scenario has its roots in the classical Condorcet Jury Theorem; it includes cases such as

arXiv:1110.1785v1
fatcat:svoz3hsw6zhxtk4up57ij3iixy
... rs in a criminal trial who all want to reach the correct verdict but disagree in their inferences from the available evidence, or a corporate board of directors who all want to improve the company's revenue, but who have different information that favors different options. This style of voting leads to a natural set of questions: each voter has a private signal that provides probabilistic information about which option is best, and a central question is whether a simple plurality voting system, which tabulates votes for different options, can cause the group decision to arrive at the correct option. We show that plurality voting is powerful enough to achieve this: there is a way for voters to map their signals into votes for options in such a way that --- with sufficiently many voters --- the correct option receives the greatest number of votes with high probability. We show further, however, that any process for achieving this is inherently expensive in the number of voters it requires: succeeding in identifying the correct option with probability at least 1 - η requires Ω(n^3 ϵ^-2η^-1) voters, where n is the number of options and ϵ is a distributional measure of the minimum difference between the options.##
Asymptotic Behavior of Sequence Models

2019
Zenodo
*

In what situations is it supported over [0,
Conference'17, July 2017, Washington, DC, USA

doi:10.5281/zenodo.4003697
fatcat:yk3b4k2ykzckvd35mfm3d56gkq
*Flavio**Chierichetti*, Ravi Kumar, and Andrew Tomkins ...##
On the Power Laws of Language

2017
Zenodo
*

About eight decades ago, Zipf postulated that the word frequency distribution of languages is a power law, i.e., it is a straight line on a log-log plot. Over the years, this phenomenon has been documented and studied extensively. For many corpora, however, the empirical distribution barely resembles a power law: when plotted on a log-log scale, the distribution is concave and appears to be composed of two differently sloped straight lines joined by a smooth curve. A simple generative model is

doi:10.5281/zenodo.4697663
fatcat:ny66a4rvjzb6jmexnrpvobfgo4
... roposed to capture this phenomenon. The word frequency distributions produced by this model are shown to match the observations both analytically and empirically.##
Trace Complexity of Network Inference
2013
arXiv
The network inference problem consists of reconstructing the edge set of a network given traces representing the chronology of infection times as epidemics spread through the network. This problem is a paradigmatic representative of prediction tasks in machine learning that require deducing a latent structure from observed patterns of activity in a network, which often require an unrealistically large number of resources (e.g., amount of available data, or computational time). A fundamental

arXiv:1308.2954v1
fatcat:7rkkytzocbgsflpgi73cs4lycu
... tion is to understand which properties we can predict with a reasonable degree of accuracy with the available resources, and which we cannot. We define the trace complexity as the number of distinct traces required to achieve high fidelity in reconstructing the topology of the unobserved network or, more generally, some of its properties. We give algorithms that are competitive with, while being simpler and more efficient than, existing network inference approaches. Moreover, we prove that our algorithms are nearly optimal, by proving an information-theoretic lower bound on the number of traces that an optimal inference algorithm requires for performing this task in the general case. Given these strong lower bounds, we turn our attention to special cases, such as trees and bounded-degree graphs, and to property recovery tasks, such as reconstructing the degree distribution without inferring the network. We show that these problems require a much smaller (and more realistic) number of traces, making them potentially solvable in practice.##
Motif Counting Beyond Five Nodes

2021
Zenodo
*

Understanding 1:2 Marco Bressan,

doi:10.5281/zenodo.4698505
fatcat:it2h4tw5i5axtezwqjjtwtcicm
*Flavio**Chierichetti*, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi the distribution of graphlets allows us to make key inferences about the structural properties ...##
Mallows Models for Top-k Lists

2018
Zenodo
*

The classic Mallows model is a widely-used tool to realize distributions on per- mutations. Motivated by common practical situations, in this paper, we generalize Mallows to model distributions on top-k lists by using a suitable distance measure between top-k lists. Unlike many earlier works, our model is both analytically tractable and computationally efficient. We demonstrate this by studying two basic problems in this model, namely, sampling and reconstruction, from both algorithmic and experimental points of view.

doi:10.5281/zenodo.4697980
fatcat:ha5kg3nqrnbgzgnecmzd3rnhyu
##
Learning a Mixture of Two Multinomial Logits

2021
Zenodo
*

Recently,

doi:10.5281/zenodo.4697670
fatcat:qed5ihwpxfhyhkyh5nghugtuva
*Chierichetti*et al. (2018) study choice models that are represented by distributions over permutations of the items in the universe; they show a series of lower bounds in that model. ...##
Counting Graphlets: Space vs Time

2021
Zenodo
*

This is the preprint version of the ACM WSDM 2017 paper, https://doi.org/10.1145/3018661.3018732.

doi:10.5281/zenodo.4698491
fatcat:53dcaeuqqvg35knbi74vchtnia
##
Light RUMs

2021
International Conference on Machine Learning
*

*Flavio*

*Chierichetti*was supported in part by the PRIN project 2017K7XPAN, by a Google Focused Research Award, by BiCi -Bertinoro international Center for informatics, and by the "Dipartimenti di Eccellenza ... Note that the above upper bound improves the one in (

*Chierichetti*et al., 2018a) from O(n 2 ) to O(n). ... These two definitions of RUMs are equivalent (see, e.g.,

*Chierichetti*et al., 2018a) . MNLs and MNL Mixtures. A Multinomial Logit (aka, MNL) is a widely used kind of RUM. ...

##
Algorithms for ℓ_p Low Rank Approximation
2017
arXiv
We consider the problem of approximating a given matrix by a low-rank matrix so as to minimize the entrywise ℓ_p-approximation error, for any p ≥ 1; the case p = 2 is the classical SVD problem. We obtain the first provably good approximation algorithms for this version of low-rank approximation that work for every value of p ≥ 1, including p = ∞. Our algorithms are simple, easy to implement, work well in practice, and illustrate interesting tradeoffs between the approximation quality, the running time, and the rank of the approximating matrix.

arXiv:1705.06730v1
fatcat:pnxgtjuyevbezczictet52pzay
