The post potential reads, M2 course appeared first on MSR-INRIA.

]]>

- B. Hajek, S. Oh and J. Xu, “Minimax-optimal inference from partial rankings,”
*Neural Information brocessing Systems, (NIPS)*2014. - B. Hajek, Y. Wu, and J. Xu, “Achieving exact cluster recovery threshold via semidefinite programming,”
*IEEE Transactions on Information Theory*, vol. 62, May 2016, pp. 2788-2797. (Also Arxiv 1412.6156) - B. Hajek, Y. Wu, and J. Xu, “Computational lower bounds for community detection on random graphs,”
*Proceedings Conf. on Learning Theory (COLT)*June, 2015, Paris. Arxiv 1406.6625 poster slides - B. Hajek and S. Sankagiri, “Recovering a Hidden Community in a Preferential Attachment Graph,” IEEE International Symposium on Information Theory, June 15-20, 2018, Vail, CO.) (Full version at Arxiv 1801.06818)

Achieving the Bayes Error Rate in Synchronization and Block Models by SDP, Robustly

https://arxiv.org/abs/1904.09635

Testing for high-dimensional geometry in random graphs

Sébastien Bubeck, Jian Ding, Ronen Eldan, Mikl’os Z. R’acz

https://arxiv.org/pdf/1411.5713.pdf

‘On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank One Perturbations of Gaussian Tensors’

Andrea Montanari, Daniel Reichman, and Ofer Zeitouni,

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 63, NO. 3, MARCH 2017

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7779134

‘Information-theoretic thresholds for community detection in sparse networks.’

- Banks, C. Moore, J. Neeman, and P. Netrapalli.

In Proceedings of the 29th Conference on Learning Theory, COLT 2016, New York, USA, June 23-26, 2016, pages 383?416, 2016.

http://proceedings.mlr.press/v49/banks16.pdf

A strengthening and a multipartite generalization of the Alon-Boppana-Serre Theorem,

B. Mohar, https://arxiv.org/pdf/1002.1084.pdf

The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices

Florent Benaych-Georges (PMA, CMAP), Raj Rao Nadakuditi (EECS)

https://arxiv.org/abs/0910.2120

Approximating the cut-norm via Grothendieck’s inequality

Noga Alon and Assaf Naor, https://web.math.princeton.edu/~naor/homepage%20files/cutnorm.pdf

Spectral Techniques applied to sparse random graphs

U. Feige and E. Ofek

Wiley Interscience 2005, https://onlinelibrary.wiley.com/doi/epdf/10.1002/rsa.20089

‘Robust reconstruction on trees is determined by the second eigenvalue’.

S. Janson and E. Mossel.

The Annals of Probability, 32(3B):2630?2649, 2004

https://projecteuclid.org/download/pdfview_1/euclid.aop/1091813626

‘BROADCASTING ON TREES AND THE ISING MODEL’

William Evans, Claire Kenyon, Yuval Peres and Leonard J. Schulman

The Annals of Applied Probability, 2000, Vol. 10, No. 2, 410–433

https://projecteuclid.org/download/pdf_1/euclid.aoap/1019487349

‘THE EIGENVALUES OF RANDOM SYMMETRIC MATRICES’

Z. FÜREDI and J. KoMLóS

Combinatorica 1 (3) (1981) 233-241

https://faculty.math.illinois.edu/~z-furedi/PUBS/furedi_komlos_cca1981_random_eig.pdf

The post potential reads, M2 course appeared first on MSR-INRIA.

]]>The post Job offer for a 2-year “Starting Research Position” in the “Distributed Machine Learning” project appeared first on MSR-INRIA.

]]>**Profile:** We are looking for a candidate having obtained a PhD thesis in Machine Learning in the last few years, with a strong publication record at competitive venues in the domain (eg ICML, COLT, NeurIPS) and proven ability to autonomously drive her/his own research agenda. Previous experience in theoretical computer science or distributed computing are a plus.

The post Job offer for a 2-year “Starting Research Position” in the “Distributed Machine Learning” project appeared first on MSR-INRIA.

]]>The post Colloquium on « AI aspirations and Advances » by Eric Horvitz appeared first on MSR-INRIA.

]]>https://www.lip6.fr/colloquium/

The post Colloquium on « AI aspirations and Advances » by Eric Horvitz appeared first on MSR-INRIA.

]]>The post Karthikeyan Bhargavan awarded for his joint collaborations with Microsoft Research appeared first on MSR-INRIA.

]]>https://www.microsoft.com/en-us/research/academic-program/outstanding-collaborator-award/

The post Karthikeyan Bhargavan awarded for his joint collaborations with Microsoft Research appeared first on MSR-INRIA.

]]>The post Karthikeyan Bhargavan awarded for his joint collaborations with Microsoft Research appeared first on MSR-INRIA.

]]>https://www.microsoft.com/en-us/research/academic-program/outstanding-collaborator-award/

The post Conference with Satya Nadella (CEO of Microsoft) at Sorbonne University appeared first on MSR-INRIA.

]]>The conference emphasizes on the changes in the economy and the jobs market driven by the so called “digital revolution” and the current “data analytics” frenzy. There will be opening contributions from a startup (Iconem), Airbnb, Louis Vuitton, ENSAE ParisTech and Simplon.

**Satya Nadella’s keynote will be followed by a panel discussion on research in Machine Learning with Laurent Massoulié (Microsoft Research -Inria Joint Centre), Florent Perronnin (Facebook Artificial Intelligence Research Paris) and Nicolas Le Roux (Criteo).**

More details: the event will take place from 14:30 to 17:30 at the “grand amphi de la Sorbonne”. **Online registration is mandatory (register ****here****)**

The post Conference with Satya Nadella (CEO of Microsoft) at Sorbonne University appeared first on MSR-INRIA.

]]>The post Workshop on “Networks: learning, information and complexity” at the Institut Henri Poincaré, Paris, on May 18-20, 2016. appeared first on MSR-INRIA.

]]>Organized by Emmanuel Abbe (Princeton University), Sébastien Bubeck (MSR), Marc Lelarge (Inria) and Laurent Massoulié (MSR-Inria Joint Centre).

The workshop will be devoted to questions of learning, information and complexity in the context of networks. Specific topics of interest include: inference of network structure, community detection, synthesis of efficient random forests and deep neural networks, approaches from statistical physics, and limit objects of graph sequences.

**Attendance is open and free. As the number of registrations has reached the amphitheater’s capacity, we cannot guarantee that there will be room to fit would-be attendants who have not registered.**

Program (current version, subject to adjustments):

Wednesday May 18 | Thursday May 19 | Friday May 20 |

9:20 participants welcome | ||

9:30 Léon Bottou
Beyond statistical machine learning (slides) |
9:30 Gérard Biau
Collaborative Inference (slides) |
9:30 Jennifer Chayes
Graphons and Machine Learning: Modeling and Estimation of Sparse Massive Networks, Part I |

10:20 Yann Ollivier
Invariance principles for robust learning. An illustration with recurrent neural networks |
10:20 Devavrat Shah
Blind regression (slides) |
10:20 Christian Borgs
Graphons and Machine Learning: Modeling and Estimation of Sparse Massive Networks, Part II |

11:10-11:20 Coffee break | 11:10-11:20 Coffee break | 11:10-11:20 Coffee break |

11:20 Ronen Eldan
The power of depth for feedforward neural networks |
11:20 Sylvain Arlot
Analysis of purely random forest bias (slides) |
11:20 Amin Coja-Oghlan
The k-core revisited (slides) |

12:10-2:00 Lunch break and poster session 1 | 12:10-2:00 Lunch break and poster session 2 | 12:10-2:00 Lunch break |

2:00 Bruce Hajek
Algorithms and the computational gap for recovering a single community in a network (slides) |
2:00 Cris Moore
Information-theoretic bounds and phase transitions in community detection and high-dimensional clustering |
2:00 Yuval Peres
Rigidity and Tolerance for Perturbed Lattices |

2:50 Ravi Kannan
The planted Gaussian problem (slides) |
2:50 Elchanan Mossel
Shotgun assembly of graphs (slides) |
2:50 François Baccelli
Dynamical Systems on Point Processes and Geometric Routing in Stochastic Networks (slides) |

3:40-4:00 Coffee break | 3:40-4:00 Coffee break | 3:40-4:00 Coffee break |

4:00 Riccardo Zecchina
Accessible dense regions of solutions in the weights space of neural networks: general algorithmic schemes from a non-Gibbs measure |
4:00 Lenka Zdeborová
Solvable model of unsupervised feature learning (slides) |
4:00 Florent Krzakala
Mutual information and Phase transitions in Low Rank matrix Factorizations (slides) |

4:50 Ulrike von Luxburg
Geometry of unweighted k-nearest neighbor graphs |
4:50 Andrea Montanari
Phase transitions in semi-definite programming |
4:50 Ruediger Urbanke
TBD |

5:50 Workshop close | ||

Poster presenters:

**Caterina de Bacco**, Santa Fe Institute:

Community detection on multilayer networks

**Jess Banks**, Santa Fe Institute:

Information-theoretic thresholds for community detection in sparse networks

**Laura Florescu**, New York University:

Spectral thresholds in the bipartite SBM

**Mohsen Bayati**, Stanford University:

Online Decision-Making with High-Dimensional Covariates

**Robin Lamarche-Perrin**, UPMC:

Information-theoretic Compression of Weighted Graphs (joint work with Lionel Tabourier and Fabien Tarissan)

**Frederik Mallmann**, ENS:

Distance in the Forest Fire Model: How far are you from Eve?

**Jiaming Xu**, Simons Institute, UC Berkeley:

TBD

Titles and abstracts:

**Sylvain Arlot (U. Paris Sud): Analysis of purely random forests bias**.

Abstract: Random forests (Breiman, 2001) are a very effective and commonly used statistical method, but their full theoretical analysis is still an open problem.

As a first step, simplified models such as purely random forests have been introduced, in order to shed light on the good performance of Breiman’s random forests.

In the regression framework, the quadratic risk of a purely random forest can be written as the sum of two terms, which can be understood as an approximation error and an estimation error. Robin Genuer (2010) studied how the estimation error decreases when the number of trees increases for some specific model. In this talk, we study the approximation error (the bias) of some purely random forest models in a regression framework, focusing in particular on the influence of the size of each tree and of the number of trees in the forest.

Under some regularity assumptions on the regression function, we show that the bias of an infinite forest decreases at a faster rate (with respect to the size of each tree) than a single tree. As a consequence, infinite forests attain a strictly better risk rate (with respect to the sample size) than single trees.

This talk is based on a joint work with Robin Genuer. http://arxiv.org/abs/1407.3939

**François Baccelli (INRIA- UT Austin): Dynamical Systems on Point Processes and Geometric Routing in Stochastic Networks**

Abstract: This talk is motivated by the study of geometric routing algorithms used for navigating stationary point processes. The mathematical abstraction for such a navigation is a class of non-measure preserving dynamical systems on counting measures called point-maps. The talk will focus on two objects associated with a point map $f$ acting on a stationary point process $\Phi$:

* The $f$-probabilities of $\Phi$, which can be interpreted as the stationary regimes of the routing algorithm $f$ on $\Phi$. These probabilities are defined from the compactification of the action of the semigroup of point-map translations on the space of Palm probabilities. The $f$-probabilities of $\Phi$ are not always Palm distributions.

* The $f$-foliation of $\Phi$, a partition of the support of $\Phi$ which is the discrete analogue of the stable manifold of $f$, i.e., the leaves of the foliation are the points of $\Phi$ with the same asymptotic fate for $f$. These leaves are not always stationary point processes. There always exists a point-map allowing one to navigate the leaves in a measure-preserving way.

**Gérard Biau (UPMC): Collaborative Inference**

(based on joint work with B. Cadre and K. Bleakley)

Abstract: The statistical analysis of massive and complex data sets will require the development of algorithms that depend on distributed computing and collaborative inference. Inspired by this, we propose a collaborative framework that aims to estimate the unknown mean $\theta$ of a random variable $X$. In the model we present, a certain number of calculation units, distributed across a communication network represented by a graph, participate in the estimation of $\theta$ by sequentially receiving independent data from $X$ while exchanging messages via a stochastic matrix $A$ defined over the graph. We give precise conditions on the matrix $A$ under which the statistical precision of the individual units is comparable to that of a (gold standard) virtual centralized estimate, even though each unit does not have access to all of the data. We show in particular the fundamental role played by both the non-trivial eigenvalues of $A$ and the Ramanujan class of expander graphs, which provide remarkable performance for moderate algorithmic cost.

**Christian Borgs (MSR): Graphons and Machine Learning: Modeling and Estimation of Sparse Massive Networks – Part II**

Abstract: There are numerous examples of sparse massive networks, in particular the Internet, WWW and online social networks. How do we model and learn these networks? In contrast to conventional learning problems, where we have many independent samples, it is often the case for these networks that we can get only one independent sample. How do we use a single snapshot today to learn a model for the network, and therefore be able to predict a similar, but larger network in the future? In the case of relatively small or moderately sized networks, it’s appropriate to model the network parametrically, and attempt to learn these parameters. For massive networks, a non-parametric representation is more appropriate. In this talk, we first review the theory of graphons, developed over the last decade to describe limits of dense graphs, and the more the recent theory describing sparse graphs of unbounded average degree, including power-law graphs. We then show how to use these graphons as non-parametric models for sparse networks. Finally, we show how to get consistent estimators of these non-parametric models, and moreover how to do this in a way that protects the privacy of individuals on the network.

Part I of this talk reviews the theory of graph convergence for dense and sparse graphs. Part II uses the results of Part I to model and estimate massive networks.

**Léon Bottou (Facebook):** **Beyond statistical machine learning**

Abstract: The assumption that both training and test examples are sampled from the same distribution plays a central role in the theory of machine learning. This assumption used to be reasonable for many applications of machine learning. This presentation first provides evidence that the current practice is increasingly departing from this ideal. This is especially true in the deep learning community. Since this situation generates interesting conceptual challenges, the rest of the presentation describes my attempts to identify the fundamental causes of this departure, to describe the corresponding challenges and opportunities, and to predict the conceptual changes that are required to understand “post-statistical” machine learning and exploit the opportunities it offers.

** Jennifer Chayes (MSR): Graphons and Machine Learning: Modeling and Estimation of Sparse Massive Networks – Part I**

Abstract: There are numerous examples of sparse massive networks, in particular the Internet, WWW and online social networks. How do we model and learn these networks? In contrast to conventional learning problems, where we have many independent samples, it is often the case for these networks that we can get only one independent sample. How do we use a single snapshot today to learn a model for the network, and therefore be able to predict a similar, but larger network in the future? In the case of relatively small or moderately sized networks, it’s appropriate to model the network parametrically, and attempt to learn these parameters. For massive networks, a non-parametric representation is more appropriate. In this talk, we first review the theory of graphons, developed over the last decade to describe limits of dense graphs, and the more the recent theory describing sparse graphs of unbounded average degree, including power-law graphs. We then show how to use these graphons as non-parametric models for sparse networks. Finally, we show how to get consistent estimators of these non-parametric models, and moreover how to do this in a way that protects the privacy of individuals on the network.

Part I of this talk reviews the theory of graph convergence for dense and sparse graphs. Part II uses the results of Part I to model and estimate massive networks.

**Amin Coja-Oghlan (Mathematics Institute, Goethe University, Frankfurt): The k-core revisited**

Abstract: The k-core of a graph is the maximum subgraph of minimum degree k. In an important paper Pittel, Wormald and Spencer (JCTB 1996) identified the threshold for the emergence of a non-empty k-core in the random graph G(n,m) as well as the asymptotic size of the core beyond the threshold. This was followed up by a sequence of papers that determined, among other things, the asymptotic distribution of the number of vertices and edges in the k-core. In this talk, which is based on joint work with Oliver Cooley, Mihyun Kang and Kathrin Skubch, I will present a novel approach to the k-core problem that employs local weak convergence and the Warning Propagation message passing scheme.

** Ronen Eldan (University of Washington): The Power of Depth for Feedforward Neural Networks.**

Abstract: We construct simple functions on $\reals^d$, expressible by small 3-layer feedforward neural networks, which cannot be approximated by any 2-layer network, to more than a certain constant accuracy, unless its width is exponential in the dimension. The result holds for a large class of activation functions, including rectified linear units and sigmoids, and is a formal demonstration that depth — even if increased by 1 — can be exponentially more valuable than width for standard feedforward neural networks. Moreover, compared to related results in the context of Boolean functions, our result requires fewer assumptions, and the proof techniques and construction are very different. Joint work with Ohad Shamir.

**Bruce Hajek (UIUC): Algorithms and the Computational Gap for Recovering a Single Community in a Network**

Abstract: The stochastic block model for one community is considered. Out of n vertices a community consisting of K vertices is drawn uniformly at random; two vertices are connected by an edge with probability p if they both belong to the community and with probability q otherwise, where p > q > 0 and p/q is assumed to be bounded. A critical role is played by the effective signal-to-noise ratio K^2(p-q)^2/((n-K)q). The main focus of this work is on weak recovery, with o(K) misclassified vertices on average, in the sublinear regime K=o(n). It is explained that most results readily translate to the problem of exact recovery with high probability if a linear time voting procedure is used for cleanup. It is shown that weak recovery is provided by a belief propagation algorithm running in near linear time if the signal-to-noise ratio exceeds 1/e, and, conversely, if the signal-to-noise ratio is less than 1/e then no local algorithm can provide weak recovery. It is argued that the belief propagation algorithm outperforms spectral methods by a factor of e in signal-to-noise ratio. Similar results are obtained for the problem with Gaussian observations with an elevated mean for pairs of vertices in the same community. It is found that these iterative algorithms, and also a semidefinite programming (SDP) algorithm, fall short of the information theoretic limit for recovery for K=o(n/\log n) and achieve the information limit for K=\omega(n/\log n).

Joint work with Yihong Wu and Jiaming Xu.

**Ravi Kannan (MSR): The planted Gaussian problem**

Abstract: Suppose A is a n by n matrix whose random entries are each N(0,1) except in a k by k principal minor (called the planted part) where they are each N(\mu,\sigma^2). For what values of \mu,\sigma, can we recover the planted part even when k \in o(\sqrt n) ? For \mu =0, we will show that if \sigma^2 \geq 2, indeed, we can do this. Also, we show that if \mu=0 and \sigma^2 < 2, then no polynomial time statistical algorithm can do this. We consider questions with other values of \mu,\sigma^2 and the connection, if any, of the problem to the Planted Clique problem and the spiked vector model.

Joint work with Santosh Vempala

** Florent Krzakala (ENS): Mutual information and Phase transitions in Low Rank matrix Factorizations**

Abstract: We consider a probabilistic estimation of a low-rank matrix from non-linear element-wise measurements, a problem to which sparse PCA, sub-matrix localization, or the detection of communities hidden in a dense stochastic block model, can be mapped. Using the cavity method from statistical physics, we show hat the MMSE depends on the output channel only trough a single parameter: its Fisher information. We characterize the minimum mean squared error (MMSE) achievable information theoretically and with the message passing algorithm. Finally, we also show how to prove rigorously a large part of these results.

**Andrea Montanari (Stanford University): Phase transitions in semidefinite programming **

Abstract: Semidefinite programming (SDP) relaxations are among the most powerful algorithmic tools for solving high-dimensional statistical estimation problems. They are surprisingly well-suited for a broad range of tasks where data take the form of matrices or graphs. It has been observed several times that, when the ‘statistical noise’ is small enough, SDP relaxations correctly detect the underlying combinatorial structures.

I will present asymptotic predictions for several ‘detection thresholds,’ as well as for the estimation error above these thresholds. In particular, I will consider classical SDP relaxations for statistical problems motivated by graph synchronization and community detection in networks.

[Based on joint papers with Suhabrata Sen, and with Adel Javanmard and Federico Ricci-Tersenghi]

** Cris Moore (Santa Fe Institute): Information-theoretic bounds and phase transitions in community detection and high-dimensional clustering**

Over the past decade, a rich interaction has grown up between statistical physics and statistical inference. In particular, physics often predicts phase transitions where, if a data set becomes too sparse or too noisy, it suddenly becomes impossible to find its underlying structure, or even to distinguish it from a “null model” with no structure at all. For instance, in the community detection problem in networks, there is a transition below which the vertices cannot be labeled better than chance, and the network cannot be distinguished from an Erdos-Renyi random graph. Similarly, in the clustering problem in Gaussian mixture models, it becomes impossible to find the clusters, or even to tell whether clusters exist, i.e., to distinguish the data from a single Gaussian.

Many of the physics predictions have been made rigorous, but there is still a very lively interchange between the fields, both about these transitions and about asymptotically optimal algorithms. In particular, while efficient message-passing and spectral algorithms are known that succeed down to the so-called Kesten-Stigum bound, in a number of problems we believe that it is information-theoretically possible (but perhaps not computationally feasible) to go further. I will give a friendly introduction to this field, and present some new results proving this conjecture.

This is joint work with Jess Banks, Joe Neeman, Praneeth Netrapalli, and Jiaming Xu.

**Elchanan Mossel (Wharton University of Pennsylvania): Shotgun Assembly of Graphs**

Abstract: We will present some results and some open problems related to shotgun assembly of graphs for random generating models.

Shotgun assembly of graphs is the problem of recovering a random graph or a randomly labelled graphs from small pieces.

The question of shotgun assembly presents novel problems in random graphs, percolation, and random constraint satisfaction problems.

Based on joint works with Nathan Ross, Nike Sun and Uri Feige.

**Joe Neeman (UT Austin): Robust reconstruction and optimal detection in the stochastic block model**

Abstract: Consider a two-type branching process with Poisson offspring distribution. In the tree reconstruction problem, we get to observe the types of the current generation and we try to guess the type of their long-ago ancestor. In the robust tree reconstruction problem, we only get noisy observations of the current generation’s types. Remarkably, this extra noise doesn’t seem to affect the accuracy with which we can guess the ancestor’s type. We will discuss this phenomenon and an application to community detection in the stochastic block model.

**Yann Ollivier (CNRS): Invariance principles for robust learning. An illustration with recurrent neural networks.**

Abstract: The optimization methods used to learn models of data are often not invariant under simple changes in the representation of data or of intermediate variables. For instance, for neural networks, using neural activities in [0;1] or in [-1;1] can lead to very different final performance even though the two representations are isomorphic. Here we show how information theory, together with a Riemannian geometric viewpoint emphasizing independence from the details of data representation, leads to new, scalable algorithms for training models of sequential data, which detect more complex patterns and use fewer training samples.

For the talk, no familiarity will be assumed with Riemannian geometry, neural networks, information theory, or statistical learning.

**Yuval Peres (MSR): Rigidity and Tolerance for Perturbed Lattices**

Abstract: Consider a perturbed lattice {v+Y_v} obtained by adding IID d-dimensional Gaussian variables {Y_v} to the lattice points in Z^d.

Suppose that one point, say Y_0, is removed from this perturbed lattice; is it possible for an observer, who sees just the remaining points, to detect that a point is missing?

In one and two dimensions, the answer is positive: the two point processes (before and after Y_0 is removed) can be distinguished by counting points in a large ball and averaging over its radius (cf. Sodin-Tsireslon (2004) and Holroyd and Soo (2011) ). The situation in higher dimensions is more delicate, as this counting approach fails; our solution depends on a game-theoretic idea, in one direction, and on the unpredictable paths constructed by Benjamini, Pemantle and the speaker (1998), in the other. At the end of the talk I will discuss the (speculative) relevance of the solution to the planted clique problem. (Joint work with Allan Sly).

**Devavrat Shah (MIT): Blind Regression**

Abstract: We introduce the framework of *Blind Regression *motivated by the problem of *Matrix Completion *for recommendation systems: given n users and m movies, the goal is to predict the unknown rating of a user for a movie using known observations, i.e. completing the partially observed matrix. Following the framework of non-parametric statistics, we posit that user u and movie i have features x1 (u) and x2 (i) respectively, and their corresponding rating yui = f (x1 (u), x2 (i)) for some unknown function f. In the setting of classical regression, for each known rating yui, the features x = (x1 (u), x2 (i)) are observed and the goal is to estimate function f using this data so as to predict the unknown ratings. In the setting of Matrix Completion, we *do not *observe the features, thus termed *Blind Regression*. This makes it challenging to predict the rating for an unknown user-movie pair.

Using inspiration from the classical Taylor’s expansion for differentiable functions, we provide a prediction algorithm that is consistent for all Lipschitz continuous functions. We provide finite sample analysis that suggests that even when observing a vanishing fraction of the matrix, the algorithm produces accurate predictions. Connections to classical collaborative filtering algorithm as well as Tensor completion will be explained.

Joint work with Christina Lee, Yihua Li and Dogyoon Song (MIT)

**Ruediger Urbanke (EPFL): TBD**

**Ulrike von Luxburg (Universität Tübingen): Geometry of unweighted k-nearest neighbor graphs**

Consider an unweighted k-nearest neighbor graph that has been built on a random sample from some unknown density p on R^d. Assume we are given nothing but the unweighted (!) adjacency matrix of the graph: we know who are the k nearest neighbors of whom, but we do not know the point locations or any distances or similarity values between the points. Is it then possible to recover the original point configuration or estimate the underlying density p, just from the adjacency matrix of the unweighted graph? I will present the answer to the question in my talk, and well as relations to the problem of ordinal embedding.

**Lenka Zdeborova (CEA): Solvable model of unsupervised feature learning**

Abstract: We introduce factorization of certain randomly generated matrices as a probabilistic model

for unsupervised feature learning. Using the replica method we obtain an exact closed formula

for the minimum mean-squared error and sample complexity for reconstruction of the features

from data. An alternative way to derive this formula is via state evolution of the associated

approximate message passing. Analyzing the fixed points of this formula we identify a gap

between computationally tractable and information theoretically optimal inference, associated to a first order phase transition. The model can serve for benchmarking generic-purpose feature learning algorithms, its analysis provides insight into theoretical understanding of unsupervised learning from data.

**Riccardo Zecchina (Politecnico di Torino): TBD**

The post Workshop on “Networks: learning, information and complexity” at the Institut Henri Poincaré, Paris, on May 18-20, 2016. appeared first on MSR-INRIA.

]]>The post Workshop on Community Detection at Institut Henri Poincaré on Feb 26-27 appeared first on MSR-INRIA.

]]>

Thursday February 26, 2015

9:30 Welcome

10am-10:45 **Lenka Zdeborová** (CEA), Asymptotically exact analysis of the sparse stochastic block model [slides]

10:45-11:30 ** Martin Weigt** (UPMC),

11:30-12:15 **Charles Bordenave** (CNRS), Non-backtracking spectrum of random graphs [slides]

12:15-2pm Lunch break

2pm-2:45 **Joe Neeman **(UT Austin), Detection and recovery in the two-part symmetric stochastic block model

2:45-3:30 **Milan Vojnovic** (MSR), Load Balancing Tasks with Overlapping Requirements

3:30-4pm Coffee Break

4pm-4:45 **Pierre Borgnat** (ENS-Lyon), Multiscale community mining in networks with wavelets [slides]

4:45-5:30 **Emmanuel Abbe** (Princeton), Recovery thresholds in block models [slides]

Friday February 27, 2015

9:30-10:15 **Bruce Hajek** (UIUC), Detecting a community in a relatively sparse graph: understanding the fundamental limits of polynomial time algorithms [slides]

10:15-11am **Sofia Olhede** (UCL), Understanding Large Networks using Blockmodels

11am-11:30 Coffee Break

11:30-12:15 **Seyoung Yun **(MSR-Inria), joint work with **Alexandre Proutière** (KTH), Community detection with sampling and streaming [slides]

12:15-1pm **Yudong Chen** (Berkeley), Statistical-Computational Tradeoffs in Community Detection

1pm-2pm Lunch Break

2pm-2:45 **Florent Krzakala** (ENS Ulm), The Bethe Hessian Operator [slides]

2:45-3:15 Closing Remarks (open problems / future directions)

**Abstracts**

**Charles Bordenave**, Non-backtracking spectrum of random graphs

The non-backtracking matrix of a graph is a non-symmetric matrix on the oriented edge of a graph which has interesting algebraic properties. It appears notably in connection with the Ihara zeta function and in some generalizations of Ramanujan graphs. Recently, Krzakala, Moore, Mossel, Neeman, Sly, Zdeborová and Zhang have advocated the use of this matrix in the context of community detection to bypass some limitations of usual spectral clustering methods on sparse graphs. In this talk, we will study the largest eigenvalues of this matrix for the Erdos-Renyi graph G(n,c/n) and for simple inhomogeneous random graphs (stochastic block model). Our results confirm predictions from Krzakala et al. This is a joint work with Marc Lelarge and Laurent Massoulié.

**Joe Neeman**, Detection and recovery in the two-part, symmetric stochastic block model

The stochastic block model is a simple model for communities in random graphs. Its simplest incarnation is the two-part, symmetric case, in which n vertices are divided into two classes of equal size; then edges are added independently, with probability p_n for an edge within a class and probability q_n for an edge between classes. We will discuss the problem of inferring the classes given the graph. Depending on p_n and q_n, there are four things that could happen:

1) nothing can be learned about the classes;

2) one can find a partition that is correlated with the classes;

3) one can recover the classes, up to o(n) errors; or

4) one can recover the classes exactly.

**Milan Vojnovic,** Load Balancing Tasks with Overlapping Requirements

In several data centre applications, computational tasks are processed in a distributed system such that each task requires a set of distinct data inputs. To ensure scalability, the assignment of tasks to machines should guarantee that tasks requiring similar data inputs are collocated, and that the load is balanced across machines. In this talk we shall consider the problem of load balancing tasks with overlapping requirements, specified by an input bipartite graph which defines the set of requirements needed by individual tasks, under different definitions of the processing load of a machine. For example, the load of a machine may be proportional to the number of distinct requirements needed to serve the set of tasks assigned to this machine. In the online task assignment problem, tasks arrive one at a time and each task needs to be irrevocably assigned to one of the machines at its arrival time. We shall present results on these type of generalized load balancing problems for both arbitrary and hidden cluster random graph inputs.

**Bruce Hajek**, Detecting a community in a relatively sparse graph: understanding the fundamental limits of polynomial time algorithms

This talk focuses on the problem of finding an underlying community, or dense subgraph, within a network using only knowledge of the network topology. We consider the planted cluster model, which is a simple extension of the classical Erdos-Renyi random graph. We derive a semidefinite programming (SDP) relaxation of the maximum likelihood estimator for recovering the community from the network. If the size of the community is linear in the total number of vertices, the performance guarantee of the SDP exactly matches the necessary information bound. However, if the community size is sub-linear in the total number of vertices, the performance guarantee of the SDP is far from the information limit. Building on average case reductions, we show there exists a significant gap between the information limit and what can be achieved by computationally efficient procedures, conditionally on the assumptions that some instances of the planted clique problem cannot be solved in randomized polynomial time, for both a community detection problem and a community recovery problem.

Based on joint work, available at arXiv 1406.6625 and 1412.6156, with Prof. Yihong Wu (Univ. Illinois) and Dr. Jiaming Xu (formerly Illinois, now at Wharton School of Statistics, U. Pennsylvania).

** ****Sofia Olhede**, Understanding Large Networks using Blockmodels

Networks have become pervasive in practical applications. Understanding large networks is hard, especially because of a number of typical features present in such observations create a number of technical analysis challenges. I will discuss some basic network models that are tractable for analysis, what sampling properties they can reproduce, and some results relating to their inference. I will especially touch on the importance of the stochastic block model as an analysis tool.

This is joint work with Patrick Wolfe (UCL)

**Pierre Borgnat, **Multiscale community mining in networks with wavelets

(joint work avec N. Tremblay)

A signal processing approach is developed for the multiscale detection of communities in networks. This method relies on a spectral graph wavelets that stands for a local and and ego-centered view of the graph seen from each node at a specific scale. This enables to find communities of nodes according to the scale of analysis. We will show how to make the method suitable for the analysis of real data (facing scalability issues and introducing a way to test the relevance of the clustering), how the method compares to other methods (for instance methods using random walks) and we will discuss some applications to analyses of social networks, simplified sensor networks or to study data in genomics.

**Yudong Chen**, Statistical-Computational Tradeoffs in Community Detection

The problem of graph clustering and community detection concerns with identifying densely connected groups of nodes in a graph. In this talk we show that one can use computationally more expensive algorithms to achieve better statistical performance. In particular, we consider the classical planted clustering model (a.k.a. stochastic blockmodel), and show that the parameter space can be partitioned into four regimes — “simple”, “easy”, “hard”, and “impossible” — which correspond to progressively noisier problems such that: (1) a near-linear time algorithm succeeds in the “simple” regime, but provably fails in the “easy” and “hard” regimes; (2) a polynomial time algorithm succeeds in the “easy” regime, but provably fails in the “hard” regime; (3) a super-polynomial time algorithm succeeds in the “hard” regime, for which no polynomial time algorithm is known; (4) all algorithms fail in the “impossible” regime. Our results apply to the setting with an unbounded number of clusters.

Bio: Yudong Chen is currently a postdoc in EECS at UC Berkeley with Martin Wainwright. He obtained his Ph.D. in ECE from the University of Texas at Austin in 2013. His research interests include high-dimensional and robust statistics, convex optimization, and applications in community detection.

The post Workshop on Community Detection at Institut Henri Poincaré on Feb 26-27 appeared first on MSR-INRIA.

]]>The post NIPS’14 Workshop on Optimization for Machine Learning appeared first on MSR-INRIA.

]]>Webpage: http://opt-ml.org/

The post NIPS’14 Workshop on Optimization for Machine Learning appeared first on MSR-INRIA.

]]>The post New Scientific Report available appeared first on MSR-INRIA.

]]>The post New Scientific Report available appeared first on MSR-INRIA.

]]>