Probability models based on graphs, such as Markov random fields (MRFs), are used in many areas of the statistical and computational sciences.The problem of graphical model selection is the following:
given a set of samples drawn from some MRF, determine the unknown graph that underlies it.This problem arises in many application domains, including social network modeling, genome analysis, and spatial statistics.In the worst-case setting, it is known to be computationally intractable but this does not preclude algorithms that are efficient for typical problems.
In the first part of this talk, we discuss some practical methods (meaning polynomial-time in problem parameters) for solving the graphical model selection problem.We show how methods based on $\ell_1$-regularized likelihood (or pseudolikelihood) can be used to recover the graph structure efficiently.For discrete random variables on a graph with $p$ vertices and maximum degree $d$, we show that a practical algorithm succeeds with high probability using $n = \Omega(d^3 \log p)$ samples.
For bounded degree graphs, this scaling allows for the graph size to scale exponentially in the sample size.In the second part of the talk, we compare this scaling to the information-theoretic limits of graphical model selection.We prove that no algorithm, even one with unlimited computational power, can reliably recover the graph if the sample size $n$ is less than $c d^2 \log p$, for some constant $c$.We conclude by discussing various open issues and on-going work in the area, including alternative loss functions and graph recovery with dependent samples.
Based on joint works with: John Lafferty (CMU), Garvesh Raskutti (Berkeley), Pradeep Ravikumar (UT Austin), and Prasad Santhanam (UW Hawaii), and Bin Yu (Berkeley).