The 2nd Workshop on Machine Learning and Optimization at the ISM
Taking advantage of the simultaneous presence of different international
researchers in Tokyo, an informal workshop on topics blending Statistical
Modelling and Optimization approaches to Machine Learning will be held
on October 12th 2007 in the Institute of Statistical Mathematics, Tokyo,
Japan. Many thanks to all speakers and participants!
NEW: Some slides can be downloaded below!
10:00 - 10:10 | Opening Remarks | |
10:10 - 11:05 | Collaborative filtering with kernels and spectral regularization | Jean-Philippe Vert |
11:05 - 12:00 | Bundle Methods for Machine Learning | Alexander J. Smola | 12:00 - 13:30 |
Lunch break |
13:30 - 14:25 | Hilbert Space Representations of Probability Distributions | Arthur Gretton |
14:25 - 15:20 | Measuring conditional dependence with kernels | Kenji Fukumizu | 15:20 - 15:40 | Break | |
15:40 - 16:35 | Cluster Identification in Nearest-Neighbor Graphs | Markus Maier |
16:35 - 17:30 | Epagogics: Beyond Newtonian deduction based paradigm towards 'universal' induction machines | Kunio Tanabe |
[ Access ]
Please follow this link for access information. The Workshop will be held in the conference room
(2F).
[ Organizers ]
Please contact Tomoko Matsui for any questions
regarding the workshop.
[ Detailed Program And Slides]
slides | Collaborative filtering with kernels and spectral regularization | Jean-Philippe Vert |
I will present a general framework for Collaborative Filtering (CF), which
is the task of learning preferences of users for products, such as books
or movies, from a set of known preferences. A standard approach to CF is
to find a low rank, or low trace norm, approximation to a partially observer
matrix of user preferences. We generalize this approach to estimation of
a compact operator, of which matrix estimation is a special case. We develop
a notion of spectral regularization which captures both rank constraint
and trace norm regularization. The major advantage of this approach is
that it provides a natural method of utilizing side-information, such as
age and gender, about the users (or objects) in question - a formerly challenging
limitation of the low-rank approach. We provide a number of algorithms,
and test our results on a standard CF dataset with promising results. This
is a joint work with Jacob Abernethy, Francis Bach and Theodoros Evgeniou. |
slides | Bundle Methods for Machine Learning | Alexander J. Smola | We present a globally convergent method for regularized risk minimization problems.
Our method applies to Support Vector estimation, regression, Gaussian Processes,
and any other regularized risk minimization setting which leads to a convex
optimization problem. SVMPerf can be shown to be a special case of our
approach. In addition to the unified framework we present tight convergence
bounds, which show that our algorithm converges in O(1/e) steps to e precision for general convex problems and in O(log e) steps for continuously differentiable problems. We demonstrate in experiments the performance of our approach. | slides | Hilbert Space Representations of Probability Distributions | Arthur Gretton |
Many problems in unsupervised learning require the analysis of features
of probability distributions. At the most fundamental level, we might wish
to determine whether two distributions are the same, based on samples from
each - this is known as the two-sample or homogeneity problem. We use kernel
methods to address this problem, by mapping probability distributions to
elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently
rich RKHS, these representations are unique: thus comparing feature space
representations allows us to compare distributions without ambiguity. Applications
include testing whether cancer subtypes are distinguishable on the basis
of DNA microarray data, and whether low frequency oscillations measured
at an electrode in the cortex have a different distribution during a neural
spike.
A more difficult problem is to discover whether two random variables drawn
from a joint distribution are independent. It turns out that any dependence
between pairs of random variables can be encoded in a cross-covariance
operator between appropriate RKHS representations of the variables, and
we may test independence by looking at a norm of the operator. We demonstrate
this independence test by establishing dependence between an English text
and its French translation, as opposed to French text on the same topic
but otherwise unrelated. Finally, we show that this operator norm is itself
a difference in feature means. |
slides | Measuring conditional dependence with kernels | Kenji Fukumizu |
We propose a new measure of conditional dependence of random variables,
based on normalized cross-covariance operators on reproducing kernel Hilbert
spaces. Unlike previous kernel dependence measures, the proposed criterion
does not depend on the choice of kernel in the limit of infinite data,
for a wide class of kernels. At the same time, it has a straightforward
empirical estimate with good convergence behaviour. In the special case
of unconditional dependence, the measure is exactly the same as the mean
square contingency, which is one of the popular measures of dependence.
We discuss the theoretical properties of the measure, and demonstrate its
application in experiments. |
| Cluster Identification in Nearest-Neighbor Graphs | Markus Maier |
Assume we are given a sample of points from some underlying distribution
which contains several distinct clusters. Our goal is to construct a neighborhood
graph on the sample points such that clusters are identified: that is,
the subgraph induced by points from the same cluster is connected, while
subgraphs corresponding to different clusters are not connected to each
other. We derive bounds on the probability that cluster identification
is successful, and use them to predict optimal values of k for the mutual
and symmetric k-nearest-neighbor graphs. We point out different properties
of the mutual and symmetric nearest-neighbor graphs related to the cluster
identification problem. |
| Epagogics: Beyond Newtonian deduction based paradigm towards 'universal' induction machines | Kunio Tanabe |
|
|