ANU-ISM Workshop on Data Science (online)

March 24, 2021
Alan Welsh (ANU)
Koji Kanefuji (ISM)
【Program Committee】
Tao Zou (ANU)
Shogo Kato (ISM)
This timetable is given in Japan Standard Time. Australian Eastern Daylight Time is 2 hours ahead of Japan Standard Time.
The Australian National University (ANU) and the Institute of Statistical Mathematics (ISM signed the Memorandum of Understanding (MOU) in 2014.
Program (in Japan Standard Time)
11:00-11:05Welcome Address
Hiroe Tsubaki (ISM)
11:05-11:55Keynote Speech [Chair: Kunio Shimizu, ISM]

▼"A CLT for the intrinsic Fréchet mean on compact Riemannian manifolds"
Andrew Wood (ANU)

In object data analysis the choice of metric is typically a central question. In particular situations where the objects of interest (e.g. signed or unsigned directions or the shape of a configuration of points) can be represented as points on a Riemannian manifold, when calculating a Fréchet mean, should one use the intrinsic Riemannian metric or embed the manifold in a Euclidean space and use the resulting Euclidean metric? In this talk a new result is presented which shows that in some cases the CLT for the Fréchet mean based on the intrinsic Riemannian metric exhibits non-standard behaviour. The cause of such non-standard behaviour is irregular behaviour at the cut locus of the population Fréchet mean, as will be explained in the talk. Cases in which this non-standard behaviour arises will be precisely characterised and implications of these findings will be discussed.

This is joint work with Thomas Hotz (Ilmenau, Germany) and Huiling Le (Nottingham, UK).

12:00-13:20Session I [Chair: Francis Hui, ANU]

▼"Elliptical symmetry models and robust estimation methods on spheres"
Janice Scealy (ANU)

First, a new distribution is proposed for analysing directional data that is a novel transformation of the von Mises–Fisher distribution. The new distribution has ellipse-like symmetry, as does the Kent distribution; however, unlike the Kent distribution the normalising constant in the new density is easy to compute and estimation of the shape parameters is straightforward. To accommodate outliers, the model also incorporates an additional shape parameter which controls the tail-weight of the distribution. Next, we define a more general semi-parametric elliptical symmetry model on the sphere and propose two new robust direction estimators, both of which are analogous to the affine-equivariant spatial median in Euclidean space. We calculate influence functions and show that the new direction estimators are standardised bias robust in the highly concentrated case. To illustrate our new models and estimation methods, we analyse archaeomagnetic data and lava flow data from two recently compiled online geophysics databases.

This is joint work with Andrew Wood.

▼"Parameter estimation for a Cauchy family of distributions on the sphere"
Shogo Kato (ISM)

A Cauchy family of distributions on the sphere is proposed as a spherical extension of the wrapped Cauchy family on the circle. Some properties of the proposed family, especially those related to parameter estimation, are discussed. Three estimators for the spherical Cauchy family are presented, namely, a method of moments estimator, the maximum likelihood estimator, and an asymptotically efficient estimator. The method of moments estimator and the asymptotically efficient estimator are expressed in closed form. A simple algorithm is presented to estimate the maximum likelihood estimate numerically. The EM algorithm is also available for maximum likelihood estimation by transforming the spherical Cauchy family into a t-family on the Euclidean space via the stereographic projection. Asymptotic properties of the proposed estimators are considered. A simulation study is carried out to compare the estimators in terms of their performance for finite sample sizes.

This is joint with Peter McCullagh of the University of Chicago, USA.

13:20-14:00Break (or Lunch) + Casual Discussion
14:00-15:20Session II  [Chair: Yoshinori Kawasaki, ISM]

▼"From Covariance to Correlation Regression Analysis"
Tao Zou (ANU)

In the analysis of multivariate or multi-response data, researchers are often interested in studying how the covariations between responses are related to one or more similarity/distance measures. To address such research questions, we review the covariance regression analysis of Zou et al. (2017, 2020, 2021) in the talk, and propose a novel joint mean and correlation regression model, which is applicable to a wide variety of correlated discrete and (semi-)continuous responses. The model involves regressing the mean of each response against a set of covariates and the correlations between responses against a set of similarity measures, which provides explicit quantification of how observed similarity measures affect the between response correlation after accounting for differences in response means due to covariates.

We develop a constrained algorithm which iterates between solving estimating equations for the mean regression coefficients and correlation regression parameters. Under a general setting where the number of responses can tend to infinity with the number of clusters, we demonstrate that the proposed joint estimator is consistent and asymptotically normal, with differing rates of convergence. Simulations and an ecology empirical example are presented to illustrate the usefulness of the proposed model, and how inference and new insights can be obtained by simultaneously modelling both the mean of each response and the correlations between them.

This is a joint work with Zhi Yang Tho and Francis Hui at ANU.

▼"A Bayesian construction of asymptotically unbiased estimators"
Shuhei Mano (ISM)

A differential geometric framework to construct an asymptotically unbiased estimator of a function of a parameter is presented. The derived estimator asymptotically coincides with the uniformly minimum variance unbiased estimator, if a complete sufficient statistic exists. The framework is based on the maximum a posteriori estimation, where the prior is chosen such that the estimator is unbiased. The framework is demonstrated up to the second-order asymptotic unbiasedness (unbiased up to for a sample of size ). The bias of an estimator emerges as a departure from a kind of harmonicity of the estimand, and multiplication of a prior is equivalent to modify the model manifold such that the departure from the harmonicity is canceled out. For a given estimand, the prior is chosen by solving a first-order differential equation. On the other hand, for a given prior, we can address the bias of what estimator can be reduced by solving an elliptic partial differential equation. Their integrations are discussed, and a family of invariant priors, which generalizes the Jeffreys prior, is mentioned as a specific example. As an illustrative example, an estimation of the shrinkage factor in the linear mixed-effects model is discussed. This is a joint work with Masayo Y. Hirose at Kyushu university, and is available at arXiv: 2011.14747.

15:20-15:25Closing Address
Alan Welsh (ANU)