ISM Symposium on Environmental Statistics 2023

22 March, 2023
Auditorium @ The Institute of Statistical Mathematics
This symposium will be conducted face-to-face. However, the number of participants is limited to 50 or less. If the number of participants exceeds 50, we regret to inform you that registration will be closed. If you wish to attend, please register using the Google form below.

Please note that only those who have registered will receive a PDF copy of the proceedings.
There is no registration fee.
【Host Organization】
The Institute of Statistical Mathematics, Japan
【Co-host Organization】
Grant-in-Aid for Scientific Research(S) 18H05290 [Introduction of general causality to various observations and the innovation for its optimal statistical inference]
【Main Subjects】
Statistical methods supporting environmental statistics
・Spatial statistics
・Space-time modeling
・Model selection
・Bayesian inference
・Markov Chain Monte Carlo
・Directional Statistics
Daisuke Murakami (The Institute of Statistical Mathematics)
Alan H. Welsh (The Australian National University, Australia)
Koji Kanefuji (The Institute of Statistical Mathematics)
Kunio Shimizu (The Institute of Statistical Mathematics)
Satoshi Yamashita (The Institute of Statistical Mathematics)
【Invited Speakers】
Alexis Comber (University of Leeds, UK)
Daisuke Murakami (The Institute of Statistical Mathematics, Japan)
Hsin-Cheng Huang (Academia Sinica, Taiwan)
Toshihiro Hirano (Kanto Gakuin University, Japan)
Andrew Wood (Australian National University, Australia)
Yoshinori Kawasaki (The Institute of Statistical Mathematics, Japan)
Alan H. Welsh (Australian National University, Australia)
Shonosuke Sugasawa (The University of Tokyo, Japan)
Program (* means the presenter)
9:45–9:55 Opening Address
Hiroe Tsubaki (Director-General, The Institute of Statistical Mathematics)
〈Session 1〉 Chairperson: Kunio Shimizu (The Institute of Statistical Mathematics)
10:00–10:45 Geographical Gaussian Process GAMs - An Alternative to MGWR.
Alexis Comber* (University of Leeds, UK), Paul Harris (Rothamsted Research, UK), Chris Brunsdon (Maynooth University, Ireland)

This talk describes a novel spatially varying coefficient (SVC) regression model: a Geographical Gaussian Process Generalized Additive Model (GGP-GAM). It uses a Generalized Additive Model (GAM) framework but with Gaussian Process (GP) splines parameterised at observation locations. The proposed GGPGAM is multiscale in that it fits a geographic GP spline for each covariate but has fewer theoretical and technical limitations than the Geographically Weighted Regression (GWR) and Multiscale GWR (MGWR). A GGP-GAM was fitted to simulated coefficient data with varying degrees of spatial heterogeneity and the results were compared with MGWR. For each fit metric the GGP_GAM out performs MGWR. It was then applied to a case study to model the the UK EU membership referendum (Brexit) result with predictor variables describing different socio-economic factors. The spatially varying GGP-GAM coefficient estimates show different scales of spatial non-stationarity in their relationship with the UK’s Brexit vote. A number of areas of further work are identified including to facilitate ease of comparison with alternate SVC models such as MGWR, GGP-GAM calibration and GAM tuning and how to link the GGP-GAM spline smoothing parameters to more intuitive user understandings of process spatial heterogeneity.

10:45–11:30 Sub-model Aggregation for Scalable Spatial Mixed Modeling.
Daisuke Murakami* (The Institute of Statistical Mathematics), Shonosuke Sugasawa (The University of Tokyo)

The aim of this study is to develop an approach that aggregates/combines global and local sub-models to build a flexible and scalable spatial regression model, including a spatially varying coefficient model, which we will focus on. To aggregate sub-models, we use a generalized product-of-experts method, which is widely used in the machine learning literature. The resulting method has the following practically useful properties: computationally efficient; each sub-model can be estimated independently; the marginal likelihood is available in closed form. Furthermore, this method is useful for improving the accuracy of spatial process modelling. The accuracy and computational efficiency are investigated by Monte Carlo experiments. The method is then applied to an analysis of residential land prices in Japan.

11:30–12:20Lunch Break
〈Session 2〉 Chairperson: Shuhei Mano (The Institute of Statistical Mathematics)
12:20–13:05 Nonstationary Spatial Modeling by Segmenting Spatial Processes into Stationary Components.
Hsin-Cheng Huang* (Academia Sinica, Taiwan)

In this study, we tackle the issue of nonstationary spatial data by developing a novel approach to specify the nonstationary covariance function. We begin by using a robust local estimate of the spatial covariance and constructing a test for stationarity. The test statistic is generated by clustering data locations using Voronoi tessellation. If the stationarity assumption is not met, the region is partitioned into Voronoi polygons, and within each, the process becomes approximately stationary. Our model builds upon this partition and uses a linear combination of stationary processes with spatially varying weights. Unlike piecewise stationary models, our model allows for a smooth or sharp change in the spatial covariance function across subregions governed by a tuning parameter. It reverts to a global stationary process when all stationary components share a common spatial covariance structure. We use maximum composite likelihood to estimate the parameters of our model and develop a doubled kriging method based on a divide-and-conquer strategy. Some numerical results show that our approach is flexible and computationally efficient.

13:05–13:50 Multi-resolution Filters via Linear Projection for Large Spatio-temporal Datasets.
Toshihiro Hirano* (Kanto Gakuin University) and Tsunehiro Ishihara (Takasaki City University of Economics)

Recently, large-scale spatio-temporal data have been measured by compact sensing devices mounted on satellites. Since these kinds of datasets are often incomplete, it is useful to create the prediction surface of a spatial field. For example, this prediction surface can be applied to forecasting an extreme weather phenomenon. To this end, we consider the Kalman filter based on the linear Gaussian state-space model. However, the Kalman filter is impractically time-consuming when the number of locations in spatio-temporal datasets is large. To address this problem, we propose a multi-resolution filter via linear projection (MRF-lp). The MRF-lp can be regarded as an extension of Hirano (2021) to spatio-temporal datasets and a generalization of a multi-resolution filter (MRF) developed by Jurek and Katzfuss (2021). Consequently, our proposed MRF-lp inherits some desirable features of the MRF such as the preservation of the block-sparse structure of some matrices through time and the resulting scalability. In addition, we discuss extensions of the MRF-lp to nonlinear and non-Gaussian cases based on Jurek and Katzfuss (2022). Some simulations demonstrate that the MRF-lp performs well.

〈Session 3〉 Chairperson: Shogo Kato (The Institute of Statistical Mathematics)
13:50–14:35 Score Matching: Theory, Applications and Future Potential.
Andrew Wood* (Australian National University)

There are many statistical models whose probability density function is known up to proportionality but whose normalising constant is intractable. Score matching is a remarkable method of parameter estimation due to Hyvarinen (2005) which avoids calculation of the normalising constant, yet provides a consistent and asymptotically normal estimator of the unknown parameter vector in a parametric model. This talk will review the score matching approach and then discuss some new methodological developments and applications of score matching which go beyond the continuous IID (independent and identically distributed) case, which has been the main focus of attention in the literature to date. A situation of importance in the environmental and geosciences where score matching has the potential to be very useful is in the analysis of samples of directions (represented as unit vectors) that are spatially dependent. The latter part of this talk will discuss how score matching can be used in this situation and an alternative approach will also be discussed briefly.

14:35–15:20 GARCH-UGH: A Bias-reduced Approach for Dynamic Extreme Value-at-Risk Estimation in Financial Time Series.
Yoshinori Kawasaki* (The Institute of Statistical Mathematics)

The Value-at-Risk (VaR) is a widely used instrument in financial risk management. The question of estimating the VaR of loss return distributions at extreme levels is an important question in financial applications, both from operational and regulatory perspectives; in particular, the dynamic estimation of extreme VaR given the recent past has received substantial attention. We propose here a new two-step bias-reduced estimation methodology for the estimation of one-step ahead dynamic extreme VaR, called GARCH-UGH (Unbiased Gomes-de Haan), whereby financial returns are first filtered using an AR-GARCH model, and then a bias-reduced estimator of extreme quantiles is applied to the standardized residuals. Our results indicate that the GARCH-UGH estimates of the dynamic extreme VaR are more accurate than those obtained either by historical simulation, conventional AR-GARCH filtering with Gaussian or Student-t innovations, or AR-GARCH filtering with standard extreme value estimates, both from the perspective of in-sample and out-of-sample backtestings of historical daily returns on several financial time series. [This is a joint work with Hibiki Kaibuchi (current affiliation: Mizuho-DL Financial Technology Co., Ltd.) and Gilles Stupfler (current affiliation: University of Angers).]

15:20–15:40Coffee Break
〈Session 4〉 Chairperson: Koji Kanefuji (The Institute of Statistical Mathematics)
15:40–16:25 Sparse Sliced Inverse Regression via Cholesky Matrix Penalization.
Linh H. Nghiem (University of Sydney, Australia), Francis K.C. Hui (Australian National University), Samuel Muller (Macquarie University, Australia), and Alan H. Welsh* (Australian National University)

For a regression problem with a scalar outcome and a p-dimensional covariate, sufficient dimension reduction refers to a class of methods that try to express the outcome as a function of a few linear combinations of covariates without losing information about the relationship. The sufficient dimension reduction is sparse if only a few of the coefficients in each linear combination of covariates are non-zero. We introduce a new sparse sliced inverse regression estimator called Cholesky matrix penalization and an adaptive version of it for achieving sparsity in estimating the dimensions of the central subspace. The new estimators use the Cholesky decomposition of the covariance matrix of the covariates and include a regularization term in the objective function to achieve sparsity in a computationally efficient manner. We establish the theoretical values of the tuning parameters that achieve estimation and variable selection consistency for the central subspace. Furthermore, we propose a new projection information criterion to select the tuning parameter for our proposed estimators and prove that the new criterion facilitates selection consistency. The Cholesky matrix penalization estimator inherits the strength of the Matrix Lasso and the Lasso sliced inverse regression estimator; it has superior performance in numerical studies and can be extended to other sufficient dimension methods in the literature.

16:25–17:10 Scalable Bayesian Spatio-temporal Predictive Synthesis.
Shonosuke Sugasawa* (The University of Tokyo), Daisuke Murakami (The Institute of Statistical Mathematics), Kenichiro McAlinn (Temple University, USA)

Bayesian predictive synthesis is a general framework for synthesizing multiple predictive distributions through Bayesian updating. We here adopted the framework to develop a flexible model combination for spatio-temporal prediction. To this end, we adopt a latent variable model with spatially and temporally varying regression coefficients and develop a scalable computation algorithm to obtain estimates of unknown hyperparameters and posterior distributions of the latent variables. We demonstrate the proposed method through some numerical examples.

17:10–17:20 Closing Address
Alan H. Welsh (Australian National University) & Daisuke Murakami (The Institute of Statistical Mathematics)