ISM Symposium on Environmental Statistics 2025

【Date】
24 March, 2025
【Venue】
Auditorium @ The Institute of Statistical Mathematics
This symposium will be conducted face-to-face. However, the number of participants is limited to 50 or less. If the number of participants exceeds 50, we regret to inform you that registration will be closed. If you wish to attend, please register using the Google form below.
https://forms.gle/mYVDjsTStDLjhjm39

Please note that only those who have registered will receive a PDF copy of the proceedings.
There is no registration fee.
【Main Subjects】
Statistical methods supporting environmental statistics,
・Spatial statistics
・Space-time modeling
・Model selection
・Bayesian inference
・Markov Chain Monte Carlo
・Directional Statistics
etc.
【Organizers】
Alan H. Welsh (The Australian National University)
Daisuke Murakami (The Institute of Statistical Mathematics)
Shogo Kato (The Institute of Statistical Mathematics)
Koji Kanefuji (The Institute of Statistical Mathematics)
【Invited Speakers】
Alan H. Welsh (Australian National University, Australia)
Andrew Wood (Australian National University, Australia)
Elias Teixeira Krainski (King Abdullah University of Science and Technology, Saudi Arabia)
Giovanna Jona Lasinio (Sapienza University of Rome, Italy)
Hideyasu Shimadzu (Kitasato University, Japan)
Rafael de Andrade Moral (Maynooth University, Ireland)
Shogo Kato (The Institute of Statistical Mathematics, Japan)
Daisuke Murakami (The Institute of Statistical Mathematics, Japan)
Program (* means the presenter)
9:40—9:50 Opening Address
Hiroe Tsubaki (Director-General, The Institute of Statistical Mathematics)
区切り線
〈Session 1〉 Chairperson: Shonosuke Shugasawa (Keio University)
10:00—10:40 The spacetime SPDE models and applications
Elias Teixeira Krainski* (King Abdullah University of Science and Technology), Finn Lindgren (The University of Edinburgh), Haakon Bakka (King Abdullah University of Science and Technology), David Bolin (David Bolin), and Håvard Rue (King Abdullah University of Science and Technology)

(Abstract)
The field of the partial differential equations in mathematics is used to model phenomena in several different fields. A stochastic partial differential equation (SPDE) introduces random forcing to take the nature of real-world observations. In this talk we will introduce the spacetime extension for the SPDE approach that has a direct link with the statistical models based on Gaussian random field (GRF) models. We also present the discretization approach useful to make efficient computations when fitting to real data. We will present the software implementation and illustrative applications with real-world datasets.

10:40—11:20 Spatial curriculum learning for modeling non-stationary processes in regression coefficients
Daisuke Murakami (The Institute of Statistical Mathematics)

(Abstract)
This study develops a curriculum learning algorithm for modeling non-stationary spatial processes in regression coefficients. Curriculum learning is a machine learning approach where the model is trained on increasingly complex data or tasks over time. Following the idea of curriculum learning, we propose a boosting algorithm that learns coarser spatial processes first, followed by finer processes. In each learning step, a local model, which may explain anisotropic pattern, is estimated and ensembled. The performance of the developed method is verified by simulation experiments and application to residential land price data.

11:20—12:00 Double descent and noise in fitting linear regression models
Insha Ullah (Australian National University) and Alan H. Welsh* (Australian National University)

(Abstract)
"Double descent" is used in statistical machine learning to describe the fact that models with more parameters than observations can have better predictive performance (as measured by the test error) than models with fewer parameters than observations. This challenge to the belief that simpler models are generally better, means we need a rethink of fundamental statistical ideas. We explore the effects of including noise predictors and noise observations when fitting linear regression models. We present empirical and theoretical results that show that double descent occurs in both cases, albeit with contradictory implications: the implication for noise predictors is that complex models are often better than simple ones, while the implication for noise observations is that relatively simple models are often better than complex ones. That is, double descent is not just a high-dimensional big data/machine learning phenomenon but can also occur in small datasets fitted with simple statistical models. We resolve this contradiction by showing that it is not the model complexity but rather the implicit shrinkage by the inclusion of noise in the model that drives the double descent. We also show that including noise observations in the model makes the (usually unbiased) ordinary least squares estimator biased and indicates that the ridge regression estimator may need a negative ridge parameter to avoid over-shrinkage.

区切り線
12:00—13:00Lunch Break
区切り線
〈Session 2〉 Chairperson: Keiichi Fukaya (National Institute for Environmental Studies)
13:00—13:40 Bayesian Hierarchical Modelling Applied to Wildlife Monitoring
Rafael de Andrade Moral*(Maynooth University), Luciano M. Verdade (University of Sao Paulo),and Niamh Mimnagh (Maynooth University)

(Abstract)
In this talk, I will explore statistical advances in wildlife population monitoring, focusing on methods for estimating multispecies animal abundance. Traditional management approaches typically priorities species of economic or ecological concern, leaving millions of others unmonitored. Yet, long-term monitoring is critical for understanding ecosystem dynamics, especially those influenced by climate change. In this context, I will firstly discuss a novel multispecies N-mixture modelling framework capable of estimating abundance and interspecies correlations for unmarked animal populations, while accounting for imperfect detection, zero-inflation, and serial autocorrelation. I will then introduce a novel framework, called the triple Poisson model, which can be used to estimate animal abundance using scarce data on animal vestiges (such as scats, fur and footprints). Although it does not incorporate spatial or temporal trends (and this is the object of future work), the framework offers better inference from limited observations, making vestige-based monitoring a more viable alternative for large-scale ecological studies. Finally, I will highlight the importance of adding a third dimension (plant biomass) to traditional wildlife monitoring programs, which may be estimated remotely, reducing even further the costs of establishing a potential wildlife monitoring protocol.

13:40—14:20 Myriad measures, myriad diversity—how biodiversity metrics shape our framing of contemporary changes in nature
Hideyasu Shimadzu (Kitasato University)

(Abstract)
The UN’s Decade on Ecosystem Restoration (2021--2030) highlights the urgent need to address unprecedented biodiversity change in the Anthropocene. Effective restoration efforts depend upon deep insight into the modern state of biodiversity across space and time. While numerous biodiversity measures have been proposed to quantify biodiversity, some can yield conflicting results each other due to underlying conceptual ambiguities. A crucial yet underappreciated aspect is the lack of a unifying framework that clarifies the fundamental nature of biodiversity indices and what they actually measure. We revisit classical biodiversity concepts—alpha-, beta- and gamma-diversity—through the lens of abundance distributions. We formalize biodiversity change as shifts in these distributions, considering widely used indices as estimators of distributional deviations between ecological states. To unify these indices, we introduce Kullback-Leibler divergence and an information geometric framework, providing a cohesive theoretical structure that encompasses diverse biodiversity metrics. We illustrate how this approach enhances biodiversity analysis, offering deeper insights into the drivers of observed diversity patterns and improving the interpretability of biodiversity assessments.

14:20—15:00 Presence-only data. Learning about marine life by merging different data sources
Giovanna Jona Lasinio (Sapienza University of Rome)

(Abstract)
Understanding population status is often challenged by scant abundance and distribution data for many threatened species in marine and terrestrial environments. Occurrence records are often scarce and opportunistic, and fieldwork to retrieve additional data is expensive and prone to failure, particularly if the species is highly mobile. Further information on the "absence" of the species is never available. Integrating various data sources becomes crucial to developing species distribution models for informed sampling and conservation purposes. Dolphins, sea turtles, and other marine animals are currently monitored in the Italian Mediterranean. ISPRA, Sapienza, and other scientific institutions have implemented rigorous survey designs to collect presence data on these species. However, due to the high cost, these designs explore only a specific portion of the Italian Mediterranean, limiting the generalization of the results. Other marine species, such as the white shark, are rare but persistent inhabitants of the Mediterranean Sea. Information on the species' presence is rarely connected to rigorous surveys. Here, we will explore some examples where occasional sightings ("citizen science") of a species are included in the species distribution model. In this presentation, we aim to discuss model challenges linked to the fact that (i) the data records only the presence (no information on the absence is available), (ii) the occasional sightings are not connected to a known observation mechanism, (iii) sampling biases must be included in the model too. Solutions are proposed in the framework of nonhomogeneous point processes estimated under the Bayesian paradigm, addressing computational issues through INLA and inlabru.

区切り線
15:00—15:45Coffee Break
区切り線
〈Session 3〉 Chairperson: Daisuke Kurisu (University of Tokyo)
15:45—16:25 Robust functional principal components analysis for non-Euclidean random objects
Andrew T.A. Wood* (Australian National University), Jiazhen Xu (Australian National University), and Tao Zou (Australian National University)

(Abstract)
Functional data analysis offers a diverse toolkit of statistical methods tailored for analysing samples of real-valued random functions. Recently, samples of time-varying random objects, such as time-varying networks, have been increasingly encountered in modern data analysis. These data structures represent elements within general metric spaces that lack local or global linear structures, rendering traditional functional data analysis methods inapplicable. Moreover, the existing methodology for time-varying random objects does not work well in the presence of outlying objects. In this paper, we propose a robust method for analysing time-varying random objects. Our method employs pointwise Fréchet medians and then constructs pointwise distance trajectories between the individual time courses and the sample Fréchet medians. This representation effectively transforms time-varying objects into functional data. A novel robust approach to functional principal component analysis, based on a Winsorized U-statistic estimator of the covariance structure, is introduced. The proposed robust analysis of these distance trajectories is able to identify key features of object trajectories over time and is useful for downstream analysis. To illustrate the efficacy of our approach, numerical studies focusing on (i) dynamic networks and (ii) time-varying spherical data are conducted. The results indicate that the proposed method exhibits good all-round performance and surpasses the existing approach in terms of robustness, showcasing its superior performance in handling time-varying objects data.

16:25—17:05 Regression for spherical data using a scaled link function
Shogo Kato* (The Institute of Statistical Mathematics), Kassel L. Hingee (Australian National University), Janice L. Scealy (Australian National University), and Andrew T.A. Wood (Australian National University)

(Abstract)
Spherical data, consisting of observations that take values on the unit sphere, appear in numerous academic fields. In this talk, we consider a regression problem in which both covariates and responses take values on unit spheres of possibly different dimensions. We begin with a brief review of some existing works on regression in this context. Then we propose a novel link function and present its properties. The proposed link function generalizes the Möbius transformation on the sphere, which is an isotropic mapping, allowing for control over the scale of each axis of the spherical covariate. The link function has parameters that can be clearly interpreted and includes several well-known link functions as special cases. For the error distributions of the proposed regression, we adopt two distributions, namely, the von Mises–Fisher distribution and its scaled extension by Scealy and Wood (2019). Maximum likelihood estimation for the proposed regression model is discussed, and an application of the model is presented.
Reference:
Scealy, J.L. and Wood, A.T.A. (2019). Scaled von Mises–Fisher distributions and regression models for paleomagnetic directional data. Journal of the American StatisticalAssociation, 114(528), 1547-1560.

17:10—17:15 Closing Address
Yoshinori Kawasaki (Vice-Director General, The Institute of Statistical Mathematics)