Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 133-151 (2022)

Non-Gaussianity on Cumulonimbus Prediction Using a Particle Filter at Storm-scale

Takuya Kawabata
(Department of Observation & Data Assimilation, Meteorological Research Institute)
Genta Ueno
(The Institute of Statistical Mathematics/Department of Statistical Science, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies, SOKENDAI/Center for Data Assimilation Research and Applications, Joint Support-Center for Data Science Research (ROIS-DS))

We develop a particle filter with the sampling-importance-resampling algorithm into a storm-scale numerical weather prediction model, which can explicitly resolve cumulonimbus. Our purpose is to investigate the origin of chaos which is caused by non-Gaussianity in initiations and developments of cumulonimbus. For this, we compare the information criterion BIC for three distribution models of Gaussian, Gaussian mixture, and histogram, and select the best model. After that, we objectively determine whether the data follows Gaussian or non-Gaussian distribution. From the result, the updraft in upper area of a local front becomes non-Gaussian first, the non-Gaussianity propagates to water vapor in the same region second. Once cumulus initiates and starts development, other factors like potential temperature, cloud and rain waters become non-Gaussian in sequence. When cumulonimbus becomes mature, whole evaluation area is non-Gaussian, then we see that cumulonimbus is non-Gaussian. From that, we find that the origin of the whole non-Gaussianity is the updraft in the upper region of the local front.

Key words: Particle filter, non-Gaussianity, cumulonimbus, storm-scale.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 153-163 (2022)

Forecasting Auroral Activity Using Data Assimilation

Yoshizumi Miyoshi
(Institute for Space-Earth Environmental Research, Nagoya University)
Genta Ueno
(The Institute of Statistical Mathematics)
Ryota Yamamoto
(Institute for Space-Earth Environmental Research, Nagoya University)
Shinobu Machida
(Institute for Space-Earth Environmental Research, Nagoya University)
Masahito Nose
(Institute for Space-Earth Environmental Research, Nagoya University)
Daikou Shiota
(Radio Research Institute, National Institute of Information and Communications Technology)
Satoko Nakamura
(Institute for Space-Earth Environmental Research, Nagoya University)

Auroral electrojet indices (AU, AL, AE) are a proxy of substorms and auroral activity. Forecasting these indices is crucial for space weather forecast. This study develops a data assimilation code to estimate the AU index based on the model proposed by Goertz et al. The state vector includes the AU index along with coupling parameters for solar-wind, magnetosphere, and ionosphere. The AU index provided from WDC-C2, Kyoto University is used as the observation vector. Using the data assimilation, the dynamical estimation of the coupling parameters is possible. This approach significantly improves the forecasting performance. The estimated coupling parameters have semi-annual and long-term modulations. According to a previous model, the coupling parameters are a function of the ionospheric conductance. It is expected that the estimated seasonal and yearly variations of the coupling parameters from data assimilation correspond to seasonal and yearly variations of the ionospheric conductance.

Key words: Data assimilation, particle filter, aurora activity, space weather.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 165-179 (2022)

Simultaneous Data Assimilation of Meteorological Fields and Atmospheric Concentration Fields Using Variable Localization in the Ensemble Kalman Filter

Tsuyoshi Thomas Sekiyama
(Meteorological Research Institute, Japan Meteorological Agency)
Mizuo Kajino
(Meteorological Research Institute, Japan Meteorological Agency)

The ensemble Kalman filter (EnKF) explicitly derives background error covariance matrixes, which are subsequently used to calculate the Kalman gains. The derived background error covariance can be modified during the data assimilation process. For example, the covariance can be arbitrarily increased to spread the ensemble perturbation (i.e., covariance inflation) or decreased according to the physical distance between state variables (i.e., covariance localization). Covariance localization is a key reason why the EnKF works practically with a much smaller number of samples than the degrees of freedom of the system in meteorology. Covariance localization is applicable not only to physical distances but also between state variables with small correlations (i.e., variable localization). In this study, while simultaneously assimilating meteorological data (wind, temperature, pressure, etc.) and atmospheric concentration data using the EnKF, we attempted to zero the covariances for combinations of variables with small correlations by variable localization. This improved the analysis accuracy of the concentration distribution using wind observation information to minimize the effects of sampling errors. In contrast, under the conditions of this study, the analysis accuracy could not be improved for the wind distribution using the information of the concentration observations.

Key words: Data assimilation, ensemble Kalman filter, variable localization, weather simulation, atmospheric chemistry simulation.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 181-193 (2022)

Background Error Covariance Matrix Factorization in Variational Data Assimilation for Atmospheric State Analysis

Toshiyuki Ishibashi
(Meteorological Research Institute, Japan Meteorological Agency)

Accurate global atmospheric state analysis is a difficult scientific problem due to its chaotic nature. Data assimilation enables highly accurate atmospheric state analysis by consistently integrating vast amounts of information on the atmospheric state using relationships between probability density functions (Bayes' theorem). Since the background error covariance matrix (BECM) of model prediction has complex spatiotemporal structures, accurate estimation of the BECM is a major research theme of atmospheric analysis. This paper is a review of the BECM formulation in the variational global atmospheric analysis, with a particular focus on the factorization of BECM, which are important in the variational data assimilation. In recent years, the improvement of atmospheric analysis accuracy has been remarkable by the highly accurate BECM factorization using ensemble forecasts and localization matrices, and there are four matrix representations as such BECM factorization.
These expressions have problems such as the relationships between them are not completely clarified. In recent years, the general form of the BECM factorization has been shown, and it has been shown that all of these matrix representations are deduced from the general form. It has also been shown that for the problem of non-regularity of factorized BECM, its regularity can be maintained by using the degrees of freedom under the approximation accuracy of the theory, and that the non-regularity of factorized BECM does not affect the solution in a specific minimization algorithm.

Key words: Data assimilation, variational method, atmospheric science, background error covariance matrix, ensemble, localization.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 195-208 (2022)

Estimation of a Posterior Error Covariance Matrix Using Conjugate Vectors and the BFGS Formula

Yosuke Niwa
(Earth System Division, National Institute for Environmental Studies/Department of Atmosphere, Ocean, and Earth System Modeling Research, Meteorological Research Institute, Japan Meteorological Agency)
Yosuke Fujii
(Department of Atmosphere, Ocean, and Earth System Modeling Research, Meteorological Research Institute, Japan Meteorological Agency/The Institute of Statistical Mathematics)

A four-dimensional variational method is commonly used for data assimilation/inverse problems. However, it cannot automatically provide a posterior error. Niwa and Fujii (2020) developed a technique to estimate a posterior error covariance matrix within the framework of the four-dimensional variational method. Their technique adopts a quasi-Newton method with the BFGS formula, which is a conventional optimizing method. To enhance the estimation accuracy of a posterior error covariance matrix, their technique also employs an exact line search, an ensemble method, and orthogonalization to increase the number of conjugate vectors used in the BFGS formula.
This report explains the fact that with preconditioning in the BFGS formula, an analytical posterior error covariance matrix can be obtained from the same number of iterations (or vector pairs used in the BFGS formula) as observations followed by a detailed description of Niwa and Fujii's technique (2020). Finally, the results obtained by applying this technique to an inverse problem of atmospheric CO2 are demonstrated.

Key words: Posterior error covariance matrix, data assimilation, inverse analysis, four-dimensional variational method, BFGS formula, quasi-Newton method.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 209-233 (2022)

Ensemble Member Generation Based on the BFGS Formula in a Variational Data Assimilation System

Yosuke Fujii
(Meteorological Research Institute, Japan Meteorological Agency/Numerical Prediction Development Center, Japan Meteorological Agency/The Institute of Statistical Mathematics)
Takuma Yoshida
(Numerical Prediction Development Center, Japan Meteorological Agency/Meteorological Research Institute, Japan Meteorological Agency)
Yutaro Kubo
(Numerical Prediction Development Center, Japan Meteorological Agency/Meteorological Research Institute, Japan Meteorological Agency)

This paper proposes a method to generate a perturbation for ensemble predictions using information on the gradient of the cost function obtained during optimization in a variational data assimilation system in which a quasi-Newton method, the Broyden--Fletcher--Goldfarb--Shanno (BFGS) formula, is used for optimizing the analysis variables. The proposed method generates perturbations as a linear combination of the approximated dominant singular vectors of the operator into which model and observation operators are combined, and it can approximate the analysis (posterior) error variance--covariance matrix. As a practical example, we show the perturbations on oceanic initial conditions generated by the global ocean data assimilation system in the coupled atmosphere--ocean prediction system at the Japan Meteorological Agency. We also demonstrate a result of evaluating the effect of using the perturbations in ensemble predictions with the coupled atmosphere--ocean prediction system.

Key words: Ensemble prediction, BFGS Formula, quasi-Newton method, variational method, data assimilation.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 235-250 (2022)

Ensemble-based Variational Data Assimilation Approach and Its Extension for Count Data

Shin'ya Nakano
(The Institute of Statistical Mathematics/Center for Data Assimilation Research and Applications, Joint Support-Center for Data Science Research/Department of Statistical Science, School of Multidisciplinary Sciences, The Graduate University for Advanced Studies, SOKENDAI)

Ensemble-based variational approaches are a class of data assimilation methods which solves four-dimensional variational data assimilation problems by using ensemble simulations under various initial conditions and parameter settings. In contrast with the adjoint method, which is usually employed for four-dimensional variational data assimilation, these ensemble-based methods can easily be implemented without editing a simulation code allowing the model to be treated as a black box. A limitation of existing ensemble-based methods is that they are derived assuming that the conditional distributions of observations given the system states are Gaussian. Hence, they are not immediately applicable to observations that obey other distributions. This study derives an ensemble-based algorithm applicable for data assimilation into a black-box simulation model when the conditional distributions of observations are given by Poisson distributions.

Key words: Data assimilation, four-dimensional ensemble variational method, iterative ensemble Kalman smoother, Gauss-Newton method, Poisson distribution.


Proceedings of the Institute of Statistical Mathematics Vol.70, No.2, 251-267 (2022)

Introduction to the Signature Method

Nozomi Sugiura
(Global Ocean Observation Research Center, Japan Agency for Marine-Earth Science and Technology)

Sequential data observed in Earth Science can be regarded as paths in multidimensional space. Instead of seeing a path as a mere series of vectors, it is useful to convert it into a sequence of numbers called the signature of a path. The signature can faithfully describe the order of points and the nonlinearity between dimensions contained in a path. As a result, any nonlinear function defined on a set of paths can be approximated by a linear combination of terms in the signature of each path in it. Hence, when learning a set of sequential data with labels attached, linear regression can be applied to the signature-label pairs, yielding state-of-the-art results even when the labels are determined by a nonlinear function. Incorporating the signature methods into Machine Learning and Data Assimilation utilizing sequential data should allow us to extract information that has previously been overlooked.

Key words: Signature, machine learning, sequential data.