## Stochastic Prediction of Earthquakes—A Strategy for the Research

Practical earthquake forecasting should provide the probability with uncertainty of an earthquake's location, time and magnitude. This needs statistical modeling of the effects of abnormal phenomena incorporating various predicting scenarios based on scientific knowledge of geophysics, geology and disaster history. We need to detect significant deviations (anomalies) in observed data from prediction data for various potentially useful data-bases associated with seismic activity. Namely, we need to detect abnormalities derived from an appropriate diagnostic analysis. Such anomalous phenomena need to be analyzed to model the statistical causality as precursors of large earthquakes. For this purpose, a stochastic point process is useful to predict space-time stochastic intensity rates of expected earthquakes. This should enable us to calculate the probability of a large earthquake. Abnormal phenomena of only a single type may not sufficiently enhance a secular probability of a large earthquake. However, when abnormal phenomena of plural types are observed at the same time, the probability can be substantially increased. By a variety of observations of long-term, medium-term, and short-term anomalies, we should look for knowledge and abnormal phenomena to constitute a ``multi-elements prediction formula'' model.

Key words: Anomalies, point process models, conditional intensity function, reference seismicity model, multi-elements prediction formula, probability gain.

## Evaluation Methods of Earthquake Forecasts

Objective evaluation of forecasting performance is essential in research on earthquake predictability. Since the occurrence probabilities of large and small earthquakes are completely different, the score for a successful prediction of a rarely occurring large earthquake should be significantly different from that of small earthquakes. Similar reasons should be applied to predictions in non-active and active seismic regions. First of all, it is necessary to build empirical models for seismicity in different regions, which can be used as references for forecasting future seismicity. The significance of earthquake forecasts can be evaluated by using the log likelihood ratio of the performance to the reference, or the information gain. The Akaike information criterion (AIC) is useful to estimate the information gain and to determine whether the proposed model will have better predictive performance than the reference model using currently available data. Due to the underdevelopment of forecasting algorithms and the lack of prediction experience, it is often the case that predictions are not given in the format of probabilities, but as earthquake warnings (binary predictions). This article also explains how to use a gambling score to evaluate such binary predictions. This method also needs a reference model. Each time the prediction succeeds or fails the predictor is rewarded or penalized by using a fair gambling rule according to the reference model. As the reference model, the uniform distribution (homogeneous Poisson process) for the occurrence times and locations of earthquakes has been used in addition to the Gutenberg-Richter law (exponential distribution) for earthquake magnitudes. But, when a more reasonable nonhomogeneous Poisson process is used as the reference model, the warning-type predictions that are currently available rarely have better scores.

Key words: Probability forecast, reference forecast, warning-type predictions, information gain, gambling score, Akaike Information Criterion.

## Modeling Seismicity Anomalies

The epidemic-type aftershock sequences (ETAS) model is useful for statistical analysis of time sequences of earthquake occurrences. The conditional intensity function of the point process model is defined for the occurrence rate of an earthquake in the immediate future, and is indispensable for probability forecasting of earthquakes in a given future period. Also, it provides a standard seismicity model that is useful for detecting anomalies from a series of earthquake occurrences. Then remodeling of the intensity function can improve the earthquake forecasting. In this manuscript, application methods of the stationary ETAS model are explained in detail for analyzing seismicity changes. Then this model is proposed for inversion of abnormal seismicity. This is applied to a dataset of swarm activity triggered by the 2011 Tohoku-Oki mega-earthquake.

Key words: Point process, conditional intensity function, stationary ETAS model, anomaly in seismicity, non-stationary ETAS model.

## Real-time Short- and Intermediate-term Forecasting of Aftershocks after a Main Shock

A large earthquake triggers numerous aftershocks, and some strong aftershocks can cause additional damage in the disaster area. Thus, operational forecasting of aftershock activity has been carried out to reduce earthquake risks. However, there are some problems with current forecasting methods. First, early forecasting is very difficult because of the substantial deficiency of data shortly after a main shock, although aftershocks occur very frequently soon after a main shock. Second, because aftershock activity lasts for a long time, it is also important to achieve intermediate-term forecasting as soon as possible. Nevertheless, it is not easy to do this from limited data. To overcome these difficulties, we have employed statistical methodology to develop a practical forecasting method. In this contribution, we introduce our recent works in aftershock forecasting, and show the effectiveness of our method using actual data.

Key words: Statistical seismology, point process, probability forecast, Bayesian statistics.

## Point Process Models for Recurrent Earthquakes at Active Faults and Their Long-term Forecast

There are many active faults in inland Japan that have risks of future catastrophic earthquakes. However, their occurrence probabilities vary widely because their earthquakes recur quasi-periodically and their cycles are very different. This paper introduces some point processes such as renewal processes for evaluating hazards of such `recurrent earthquakes'. Since inland active faults in Japan have very long cycles of activity, forecasting is difficult because of the scarcity and unreliability of historical data. For stability of predictive performance, Bayesian prediction may be appropriate to deal with the uncertainty of data and parameters caused by these problems. Some illustrations are shown on forecasting for active faults in Japan.

In recent years, small earthquakes repeating on subduction zones of crustal plates have also attracted much attention for earthquake prediction. Their recurrence cycles are relatively short and so they can be used for middle-term forecasting experiments to validate prediction methods. Small repeating earthquakes are also analyzed for monitoring interplate slip because their recurrence times reflect the quasi-static slip rate around their fault patches. Here, a space-time model extended from a renewal process is used to estimate the spatio-temporal distribution of slip rate on plate boundaries.

Key words: Recurrent earthquakes, long-term forecast, Brownian Passage Time distribution, renewal processes, Bayesian prediction.

## Inversion Analysis of GPS Data and Forecast of Earthquake Occurrence

The occurrence of earthquakes is brittle fracture of rocks, and so its deterministic prediction is difficult. Since the 1890s, various statistical models have been proposed for probabilistic prediction of future earthquakes from past seismicity data. In contrast, in the 1990s, a new research field that aims to understand earthquake generation on the basis of physics was born, and nowadays physics-based earthquake generation models governed by various fault constitutive (friction) laws have been proposed. In these physical models, the occurrence of earthquakes is described as the release process of shear stress acting on faults. Therefore, the physical model and the statistical model must be closely related with each other. However, it is not easy to determine the actual stress state of the earth's crust. At present, we can only estimate stress patterns from inversion analysis of focal mechanism solutions or CMT (Centroid Moment Tensor) solutions of seismic events and stress changes from inversion analysis of GPS (Global Positioning System) data. This article introduces the basic idea and method of GPS data inversion, and examine inversion results from the viewpoint of earthquake prediction.

Key words: Earthquake generation, statistical model, physical model, crustal stress, GPS data, inversion analysis.

## Earthquake Forecasting Based on the Correlation between Earth Tides and Earthquake Occurrences

Many studies have investigated and discussed the correlation between the occurrence of earthquakes and periodic stress/strain changes due to earth tides. This article introduces typical statistical methods that can be used to examine such correlation and reviews recent achievements made in past studies. In particular, it is proposed that the significance/non-significance of correlation can be used to measure the stress state of the earth's crust and that this information may be valuable for earthquake forecasting. Additionally, this article suggests possible development of new statistical approaches in the future that would allow for more comprehensive evaluation of such correlation.

Key words: Seismicity, earth tides, point-process analysis, earthquake forecasting.

## Probability of Recovery from Default Loan to Performing Loan

We made a model which estimates the probability of recovery of a default company, and analyse processes of recovery. We use corporate loan data of a regional bank which were recorded every half year from September 2007 to March 2012. Borrower characteristics, financial characteristics, and time period categorical variables are used as explanatory variables in the model choice. We propose Yeo-Johnson (YJ) transformation for explanatory variables, which is an expansion of the Box-Cox transformation to negative values. After the model choice, we advance a logistic regression in the model with some YJ transformed borrower characteristic, borrower characteristic, time period categorical variables and so on. We choose these variables by comparing model fitness using AIC, AUC and Hosmer-Lemeshow test statistics. We thus confirmed that YJ transformation improves model estimating performance.

Key words: Credit risk, probability of recovery, LGD.

## ``Obake (Ghost) Surveys'' Revealing Underlying Structure of Heart and Mind: Some Relevant Data from Asia Pacific Values Survey (APVS)

The Institute of Statistical Mathematics (ISM) has conducted a number of cross-national surveys using statistical sampling methods to compare characteristics of people in different countries, such as the Survey on the Japanese National Character, since 1953, and Cross-national Comparison Surveys since 1971. In this context, the so-called *Obake* Surveys, conducted by Chikio Hayashi's group since the late 1970s, intends to go one step further in an attempt to reveal the deep structure of the Japanese heart and mind.

This paper presents some relevant data collected in five countries/regions for the ongoing Asia Pacific Values Survey (APVS). We will focus on APVS questionnaire items which overlap with those of the *Obake* Surveys. In particular, we will examine Hayashi's personality classification between ``rational people vs. non-rational people'' in a cross-national context, in order to identify the basic set of information upon which standards for each nation will be built. First we will explain Hayashi's *Obake* Surveys, including its background, questionnaire items, and results. Then we will look at the APVS questionnaire items that are related to the *Obake* Surveys, i.e. religious attitudes, interest in supernatural powers, and views on life and death, so as to make a cross-national comparison of in-depth feelings. The data shows that it is not so relevant to classify people as ``rational people vs. non-rational people'' in other countries/regions as Hayashi did for the Japanese in the past. Although our data of the Japanese does not seem to replicate Hayashi's result in the exactly same way, a similar classification may be given in a slightly different way to deal with categories of items. Besides, the data of the other countries/regions also roughly confirm a similar classification, but with more variations. This suggests that a certain set of items cannot give a universal scale to classify people such as ``rational people vs. non-rational people.'' Lastly, perspectives for future research will be given.

Key words: Probability sampling survey, The Asia Pacific Values Survey (APVS), one's opinion on human life and death, religious belief, *Obake* Survey, Cultural Manifold Analysis (CULMAN).