Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 277-290(1999)

to Fish Population Dynamics

Yoshiharu Matsumiya

(Ocean Research Institute, University of Tokyo)

The data used in the fish population dynamics are often subject to considerable stochastic variations and measurement errors. It is necessary to use sound statistical techniques in the analysis of data. The Akaike information criterion (AIC) was introduced by Akaike for the purpose of selecting an optimal model from within a set of proposed models. Recently, there is increasing interest in the application of AIC to the fish population dynamics. The estimation of stock-recruitment relationship, the DeLury method of estimating population abundance, the estimation of mortality rates from tag recoveries, and the determination of the mesh selectivity curve are studied using AIC and the maximum likelihood method. The advantages of parameter estimation and model selection by practical application of AIC are shown through above examples. A long term approaches are also discussed.

**Key words: Akaike information criterion (AIC), fish population dynamics, estimation of population abundance, stock-recruitment relationship, population parameters.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 291-306(1999)

Aligned Current Systems as an Example of

Knowledge Discovery from the Large Database

Tomoyuki Higuchi

(The Institute of Statistical Mathematics)

Characteristics of large-scale field-aligned currents (LSFAC) observed above the Earth's ionosphere are highly variable, and we have been depending on visual examination to identify LSFAC systems. The objective of this paper is to report a new procedure that we developed to automatically identify the spatial structure of LSFAC from satellite magnetic field measurements. Depending on the number of LSFAC sheets crossed by a satellite and also on the intensity and flow direction (upward/downward) of each LSFAC, a plot of the east-west magnetic component can have any shape. The required task is to automatically fit line segments to the plot. The procedure is based on the concept of the first-order *B*-spline (polyline) fitting with variable node positions. The number of node points, which determines the number of FAC sheets, is one of the fitting parameters and is optimized for each orbit so that the Akaike Information Criterion (AIC) is minimized.

**Key words: AIC, linear spline with variable nodes, principal component analysis, discovery science.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 307-326(1999)

Multivariate Structural Time Series Model

Fumiyo N. Kondo

(Department of Statistical Science, The Graduate University for Advanced Studies)

This paper focuses on the role of the AIC as a model selection criterion through empirical analyses of POS scanner data in order to construct a multivariate structural time series model. Computer calculation capacity improves year by year and a huge amount of very minute data are available at retailers. Under these environments, we set up dozens of models within a unified framework that enables us to deal with more factors of data variation. Our models consist of trend component, day-of-the-week variation, and price deal component. They are flexible enough to handle time-varying parameter, which can capture the characteristics of scanner data, i.e., a competitive structure among products and a temporal relationship among data points. A group of models was constructed at the following three stages and their comparisons were made: 1) the determination of price function expressing the relationship between the price and the sales of a product (univariate model); 2) multivariate model expressing competitive structure among products; 3) time-varying parameter structural models which can express changes in a competitive structure. The AIC criterion was used as a unified model selection criterion for a comparison among different models and analyses were conducted on daily and weekly POS data. The analyses showed that a reasonable model was chosen as the best model at each stage. For example, at the last stage comparison, a time-varying model which can express structural changes was shown to be superior to constant parameter models by the AIC. Thus, under the model framework that a researcher established freely, the usage of the AIC as a unified model selection criterion makes it possible to choose the best model among dozens of the alternatives.

**Key words: State space model, unified model selection criterion, daily & weekly scanner data, time-varying parameter, competitive structure.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 327-342(1999)

via Jump Diffusion Process

Mitsunori Iino

(Department of Statistical Science, The Graduate University for Advanced Studies)

Tohru Ozaki

(The Institute of Statistical Mathematics)

A new method of estimating time series model that contain jump components is developed.

Our method is composed of a bank of filters and a new jump detection algorithm. Assuming that observations come from one of finite states, we detect jumps after the events using a posterior probability. As a result, we can detect low-frequency large-amplitude jump components that could not be detected in the previous studies.

**Key words: Jump diffusion process, state space model, AIC, jump detection, finance.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 343-358(1999)

Selecting Pharmacokinetic Sampling Points

Akifumi Yafune

(Division of Biostatistics, Center for Clinical Pharmacy and Clinical Sciences,

Kitasato University Graduate School

and Kitasato Institute Bio-Iatric Center)

Kitasato University Graduate School

and Kitasato Institute Bio-Iatric Center)

Makio Ishiguro and Genshiro Kitagawa

(The Institute of Statistical Mathematics)

In clinical areas, various kinds of parametric models are used to estimate time courses of repeated measurements obtained from a subject. A typical example of such models is 2353pharmacokinetic model" which is routinely used for estimating pharmacokinetic profile in human subjects. The number of sampling points per subject has to be limited because physical burdens on subjects become heavier as the number of sampling points increases. The limited measurement points have to be selected carefully. This paper describes an approach for selecting the optimum measurement point for the estimation of pharmacokinetic profile. The selection is made among given candidates, based on the goodness of estimation evaluated by the Kullback-Leibler information. This information measures the discrepancy of an estimated time course from the true one specified by a given appropriate pharmacokinetic model. The proposed approach is applied to actual pharmacokinetic observations to show how it works in practice.

**Key words: Estimation of pharmacokinetic profile, population pharmacokinetics, selection of the optimum sampling point.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 359-373(1999)

and Information Criteria

Seiya Imoto and Sadanori Konishi

(Graduate School of Mathematics, Kyushu University)

We investigate the problem of estimating *B*-spline nonlinear regression models. *B*-spline nonlinear regression models represented by probability density function are estimated using the maximum penalized likelihood method. The essential problem of model construction is in the choice of smoothing parameter and the number of basis functions, for which several procedures such as generalized cross-validation, AIC and Akaike's Bayesian information criterion have been proposed.

The purpose of the present paper is to consider this model estimation problem from an information theoretic point of view and to give a criterion as an estimator of Kullback-Leibler information. Numerical comparisons are made to compare the properties of various types of criteria.

**Key words: Spline, nonlinear regression, model evaluation, smoothing parameter, information criteria.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 375-394(1999)

Genshiro Kitagawa

(The Institute of Statistical Mathematics)

Sadanori Konishi

(Graduate School of Mathematics, Kyushu University)

The information criterion AIC is obtained by assuming the use of the maximum likelihood estimators. However, the basic idea of the bias correction for the log-likelihood as an estimator of the expected log-likelihood can be applied to a wider class of models and estimation procedures. Actually, Takeuchi (1976) proposed the criterion TIC for the situation where the model class does not contain the true model. Konishi and Kitagawa (1996) showed that this method of bias correction of the log-likelihood can be generalized to estimators defined by statistical functionals and derived the GIC (Generalized Information Criterion). On the other hand, Ishiguro et al. (1997) proposed a bootstrap based information criterion EIC (Extended Information Criterion) which can be applied to very broad class of models and estimation methods.

Recently, Konishi and Kitagawa (1998) and Kitagawa and Konishi (1998) extend the method used for the derivation of GIC, and established a theory for the second order bias correction and the variance reduction for the bootstraping log-likelihood. The amount of the bias terms in estimating the information criteria such as TIC and GIC are also explained by this method.

In this article, we explain the GIC criterion and its refinement and exemplify by using a simple model. In Section 2, we review Akaike's method of statistical model evaluation and show information criteria AIC, TIC, GIC and EIC. Section 3 is devoted to a brief derivation of GIC. The second order bias corrected information criterion is shown in Section 4 and a method of reducing the bootstrap variance in computing EIC is discussed in Section 5. Numerical examples are shown in Section 6.

**Key words: Modeling, model evaluation, log-likelihood, Kullback-Leibler information, statistical functional, bias correction, second order correction.**

Proceedings of the Institute of Statistical Mathematics Vol.47, No.2, 395-424(1999)

Makio Ishiguro

(The Institute of Statistical Mathematics)

Thinking that information criterion statistics consists of steps:

- development of new model;
- fitting of the model to data;
- publication of the findings and the methods employed,

the author composes this article as a package of

- introduction of debugging support subroutine 'bug' and its tutorial manual;
- introduction of subroutine program 'DALL' for the numerical maximization of log likelihood and its tutorial manual; and
- proposals of the use of copyright notice under the Open Market Licence(OML) condition as a part technology for software distribution.
- 'bug' and DALL are distributed from http://www.ism.ac.jp/software/ismlib/soft.html

The information about OML can be obtained from http://www.ism.ac.jp/software/ismlib/ismlib.e.html

**Key words: Debugging, log likelihood,numerical optimization, quasi-Newtonian method, copyright notice.**