Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 189-200(2005)

Positive Definite Kernels

Kenji Fukumizu

(The Institute of Statistical Mathematics)

This paper explains our recent research on dimensionality reduction in regression with positive definite kernels, following Fukumizu et al. (2004). In regression problems, where the response variable $Y$ is explained by explanatory variables $X$, the best subspace within the explanatory space to explain $Y$ is called *effective subspace*. Our goal of dimensionality reduction in regression is to find the effective subspace. We formulate this problem as conditional independence of variables, and show that this conditional independence can be characterized by using covariance operators on the reproducing Hilbert spaces, which are given by positive definite kernels. Based on this theoretical result, we derive a method of estimating the effective subspace, given a finite number of samples. This dimensionality reduction method is very general and applicable to a wide class of data, because we do not need any strong assumptions on the conditional and marginal distributions, or dimension of variables. The experimental results on real data are shown to demonstrate the effectiveness of the method.

**Key words: Kernel method, positive definite kernel, Hilbert space, dimensionality reduction, variable selection, regression.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 201-210(2005)

Tomoko Matsui

(The Institute of Statistical Mathematics)

Kunio Tanabe

(Department of Science and Engineering, Waseda University)

This paper applies the dual Penalized Logistic Regression Machine (dPLRM) to text-independent speaker identification. This machine implicitly discovers speaker characteristics relevant to discrimination only from training data by the mechanism of the kernel regression. Therefore, the speaker identification method based on dPLRM does not require the conventional feature extraction process depending on prior knowledge. We show that the dPLRM method is competitive with the conventional Gaussian mixture model (GMM)-based method with data of 26-dimensional Mel-frequency cepstral coefficients (MFCCs) extracted from training speech uttered by 10 male speakers in three different sessions, even though the dPLRM method directly handles coarse data of 256-dimensional log-power spectrum. It is also shown that the dPLRM method outperforms the GMM-based method especially as the amount of training data becomes smaller.

**Key words: Speaker recognition, speaker identification, kernel regression, dual Penalized Logistic Regression Machine, implicit feature extraction.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 211-229(2005)

Kazuyuki Nakamura

(Department of Statistical Science, The Graduate University for Advanced Studies; JST CREST)

Genta Ueno

(The Institute of Statistical Mathematics; JST CREST)

Tomoyuki Higuchi

(The Institute of Statistical Mathematics; JST CREST)

The data assimilation technique, which was developed for meteorology and oceanography, aims at accommodating states of a physical simulation model to observations. It is motivated to provide a good initial condition for the nonlinear simulation model and to realize an online model parameter estimation. To attain these purposes, sequential data assimilation such as the Ensemble Kalman Filter is used. Compared to the Particle Filter, the Ensemble Kalman Filter has a similar methodology in view of the ensemble-based method, but it has different structures. For example, linear calculation is done instead of resampling at the filtering step and approximated probability density does not converge to real probability density even if the number of ensemble members goes to infinity in almost all cases. This paper reviews the concept of data assimilation, and formulates using the nonlinear state space model. On the basis of this formulation, the Ensemble Kalman Filter is reviewed, focusing on the difference and similarity between the Ensemble Kalman Filter and the Particle Filter. After the review, we demonstrate assimilation experiments of nonlinear observations with small scale simulation and the superiority of the Particle Filter to the Ensemble Kalman Filter under these conditions. Applicability to actual systems including nonlinear observations is also suggested.

**Key words: Data assimilation, Particle Filter, Ensemble Kalman Filter, state space model.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 231-259(2005)

—Site Research, Modeling and Algorithms—

Atsuko Ikegami

(Faculty of Science and Technology, Seikei University)

This paper deals with the scheduling of nurses to staff shifts at a hospital. The basic difficulty lies in the necessity of maintaining a certain level of service and skill in the makeup of every shift, while balancing the workload among the nurses involved. As a result, it is usually impossible to develop a schedule that satisfies all the requirements in spite of the time and resources spent in the effort.

In this paper, we present our latest developments by demonstrating the entire process of problem solving, from modeling based on site research and a suvey of possible applications. We first describe the nurse scheduling problem in Japan, and then present our model that accounts for the problems identified in Japanese scheduling systems. We propose an efficient approach that takes advantage of the structure clarified in the modeling, and based on this approach, we show an algorithm for a 2-shift system that provides efficient, appropriate solutions. Finally, to assess the possibility of using efficient algorithms that can account for both 2-shift and 3-shift systems, we show the results of preliminary computational experiments using newly developed, prototype algorithms. The proposed model and approach can be adapted for the majority of hospitals in Japan, as well as for some hospitals in other countries and could be applied to many other staff scheduling problems in the fields of health services, business and logistics.

**Key words: Nurse scheduling, questionnaire survey, mathematical programming model, block-angular problem, tabu search.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 261-284(2005)

for Passenger Cars in Japan:

Analysis Based on Registration-cohort-wise

Hazard Model of Scrapping Cars

Yoh Fujisaki

(The Graduate University for Advanced Studies)

Kunio Tanabe

(Department of Science and Engineering, Waseda University)

Prediction of demand for passenger cars has been widely practiced based on the method of multiple regression using various economic variables. However, this tends to give less than satisfactory results. Based on an analysis of the annual data of registration cohort-wise scrapped cars, we propose a new method for predicting annual demand for passenger cars in Japan. By examining the annual registration and scrap volume data, the hazard rate of scrapping of a passenger car is found to be well approximated by the logistic function, and ‘Logistic Hazard Distribution(LHD)’ is introduced for the probability of scrapping a passenger car over time. By using this distribution model, scrap probability for each registration cohort is estimated in order to estimate the annual total volume of scrapped cars. Annual demand for passenger cars is also found to be highly proportional to the scrap volume of the subsequent year. By combining these two observations, a new method for prediction of annual demand for passenger cars is introduced. In order to obtain a better prediction, we also introduce perturbation terms in LHD for accommodating irregularity of scrap volumes due to the effect of legislation for safety regulation. Numerically constrained optimization is fully used for obtaining the maximum truncated likelihood estimates of parameters and the statistics AIC is employed for determining the number of perturbation terms. It is demonstrated that much better prediction is possible with our method than the existing one, and that sudden changes in demand may be predicted far in advance.

**Key words: Prediction of annual demand for passenger cars, scrapped cars, legislation for safety regulation, logistic hazard distribution, LHD, convoluted scrapping cars, registration cohort analysis.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 285-295(2005)

from Nuzi Personal Names

Sumie Ueda

(The Institute of Statistical Mathematics)

Kumi Makino

(Lake Biwa Museum, Shiga Prefecture)

Yoshiaki Itoh

(The Institute of Statistical Mathematics)

We introduce a computer algorithm for reconstructing family trees from the book ‘Nuzi Personal Names', by Gelb et al. (1943), by unifying the family trees given in it.

The clay tablets discovered at Nuzi, which was located in northeast Iraq in the about the 15th century B.C., belong either to private archives found in the houses of rich families or to official archives in the palace. The texts in the tablets are on land transactions (buying, renting, exchanging), family contracts in the form of marriage documents, litigations, loan, slavery contracts and many others, which enable us to reconstruct the social and economic life of Nuzi in the middle of the 2nd millennium B.C. The several thousand personal names mentioned in the texts furnish the main source for the evaluation of the ethnic background of Nuzi in this period. ‘Nuzi Personal Names’ gives the index for personal names with kinships.

To estimate family trees from the kinship data, we take the kinship information for each personal name from the book ‘Nuzi Personal Names’ and make a database. Several names are written with more than 10 different spellings and there are many common names. We say that family tree $A$ and family tree $B$ are consistent with each other, if and only if at least two names are common in $A$ and $B$ and there is no contradiction in their kinship relations for the names in both $A$ and $B$. We apply a sequential algorithm by comparing each family tree with the other family trees to unify two family trees if they are consistent with each other. We continue until there is no family tree to be unified.

We get three family trees for five generations, three family trees for four generations and 62 family trees for three generations, etc. The largest family tree, ‘Te\b{h}ip-tilla’ family, is for five generations with 15 members. We made family trees just from the data of personal names by using the above sequential algorithms. Our results agree with those given by philologists obtained from the details of the clay tablets. Some interesting family trees obtained in our paper will help to understand the Nuzi society of the 15th century B.C.

**Key words: Nuzi personal names, database, computer algorithm, family tree.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 297-315(2005)

with Second-order Cone Programming

Takashi Tsuchiya

(The Institute of Statistical Mathematics)

Takashi Sasakawa

(Railway Technical Research Institute)

This paper deals with an application of second-order cone programming to an optimal magnetic shielding design problem of the MAGLEV (superconducting MAGnetically LEVitated vehicle), a new bullet train under development in Japan. The MAGLEV is held in the air by a strong magnetic field and propelled by linear synchronous motors. Each car is equipped with several super-conducting magnet units which generate the magnetic field. Passengers inside the car need to be shielded from the magnetic field outside. The optimal design problem of magnetic shielding is to minimize the weight of the shielding by adjusting its thickness taking into account the external magnetic field.

After some appropriate simplification, this optimization problem is formulated as a second-order cone program and is solved with the primal-dual interior-point algorithm. Taking advantage of its efficiency and stability, the algorithm is further applied to robust design of magnetic shielding.

**Key words: Second-order cone programming, interior-point methods, magnetic shielding, MAGLEV train, optimal design, robust optimization.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 317-329(2005)

from Large Scale Databases Based

on Closed Pattern Enumeration

Takeaki Uno

(National Institute of Informatics)

Hiroki Arimura

(Graduate School, Information Science and Technology, Hokkaido University)

Learning and knowledge discovery by database analysis have important roles in science, industry, and economy. Finding distinctive patterns and association rules for discovering valuable knowledge is a common approach to such learning and knowledge discovery. However, it takes long time if the database size is huge, and we also have to deal with a huge number of unnecessary candidate patterns. In this paper, we propose a method for efficiently finding such patterns and rules by using closed pattern mining algorithm LCM that we proposed previously. Closed patterns are representatives of patterns having the same denotations. Hence, we can reduce the number of candidate patterns, and the computation time. We also propose an algorithm for evaluating rules in a short time. We can thus enumerate all association rules of high confidences or having interesting properties.

**Key words: Frequent pattern, closed pattern, association rule, algorithm, data mining, machine learning.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 331-348(2005)

Using GMRES Methods

Ken Hayami

(National Institute of Informatics)

Tokushi Ito

(Business Design Laboratory Co., Ltd.)

The most commonly used iterative method for solving large sparse least squares problems $\displaystyle{\min_{\ssbx \in \srn} \| \bb - A \bx \|_2 }, A \in \bR^{m \times n}$ is the CGLS method, which applies the (preconditioned) conjugate gradient method to the normal equation $A^\trans A \bx = A^\trans \bb$.

This paper considers alternative methods by using an $n \times m$ matrix $B$ and applying the Generalized Minimal Residual (GMRES) method, which is a robust Krylov subspace iterative method for solving systems of linear equations with nonsymmetric coefficient matrix, to $\displaystyle{\min_{\ssbz \in \bR^m} \| \bb - AB \bz \|_2 }$ or $\displaystyle{\min_{\ssbx \in \srn} \| B\bb - BA \bx \|_2 }$.

Next, we give a sufficeint condition concerning $B$ for the proposed methods to give a least squares solution without breakdown for arbitrary $\bb$, for over-determined, under-determined and possibly rank-deficient problems. Then, as an example for $B$, we propose the IMGS($l$) method, which is an incomplete QR decomposition.

Finally, we show by numercial experiments on full-rank over-determined and under-determined problems that, for ill-conditioned problems, the proposed method using the IMGS(0) method, which is equivalent to diagonal scaling, gives a least squares solution faster than previous preconditioned CGLS methods.

**Key words: Least squares problems, iterative method, Krylov subspace method, GMRES method, CGLS method, preconditioning. **

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 349-360(2005)

for a Class of Second-order Cone Programming

Keisuke Kurita

(Department of Computer Science, Graduate School of Electro-Communications, The University of Electro-Communications)

Masakazu Muramatsu

(Department of Computer Science, The University of Electro-Communications)

A pivoting algorithm for solving a class of second-order cone programming (SOCP) proposed by Muramatsu (2003) is implemented for the first time. Numerical experiments show that the performance of the algorithm heavily depends on the choice of pivot selection rules. Furthermore, we propose an improvement of the pivoting procedure proposed in Muramatsu (2003), which enhances efficiency of the pivoting algorithm. The algorithm performs best with the *maximum decrease strategy* pivoting rule proposed by Muramatsu (2003) and the improved pivoting procedure.

**Key words: Second-order cone programming problems, pivoting method.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 361-373(2005)

State-constrained Optimal Control Problems

by Dual Sequential Quadratic Programming

with Cutting Plane Strategy

Satoshi Ito

(The Institute of Statistical Mathematics)

An approach based on dual sequential quadratic programming (SQP) coupled with a cutting plane strategy is proposed as a numerical method for solving optimal control problems with state trajectory inequality constraints. This method applies the dual SQP to a sequence of relaxed problems with a finite number of pointwise state inequality constraints generated in a cutting plane sense, and thus converts the original optimal control problem into a series of finite-dimensional quadratic programs. The approach can also be viewed as a sequential estimation procedure of the multiplier for the state trajectory constraints by nondecreasing step functions and is hence expected to perform well on high-order state constraints.

**Key words: Optimal control problem, state constraint, sequential quadratic programming, cutting plane method.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 375-389(2005)

Atsuyuki Kogure

(Faculty of Policy Management, Keio University)

Masahiko Sagae

(Faculty of Engineering, Gifu University)

We consider the problem of estimating density from percentiles. Prior results such as Lecoutre (1987) and Scott (1992) suggest that a histogram with percentiles suffers large biases. We re-examine the bias problem and show that the kernel method alleviates it by asymptotic arguments. To confirm the theoretical results we give simple simulation studies.

**Key words: Histogram, percentiles, bin frequency, kernel estimator, bias.**

| Full text pdf | Back |

Proceedings of the Institute of Statistical Mathematics Vol.53, No.2, 391-404(2005)

—Limit of Weighted Least Squares Method—

Takashi Tsuchiya

(The Institute of Statistical Mathematics)

The weighted least squares method is a simple generalization of the ordinary least squares method. This paper deals with an extreme case where the ratio among the weights in the weighted least squares problem diverges to infinity. It is known that a limiting solution exists and is well-defined. This limit of the weighted least squares method is referred to as the layered least squares method. In this paper, we analyze the behavior of the weighted least squares solution when the ratio among the weights tends to infinity, and establish bounds on the difference between the weighted least squares solution and the layered least squares solution in terms of the ratios and condition numbers of the matrix involved in the least squares problem.

**Key words: Weighted least squares problems, layered least squares problems, condition numbers, interior-point algorithms.**

| Full text pdf | Back |