Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 9-21(2001)
In this review, I will explain how the Principal Component Analysis (PCA) and/or Multi–Dimensional Scaling (MDS) methods are utilized to explore information representation of visual areas in the brain. After reviewing the anatomical and physiological findings of the visual areas briefly, I introduce interesting works regarding comparison between physiological and physical spaces of face by Young and Yamane, and that between psychological and physical spaces of two–dimensional shape by Makioka et al. Young and Yamane analyzed the population of face–responsive neuron in the area AIT by means of the MDS. They compared this MDS result with that of some physical configurations of face, and found that they are similar to each other. Makioka et al. also found the similarity between the psychological and physical spaces. Recently, we analyzed the dynamical behavior of face–responsive neuron population in the monkey temporal cortex using the MDS. We found that the face–representation is embedded in the population of the face–responsive neurons with the hierarchical structure, and found that this internal representation dynamically changes: In the early phase (90 ms–140 ms), the population formed clusters corresponding to rough categories. In the later phase (140 ms–190 ms), the each cluster expanded to form sub–clusters corresponding to finer categories.
Key words: Brain, visual area, neuron, information representation, PCA, MDS.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 23-42 (2001)
In this article, we introduce ``How to use'' the principal component analysis in facial image recognition. We also introduce some improvement in the fields.
First, we describe the role of principal component analysis in image recognition technology. And we point out some difficulties in facial image recognition technology, for example image change caused by illumination change, nonlinear distribution caused by head pose change.
Finally, we introduce some improvement of principal component analysis and how to solve the problems.
Key words: Pattern recognition, computer vision, principal component analysis, facial image recognition.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 43-56(2001)
Principal component analysis is a powerful tool to investigate the protein energy landscape since protein fluctuations are highly anisotropic in nature. Because of the extreme complexity of protein three-dimensional structures, a large number of substates, which are energetically comparable with one another and distinguishable by their conformations, exist on the protein energy surface in the native state. Jumping-Among-Minima (JAM) model is proposed in order to explore the protein energy landscape. A novel picture of the protein energy landscape is discussed.
Key words: Protein, energy landscape, JAM model, harmonicity, anharmonicity.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 57-75(2001)
In order to determine the three dimensional structures of biological macromolecules such as proteins and DNA, distance geometry calculation is applied for the NMR experimental information of distances between hydrogen atom pairs. The distance geometry calculations are based on the algorithms of embedding and multi-dimensional simulated annealing. The principle and applications of the latter method developed by the authors are described. For the analysis of dynamic structure formation of those biopolymers, we need free energy landscapes in the multi-dimensional conformational space, which are obtained by the enhanced conformational sampling methods developed recently. The authors' group has recently developed the multicanonical molecular dynamics method, and applied it to many biological molecular systems, peptides and local fragments of proteins. Principal component analysis is a powerful tool to interpret those complicated free energy landscapes, and very new findings have been noticed.
Key words: Distance geometry, simulated annealing, free energy landscape, multicanonical ensemble, principal component analysis.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 77-107(2001)
Recovering the camera motion and the object shape from multiple images is a fundamental and important problem in the field of computer vision. Especially, the problem under point correspondences is the most fundamental and most important. To solve this problem, many methods are presented and among them, the factorization method is an excellent method because it is stable in numerical computation and it gives good reconstruction although it is based on the affine approximation of the perspective projection. The factorization method is useful not only for solving the problem, but also for understanding the mathematical meaning of the problem under affine approximated projection. In this paper, the mathematical analysis of recovering the camera motion and the object shape from multiple affine approximated projection images under point correspondences by the factorization method is considered. The way to recover the camera motion and the object shape from perspective images by estimating the affine approximated projection images from perspective images and the recursive factorization method are also considered.
Key words: Structure from motion, factorization method, recursive, Metric Affine Projection, perspective projection.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 109-131(2001)
The use of the principal component analysis and the factor analysis (PCA/FA) on multivariate time series is discussed. The standard results on PCA/FA rely on the assumption of independent sampling from an identical multivariate distribution, which is almost always inappropriate in time series settings. In this paper, we review two approaches in PCA/FA that is accommodated to serially correlated observations. The first attempt is to use discrete Fourier transformation of a time series. Asymptotic independence of DFT enables the ordinary PCA/FA technique valid in the frequency domain. The second methods are more straightforward in a sense that they give an explicit time series model to the latent factor series, which can be called dynamic factor analysis. We detail several styles of modeling dynamic factor, namely the simultaneous structural equation, the structural time series models and the canonical transformation of time series. Reduced rank regression models and error correction models are also related to dynamic factor model but they do not assume any explicit model for latent factor series. It is also mentioned that the principal component analyses in the time domain think little of the serial correlation of a time series. A numerical example in the final section highlights the importance of considering a lagged common factor, which will not be detected by a simple-minded application of PCA.
Key words: Multivariate time series, principal component analysis, factor analysis, frequency domain, time domain.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 133-153(2001)
Needs for extracting features from large scale data of, e.g., genetic systems, nonlinear dynamical systems, etc., are increasing these days. Non-metric multidimensional scaling (NMDS), a kind of multivariate analysis used mainly in social sciences, may be of use for such feature extraction. NMDS tries to imbed objects into a certain metric space, e.g., a Euclidean space, so that the rank order of distances between objects is maximally preserved. The method may be regarded as the most unprejudiced way to extract features, but conventional approaches do not seem to have paid particular attention to large scale data. We propose a new algorithm of NMDS that is efficient for large scale data and introduce a statistical test to evaluate how well the resultant configuration explains the data.
Key words: Multidimensional scaling, large scale data, information compression, goodness of fit.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 155-174 (2001)
Generalized binomial coefficients for partitions of nonnegative integers are involved in a class of distribution functions, arisen in multivariate analysis, including those of the eigenroots of non-central F matrices. The coefficients of degree up to and equal to eight have been tabulated in Pillai and Jouris (1969). No efficient algorithm, however, had been available for obtaining them of higher degree, because of difficulties in computation of zonal polynomials.
Recently, an algorithm for expressing zonal polynomials of arbitrary order in terms of elementary symmetric polynomials has been proposed by Hashiguchi, the second author of this article, and his coworkers including the third author. Based upon their works, a procedure for furnishing the partitional generalized binomial coefficients as rational numbers is proposed in this article.
Let \mathcal C\kappa(Y) and \mathcal E\kappa(Y) denote the zonal polynomial and the elementary symmetric polynomial, respectively, identified with a partition \kappa of which length is not greater than p and a p × p symmetric matrix Y of independent variables. For obtaining the p-variate binomial coefficients, it is required to expand \mathcal C\kappa(I+Y) into a linear combination of the elements in {\mathcal C\sigma(Y) | deg(\sigma) < deg(\kappa)}}, where Y is the identity matrix. Expansion is realized by transforming \mathcal C\kappa(I+Y), by using the results due to Hashiguchi et al. (2000), into a linear combination of the elements in {\mathcal E\kappa(I+Y) | deg(\lambda)=deg(\kappa), \lambda \succeq \kappa} to expand them into a series in \mathcal E\mu(Y) for deg(\mu) < deg(\lambda) which is reversely transformed into a series in \mathcal C\sigma(Y), where deg(\sigma) = deg(\mu) and \sigma \succeq \mu. The ordering signified by the symbol \succeq between two partitions of the same degree is the lexicographic ordering.
The authors have already obtained all the bivariate binomial coefficients of degree 30 and those of less degrees, and which have been published via World Wide Web. The coefficients of degrees 9 and 10 are shown also in this article. For the sake of illustration, application of them to Roy's largest root test is discussed with numerical computation of relevant values, including the means, variances, quantiles for the distributions of the largest eigenroots of F matrices as well as the powers of the tests.
Key words: Distribution of eigenroots, elementary symmetric function, multivariate analysis, non-central F matrix, partition of nonnegative integer, zonal polynomial.
Proceedings of the Institute of Statistical Mathematics Vol.49, No.1, 175-198(2001)
A family of probability density functions with the part consisting of the sum of a main variable and its reciprocal is introduced as a family of symmetric reciprocal distributions. Various functional forms of the probability density functions in the family are given. Several interesting identities of the density functions which show characteristics of the family are shown. The family of probability distributions with reciprocal structures in exponential functions are introduced as a family of exponential reciprocal distributions. In this family there exist the inverse Gaussian distribution, the Birnbaum–Saunders distribution and other known important ones. Further, we see that the generalized inverse Gaussian distribution also belongs to the family. Some other symmetric reciprocal distributions which seem to be unknown are also derived. In addition, bounds for evaluating the probabilities of the reciprocal inverse Gaussian distribution are given based on the inequalities of incomplete gamma function ratio. Concerning those bounds some tables and figures are presented.
Key words: Symmetric reciprocal function, symmetric function in a wide sense, symmetric reciprocal arithmetic mean, symmetric reciprocal geometric mean, symmetric reciprocal family of distributions, generalized inverse Gaussian distribution.