Proc. Inst. Statist. Math. 56-2

Extracting Pseudo-biclusters from Gene Expression Data Based on Suffix Tree

Tetsuro Namba

(Division of Computer Science, IST, Hokkaido University)

Makoto Haraguchi

(Division of Computer Science, IST, Hokkaido University)

Yoshiaki Okubo

(Division of Computer Science, IST, Hokkaido University)

This paper describes a method for finding Pseudo-Biclusters of gene expression data. For time series data, a linear time algorithm with the help of a suffix tree has been proposed. Although this algorithm can efficiently enumerate all maximal biclusters, we often observe many overlapping clusters. By combining such clusters, we can interestingly observe that all genes in the combined cluster behave quite similarly within a common time span, but differently after that. This observation is expected to provide valuable suggestions to experts. Thus, we introduce a notion of pseudo-biclusters. A pseudo-bicluster consists of several maximal biclusters with some overlap. We design a polynomial time algorithm for finding them with a suffix tree. Some experimental results for gene expression data of ascidian (Hoya) are also presented, showing an interesting actually-extracted cluster.

Key words: Biclustering, pseudo-bicluster, suffix tree, gene expression data, time series data.

Feature Extraction by Geometric Algebra from Geometric Data

Minh Tuan Pham

(School of Engineering, Nagoya University)

Kanta Tachibana

(School of Engineering, Nagoya University)

Eckhard Hitzer

(School of Engineering, University of Fukui)

Sven Buchholz

(Institute for Informatics, University of Kiel)

Tomohiro Yoshikawa

(School of Engineering, Nagoya University)

Takeshi Furuhashi

(School of Engineering, Nagoya University)

Most conventional methods of feature extraction for pattern recognition do not pay sufficient attention to inherent geometric properties of data, even where the data have characteristic spatial features. In this study, we introduce geometric algebra to systematically extract invariant geometric features from spatial data given in a vector space. Geometric algebra is a multidimensional generalization of complex numbers and of quaternions, and can accurately describe oriented spatial objects and relations between them. We further propose a combination of several geometric features using Gaussian mixture models. We demonstrate our new method by classification of hand-written digits and alphabetic characters.

Key words: Geometric algebra, feature extraction, Gaussian mixture model, pattern recognition, mixture of experts.

Path Analysis in a Supermarket and String Analysis Technique

Katsutoshi Yada

(Faculty of Commerce, Kansai University)

This paper presents the availability and usefulness of a string analysis technique for developing useful rules to determine customers' visiting patterns in sales area. It focuses on stationary states of customers in certain sales areas in a store. We apply a string analysis technique, EBONSAI, to sales area visiting patterns to effectively deal with a huge stream of data. Experiments were conducted to extract useful rules and findings about characteristics of sales area visiting patterns and we discuss problems remaining in existing string analysis techniques.

Key words: Supermarket, marketing, RFID, string analysis technique, EBONSAI.