Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 197-208 (2020)

Trend of IR in Japan: Emergence of IR Including Management, Teaching and Learning and Research

Reiko Yamada
(Faculty of Social Studies, Doshisha University)

In recent years, interest in IR, which was developed at universities in the United States in the 1960s, has increased. IRs in the US are well established in areas such as university management support, decision support, strategic planning, academic improvement and assessment, but IRs related to research are decentralized, and are conducted by the department in charge of research rather than within the organization's IR division. On the other hand, in Japan, many universities aim to achieve a global reach, and the number of universities engaged in ``Research IR'' is increasing due to external pressure to raise their ranking. The IR in Japan currently changes with policy trends, and it is extremely difficult to provide a unified definition of IR because of its various functions. This paper first examines the definition of IR and explores the relationship with the trend of IR from the relationship with the environment surrounding higher education in Japan. Next, after defining research IR so as to measure the quality of published journals (qualitative indicators), the number of articles and the number of citations (quantitative indicators), etc., through a web survey targeting URA, we examine what type of university is affected by the visualization of research results of research IR activities. Based on the results of the web survey, the research IR might have had a beneficial effect not only in the immediate past, but also over a 10-year span, on universities that accepted a small amount of money per case of scientific research expenses. Despite the limited data at this time, research IR is not necessarily developed for only large-scale research universities, but also has a positive effect on universities with a small amount of money accepted per case of scientific research expenses.

Key words: Decision making support, IR, research support, academic excellence, ranking, quality assurance.

Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 209-218 (2020)

Author Identification for Scientific Database with Topic Modeling and Its Performance Comparison

Tomokazu Fujino
(International College of Arts and Sciences, Fukuoka Women's University)
Hiroka Hamada
(The Institute of Statistical Mathematics)

We propose a method for extracting a list of articles from a scientific literature database whose authors are researchers at a specific organization. The method uses topic modeling, a technique for statistical natural language processing. Topic modeling is applied to papers in which the organization name is included, and feature vectors for each author are created. Based on this, author identification is performed, including the names of researchers in the organization and not containing the organization name. We compared this discrimination performance between several topic models such as Latent Dirichlet Allocation, Dirichlet Multinomial Regression, and Correlated Topic Model.

Key words: Scientific database, evaluating research performance, statistical natural language processing, institutional research.

Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 219-231 (2020)

Understanding Research Trends Based on Article Abstracts Using Topic Modeling

Mio Takei
(The Institute of Statistical Mathematics)
Tomokazu Fujino
(Faculty of International College of Arts and Sciences, Fukuoka Women's University)
Junji Nakano
(The Institute of Statistical Mathematics/Faculty of International Economics, Chuo University)

The financial difficulties experienced by universities due to declining birth rates and aging populations are becoming a social problem. It is necessary to identify and evaluate the trend of research activities inside and outside universities in order to strategically select support targets in these institutions. Methods in research evaluation often use article citation information such as the impact factor. However, it has been pointed out that there are several problems with this approach. Therefore, we employ a model that applies the Hierarchical Dirichlet Process (HDP) to Latent Dirichlet Allocation (LDA) for the inference of topics using abstracts of articles in which the research content is directly expressed, and show a method for determining the research trend of each target organization and group. We use abstracts from representative journals in the field of statistical sciences and from institutes related to statistical sciences to analyze the method. In the analysis, we confirm that the results can identify the research characteristics for each target group and the research trends for each year of publications.

Key words: Topic modeling, nonparametric Bayesian statistics, Hierarchical Dirichlet Process, institutional research.

Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 233-246 (2020)

Visualization of Research Fields Achieving Good Results in a Large University

Takamitsu Funayama
(Tohoku Medical Megabank Organization, Tohoku University)
Yoshiro Yamamoto
(School of Science, Tokai University)
Tomokazu Fujino
(International College of Arts and Sciences, Fukuoka Women's University)

Large universities employ many researchers, and because research fields are extensive, it is difficult to grasp the overall research activities of a university. Understanding the research situation on a campus is necessary not only for evaluation, but also for determining future support. Therefore, in this study, we extracted text data comprising the titles and abstracts of papers contained in an academic literature database and used a topic model to estimate the research fields of those papers. In addition, we tried to estimate which research fields were achieving good results. The results demonstrated that it was possible to grasp the features of each topic classified by the topic model, as well as the relationships between topics, by visualizing the results of the topic model using a self-organizing map (SOM). We used an example to make it easy to apprehend the research trends of the university and their changes over time through the SOM visualization.

Key words: Topic model, SOM, data visualization.

Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 247-264 (2020)

Citations of Academic Articles and Statistical Articles in Fields of Sciences

Livia Lin-Hsuan Chang
(Department of Statistical Science, School of Multidisciplinary Sciences, Graduate University for Advanced Studies)
Frederick Kin Hing Phoa
(Institute of Statistical Science, Academia Sinica)
Junji Nakano
(Department of Global Management, Chuo University/The Institute of Statistical Mathematics)

Statistics has obtained more attention in recent years due to the rise of big data analysis and machine learning. Statistics are widely used in academic studies that require statistical analysis to objectively support their conclusions. In modern society, there exist many academic fields, and competition among them is severe. In order for statistics to survive such competitions, it is important for statisticians to measure the influence of articles in the field of statistics relative to those in other academic fields. In this work, we analyze citations within each academic field, focusing on citations of statistical articles. We used a database of academic articles from ``Web of Science'' to define academic fields and to count the required numbers of citations in the study.

Key words: Academic fields, citation analysis, Web of Science.

Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 265-285 (2020)

Using an Academic Literature Database to Evaluate International Interdisciplinary Fusion in IoT Research through Coauthor Analysis

Yuji Mizukami
(College of Industrial Technology, Nihon University)
Junji Nakano
(Faculty of Global Management, Chuo University/The Institute of Statistical Mathematics)

In 2011, the German Academy of Technology and the German Federal Ministry of Education and Science announced the Industry 4.0 technical framework, which aims to make all social systems more efficient, create new industries, and improve intellectual productivity. The foundational technologies of Industry 4.0 are the Internet of Things (IoT), big data, and artificial intelligence. This paper focuses on research on IoT technology that bridges the physical space and cyberspace, and analyzes each country's research promotion strategy from the standpoint of integrating different fields. In our analysis, we conducted an international comparison to examine the level of fusion among different domains of IoT research. We considered varied perspectives and approaches, and employed methods derived from a series of studies to perform principal component analysis and hierarchical clustering analysis.

Key words: IR, research ability, coauthor analysis, innovation.

Proceedings of the Institute of Statistical Mathematics Vol.68, No.2, 287-303 (2020)

Fusing Adjacent Classes in an Ordinal Logistic Model via Group Regularization

Mizuho Naganuma
(Graduate School of Informatics and Engineering, The University of Electro-Communications; Now at Macromill, Inc.)
Kohei Yoshikawa
(Graduate School of Informatics and Engineering, The University of Electro-Communications)
Shuichi Kawano
(Graduate School of Informatics and Engineering, The University of Electro-Communications)

This paper aims to fuse adjacent classes in an ordinal logistic model in light of the multi-class classification problem. Fusing the classes enables us to easily interpret the constructed model and remove irrelevant classes. Fusion of classes is performed when two adjacent classes have the same posterior probability. To this end, we developed an ordinal logistic model with group regularization for fusing adjacent classes. We established an estimation algorithm based on the alternating direction method of multipliers, and used Monte Carlo simulations and real data analysis to investigate the usefulness of our proposed method.

Key words: Adjacent-categories logit model, alternating direction method of multipliers, group lasso, ordinal categorical data.