Seminar on topic model and deep learning

9 March , 2018 (Fri.) 13:30-15:30

Admission Free,No Booking Necessary

The Institute of Statistical Mathematics, Tokyo, Japan.
Seminar room 2 (3rd floor ・ D304)

"Unsupervised probabilistic topic modeling"

Hao Lei (National University of Singapore)

Probabilistic topic modeling is to extract the key information ‘topics’ from the unstructured text data. An extensively studied and applied model is the Latent Dirichlet Allocation (LDA). One of the unsolved problems in the  field is to determine the number of topics. The usual approach is to try different numbers, for example, 10, 20, 30 etc, and compare their performance on the validation dataset.
In this project, we propose an automatic method to find the ‘optimal’ number of topics as well as key words in each topic so that the probability distribution will be concentrated on fewer ‘significant’ words. We implement to analyze financial news data comprised of 644,211 articles from 2006-04-02 to 2017-04-01.


"Topic modeling and sentiment analysis on Japanese financial analyst reports"

Hitoshi Iwasaki (National University of Singapore)

We propose an asset pricing model using natural language processing (NLP). Although a series of works has  been done on news articles and corporate disclosures, analyst reports are not studied as much as other texts due to its limited availability. We gathered 76,384 Japanese analyst reports on the stocks listed on Tokyo Stock Exchange in 2016 and 2017 and perform Latent Dirichlet Allocation (LDA) analysis to identify key topics that are influential on the stock returns on the subsequent trading days. A variety of sentiment models such as Long Short Term Memory (LSTM) and Convolutional Neural Network (CNN) are then adopted to assign sentiments based on the key topics. We conduct an empirical test with the calculated topic sentiment scores to quantify the influence of each key topic.


"Interpretable forecasting of financial time series with deep learning"

Ilija Ilievski (National University of Singapore)

In this talk I will present our deep learning approach to forecasting financial multivariate time series which indicate the market sentiment towards a financial asset. The interpretable deep neural network reveals the essential dependence between the time series' variables, and in contrast to the widely used vector autoregressive model, the deep learning model dynamically adapts the dependence coefficients to the ever-changing market conditions. Thus, the proposed method permits the study of the inter-variable relationships which yields a better understanding of the asset's future price movements and consequently increases the profitability of the asset's trading activities. I will conclude the talk with dependence analysis and forecasting performance for financial assets from different sectors and with vastly different market capitalisation.



Discussant: Ying Chen (National University of Singapore)