第47回統計的機械学習セミナー / The 47th Statistical Machine Learning Seminar

Date&Time
2019年8月19日(月)16:00~17:30
/ August 19, 2019 (Mon) 16:00 - 17:30

Admission Free, No Booking Necessary

Place
統計数理研究所 セミナー室5 (D313・D314)
/ Seminar room5 (D313,D314) @ The Institute of Statistical Mathematics
区切り線
Speaker
Burcu Can
Hacettepe University (Turkey)
*The talk will be given in English
Title
Bayesian Models in Unsupervised Learning of Morphology and Syntax
Abstract
Agglutinative languages are built upon words that are made up of a sequence of morphemes. Although morphemic structure of the language enables a productive word generation that handles both syntax and semantics during the generation of new words, in other respects this production causes sparsity in the language, thereby brings one of the most serious problems in natural language processing. One solution to mitigate the sparsity is morphological segmentation. One of the topics that I will mention about in this talk is our recent work on unsupervised morphological segmentation using non-parametric Bayesian models. Here, we use tree structured Drichlet processes for morphological segmentation where the words are located on a forest of trees with nodes each representing a morphological paradigm. The sparsity brings another problem in syntax. I will also mention about another model where part-of-speech (PoS) tags are learned along with stems simultaneously using a Bayesian model.