Burcu Can Hacettepe University (Turkey) *The talk will be given in English
Title
Bayesian Models in Unsupervised Learning of Morphology and Syntax
Abstract
Agglutinative languages are built upon words that are made up of a
sequence of morphemes. Although morphemic structure of the language
enables a productive word generation that handles both syntax and
semantics during the generation of new words, in other respects this
production causes sparsity in the language, thereby brings one of the
most serious problems in natural language processing. One solution to
mitigate the sparsity is morphological segmentation. One of the topics
that I will mention about in this talk is our recent work on
unsupervised morphological segmentation using non-parametric Bayesian
models. Here, we use tree structured Drichlet processes for
morphological segmentation where the words are located on a forest of
trees with nodes each representing a morphological paradigm. The
sparsity brings another problem in syntax. I will also mention about
another model where part-of-speech (PoS) tags are learned along with
stems simultaneously using a Bayesian model.