Publications MS Theses

Interpretable Depression Detection from Social Media using Hierarchical Attention Network with Depression Indicators

Hoyun Song
MS Thesis, KAIST, 2018.
Show abstract
In order to effectively diagnose depression, which is one of the most harmful mental disorders, many researchers used social media by analyzing the differences in language use. However, detecting depression from social media has problems such as a small proportion of posts with depression indicators and difficulties for distinguishing depressive symptoms from temporarily depressed feelings. To address these problems, we propose hierarchical attention with depressive indicators inspired by the process of diagnosing depression by a person with domain knowledge. Our model provides not only interpretations, but also their visualizations with learned weights through attention mechanism. With this model, we can investigate different aspects of posts with depressive indicators based on psychological theories, which will help researchers to find useful evidence for depressive characteristics.

Mitigating Stereotypes in Word Embedding through Sentiment Modulation

Huije Lee
MS Thesis, KAIST, 2018.
Show abstract
Word embedding is an influential framework to quantify the meaning of a word, which is widely used in machine learning at a pre-processing level for natural language processing (NLP). However, word embedding trained with a large number of contexts encodes not only general syntactic and semantic meaning of a word, but also the stereotypes and biases that people may have. This thesis proposes a method to indirectly mitigate the stereotypes in the trained word embedding by modulating the dimension of sentimental attributes in a human entity without imposing equal probability on the compatible social groups. To prevent the word embedding from creating problematic predictions such as a stereotype threat, we modulate the strength of the association between a human entity and sentimental attribute and indirectly reduce the gender bias of the embedding model. We show that the proposed method preserves the overall embedding performance. We also confirm that increasing the strength of the association between human entities and sentimental attributes amplifies the model bias through experiment.

Using syntactic structure to extract prominent gene regulatory network from the literature

Wonsuk Yang
MS Thesis, KAIST, 2017.

Computational Identification of Sequence Variation and Environmental Condition in Clinical Depression from Biomedical Literature

Jinseon You
MS Thesis, KAIST, 2016.
Show abstract
Clinical depression is a complex disease, which is known to be influenced by various factors. As genetic and environmental factors are frequently referred to as the most influential in causing depression, there have been many studies that try to identify genes or proteins and environmental conditions associated with depression. While a number of text-mining (TM) systems identifying information about the genetic factors in the biomedical literature have consequently been developed, there is currently no TM system specifically targeted at extracting environmental conditions. As a result, biologists are provided only with incomplete information about depression by these TM systems, unable to help them to discover the etiology and treatment of depression. In the thesis, we propose a TM system that considers an interaction between genetic and environmental factors associated with depression. The system identifies not only relations between a sequence variation and depression but also changes in the relations according to environmental conditions. In order to develop the system, we split the system into two TM subsystems. The first system is applied to an existing system for extracting the relations between a sequence variation and depression from the biomedical literature. The system classifies whether the relations are positive or negative on a document level. Based on the dictionary with candidate terms for environmental conditions, the second system identifies the conditions in the biomedical literature containing the binary relations. Using the dependency of sentence, the system excludes terms wrongly classified as the conditions. The system is a first TM system considering a ternary relation among sequence variation, disease and condition. Through the system, we are able to provide more comprehensive information about depression than other systems. We expect that, as the system is applied to other diseases, biologists can easily identify diverse information associated with changes in symptoms of diseases including depression.

Synchronization of Non-Manual Signals in Sign Language with Sequence Prediction

Jung-Ho Kim
MS Thesis, KAIST, 2016.
Show abstract
There are various types of non-manual signals in sign language, which carry important linguistic information such as feeling, semantic difference and nuance. Upon investigation into the nature of non-manual signals in the bible and literature corpus, we find that several types of non-manual signals appear on a single word. It implies the possibility of the context in signed utterances. This thesis experimentally unravels the nature of non-manual signals and proposes a prediction model for the non-manual signal sequence and its advanced approach. The correlation between non-manual signals is measured by utilizing their co-occurrence rate. The result shows close correlations among 'Trunk', 'Head', 'Brow to Eye-gaze' and 'Mouth'. To verify the existence of the context, a prediction model using conditional random fields trained on a sequence of 'gloss'-'non-manual signal' pairs is proposed, which shows superior results in comparison with a 'gloss'-'non-manual signal' dictionary-based approach. This result suggests that synchronized non-manual signals can be predicted by the proposed model when the training is done with other non-manual signals. Also it means that the accuracy is expected to increase as we fine-tune such signals. As a result, all experiments show better performance when a sequence of 'Brow to Eye-gaze' is used as a training data.

Mention-Level Gene Normalization on Multi-Species and Multiple Identifiers

Joon-Yeob Kim
MS Thesis, KAIST, 2014.

Generating Chatting Messages in a Consistent Style with Authorship Attribution Methods

Sang-Chae Kim
MS Thesis, KAIST, 2013.

Fairy Tale Summarization through Sentence Selection

SeungJoo An
MS Thesis, KAIST, 2012.

Identifying Sentence Types in Korean with Morpho-Syntactic Analysis

Jin-Woo Chung
MS thesis, KAIST, 2011.

Automatic Sign Language Generation Reflecting the Relationship between Entities

SangYoon Jung
MS thesis, KAIST, 2010.

Extracting Melodies from Piano Music Based on Characteristics of Music

Yoonjae Choi
MS thesis, KAIST, 2009.

Function-focused Gene Clustering by Utilizing Granularities of Gene Functions

Tak-eun Kim
MS thesis, KAIST, 2009.

Automatic Identification of the Relation between Dependency Relations and Definitions of GO Concepts

Seung-Cheol Baek
MS thesis, KAIST, 2009.

Computational Processing of Verb Agreement for Automatic Generation of Sign Language Animation

Sangha Kim
MS thesis, KAIST, 2008.

Document Similarity Assessment with Natural Language Processing: Applications to Background Music Recommendation for Blog Articles

Doojin Park
MS thesis, KAIST, 2007.

Generation of Coherent Gene Summary

Chan-Goo Kang
MS thesis, KAIST, 2006.

Identification of Emotional Flow from Natural Language Documents

Hye-Jin Min
MS thesis, KAIST, 2005.

Automated Digital Cinematography with Natural Language Processing

Semin Jang
MS thesis, KAIST, 2004.

Automatic Translation of Korean into Korean Sign Language with Combinatory Categorial Grammar

Jiwon Choi
MS thesis, KAIST, 2004.

Applications to Molecular Interactions: Customized Visualization for Knowledge Discovery with Information Extraction

Changsu Lee
MS thesis, KAIST, 2004.
(Outstanding M.S. Thesis Award, 2004. 2.)

Kyung Wha Hong, Anaphora Resolution for Contextually Appropriate Text Animation

Kyung Wha Hong
MS thesis, KAIST, 2004.

Integrated Morphological Analysis for Korean in a Combinatory Categorial Grammar Framework

Ho-Joon Lee
MS thesis, KAIST, 2003.

Diphone-based Intonation Generation for Korean with Combinatory Categorial Grammar

Lee Hwa Jin
MS thesis, KAIST, 2002.

Automatic Synthesis of Multimedia Tales with Combinatory Categorial Grammar

Hyun Sook Kim
MS thesis, KAIST, 2002.

Computational Processing of Honorifics in Korean with combinatory Categorial Grammar

O Shik Kwon
MS thesis, KAIST, 2002.

Computational Processing of Floating Quantifiers in Korean with Combinatory Categorial Grammar

Jin-Bok Lee
MS thesis, KAIST, 2001.

Coordinate Constructions in Korean and Parsing Issues in Combinatory Categorial Grammar

Hyung-joon Cho
MS thesis, KAIST, 2000.