I have read “A Topic Model for Word Sense Disambiguation” by Jordan Boyd-Graber, David Blei, and Xiaojin Zhu, appeared in EMNLP 2007. For the focus paper, one thing I feel that is missing is the inter-rater agreement, especially in such as situation where there are a group of not highly reliable raters. The agreement might be calculated using measures such as Fleiss’ Kappa. Another issue I feel is that they could also address the problem of semantic similar topics (topics that have similar distributions), this problem results as that the users can not disambiguate the intruder from the intended topic words (as the authors suggested). In real semantic applications, people could wish to get rid of such semantic duplicated topics, and could do so if the similarity between two topics are known.
As the author suggested that the task specific evaluation measures should be preferred over perplexity, I decided to read the paper above. The Boy-Graber 2007 paper proposes a hierarchical bayesian model that integrates two bayesian models: LDA and WordNet-Walk. In a graphical model view, the model basically inserts into the LDA model a path node between the topic node and the word node. The state space of the node spans all the paths from the root synset to the a specific synset that could generate the observed word in the WordNet. The training is done through Gibbs Sampling and during inference, the lowest synset node of the path with max probability coming out from the assigned topic is assigned as the sense to the target word. The evaluation is done on different topic numbers and the results turn out to be inferior to the state-of-art. Besides the issues raised in the error analysis, I feel the model also does not address the path length issue--even if the parameters estimated for each edge on path is big, if the path is much longer than the competing paths, the model would not be able to pick the correct sense confidently. Being bayesian, the model can be integrated into a bigger model, as the authors claimed, however, as with the Haghighi paper, the model itself is complex for inference, when combined into a more complex model, the inference is going to much complex and how good the approximation could be is unknown. In general, this paper provides an elegant generative model and shows that more knowledge and sense specific features are needed in order to do better in WSD.