CMU Advanced NLP Seminar 2011: Pre-meeting commentary for 3/3

Wednesday, March 2, 2011

Pre-meeting commentary for 3/3

Focus Paper: Tense Sense Disambiguation: a New Syntactic Polysemy Task. Roi Reichart and Ari Rappoport. EMNLP 2010
Additional Reading: Syntactic features for high precision Word Sense Disambiguation. David Martinez et al. COLING 2002

This paper explores novel syntactic features for use in improving precision in Word Sense Disambiguation. The authors use existing supervised machine learning methods, Decision Lists, and AdaBoost. In addition to the standar n-gram type features already in use by earlier models of WSD, the authors add features for dependencies to specific words being present with a given word sense, as well as features for subcategorization frames. The authors then experimented with various types of thresholding in order to boost precision at the expense of recall.
When no thresholding, they found that the specific syntactic features helped precision dramatically, but the subcategorization features led to a higher F score, with the combination of the two being slightly better than the subcategorization features. They further find that AdaBoost works significantly better than Decision Lists when syntactic features are taken into account, although not with the basic feature set. AdaBoost combined with syntactic features outperforms either method with only basic features.
The authors found that there was not an effective thresholding mechanism for AdaBoost, but with decision lists, by only using features with high confidences, it was possible to prune the results to 95% precision and 7% recall.

CMU Advanced NLP Seminar 2011

Wednesday, March 2, 2011

Pre-meeting commentary for 3/3

No comments:

Post a Comment