CMU Advanced NLP Seminar 2011: Week 3 - Pre-meeting Commentary

Focus paper: An Entity-Level Approach to Information Extraction

Related paper: A Unified Model of Phrasal and Sentential Evidence for Information Extraction

Authors: Siddharth Patwardhan and Ellen Riloff

Patwardhan and Riloff present GLACIER, a probabilistic model for extracting role-filling entities from sentences. Like Haghighi and Klein, the authors incorporate sentential information beyond local context to determine role-fillers, though not to the extent of the focus work as context is still limited to the sentence level. The system employs a sentential event recognizer to determine if a sentence discusses a relevant event for which role-filling entities can be extracted, followed by a plausible role-filler recognizer to extract such entities. The role-filler recognizer is implemented as a Naive Bayes classifier that considers contextual features generated by various off-the-shelf NLP tools such as named entity recognizers, shallow parsers, and semantic dictionaries. The sentential event recognizer, implemented alternatively as a NB classifier or SVM classifier, uses similar features calculated for all sentence NPs plus additional sentence-level features. The GLACIER model outperforms a context-only baseline on test data due primarily to (1) extracting entities with inconclusive local context but clearer sentence-level context, and (2) reducing false positives by identifying uneventful sentences and not attempting entity extraction. When viewed alongside the focus work, these results provide an intermediate data point that helps demonstrate the benefit of incrementally increasing context scope and model sophistication to improve performance of IE systems.

CMU Advanced NLP Seminar 2011

Wednesday, February 2, 2011

Week 3 - Pre-meeting Commentary - Michael

No comments:

Post a Comment