CMU Advanced NLP Seminar 2011: Comments week 3

Focus paper: An Entity-Level Approach to Information Extraction
Aria Haghighi and Dan Klein, ACL 2010
Pre-meeting

In addition to the focus paper, I read:

Using Document Level Cross-Event Inference to Improve Event Extraction
Shasha Liao and Ralph Grishman, ACL 2010
http://aclweb.org/anthology-new/P/P10/P10-1081.pdf

Both papers are about the template-filling problem, and they both try to incorporate more information than just the local context as evidence. The focus paper presents a generative model, while the related paper incorporates global context features in their classifiers.

The related paper by Liao and Grishman presents an approach that uses document level information to improve information extraction (event and role extraction). Their approach is build on several intuitions:

- First extract easier cases, then use this information to tag the harder cases
- If a word triggers a particular event, other instances of the word probably also trigger events of the same type
- There is a strong correlation between event types
- Role consistency
- Document level information can make labeling more consistent.

Their approach first applies a baseline, state-of-the art IE system. This system extracts information independently for each sentence. For their second step, they only keep the high-confident extracted events and roles using a heuristic threshold. Then two additional classifiers are used, for which the features uses the high-confident extracted events and roles. For example, an example feature they used contained a binary indicator whether the particular event type was also present elsewhere in the document.
I think their intuitions are really nice, and they paper provided a nice analysis to motivate this. However, I think their final approach could be more sophisticated instead of generating features with binary indicators as document level information.

I like the approach of Haghigi and Klein because they present a generative model. Furthermore they seem to use more sophisticated distance information (such as tree distance) in comparison with Liao and Grishman.

CMU Advanced NLP Seminar 2011

Wednesday, February 2, 2011

Comments week 3

No comments:

Post a Comment