CMU Advanced NLP Seminar 2011: Week 3 - Pre-meeting commentary

This week's focus paper discussed about a generative solution to the template filling problem. The generation process is composed of three components - Semantic: generating entities for roles, Discourse: entity indicators for the mention and Mention Generation. The learning is done through variational EM. A major assumption of the paper is the one-to-one mapping of roles to entities.

The related papers read this week had a great amount of overlap. They concerned (1) Co-reference resolution, (2) Another approach to role filling using event recognition, and (3) Event extraction.

Matt, Michael and I read "A uniﬁed model of phrasal and sentential evidence for information extraction" by S. Patwardhan and E Riloff. The paper proposes classifying that a sentence conveys an event as a factor to a mention that relates to an entity fills the role. The authors claim that one of the significant contributions of the paper is classifying a sentence as an event sentence. The GLACIER model outperforms a context-only baseline on test data due primarily to (1) extracting entities with inconclusive local context but clearer sentence-level context, and (2) reducing false positives by identifying uneventful sentences and not attempting entity extraction.

Daniel, Alan and Weisi read "Coreference Resolution in a Modular, Entity-Centered Model". This model is different from the focus paper in the sense that it uses a log-linear model over multiple features for the discourse component, as compared to tree-distance. Another difference is that the assumption that a role is mapped to a single entity is not made.

Dong read "An Entity-Level Approach to Information Extraction". It uses a much broader context - document to the template filling problem. It tries to find out the easy cases, and then uses this information to track the hard cases. It assumes that mentions are likely to trigger the same event in a given context; there is a strong corelation between event types; roles are consistent across events and document as a strong context.

Brendon read on FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text, which discusses a fast transducer to extract events and their attributes. It is a pipeline of pattern recognizers. The domain independent and domain specific operations are clearly demarcated thus making the system easy to adapt to new domains.

CMU Advanced NLP Seminar 2011

Thursday, February 3, 2011

Week 3 - Pre-meeting commentary

No comments:

Post a Comment