Tuesday, April 19, 2011
Pre-meeting Post from Weisi Duan
I have read the paper “Automatically Identifying the Arguments of Discourse Connectives” by Ben Wellner and James Pustejovsky, appeared in EMNLP 2007. The paper is targeted at the problem of identitying discourse segments and whether a relation exists between two identified segments. The authors have formulated the problem as given a connective, what would be the discourse segment be (The relation seems to be already existing since the connective is already there as input.). The method employs a log linear model that wraps up features between the connective and the possible argument (discourse segment lexical head). The arguments are identified though the head finding algorithm (Collins,1999). The training method is not well described, as the authors claimed “the correct candidate receives 100% probability mass and the wrong ones receive 0”, which sounds like Collins’ perceptron and definitely not max entropy. The authors conducted some ablation of the features and discovered that the dependency parse features are better than constituent parse features. To utilize the features between the two arguements that one connective could have, the two models (one for each argument of the connective) are further wrapped up in Collins’ perceptron. The results turned to be better that of the previous models. The evaluation is conducted by comparing the predicted arguments (lexical heads) to ones in Penn Discourse Tree Bank. I like the paper but the formulation of the problem is a little sloppy that the argument to discourse segment mapping is not one-on-one, which means finding the argument does not necessarily lead to the correct discourse segement. If finding the argument is all their goal, they should probably treat the different segments as hidden variables and sum them out.