Thursday, January 20, 2011

Summary of the first week reading on semantic parsing

I find reading the comments by different people is a great pleasure. People have looked at the problem from different perspectives. For example, people have used unsupervised methods to attack the problem, which is of less cost but is weaker in evaluation because of the lack of gold standard.

As far as I could see, the comments for the selected paper fall into the following categories:

Lexicon induction and weight initialization: Michael has read the paper Learning for Semantic Parsing with Statistical Machine Translation" (Wong and Mooney, 2006) which utilizes the IBM alignment model to generate the lexicon. The paper uses the lexicon generated by the IBM 5 model directly for semantic parsing while the Kwiatkowski paper uses only the IBM 1 to initialize the feature weights. It is suggested that IBM 5, when used, requires more expert knowledge of the task to make it work better.

Unsupervised methods: Dong has read the paper "Unsupervised Semantic Parsing, Hoifung Poon, Pedro Domingos, EMNLP 2009" which employs an unsupervised bottom up methods to obtain the semantic representation. The method has the advantage of using clustering to account for syntactic variations of the same meaning. However, because of the unsupervised nature, the method is difficult to evaluate because of the lack of gold standard. The final evaluation in comparison with information extraction methods could not seem to completely account for the performance on semantic parsing itself.

Alternative models:
Matt read the paper "Wide-coverage semantic representations from a CCG parser" and analyzed the generality between the focus paper model and the selected paper model. He suggested that examining the failure mode and coming up with new ways to split the logical form would be a good way to improve the performance.
Daniel read the paper A Generative Model for Parsing Natural Language to Meaning Representations" (Lu et al, 2008) which jointly models the sentence and semantics, similar to CCG. The drawback of the model is that it generates the trees top down and thus uses less context, and therefore need re-ranking in the end. Daniel also compared the training methods used in the focused paper model and the selected paper model, where the former utilizes an alternation of refining the lexicon and re-parsing the training data, and the latter uses EM.

Linguistics:
Brendan read the review "Constraint-based approaches to grammar: alternatives to transformational syntax" and gave a detailed description of the mechanism of the CCG formalism through a few examples. Because there are different variations in the CCG formalism, he suggested that the one used in the focused paper might be different from the one introduced the review.

Similar problems:
Alan read the paper "Learning Context-Dependent Mappings from Sentences to Logical Form". He suggested that the focused paper looks at a more general problem which maps logical forms over multiple languages, compared to the selected paper. He also suggested that "The 2009 paper presents both a simple method of contextual analysis along with a bare-bones linear model to produce roughly 80% accuracy. Looking at the results, we also see that even examining just the most recent statement before a sentence, almost doubles the accuracy."
I have also read the paper "Learning Context-Dependent Mappings from Sentences to Logical Form". I feel the contribution of the paper is that it utilizes the hidden derivation to represent the out, and manages to learn a model over it and the input sentence. One shortcoming of the method, I feel, is that the method is heavily based on heuristically generated rules obtained from the ATIS data set, which might hinder its generalization to other domains.

For the focused paper, people have conducted comparison between the focused model and the selected model. In general, the focused model is generalized to deal with different representations of semantics and the approaches used at different components of the model could be improved, eg. the logical splitting method and so on.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.