Collective journal for participants in the Advanced Natural Language Processing Seminar at the Language Technologies Institute, Carnegie Mellon University, in Spring 2011.
Thursday, January 27, 2011
Metaphor Comics
The paper I mentioned on metaphors in sign language will be the subject of discussion for Monday's Linguistics Reading Group (which meets at 2:30 in GHC 7501). Anyone who's interested is welcome to participate.
- Nathan
Reading for 2/3/11: Haghighi and Klein, 2010
Request (new!): When you post to the blog, please include:
- Your name (plus "leader") if you are leading the discussion
- Which focus paper this post relates to
- Whether this is the pre-meeting review or the post-meeting summary
- Leave a comment on this post (non-anonymously) giving the details of the related paper you will read (include a URL), by Monday, January 31.
- Post your commentary (a paragraph) as a new blog post, by Wednesday, February 2.
Summary of week 2 commentary
Dani, Alan, Dhananjay and I read papers on metaphor detection.
Dani's paper, Metaphor Identification Using Verb and Noun Clustering, combines a small amount of seed knowledge in the form of source-target domain mappings with word clustering in order to generalize those mappings. The clustering is done using parse information and a spectral clustering algorithm. To evaluate, they sampled randomly from the output of their system, and had human annotators judge the sampled sentences, obtaining a precision of .79.
Alan's paper, Comparing Semantic Role Labeling with Typed Dependency Parsing in Computational Metaphor Identification, focused on a slightly different task: finding patterns in text that commonly indicate the use of metaphor. The paper found that semantic role labels are slightly more useful than typed dependency arcs for extracting semantic relations from text, but overall, the paper mostly discussed the problem instead of their solution.
Dhananjay read the paper Catching Metaphors, which used a maximum entropy classifier to detect the metaphorical usage of verbs. The features used in the model were the prior belief of each verb being used metaphorically, and the type of the verb's arguments. The paper used WSJ data that they annotated. A high accuracy of 96.98% accuracy is reported, although this weakened by the fact that over 90% of the verbs that were annotated were marked as metaphorical.
I read the paper Hunting Elusive Metaphors Using Lexical Resources. This paper looked at a wider range of phenomena than other metaphor detection papers, nouns, verbs, and adjectives, but used relatively simplistic techniques. In order to discover IsA metaphors, they simply checked to see if the first noun was a hyponym of the second, using WordNet. For verb and adjective metaphors, they used a method based on computing the frequency of a noun's hyponyms occurring as arguments of the predicate. Their evaluation was not well explained, and they had unimpressive results.
Next, Matt, Michael, Dong, and Weisi read papers that looked at metaphor interpretation.
Matt and Michael read the paper
Dong and Weisi's paper, Automatic Metaphor Interpretation as a Paraphrasing Task, addressed the problem of finding literal paraphrases of metaphorical verbs. They use a variety of methods to obtain and filter a list of possible paraphrases using WordNet similarity, likelihood given the context, and a selectional preference measure. They hand-annotated a set of sentences with a ranked list of possible verb paraphrases, and evaluated their system on 1st choice accuracy and mean reciprocal rank, getting an accuracy of .81. Weisi makes a comparison of this task to word sense disambiguation, noting that is much easier, since the problem of detecting metaphor is already taken care of.
Overall, the topic of metaphor in NLP seems to suffer from a lack of a good definition, and no standardization of evaluation. None of the papers present results that are comparable to any of the others, so it is hard to say conclusively what sorts of techniques are preferable.
Wednesday, January 26, 2011
Comments for week 2
T. Veale and Y. Hao. 2008. A fluid knowledge representation for understanding and generating creative metaphors. In Proceedings of COLING 2008, pages 945–952, Manchester, UK.
The goal of this paper is metaphor interpretation and generation (ambitious!). Michael gives a good overview of their approach which I would breakdown into 3 steps:
1. Extract facts
2. Link facts
3. Use knowledge representation to interpret metaphors
#1 seems straightforward and they accomplish it using WordNet and the web. They also have empirical results demonstrating the quality of their facts.
#2 seems more difficult and they give some small amount of details of how they identify closely related facts using semantic relations in WordNet. #3 they mostly explain by examples linking together two seemingly unrelated nouns like "Pope" and "Don (Crime Father)". It seems to me that the algorithm can be thought of as constructing a graph using some heuristic rules and then finding a path (any path?) through the graph from A to B. Its unclear to me how this could be used to interpret a new metaphor and the authors don't seem to address this directly. This also seems to contrast with earlier cited work which is also called a slipnet and is referred to as a "probabilistic network". As far as I can tell, there is nothing stochastic about their approach. I briefly endeavored to read the earlier work (Hofstadter, 1994) but failed due to the fact that it is 80 pages.
They also don't have any way of measuring the quality of their interpretations which, in fairness, seems like a difficult task.
Matt
Related Paper - Jan 27
E. Shutova, L. Sun and A. Korhonen. 2010. Metaphor Identification Using Verb and Noun Clustering. In Proceedings of COLING 2010, Beijing, China.
http://www.cl.cam.ac.uk/~es407/papers/Coling10.pdf
The paper describes a word clustering approach to metaphor identification. Their decision to use word clustering is based on hypothesis that target concepts associated with a source concept appear in similar lexico-syntactic environments, and clustering will capture this relatedness by association. The method starts with a small set of seeds of source-target domain mappings, extracts rich features from a shallow parser, and uses spectral method to perform noun and verb clustering. The resulting noun clusters are considered as target concepts in the same source domain, and the resulting verb clusters are considered as source domain lexicon.
As for the results, they were able to get some nice metaphors that represent broad semantic classes such as {swallow anger, hurl comment, spark enthusiasm, etc.} from seeds {stir excitement, throw remark, cast doubt}, which the WordNet-based approach (baseline) cannot acquire. They evaluated the methods using precision, and got 0.79 (baseline 0.44). I don't think these numbers are convincing though since they randomly sampled sentences annotated by the systems and asked five human annotators to judge, but they did not report the size of the sample (or maybe I missed it?). Also, though there is no large annotated corpus for metaphor identification, it would be nice if they had reported recall on smaller data just to get an idea of the coverage of the method.
Readings for Jan. 27
The additional paper I read attempted to use semantic role labeling in pace of typed dependency parsing to improve CMI, which is an approach that aims to identify patterns that indicate metaphors, and not worry so much about actually identifying each metaphor. I suppose you could call it a higher-order version of metaphor detection. Unfortunately, the paper doesn't have much to report in terms of result. Much of the paper was spent asking questions and giving background rather than actually talking about the significance of the research. It turns out that semantic role labeling proves slightly more effective in extracting relationships that have more semantic importance, but it's a double edged sword in that the granularity may be too fine to prove of effective use for input into other systems.
Commentary for Jan 27th
Related paper for Jan 27
http://www.aclweb.org/anthology-new/W/W06/W06-3506.pdf
It tries to classify whether a particular occurrence is a literal or metaphorical using a maxent classifier. The paper restricts the problem to identifying the metaphorical nature to verbs. It works on an corpus annotated by Propnet. The feature set is the bias of the verb (percentage of occurrences of metaphor to literal), and the type of the arguments of the annotation. Based on crossvalidation, the authors claim an accuracy of 96.98%.
Comments on Davies and Russell
It was completely different than the focus paper. The focus paper was talking about NLP work in metaphors, which was very poorly defined and often just "figurative language with unusual argument types" or so. In fact, when the paper quoted Nunberg 1987, I think that was an awfully good takedown of the entire premise of the article -- aren't metaphors just another word sense? What's interesting about metaphors is that their **semantics** is derived from, or somehow implicated by, the semantics of the other non-metaphorical word senses. The metaphor recognition task of finding selectional restriction violations seems kind of contrived. How is it useful or meaningful to claim "cold" as in "cold person" is metaphorical? Maybe the other theoretical work cited has more details (like Lakoff or Gentner) but it wasn't explained.
The Davies and Russell paper focuses on defining analogical reasoning and giving it a normative account. It's from KR&R AI, no language involved. It says that analogical reasoning often takes the form of
inferring a conclusion property Q holds of target object T
because T is similar to source object S by sharing properties P
[[Note: I think "source" and "target" are standard terms in the literature. Maybe Lakoff introduced them? Lakoff predates this work and is cited.]]
P(S) ^ Q(S)
P(T)
-------
Q(T)
The paper points out there are analogical reasoning systems that use heuristic similarity of S and T to justify Q(S) => Q(T).
They work out a "determination rule" among predicates that I interpret as saying the properties P and Q are either correlated or inverse-correlated, but not unrelated. (Actually a deterministic correlation):
(∀x P(x) => Q(x)) v (∀x P(x) => ~Q(x))
The important property this has is non-redundancy. If you just said (P(x) => Q(x)) as background knowledge, that's not analogical reasoning, because you get the target conclusion without having to use information about the source object. Instead, you say that P determines whether or not Q is true, but don't take a stance whether it's a positive or negative implication. You then apply information about the source to derive the implication for the target.
[[They cite different work by Davies that relates this to statistical correlation and regression]]
Properties 'P' have to do with relevance, so you don't make inferences based on similarity from spurious properties. They contrast to methods based on heuristic similarity between S and T.
This is mostly the first half of the paper. I got confused when they made it more general; the determination rule is actually a second-order thing, Det(P,Q). They talk a little about an implementation within a logic programming system. The examples weren't very convincing of its usefulness.
Anyways, this seems like a reasonable starting point to me for interpretation of metaphor. Naively, I might think that the semantic implications of a metaphorical statement (with its target sense T) can be inferred by analogical reasoning from the source (non-metaphorical) sense S. Actually this seems kind of definitional for what a metaphor is. (Oh: what IS a metaphor, anyway? Why doesn't the focus paper tell us?? The Wilks definition is crap.)
But there's lots of hoops to jump through before getting to interpretation. It would probably be useful to read less formal background theory like Gentner or something to understand the problem better.
Comments week 2
E. Shutova. 2010. Automatic Metaphor Interpretation as a Paraphrasing Task. In Proceedings of NAACL 2010, Los Angeles, USA.
http://www.cl.cam.ac.uk/~es407/papers/NAACL.pdf
This paper frames metaphor interpretation as a paraphrasing task. Given a metaphorical expression, the system returns a ranked list of literal substitutions. The author only focuses on single-word metaphors expressed by a verb. First paraphrases are ranked according to their likelihood in the context. Unrelated substitutions are then removed, by only keeping terms that are a hypernym or share a hypernym with the metaphor according to WordNet. A selectional preference measure is then used to filter out metaphors and rerank the paraphrases (how they did the reranking wasn't very clear to me). They showed in their evaluation that the last step increased performance a lot. I liked their approach, because the steps they performed are intuitive and relatively simple.
They evaluate their system in two different ways. By looking at the accuracy of the returned paraphrases ranked first, and at the MRR with a cutoff at rank 5. I think their performance was pretty good, with an accuracy of 0.81 when only looking at the first returned paraphrase. However, there are often multiple suitable substitutions for a metaphor (their annotators also had to list all suitable literal paraphrases they could come up with for the particular verb). It would have been interesting not only to look at which ranking the first correct paraphrase occurred, but also for example how many of the paraphrases in the top x were actually correct (accuracy instead of MRR).
Related Paper for Jan 27
Friday, January 21, 2011
Comment for readings of the second week
Thursday, January 20, 2011
CCG parsers
(2) Curran and Clark's industrial-strength parser that Noah was talking about. Link.
Here's the parser in action... They have a web demo!
I bought a house on Thursday that was red .
Command-line is fun. I bolded the line with the type-annotated parse. They also have a separate tools that outputs semantic structures based on this.
~/sw/nlp/candc/candc-1.00 % echo "I bought a house on Thursday that was red ." | bin/pos --model models/pos | bin/parser --parser models/parser --super models/super
tagging total: 0.01s usr: 0.00s sys: 0.00s
total total: 2.53s usr: 2.45s sys: 0.09s
# this file was generated by the following command(s):
# bin/parser --parser models/parser --super models/super
# this file was generated by the following command(s):
# bin/parser --parser models/parser --super models/super
1 parsed at B=0.075, K=20
1 coverage 100%
(det house_3 a_2)
(dobj on_4 Thursday_5)
(ncmod _ house_3 on_4)
(xcomp _ was_7 red_8)
(ncsubj was_7 that_6 _)
(cmod that_6 house_3 was_7)
(dobj bought_1 house_3)
(ncsubj bought_1 I_0 _)
I|PRP|NP bought|VBD|(S[dcl]\NP)/NP a|DT|NP[nb]/N house|NN|N on|IN|(NP\NP)/NP Thursday|NNP|N that|WDT|(NP\NP)/(S[dcl]\NP) was|VBD|(S[dcl]\NP)/(S[adj]\NP) red|JJ|S[adj]\NP .|.|.
1 stats 5.8693 232 269
use super = 1
beta levels = 0.075 0.03 0.01 0.005 0.001
dict cutoffs = 20 20 20 20 150
start level = 0
nwords = 10
nsentences = 1
nexceptions = 0
nfailures = 0
run out of levels = 0
nospan = 0
explode = 0
backtrack on levels = 0
nospan/explode = 0
explode/nospan = 0
nsuccess 0 0.075 1 <--
nsuccess 1 0.03 0
nsuccess 2 0.01 0
nsuccess 3 0.005 0
nsuccess 4 0.001 0
total parsing time = 0.008075 seconds
sentence speed = 123.839 sentences/second
word speed = 1238.39 words/second
Reading for 1/27/11: Shutova, 2010
Author: Ekaterina Shutova
Venue: ACL 2010
Leader: Daniel
Reminders:
- Leave a comment on this post (non-anonymously) giving the details of the related paper you will read (include a URL), by Monday, January 24.
- Post your commentary (a paragraph) as a new blog post, by Wednesday, January 26.
Commentary for January 20
Summary of the first week reading on semantic parsing
As far as I could see, the comments for the selected paper fall into the following categories:
Lexicon induction and weight initialization: Michael has read the paper Learning for Semantic Parsing with Statistical Machine Translation" (Wong and Mooney, 2006) which utilizes the IBM alignment model to generate the lexicon. The paper uses the lexicon generated by the IBM 5 model directly for semantic parsing while the Kwiatkowski paper uses only the IBM 1 to initialize the feature weights. It is suggested that IBM 5, when used, requires more expert knowledge of the task to make it work better.
Unsupervised methods: Dong has read the paper "Unsupervised Semantic Parsing, Hoifung Poon, Pedro Domingos, EMNLP 2009" which employs an unsupervised bottom up methods to obtain the semantic representation. The method has the advantage of using clustering to account for syntactic variations of the same meaning. However, because of the unsupervised nature, the method is difficult to evaluate because of the lack of gold standard. The final evaluation in comparison with information extraction methods could not seem to completely account for the performance on semantic parsing itself.
Alternative models:
Matt read the paper "Wide-coverage semantic representations from a CCG parser" and analyzed the generality between the focus paper model and the selected paper model. He suggested that examining the failure mode and coming up with new ways to split the logical form would be a good way to improve the performance.
Daniel read the paper A Generative Model for Parsing Natural Language to Meaning Representations" (Lu et al, 2008) which jointly models the sentence and semantics, similar to CCG. The drawback of the model is that it generates the trees top down and thus uses less context, and therefore need re-ranking in the end. Daniel also compared the training methods used in the focused paper model and the selected paper model, where the former utilizes an alternation of refining the lexicon and re-parsing the training data, and the latter uses EM.
Linguistics:
Brendan read the review "Constraint-based approaches to grammar: alternatives to transformational syntax" and gave a detailed description of the mechanism of the CCG formalism through a few examples. Because there are different variations in the CCG formalism, he suggested that the one used in the focused paper might be different from the one introduced the review.
Similar problems:
Alan read the paper "Learning Context-Dependent Mappings from Sentences to Logical Form". He suggested that the focused paper looks at a more general problem which maps logical forms over multiple languages, compared to the selected paper. He also suggested that "The 2009 paper presents both a simple method of contextual analysis along with a bare-bones linear model to produce roughly 80% accuracy. Looking at the results, we also see that even examining just the most recent statement before a sentence, almost doubles the accuracy."
I have also read the paper "Learning Context-Dependent Mappings from Sentences to Logical Form". I feel the contribution of the paper is that it utilizes the hidden derivation to represent the out, and manages to learn a model over it and the input sentence. One shortcoming of the method, I feel, is that the method is heavily based on heuristically generated rules obtained from the ATIS data set, which might hinder its generalization to other domains.
For the focused paper, people have conducted comparison between the focused model and the selected model. In general, the focused model is generalized to deal with different representations of semantics and the approaches used at different components of the model could be improved, eg. the logical splitting method and so on.
Wednesday, January 19, 2011
Summary of CCG review (Week 1)
Marcel proved completeness
Lexicon:
Marcel: NP
proved: (S\NP)/NP
completeness: NP
A/B means "combine with right, of type B, yielding A." So [proved completeness] join together to yield
[proved completeness]: S\NP
A\B means "combine with left, of type A, yielding B." So this joins with the left to get
[Marcel [proved completeness]: S
You can also associate lambda calculus expressions with these expressions (I don't know their names), to get the logical forms that we see in the focus paper. There are quite a few more details for how these composition operations work, of course, but there's a core set of principles that seem reasonable.
The review is fairly long (60 pages) and goes over a variety of linguistic phenomena that CCG can handle, and talks a lot about how it relates to other syntactic theories. According to the review, it is the best thing since sliced bread.
I was hoping this would help me understand Section 4 of the focus paper but I'm still confused. I get the impression they only use part of CCG as it's described by Steedman and Baldridge, and then they do something weird too.
Comments for week 1
[1] Bos, J., Clark, S., Steedman, M., Curran, J. R., & Hockenmaier, J. (2004). Wide-coverage semantic representations from a CCG parser. In Proceedings of the International Conference on Computational Linguistics
Commentary for Jan 20th
Commentary for week of Jan 20
Commentary for Jan. 19th, 2011
Dong's comments for week 1
For this week's reading I read the required paper and the following related paper:
Unsupervised Semantic Parsing, Hoifung Poon, Pedro Domingos, EMNLP 2009
http://aclweb.org/anthology/D/D09/D09-1001.pdf
Both have as goal mapping sentences to logical form, but their approaches and settings are very different. I will highlight some key differences between the related paper and the required paper.
Setting: Supervised versus unsupervised. Kwiatkowski et al. use as training data sentences with corresponding logical representations. Poon et al.’s approach is unsupervised.
Approach: Kwiatkowski use a top-down approach. They start with logical forms that map sentences completely. These forms are then iteratively refined with a restricted higher-order unification procedure. Poon et al. use a bottom-up approach. They start with lambda-form clusters at the atom level and then recursively build up larger clusters using two operations (merge and compose).
What they learn: Kwiatkowski et al. learn a CCG grammar (thus both syntax as well as semantics). Poon et al. only focus on semantics, and use an existing parser (Stanford parser) for the syntax.
The nice thing about Kwiatkowski et al. approach is that it's more general than previous work (can handle different languages and meaning representations), while still having comparable performance compared with less general approaches.
What I liked about Poon’s approach is the idea of clustering to account for syntactic variations of the same meaning. However, Poon et al.'s work was more difficult to evaluate, because no gold standard was available. They therefore performed a task-based evaluation (question answering) and they compared their approach with information extraction systems. Because of their evaluation setup, their performance on semantic parsing was less clear to me.
Monday, January 17, 2011
Comments for the required paper and additional paper
For the additional paper, I read the paper:
Luke S. Zettlemoyer, Michael Collins. Learning Context-dependent Mappings from Sentences to Logical Form. In Proceedings of the Joint Conference of the Association for Computational Linguistics and International Joint Conference on Natural Language Processing (ACL-IJCNLP), 2009.
The idea is to map context dependent sentences to logical forms. The method proposed starts with a incomplete logical form for the target sentence as the first step, and this incomplete logical form is further combined with previous logical forms to obtain the final logical form as the second step. For the second step, the author proposes a model with a solution space of derivations, which are different configurations for deriving a final logical form. The inference is done through searching the best derivation in beam search based on the score of the derivation. The learning is done through Collins' perceptron which tunes the weight vectors corresponding to the feature vector of the derivation. I feel the most interesting idea is that they use the derivations as the output, and search for the best derivation, which would in turn generate the final logical form.
Thursday, January 13, 2011
Reading for 1/20/11: Kwiatkowski et al., 2010
Inducing Probabilistic CCG Grammars from Logical Form with Higher-order Unification
Authors: Tom Kwiatkowski, Luke Zettlemoyer, Sharon Goldwater, and Mark Steedman
Venue: EMNLP 2010
Leader: Weisi
Reminders:
- Leave a comment on this post (non-anonymously) giving the details of the related paper you will read (include a URL), by Monday, January 17.
- Post your commentary (a paragraph) as a new blog post, by Wednesday, January 19.
- If you haven't received a message from the 11-713 mailing list listing the schedule for leading discussions, let Tae know.