Thursday, January 20, 2011

CCG parsers

(1) NLTK has a tiny little CCG parser. Link

(2) Curran and Clark's industrial-strength parser that Noah was talking about. Link.

Here's the parser in action... They have a web demo!

I bought a house on Thursday that was red .

Command-line is fun. I bolded the line with the type-annotated parse. They also have a separate tools that outputs semantic structures based on this.


~/sw/nlp/candc/candc-1.00 % echo "I bought a house on Thursday that was red ." | bin/pos --model models/pos | bin/parser --parser models/parser --super models/super
tagging total: 0.01s usr: 0.00s sys: 0.00s
total total: 2.53s usr: 2.45s sys: 0.09s
# this file was generated by the following command(s):
# bin/parser --parser models/parser --super models/super

# this file was generated by the following command(s):
# bin/parser --parser models/parser --super models/super

1 parsed at B=0.075, K=20
1 coverage 100%
(det house_3 a_2)
(dobj on_4 Thursday_5)
(ncmod _ house_3 on_4)
(xcomp _ was_7 red_8)
(ncsubj was_7 that_6 _)
(cmod that_6 house_3 was_7)
(dobj bought_1 house_3)
(ncsubj bought_1 I_0 _)
I|PRP|NP bought|VBD|(S[dcl]\NP)/NP a|DT|NP[nb]/N house|NN|N on|IN|(NP\NP)/NP Thursday|NNP|N that|WDT|(NP\NP)/(S[dcl]\NP) was|VBD|(S[dcl]\NP)/(S[adj]\NP) red|JJ|S[adj]\NP .|.|.

1 stats 5.8693 232 269

use super = 1
beta levels = 0.075 0.03 0.01 0.005 0.001
dict cutoffs = 20 20 20 20 150
start level = 0
nwords = 10
nsentences = 1
nexceptions = 0
nfailures = 0
run out of levels = 0
nospan = 0
explode = 0
backtrack on levels = 0
nospan/explode = 0
explode/nospan = 0
nsuccess 0 0.075 1 <--
nsuccess 1 0.03 0
nsuccess 2 0.01 0
nsuccess 3 0.005 0
nsuccess 4 0.001 0
total parsing time = 0.008075 seconds
sentence speed = 123.839 sentences/second
word speed = 1238.39 words/second

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.