CMU Advanced NLP Seminar 2011: Commentary for Feb 2

I read the paper Coreference Resolution in a Modular, Entity-Centered Model. This paper presents a mostly unsupervised approach to coreference resolution, with similar methods to the focus paper. Instead of trying to match mentions to slots in frames, they match mentions to entities. The system deals with a hierarchy of mentions which are present in the text, abstract entities, and types, which are classes of entities. This allows the model to make generalizations across multiple different entities.

The system uses a hierarchical generative process, where first a list of entities is drawn by drawing a list of types, then an entity from each of those types. Then, mentions are drawn from the entities using a sequential distance-dependent chinese restaurant process. Finally, each of these mentions generates a surface realization.

To train the model, each level of the generative hierarchy. is updated using EM in turn, until all have converged. This is an approximation to just running normal EM, which would be computationally infeasible given the model. All of the training is done on unlabeled data, except for prototypes of the types which are hard-coded in at the beginning of training.

The explicit modeling of discourse is interesting in this paper, and seems like a good starting point for generative models of discourse that could be incorporated into other models.

CMU Advanced NLP Seminar 2011

Wednesday, February 2, 2011

Commentary for Feb 2

No comments:

Post a Comment