CMU Advanced NLP Seminar 2011: Pre-meeting

Focus paper: Poetic Statistical Machine Translation: Rhyme and Meter

Related Paper: Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation

This paper presents a generative model of rhythm in poetry, and shows how it can be applied to the task of poetry translation. The paper only deals with sonnets, a poetry form using fairly strict iambic pentameter. For the model of rhythm, two elements are combined, the CMU pronunciation dictionary and an FST-based model of stress. The FST model learns pronunciation patterns of each word by constraining each line two have one of four standard stress patterns, and using EM to learn weights of word-specific pronunciation FSTs. For poetry generation, the authors mention that the FST model is augmented with the CMU dictionary, but do not discuss how. On the task of assigning stresses to words on held-out data, the FST method achieves 94% per word accuracy and 81% per line. The authors note that the main problem with this model is that the lexical pronunciation probabilities learned are context independant, which makes the processing of single syllable words difficult.

For generation, the authors use a language-model based method, combined with the pronunciation model and a simplified model of rhyme. This model is not evaluated directly, but produces at least amusing poems.

For translation, the pronunciation model is applied as part of the language model in a PBMT system. The authors do not do any sort of quantitative evaluation, do the intrinsic difficulty of evaluating poetry automatically. The authors find that when testing on training data, the resulting poems tend to more closely resemble human translations then the outputs of a pure PBMT system, but when testing on test data, the system often fails to produce output.

CMU Advanced NLP Seminar 2011

Thursday, April 14, 2011

Pre-meeting - Daniel

No comments:

Post a Comment