Related paper: Unsupervised Modeling of Twitter Conversations;
Alan Ritter, Colin Cherry, Bill Dolan, NAACL 2010
Focus paper: Characterizing Microblogs with Topic Models
The related paper I read this week proposes an unsupervised method for discovering dialogue structure or "dialogue acts" in twitter conversations. The idea was to automatically extract information that says something about the nature of the interactions between people in new mediums such as twitter. This is a pretty cool problem since the conversational aspects of English or any language seem to be one of the harder problems one could pose in NLP. The authors crawled Twitter using its API and obtained the posts of a sample of users, and all the replies to their posts, extracting entire conversation trees. All the data amounted to 1.3 million conversations. Only 10,000 random conversations are used, and scaling the models to the entire corpus is left for future work. The authors introduce three models, the EM Conversation model, the Conversation+Topic model, and the Bayesian Conversation model, the second being an extension of the first.
The Conversation+Topic model is basically a modified HMM borrowed from some previous work on multi-document summarization. Just using the Conversational model wasn't quite good enough since topic and dialogue structure were mixed in the results and the focus is on dialogue structure. They use an LDA framework to modify the model to account for topic and thus separate content words from dialogue indicators. For the inference engine, the HMM dp is swapped for Gibbs sampling. Slice sampling is also applied.
To evaluate the set of generated dialogue acts, the authors examine both qualitative and quantitive evaluations. The qualitative evaluation really only focuses on the Conversation+Topic model, and they go through a 10-act model, showing the probability on the transitions between dialogue acts. Most of the acts are reasonable and do a good job of illustrating the fact that Twitter is a microblog. They also display word lists and example posts for each Dialogue Act which are fairly convincing. For quantitative evaluation the paper introduces a new task of conversation ordering, that is given a random set of conversations, all permutations of the conversations are generated and the probability of each permutation is evaluated as if it were an unseen conversation. It appears that although useful, this metric does not directly imply anything about the interpretability of the model.
It seems the paper does a decent job at planting a first step in terms of unsupervised dialogue act tagging. The work done doesn't seem overly complex, but the observations and data they collected seems like they could be useful for at least a couple more runs.