To recap I read "Latent features in automatic tense translation between Chinese and English."
The paper is a very basic introduction to the difficulties inherent in translating Chinese verbs to English, particularly with regards to the task of determining tense. The paper pushes the idea that the deeper features used by humans when translating needs to be identified and also should be the focus in terms of creating more advanced automatic extraction methods. As a method of comparison and evaluation, tense classifiers with latent features are demonstrated to have better performance than those only using surface features. The paper seems to promote a more human-based cognition approach to the problem and NLP in general, targeting the information that currently remains difficult to extract.
I feel Chinese may have just been chosen to demonstrated maybe the more interesting or extremal case of the tense classification problem. The basic feature space explored includes surface features, latent features, telicity and punctuality features, and temporal ordering features. It was fairly comfortable to understand the basic ideas here due to my relative fluency in Chinese, and I won't go into the details on this blog post.
The paper describes experiments using both CRF (conditional random fields) learning experiments and classification tree learning experiments. The classifiers are trained on both surface features and latent features separately and then together. Both yield similar accuracy and perform generally 15% better than baseline systems. I feel like a large part of the paper is to maybe reorient focus on extracting latent features as the authors leverage the difficulty of the task as the key to advancing our current extraction methods.
I will write up the pre-meeting summary before class tomorrow.