The paper presents and tests a hypothesis about what additional information human translators use to determine the tenses to be used during translation (in this case between Chinese and English). Since human translators still outperform current automated systems, the idea behind this is to identify where effort should be focused for advancing automatic extraction methods.
The experiment consists of training both conditional random fields (CSFs) and classification trees on surface features and latent features (alone and together) and evaluating the accuracy by evaluating against the tenses from gold-standard/best-ranked human-generated English translations (Of course here we are only considering whether the classifier can correctly predict the verb tense to use for the translation, so for Chinese to English, the classifier predicts which tense of the English verb is best suited). The paper also details how these gold-standards are generated from all the human annotations they collected.
In terms of surface features versus latent features, it seems that surface features include those that can be easily extracted, i.e. whether the verb is inside a quote, the presence of signal adverbs either before or after the verb, the presence of signal markers between two verbs, distance between characters, whether the verbs are in the same clause, etc. When the paper talks about latent verbs it addresses mainly three features. One is telicity, which specifies whether the verb can be bound within a certain time frame. Another is punctuality, which says whether a verb can be associated with a single point event. The third is temporal ordering, which describes one of six relationships between two invents, namely precession, succession, inclusion, subsumation, overlap, and none. The idea is that latent variables require deeper semantic analysis of the source text that is generally only feasible by human processing.
So in the end, the results simply show that the classifiers trained only on latent features outperform those trained only on surface features, which provides support for their simple claim that if we were able to achieve some deeper semantic analysis, then we would definitely be able to improve our approach towards this problem and maybe tense/disambiguation problems in general.
In hindsight, maybe not the greatest of papers, but I thought I would try to clear some things up since it was pretty rough when I presented yesterday. It's very possible I may have misunderstood some part so please let me know if something seems off.