I have read the paper “Statistical Phase-Based Translation” for the focus paper. The paper describes a framework that different heuristics of phrase extraction can be used to generate phrases. The authors have compared the performance of different phrase extraction methods with IBM4, and concludes that the phrase helps during translation while syntactic based phrases do not. The beam search on the decoder feels a little ad hoc, and the search operations are not clear enough to me on certain cases, such as whether phrases could overlap or not, and how the distortion probability is used during decoding, e.g. there can be an operation like swapping two phases in the hypothesis.
For the focus paper, because of the limitation of knowledge of machine translation frameworks, I am not sure how the extracted sets are used in decoding, eg. the parameters for them as features are estimated in what way, eg. as LM or discrimatively.