This week I read "Combining Multiple Weak Clusters" by Topchy, Jain, and Punch from Proceedings IEEE Intl. Conf. on Data Mining 2003. http://www.cse.msu.edu/prip/Files/topchy_combination.pdf
This paper presented several interesting points on the topic. First it sets up a formal definition of the combination clustering problem, which basically says given some set of clusterings for a set of data, find a cluster which is some combination of one or more of those sets that yields a "better" cluster. The paper formalizes this combination problem basically into a clustering problem of itself, where the critical aspect of the combination problem is the consensus function that defines how different clusters are combined. Then there follows a proof of relating this to a median partitioning problem and intra-class variance criterion. Seven different consensus functions are studied, where evaluation is done by measuring the mis-assignment rate of the consensus partition (the true known number of clusters is made available). A second point of the paper focuses on combining weak clustering algorithms, specifically clustering (via k-means) the data after projection into a lower dimension, or splitting the data randomly via hyperplanes. Experiments are combined with the consensus functions and applied in different dimensions for different values of k.
The idea in this paper seems pretty intuitive which probably explains why it was published some time ago. Theoretically it makes sense that you should be able form a more optimal cluster by combining multiple clusterings of a set of data, but because you could also telescope the argument, the practicality and marginal benefits seem limited in significance.