Recombining dependent data: an Order Statistics approach
By Adolfo Alvarez and Daniel Peña in Research
December 1, 2009
Abstract
This article discusses the problem of forming groups from previously split data. Algorithms for Cluster Analysis like SAR proposed by Peña, Rodriguez and Tiao (2004), divide the sample into small very homogeneous groups and then recombine them to form the definitive data configuration. This kind of splitting leads to dependent data in the sense that the groups are disjoint, so no traditional homogeneity of means or variances tests can be used.
We propose an alternative by using Order Statistics. Studying the distribution and some moments of linear combination of Order Statistics it is possible to recombine disjoint data groups when they merge into a sample from the same population.