Multivariate spacings based on data depth: I. Construction of nonparametric multivariate tolerance regions
Annals of Statistics
This paper introduces and studies multivariate spacings. The spacings are developed using the order statistics derived from data depth. Specifically, the spacing between two consecutive order statistics is the region which bridges the two order statistics, in the sense that the region contains all the points whose depth values fall between the depth values of the two consecutive order statistics. These multivariate spacings can be viewed as a data-driven realization of the so-called
... ly equivalent blocks." These spacings assume a form of center-outward layers of "shells" ("rings" in the two-dimensional case), where the shapes of the shells follow closely the underlying probabilistic geometry. The properties and applications of these spacings are studied. In particular, the spacings are used to construct tolerance regions. The construction of tolerance regions is nonparametric and completely data driven, and the resulting tolerance region reflects the true geometry of the underlying distribution. This is different from most existing approaches which require that the shape of the tolerance region be specified in advance. The proposed tolerance regions are shown to meet the prescribed specifications, in terms of β-content and β-expectation. They are also asymptotically minimal under elliptical distributions. Finally, a simulation and comparison study on the proposed tolerance regions is presented.