EMPIRICAL FOURIER METHODS FOR INTERVAL CENSORED DATA

Peter Hall, John Braun, Thierry Duchesne
2018 Statistica sinica  
Methods for estimating the probability density function are considered under the circumstance that the underlying measurements are intervalcensored. Density and distribution function estimators are proposed under parametric and nonparametric assumptions on the censoring mechanism. Conditions for identifiability and consistency of the estimates are established theoretically, and it is shown that under such conditions, the estimates converge to the truth at a polynomial rate in the inverse sample
more » ... size. An online supplement contains the technical arguments as well as practical guidelines for numerical implementation of the proposed methods. [The core of the theory in this paper was originally drafted by Peter Hall in early 2010, following discussions at a workshop on mismeasured data held in Canada in December, 2009 at which Peter was the keynote speaker. The co-authors are grateful for the follow-up conversations held with Peter by long distance over the years prior to his regretful passing.] on Fourier deconvolution approaches to measurement error problems, and in discussions during and after the workshop, he became quite interested in how these approaches might be adapted to problems where the data were interval-censored. These discussions were immensely appreciated by all involved. By the end of the workshop, Peter indicated in his polite, but clear way, that he had quite seen enough of snow (it had been falling almost continuously for the entire event), and he happily boarded an airplane bound for Hong Kong. Upon arrival in Hong Kong, Peter sent an email message to one of us (JB) containing an attachment of a carefully typed 6-page draft manuscript encapsulating some of the ideas discussed at the meeting. Much of that draft appears verbatim in Sections 2 and 3 of the present paper. Over the next few weeks, there were several emails back and forth concerning implementation of the proposed Fourier approach, and by January 8, 2010, Peter had completed most of the theory outlined in Sections 4 and 5. Numerical issues continued to cause trouble over the ensuing months, seemingly contradicting the theoretical results concerning the consistency of the new estimator. The method seemed to require enormous sample sizes in order to work, so it did not appear to be a practical contribution to the literature on interval-censored density estimation. The work was abandoned, until TD was approached with questions about the consistency of a competing estimator (Braun et al, 2005) . Peter's theoretical ideas were brought into these discussions, and interest in implementing the Fourier method was rekindled. We had a few 2 Statistica Sinica: Newly accepted Paper (accepted version subject to English editing) brief email conversations with Peter and discussed plans for the three of us to publish this paper, but Peter's illness brought those conversations to a close, and it was with sadness that we learned of his passing. In October, 2016, we made one more attempt at numerically implementing the method, scanning Peter's carefully constructed theoretical arguments for clues that might assist us in practically implementing the method. Gradually, we began to see that our earlier numerical efforts had been based on unnecessary simplifications, leading to horribly suboptimal solutions; full implementation of the technique was, in fact, not only possible, but it also gave very good results. This paper sets out, then, to show that a Fourier method for kernel density and distribution function estimation for interval-censored data can work, both theoretically and practically.
doi:10.5705/ss.202017.0057 fatcat:nlswbvsrbrdxpbgsau55xpqdxu