A new method of bandwidth selection for kernel density estimators is proposed. The method, termed indirect cross-validation (ICV), makes use of so-called selection kernels. Least-squares cross-validation (LSCV) is used to select the bandwidth of a selection-kernel estimator and this bandwidth is appropriately rescaled for use in a Gaussian kernel estimator. The proposed selection kernels are linear combinations of two Gaussian kernels and need not be unimodal or positive. A theory is developed showing that the relative error of ICV bandwidths can converge to 0 at a rate of n'1/4, which is substantially better than the n'1/10 rate of LSCV. Interestingly, the selection kernels that are best for purposes of bandwidth selection are very poor if used to actually estimate the density function. This property appears to be part of the larger and well-documented paradox to the effect that "the harder the estimation problem, the better cross-validation performs." The ICV method uniformly outperforms LSCV in a simulation study, a real data example, and a simulated example in which bandwidths are chosen locally. Supplemental materials for the article are available online.
|Number of pages||9|
|Journal||Journal of the American Statistical Association|
|State||Published - Mar 2010|
Bibliographical noteFunding Information:
Olga Y. Savchuk is Visiting Assistant Professor, Binghamton University, Binghamton, NY 13902-6000 (E-mail: firstname.lastname@example.org). Jeffrey D. Hart is Professor, Department of Statistics, Texas A&M University, College Station, TX 77843-3143 (E-mail: email@example.com). Simon J. Sheather is Professor and Head, Department of Statistics, Texas A&M University, College Station, TX 77843-3143 (E-mail: firstname.lastname@example.org). The authors are grateful to David Scott and George Terrell for providing valuable insight about cross-validation, and to three referees and an associate editor, whose comments led to a much improved final version of our paper. The research of Savchuk and Hart was supported in part by NSF grant DMS-0604801.
- Bandwidth selection
- Kernel density estimation
- Local cross-validation
- Simulation of Bayes risk
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty