Progress in data-based bandwidth selection for Kernel density estimation

M. C. Jones, J. S. Marron, S. J. Sheather

Research output: Contribution to journalArticlepeer-review

108 Scopus citations

Abstract

We review the extensive recent literature on automatic, data-based selection of a global smoothing parameter in univariate kernel density estimation. Proposals are presented in a unified framework, making considerable reference to their theoretical properties as we go. The results of a major simulation study of the practical performance of many of these methods are summarised. Also, our remarks are further consolidated by describing a small portion of our practical experience on real datasets. Our comparison of methods' practical performance demonstrates that improvements to be gained by using the better methods can be, and often are, considerable. It will be seen that achieving optimal theoretical performance (up to bounds derived by Hall and Marron, 1991) and acceptable practical performance is not accomplished by the same techniques. We put much effort into making good practical choices whenever options arise. We emphasise that arguably the two best known bandwidth selection methods cannot be advocated for general practical use; these are "least squares cross-validation" (which suffers from too much variability) and normal-based "rules-of-thumb" (which are too biased towards oversmoothing). A number of methods that do seem to be worthy of further consideration are listed. We show why our overall current preference is for the method of Sheather and Jones (1991). It is hoped that the lessons learned in this comparatively simple setting will also prove useful in many other smoothing situations.

Original languageEnglish
Pages (from-to)337-381
Number of pages45
JournalComputational Statistics
Volume11
Issue number3
StatePublished - 1996

Keywords

  • Automatic methods
  • Cross-validation
  • Curve estimation
  • Functional estimation
  • Mean integrated squared error
  • Normal mixture
  • Oversmoothing
  • Rates of convergence
  • Scale estimation
  • Smoothed boot-strapping
  • Smoothing parameter

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Progress in data-based bandwidth selection for Kernel density estimation'. Together they form a unique fingerprint.

Cite this