Efficient computation of statistical procedures based on all subsets of a specified size

John E. Hinkle, Arnold J. Stromberg

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Many statistical techniques require that computations be done on all subsets of size r in a data set of size n. Typically, this is done lexographically, i.e., with nested for-loops. If an exchange one point update formula is available, then it is used on the inner loop. In this paper we discuss a method of counting through all subsets of size r in a data set of size n by changing only one element between successive subsets. Such methods have been studied in the applied mathematics literature but are mostly unknown to statisticians. The advantage of such methods is that an update formula can be used at every step, thus potentially saving computation time. The method used to compute the next subset in the list requires some computation time, and thus the new method will only be faster if the update formula is sufficiently faster than doing the computation from scratch.

Original languageEnglish
Pages (from-to)489-500
Number of pages12
JournalCommunications in Statistics - Theory and Methods
Volume25
Issue number3
DOIs
StatePublished - 1996

Bibliographical note

Funding Information:
'John Hinkle is a Ph.D. student and Arnold Stromberg is an Assistant Professor in the Department of Statistics at the University of Kentucky, 817 Patterson Office Tower, Lexington, KY 40506-0027. Hinkle was partially supported by the College of Arts and Sciences at the University of Kentucky and initial work on this manuscript by Stromberg was supported by NSF grant DMS-9204380 and NSA grant MDA-904-92-H-3077. The authors thank Carl Brezovcc and Carl Lee for their assistance.

Keywords

  • Cook's distance
  • Distributed computing
  • Gray code
  • Jackknife

ASJC Scopus subject areas

  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Efficient computation of statistical procedures based on all subsets of a specified size'. Together they form a unique fingerprint.

Cite this