Mixed modeling and sample size calculations for identifying housekeepinggenes

Hongying Dai, Richard Charnigo, Carrie A. Vyhlidal, Bridgette L. Jones, Madhusudan Bhandary

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Normalization of gene expression data using internal control genes that have biologically stable expression levels is an important process for analyzing reverse transcription polymerase chain reaction data. We propose a three-way linear mixed-effects model to select optimal housekeeping genes. The mixed-effects model can accommodate multiple continuous and/or categorical variables with sample random effects, gene fixed effects, systematic effects, and gene by systematic effect interactions. We propose using the intraclass correlation coefficient among gene expression levels as the stability measure to select housekeeping genes that have low within-sample variation. Global hypothesis testing is proposed to ensure that selected housekeeping genes are free of systematic effects or gene by systematic effect interactions. A gene combination with the highest lower bound of 95% confidence interval for intraclass correlation coefficient and no significant systematic effects is selected for normalization. Sample size calculation based on the estimation accuracy of the stability measure is offered to help practitioners design experiments to identify housekeeping genes. We compare our methods with geNorm and NormFinder by using three case studies. A free software package written in SAS (Cary, NC, U.S.A.) is available at http://d.web.umkc.edu/daih under software tab.

Original languageEnglish
Pages (from-to)3115-3125
Number of pages11
JournalStatistics in Medicine
Issue number18
StatePublished - Aug 15 2013


  • Housekeeping gene
  • Intraclass correlation coefficient (ICC)
  • Linear mixed-effects model (LMM)
  • Normalization
  • RT-PCR
  • Systematic effect

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability


Dive into the research topics of 'Mixed modeling and sample size calculations for identifying housekeepinggenes'. Together they form a unique fingerprint.

Cite this