Robust statistical methods for hit selection RNA interference high-throughput screening experiments

Xiaohua Douglas Zhang, Xiting Cindy Yang, Namjin Chung, Adam Gates, Erica Stec, Priya Kunapuli, Dan J. Holder, Marc Ferner, Amy S. Espeseth

Research output: Contribution to journalArticlepeer-review

81 Scopus citations


RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean ± k standard deviation (SD) and median ± 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean ± k SD under the same preset error rate. The number of hits selected by median ± k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median ± k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifact.

Original languageEnglish
Pages (from-to)299-309
Number of pages11
Issue number3
StatePublished - Apr 2006


  • Experimentwise
  • High-throughput screening
  • Hit selection
  • Median
  • Median absolute deviation
  • Plate-well series plot
  • Platewise
  • Quartile-based method
  • RNA interference
  • Statistical methods

ASJC Scopus subject areas

  • Molecular Medicine
  • Genetics
  • Pharmacology


Dive into the research topics of 'Robust statistical methods for hit selection RNA interference high-throughput screening experiments'. Together they form a unique fingerprint.

Cite this