TY - JOUR
T1 - Robust statistical methods for hit selection RNA interference high-throughput screening experiments
AU - Zhang, Xiaohua Douglas
AU - Yang, Xiting Cindy
AU - Chung, Namjin
AU - Gates, Adam
AU - Stec, Erica
AU - Kunapuli, Priya
AU - Holder, Dan J.
AU - Ferner, Marc
AU - Espeseth, Amy S.
PY - 2006/4
Y1 - 2006/4
N2 - RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean ± k standard deviation (SD) and median ± 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean ± k SD under the same preset error rate. The number of hits selected by median ± k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median ± k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifact.
AB - RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean ± k standard deviation (SD) and median ± 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean ± k SD under the same preset error rate. The number of hits selected by median ± k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median ± k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifact.
KW - Experimentwise
KW - High-throughput screening
KW - Hit selection
KW - Median
KW - Median absolute deviation
KW - Plate-well series plot
KW - Platewise
KW - Quartile-based method
KW - RNA interference
KW - Statistical methods
UR - http://www.scopus.com/inward/record.url?scp=33646034336&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33646034336&partnerID=8YFLogxK
U2 - 10.2217/14622416.7.3.299
DO - 10.2217/14622416.7.3.299
M3 - Article
C2 - 16610941
AN - SCOPUS:33646034336
SN - 1462-2416
VL - 7
SP - 299
EP - 309
JO - Pharmacogenomics
JF - Pharmacogenomics
IS - 3
ER -