Statistical Methods for Environmental Data Subject to Detection Limits

Grants and Contracts Details


As researchers investigate the relationship between cancer and exposure to environmental chemicals such as trace elements, pesticides, and dioxins, they often find concentrations that are lower than limits deemed reliable enough to report as numerical values. The detection limit (DL) may be a fixed number in some studies, but it can also vary widely from sample to sample in other studies. For the latter, the DL may be correlated with the exposure level, as observed in a colon cancer study in Kentucky. The data subject to DLs present challenges for data analysis and interpretation. In this proposal we focus on two important statistical problems encountered in the analysis of data from environmental epidemiologic studies: (a) estimation of the chemical distribution in a specific group; and (b) comparison of distributions among groups. For these two problems, ad hoc, parametric, and nonparametric methods have been proposed. Ad hoc methods are ill-advised unless there are relatively few measurements below DLs; and parametric methods can lead to markedly biased results when the parametric model is misspecified. Nonparametric methods have received increasing attention in recent years because of their robustness. However, current nonparametric methods simply borrow the commonly used methods for right-censored survival data, and do not take into account the following two unique characteristics of environmental exposure data with DLs: (a) it is not meaningful to define the hazard function for an exposure measurement; and (b) DL values are observable for all subjects including those whose actual exposure levels are detected. In addition, current nonparametric methods do not allow for sampling weights, which are typically present in survey data such as the National Health and Nutrition Examination Survey (NHANES). Due to these issues, current nonparametric methods may lead to the following four problems for the analysis of environmental exposure data with DLs: (a) lack of meaningful interpretation; (b) inefficient results; (c) inability to deal with the situation that the exposure level and DL are correlated; and (d) inability to handle survey data with sampling weights. To address the aforementioned problems, we will develop unified and efficient nonparametric estimation and testing methods that can (a) deal with possible correlation between the exposure level and DL; (b) incorporate sampling weights. We will utilize state-of-the-art methods for censored survival data and tailor them to environmental exposure data with DLs. The proposed methods will be applied to data from a recently conducted colon cancer case-control study in Kentucky, an ongoing lung cancer case-control study in Kentucky, and the NHANES.
Effective start/end date5/1/154/30/17


  • National Cancer Institute: $150,459.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.