TY - GEN
T1 - Validation and implication of segmentation on Empirical Bayes for highway safety studies
AU - Souleyrette, R. R.
AU - Haas, R. P.
AU - Maze, T. H.
PY - 2007
Y1 - 2007
N2 - Typically, crash frequency is modelled as Poison where the variation is the square root of the expected number. If the expected number of crashes is small, the variation is a large percentage of the expected number of crashes, and the observed number of crashes provides a crude estimate for the expected number. A better estimate is obtained when the expected number is large. For a specific location, there are two approaches for performing measurements where the expected number of crashes is large. One approach is to measure over a long period of time. However, data are not often available for long periods. Even if available, changes in conditions over time, such as increase in traffic volumes or improvement in infrastructure, may limit the useful time frame. Another approach is to perform measurements over a large number of similar locations, providing a relatively precise estimate for the distribution. Then, one can use the Empirical Bayes (EB) approach to combine the relatively precise estimate for the distribution with the less precise estimate for the expected number at the location of interest, resulting is an improved estimate for the expected number at that location. This paper explores the two approaches. It uses multiple years of data from the Highway Safety Information System for California intersections and highway links from the State of Iowa. Data from a single year is used to estimate the expected number of crashes at locations, following the EB approach. Data from multiple years at each location is then used to estimate the expected number of crashes at those locations, and the results from the two approaches are compared. No such large scale validation has yet been performed. The effect of a priori segmentation of the highway system is also explored. Longer, homogeneous sections are found both to improve the statistical validity of models and to improve the EB correction of one-year section crash estimates.
AB - Typically, crash frequency is modelled as Poison where the variation is the square root of the expected number. If the expected number of crashes is small, the variation is a large percentage of the expected number of crashes, and the observed number of crashes provides a crude estimate for the expected number. A better estimate is obtained when the expected number is large. For a specific location, there are two approaches for performing measurements where the expected number of crashes is large. One approach is to measure over a long period of time. However, data are not often available for long periods. Even if available, changes in conditions over time, such as increase in traffic volumes or improvement in infrastructure, may limit the useful time frame. Another approach is to perform measurements over a large number of similar locations, providing a relatively precise estimate for the distribution. Then, one can use the Empirical Bayes (EB) approach to combine the relatively precise estimate for the distribution with the less precise estimate for the expected number at the location of interest, resulting is an improved estimate for the expected number at that location. This paper explores the two approaches. It uses multiple years of data from the Highway Safety Information System for California intersections and highway links from the State of Iowa. Data from a single year is used to estimate the expected number of crashes at locations, following the EB approach. Data from multiple years at each location is then used to estimate the expected number of crashes at those locations, and the results from the two approaches are compared. No such large scale validation has yet been performed. The effect of a priori segmentation of the highway system is also explored. Longer, homogeneous sections are found both to improve the statistical validity of models and to improve the EB correction of one-year section crash estimates.
KW - Count models
KW - Crash frequency estimations
KW - Empirical Bayes
KW - Segmentation for crash sampling
UR - http://www.scopus.com/inward/record.url?scp=38849100295&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38849100295&partnerID=8YFLogxK
U2 - 10.2495/EHR070101
DO - 10.2495/EHR070101
M3 - Conference contribution
AN - SCOPUS:38849100295
SN - 9781845640835
T3 - WIT Transactions on Biomedicine and Health
SP - 85
EP - 94
BT - Environmental Health Risk IV
T2 - 4th International Conference on the Impact of Environmental Factors on Health 2007
Y2 - 27 June 2007 through 29 June 2007
ER -