Abstract
Robust regression methods have many potential applications in big data problems. In this paper, we consider two such applications using publicly available data. The first application looks at modeling taxi fares based on the trip distance of n = 49; 800 taxi rides in New York City on Tuesday January 15, 2013. The second application focuses on modeling the airfare from the miles flown of n = 78; 905 round trip itineraries for single passengers which consisted of 2 direct one-way flights within the contiguous domestic US market on Southwest Airlines in the fourth quarter of 2014. The robust estimates were obtained for both applications using PROC ROBUSTREG in SAS 9.4. In both cases, we find that the confidence intervals around the robust estimates of the parameters in the regression models are very narrow, typically $0.01 or lower. With these confidence intervals being so narrow, one is left with the impression that these robust estimates differ in some meaningful way across at least some of the robust methods. Finally, utilizing findings in Cox (Biometrika, 102:712–716, 2015) we argue that in such applications it is not surprising that the confidence intervals around the robust estimates are very narrow, thus producing the illusion of apparently very high precision.
Original language | English |
---|---|
Title of host publication | Robust Rank-Based and Nonparametric Methods - Selected, Revised, and Extended Contributions |
Editors | Joseph W. McKean, Regina Y. Liu |
Pages | 101-120 |
Number of pages | 20 |
DOIs | |
State | Published - 2016 |
Event | International Conference on Robust Rank-Based and Nonparametric Methods, 2015 - Kalamazoo, United States Duration: Apr 9 2015 → Apr 10 2015 |
Publication series
Name | Springer Proceedings in Mathematics and Statistics |
---|---|
Volume | 168 |
ISSN (Print) | 2194-1009 |
ISSN (Electronic) | 2194-1017 |
Conference
Conference | International Conference on Robust Rank-Based and Nonparametric Methods, 2015 |
---|---|
Country/Territory | United States |
City | Kalamazoo |
Period | 4/9/15 → 4/10/15 |
Bibliographical note
Publisher Copyright:© Springer International Publishing Switzerland 2016.
Keywords
- Big data
- Illusion of very high precision
- Robust regression
- SAS
ASJC Scopus subject areas
- General Mathematics