TY - GEN
T1 - CoLUA
AU - Wen, Wei
AU - Yu, Tingting
AU - Hayes, Jane Huffman
PY - 2016/12/5
Y1 - 2016/12/5
N2 - Configuration bugs are among the dominant causes of software failures. Software organizations often use bug tracking systems to manage bug reports collected from developers and users. In order for software developers to understand and reproduce configuration bugs, it is vital for them to know whether a bug in the bug report is related to configuration issues, this is not often easily discerned due to a lack of easy to spot terminology in the bug reports. In addition, to locate and fix a configuration bug, a developer needs to know which configuration options are associated with the bug. To address these two problems, we introduce CoLUA, a two-step automated approach that combines natural language processing, information retrieval, and machine learning. In the first step, CoLUA selects features from the textual information in the bug reports, and uses various machine learning techniques to build classification models, developers can use these models to label a bug report as either a configuration bug report or a non-configuration bug report. In the second step, CoLUA identifies which configuration options are involved in the labeled configuration bug reports. We evaluate CoLUA on 900 bug reports from three large open source software systems. The results show that CoLUA predicts configuration bug reports with high accuracy and that it effectively identifies the root causes of configuration options.
AB - Configuration bugs are among the dominant causes of software failures. Software organizations often use bug tracking systems to manage bug reports collected from developers and users. In order for software developers to understand and reproduce configuration bugs, it is vital for them to know whether a bug in the bug report is related to configuration issues, this is not often easily discerned due to a lack of easy to spot terminology in the bug reports. In addition, to locate and fix a configuration bug, a developer needs to know which configuration options are associated with the bug. To address these two problems, we introduce CoLUA, a two-step automated approach that combines natural language processing, information retrieval, and machine learning. In the first step, CoLUA selects features from the textual information in the bug reports, and uses various machine learning techniques to build classification models, developers can use these models to label a bug report as either a configuration bug report or a non-configuration bug report. In the second step, CoLUA identifies which configuration options are involved in the labeled configuration bug reports. We evaluate CoLUA on 900 bug reports from three large open source software systems. The results show that CoLUA predicts configuration bug reports with high accuracy and that it effectively identifies the root causes of configuration options.
KW - bug reports
KW - configuration
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85013306212&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85013306212&partnerID=8YFLogxK
U2 - 10.1109/ISSRE.2016.29
DO - 10.1109/ISSRE.2016.29
M3 - Conference contribution
AN - SCOPUS:85013306212
T3 - Proceedings - International Symposium on Software Reliability Engineering, ISSRE
SP - 150
EP - 161
BT - Proceedings - 2016 IEEE 27th International Symposium on Software Reliability Engineering, ISSRE 2016
Y2 - 23 October 2016 through 27 October 2016
ER -