Abstract
We examine the effects of stemming on the tracing of software engineering artifacts. We compare two common stemming algorithms to each other as well as to a baseline of no stemming. We evaluate the algorithms on eight tracing datasets. We run the experiment using the TraceLab experimental framework to allow for ease of repeatability and knowledge sharing among the tracing community. We compare the algorithms on precision at recall levels of [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0], as well as on mean average precision values. The experiment indicated that neither the Porter stemmer nor the Krovetz stemmer outperformed the other on all datasets tested.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2019 IEEE/ACM 10th International Workshop on Software and Systems Traceability, SST 2019 |
| Pages | 37-44 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781728122557 |
| DOIs | |
| State | Published - May 2019 |
| Event | 10th IEEE/ACM International Workshop on Software and Systems Traceability, SST 2019 - Montreal, Canada Duration: May 27 2019 → … |
Publication series
| Name | Proceedings - 2019 IEEE/ACM 10th International Workshop on Software and Systems Traceability, SST 2019 |
|---|
Conference
| Conference | 10th IEEE/ACM International Workshop on Software and Systems Traceability, SST 2019 |
|---|---|
| Country/Territory | Canada |
| City | Montreal |
| Period | 5/27/19 → … |
Bibliographical note
Publisher Copyright:© 2019 IEEE.
Funding
We thank NSF for partially funding this grants CCF-1511117 and CICI 1642134.
| Funders | Funder number |
|---|---|
| National Science Foundation (NSF) | CCF-1511117, CICI 1642134 |
Keywords
- Empirical research
- Stemming
- Traceability
ASJC Scopus subject areas
- Software
- Safety, Risk, Reliability and Quality