SCMiner: Localizing system-level concurrency faults from large system call traces

Tarannum Shaila Zaman, Xue Han, Tingting Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

Localizing concurrency faults that occur in production is hard because, (1) detailed field data, such as user input, file content and interleaving schedule, may not be available to developers to reproduce the failure; (2) it is often impractical to assume the availability of multiple failing executions to localize the faults using existing techniques; (3) it is challenging to search for buggy locations in an application given limited runtime data; and, (4) concurrency failures at the system level often involve multiple processes or event handlers (e.g., software signals), which can not be handled by existing tools for diagnosing intra-process(thread-level) failures. To address these problems, we present SCMiner, a practical online bug diagnosis tool to help developers understand how a system-level concurrency fault happens based on the logs collected by the default system audit tools. SCMiner achieves online bug diagnosis to obviate the need for offline bug reproduction. SCMiner does not require code instrumentation on the production system or rely on the assumption of the availability of multiple failing executions. Specifically, after the system call traces are collected, SCMiner uses data mining and statistical anomaly detection techniques to identify the failure-inducing system call sequences. It then maps each abnormal sequence to specific application functions. We have conducted an empirical study on 19 real-world benchmarks. The results show that SCMiner is both effective and efficient at localizing system-level concurrency faults.

Original languageEnglish
Title of host publicationProceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Pages515-526
Number of pages12
ISBN (Electronic)9781728125084
DOIs
StatePublished - Nov 2019
Event34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019 - San Diego, United States
Duration: Nov 10 2019Nov 15 2019

Publication series

NameProceedings - 2019 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019

Conference

Conference34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019
Country/TerritoryUnited States
CitySan Diego
Period11/10/1911/15/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Keywords

  • Concurrency Failures
  • Fault Localization
  • Multi Process Applications

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software
  • Control and Optimization

Fingerprint

Dive into the research topics of 'SCMiner: Localizing system-level concurrency faults from large system call traces'. Together they form a unique fingerprint.

Cite this