Models for monitoring and debugging tools for parallel and distributed software

Dan C. Marinescu, James E. Lumpp, Thomas L. Casavant, Howard Jay Siegel

Research output: Contribution to journalArticlepeer-review

39 Scopus citations

Abstract

Several aspects of the multidimensional problem of providing monitoring support tools for the debugging and performance analysis of software for distributed and parallel systems are presented. A formal event-action model at the process level and a layered architectural model are introduced. The application of the event-action model to the development of the layered architectural model is shown. This effort was motivated by the need to understand the ways in which a monitoring system may intrude upon a monitored system. An understanding of the fundamental ideas underlying the relationship between monitoring and monitored systems is necessary to build practical tools for software development. These models are currently being used in the development of monitoring tools for the PASM parallel processing system prototype.

Original languageEnglish
Pages (from-to)171-184
Number of pages14
JournalJournal of Parallel and Distributed Computing
Volume9
Issue number2
DOIs
StatePublished - Jun 1990

Bibliographical note

Funding Information:
ing Research Center (SERC), by the SD1 under AR0 Contract DAALO3-86K-0106, and by the Naval Ocean Systems Center under the High Performance Computing Block, ONT. t This portion of this project was done, in part, while the authors were at Purdue University.

Funding Information:
The degree of intrusion that can be tolerated depends upon the nature of the application and upon the desired results of monitoring. To determine this level, it is important to understand that an intrusive monitor perturbs the execution of the parallel program by altering in an arbitrary manner the timing ofevents in the multiple threads of control being monitored. Multiple threads of control are defined as the case in which multiple processors are executing (potentially) independent instructions. Altering the timing of events may: (a) lead to incorrect results, (b) create (or mask) deadlock situations when the order of events in different threads of control is affected, (c) cause a real-time program to fail to meet its dead-lines, (d) increase drastically the execution time of the pro-gram being monitored, (e) make the debugging of a parallel program a difficult Grants CCR-8704826 and CCR-8809600, by the NSF Software Engineer-* This work was supported by the National Science Foundation under task.

Funding

ing Research Center (SERC), by the SD1 under AR0 Contract DAALO3-86K-0106, and by the Naval Ocean Systems Center under the High Performance Computing Block, ONT. t This portion of this project was done, in part, while the authors were at Purdue University. The degree of intrusion that can be tolerated depends upon the nature of the application and upon the desired results of monitoring. To determine this level, it is important to understand that an intrusive monitor perturbs the execution of the parallel program by altering in an arbitrary manner the timing ofevents in the multiple threads of control being monitored. Multiple threads of control are defined as the case in which multiple processors are executing (potentially) independent instructions. Altering the timing of events may: (a) lead to incorrect results, (b) create (or mask) deadlock situations when the order of events in different threads of control is affected, (c) cause a real-time program to fail to meet its dead-lines, (d) increase drastically the execution time of the pro-gram being monitored, (e) make the debugging of a parallel program a difficult Grants CCR-8704826 and CCR-8809600, by the NSF Software Engineer-* This work was supported by the National Science Foundation under task.

FundersFunder number
AR0DAALO3-86K-0106
Naval Ocean Systems Center
National Science Foundation Arctic Social Science Program

    ASJC Scopus subject areas

    • Software
    • Theoretical Computer Science
    • Hardware and Architecture
    • Computer Networks and Communications
    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'Models for monitoring and debugging tools for parallel and distributed software'. Together they form a unique fingerprint.

    Cite this