Several aspects of the multidimensional problem of providing monitoring support tools for the debugging and performance analysis of software for distributed and parallel systems are presented. A formal event-action model at the process level and a layered architectural model are introduced. The application of the event-action model to the development of the layered architectural model is shown. This effort was motivated by the need to understand the ways in which a monitoring system may intrude upon a monitored system. An understanding of the fundamental ideas underlying the relationship between monitoring and monitored systems is necessary to build practical tools for software development. These models are currently being used in the development of monitoring tools for the PASM parallel processing system prototype.
|Number of pages||14|
|Journal||Journal of Parallel and Distributed Computing|
|State||Published - Jun 1990|
Bibliographical noteFunding Information:
ing Research Center (SERC), by the SD1 under AR0 Contract DAALO3-86K-0106, and by the Naval Ocean Systems Center under the High Performance Computing Block, ONT. t This portion of this project was done, in part, while the authors were at Purdue University.
The degree of intrusion that can be tolerated depends upon the nature of the application and upon the desired results of monitoring. To determine this level, it is important to understand that an intrusive monitor perturbs the execution of the parallel program by altering in an arbitrary manner the timing ofevents in the multiple threads of control being monitored. Multiple threads of control are defined as the case in which multiple processors are executing (potentially) independent instructions. Altering the timing of events may: (a) lead to incorrect results, (b) create (or mask) deadlock situations when the order of events in different threads of control is affected, (c) cause a real-time program to fail to meet its dead-lines, (d) increase drastically the execution time of the pro-gram being monitored, (e) make the debugging of a parallel program a difficult Grants CCR-8704826 and CCR-8809600, by the NSF Software Engineer-* This work was supported by the National Science Foundation under task.
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence