Consistent detection of global predicates

Robert Cooper, Keith Marzullo
1991 SIGPLAN notices  
A fundamental problem in debugging and monitoring is detecting whether the state of a system satisfies some predicate. If the system is distributed, then the resulting uncertainty in the state of the system makes such deteetion, in general, ill-defined. This paper presents three algorithms for detecting global predicates in a well-defined way. These algorithms do so by interpreting predicates with respect to the communication that has oecurrcd in the system. Briefly, the first algorithm
more » ... es that the predicate was possibly true at some point in the past the second determines that the predicate was definitely true in the past while the third algorithm establishes that the predicate is currently true, but to do so it may delay the execution of certain processes. Our approach is in contrast to the considerable body of work that uses temporal predicates (i.e., predicates expressed over process histories) for distributed monitoring. Temporal predicates are more powerful, but also more complex to use. In many cases, the condition that the programmer wishes to monitor is simply and intuitively viewed as a predicate over the "instantaneous" state of the system. Using the possibly/definitely/currently interpretation such a predicate becomes well-defined, without requiring it to be recast using temporal formulas. Further, our algorithms may be more efficient than techniques that use a notion of explicit time or process histories. Section 1 specifies the protocols and Section 2 gives an outline of 'This work was supportedby tbe DefenseAdvanced Research ProjectsAgency (DoD) under NASA Ames grant number NAG 2-593, Contract NO0140-87-C-8904, snd by grants from IBM and Siemens. The views, opinions, and findingscontained in this report are those of the authors and should not be construed as an official Department of Defense position, policy, or decision. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. @ 1991 ACM 0-89791-457-0/91/001 1/0167 . ..$1 .50 their operation. This work arose as part of A4eta, a toolkit supporting distributed system monitoring and control. The architecture addressed by Meta is more general than that needed for debugging, in that we are also concerned with severat monitoring components reacting in a consistent and fault-tolerant manner. Section 3 discusses how the algorithms of this paper can be used to provide a breakpoint and tracepoint facility for Meta.
doi:10.1145/127695.122774 fatcat:emlmf5jjynabxlmxaixoaaaaji