On-line monitoring: a tutorial

B.A. Schroeder
1995 Computer  
m On-line monitoring can complement formal techniques to increase application dependabi I i ty. This tutorial outlines the concepts and identifies the activities that comprise eventbased monitoring, describing several representative monitoring systems. Computer --lthough monitoring has been around since the early 1960s with the advent of debuggers, the field has recently made some excit-A ing advances. Monitoring systems today monitor distributed applications and are often themselves
more » ... . In addition, they are increasingly seen as a viable solution to areas of growing concern: lack of dependability and tools to support distributed applications. Monitoring has succeeded in these areas and has matured in its ability to give users freedom in defining what is to be monitored. Monitoring gathers information about a computational process as it executes' and can be classified by its functionality (see Figure 1 ). Dependability includes fault tolerance and safety. Performance enhancement includes dynamic system configuration, dynamic program tuning, and on-line steering.* Correctness checking is the monitoring of an application to ensure consistencywith a formal specification. It can be used to detect runtime errors or as a verification technique. Security monitoring attempts to detect security violations such as illegal login or attempted file access. Control includes cases where the monitoring system is part of the target system, a necessary component in providing computational functionality. Debugging and testing employs monitoring techniques to extract data values from an application being tested. Performance evaluation uses monitoring to extract data from a system that is later analyzed to assess system performance. I focus on four of the seven functional areas: dependability, performance enhancement, correctness checking, and security. The systems in these functional areas exhibit common characteristics. First, the monitor functions as an external observer of the target software. Unlike control monitors, external observers are not required to provide computational functionality. Second, the systems are designed to monitor the target software and respond while the target software is operational. This forces the monitoring system to react in a timely manner to events as they occur in the target system. (Debuggers are not so constrained, because they either slow the application's execution rate or simply gather trace data for later analysis or replay.) Lastly, the monitoring component is a permanent part of the overall system, although at times it may run at reduced functionality. (This is unlike performance evaluation tools that are, like some hardware test tools, attached to a system.) We call a monitoring system that is an external observer, monitors a fully functioning application, and is generally intended to be permanent an on-line monitoringsystem. These systems often do more than just gather information; they interpret the gathered information and respond appropriately. On-line monitoring systems can therefore provide increased robustness, security, fault-tolerance, and adaptability.
doi:10.1109/2.386988 fatcat:c22e7hnwrff7tfjr7a5lyugrcm