Using KIV to specify and verify architectures of knowledge-based systems

D. Fensel, A. Schnogge
Proceedings 12th IEEE International Conference Automated Software Engineering  
examples. E.g. we have verified with KIV theorems of [3] used to distinguish different complexity classes in abduction. 7 Notice that these proofs are in no way trivial: the informal proof for theorem 5.3 of [3] took one page. It states that for the class of ordered monotonic abduction problems using a specific preference criterion, there is a polynomial algorithm for finding a best explanation. This is proved by presenting an algorithm that in polynomial time returns such a best explanation.
more » ... rmalizing the informal arguments of the correctness proof for this algorithm results in several hundred (machine checkable) proof steps. Roughly, the interactive theorem proving system of KIV is comparable with systems as PVS [5] and Isabelle [22] . For our purpose, the KIV system is especially well suited due to its facilities for structuring specifications and software modules (including automatic generation of proof obligations), its proof engineering facilities (like an elaborated graphical user interface and reuse mechanisms), and the underlying dynamic logic. [20] identifies two kinds of approaches in software reuse: Supporting the software development process with reusable components or making parts of the development process reusable via program transformation techniques. Our approach provide support by formally specified and verified building blocks i.e. components. The latter approach is taken by KIDS/SPECWARE [28] , [29] which provides support in the derivation of efficient implementations from formal specifications. Here, problem-solving methods are not "first-order citizens" that describe reusable components or architectures but secondorder transformation rules working on specifications. As in our approach, system development is viewed as a semiautomatic activity. At the technical level, the main differences are the use of dynamic logic for the declarative specification of procedural constructs in KIV and the use of category theory and sheaf theory to express transformations of algebraic specifications in SPECWARE. AMPHION ([18], [19] ) is a knowledge-based software engineering system for the formal specification and automatic deductive synthesis of programs which consist of calls of subroutines from a library. It is specialized to application domains by means of a declarative domain theory and a library of subroutines. This specialization allows the automatic synthesis of programs from specifications. Our approach is more general-purpose (but, of course, less automatic): the programs developed are combinations and instantiations of (mostly domain-7. Theorem 4.4 and (the more difficult) theorem 5.3. For both theorems only the total correctness of the algorithms and, of course, not their complexity bounds have been proven. independent) problem-solving methods rather then simply a sequence of calls of subroutines from a library. Furthermore the (normal) user of the AMPHION system is not intended to create or modify the domain theory or the subroutine library. In our approach, the verification of user-defined problem-solving methods in a library with respect to their declarative specifications (competence) is done within the KIV system itself. From a modelling point of view the main differences to the mentioned approaches stems from the fact that we specialise our approach to a specific type of systems (i.e., knowledge-based systems) which allow us to introduce strong assumptions on the architecture of the system under development. In consequence, we are able to provide stronger conceptual guidance for system development. Also, none of the approaches in software engineering make the distinction between a problem type (called a task in our framework) and a domain. In consequence, reusability is limited in their approaches. From a technical point of view the main difference to the mentioned approaches stems from the fact that KIV uses dynamic logic, which enables the integrated specification and verification of declarative and procedural parts of a system. Part of this support is the automatic generation of proof obligations that guarantee the proper relationships between declarative and procedural parts as well as composed specifications in general. Finally, we would like to mention some lines of our future work. The architecture used to specify knowledgebased systems can be expressed in the generic module concept of KIV. However, this is connected with a loss of information because the KIV specification does not distinguish the different roles that specifications may have (goals, requirements, adapters etc.). Therefore, not all of the desired proof obligations could be generated automatically or at least not directly. Still, it seems possible to specialize the generic concepts of KIV. This would allow us to provide the automatic generation of according proof obligations and of predefined modules and specification combinations to model the different aspects of a knowledge-based systems. Based on this, we plan to develop a methodological framework for the stepwise development of correct specifications of knowledge-based systems. Here we have to take a look on approaches like KIDS/Specware, especially for the process of refining a task via a problem-solving method into subtasks. Acknowledgement. We would like to thank Rix Groenboom, John Penix, Annette ten Teije, Frank van Harmelen, Bob Wielinga, and the anonymous reviewers for very helpful comments and discussions and Jeff Butler for correcting the English. parsimonious and it remains to prove that there is no proper subset H of local-parsimonious-explanation which explains (at least) all the data explained by the localparsimonious explanation , i.e., explain ( local-parsimonious-explanation ) ⊆ explain ( H ). We choose some hypothesis h ∈ local-parsimoniousexplanation , such that H ⊆ local-parsimonious-explanation \ h. Due to the monotony assumption we can derive explain ( H ) ⊆ explain ( local-parsimonious-explanation \ h ) and transitivity of ⊆ yields explain ( local-parsimonious-explanation ) ⊆ explain ( local-parsimonious-explanation \ h ). Since local-parsimonious-explanation is complete (see ii) it holds explain ( local-parsimonious-explanation ) = all-data and thus explain ( local-parsimonious-explanation \ h ) = all-data that is local-parsimonious-explanation \ h is a complete set of hypotheses. This, however, contradicts the minimality axiom of the (mapped) competence. The proof in KIV requires 14 proof steps and 7 interactions. The seven interactions concern the application of axioms of the different specifications for the proof process. The monotony assumption defines a natural subclass of abduction. For example [17] examine their role in modelbased diagnosis. The assumption holds for applications, where no knowledge that constrains fault behaviour of devices is provided or where this knowledge respects the limited-knowledge-of-abnormal behaviour assumption . This is used by [16] as a minimal diagnosis hypothesis to reduce the average-case effort of finding all parsimonious and complete explanations with GDE. The question of how to provide such assumptions that close the gap between task definitions and PSMs may arise. In [12], we presented the idea of inverse verification . We start a proof with KIV that the competence of the PSM implies the goal of the task. This proof usually cannot succeed but its gaps provide hints for assumptions that are necessary for it. That is, we use the technique of a mathematical proof to search for assumptions that are necessary to guarantee the relationship between the PSM and the task. When applying the interactive theorem prover to an impossible proof it returns a open goal that cannot be proven but which would allow to finish the proof. Therefore such an open goal defines a sufficient assumption. Further proof attempts have to be made to refine it to necessary assumptions (see [12] for more details). 6 6. For example, the generation of counterexamples helps to find the essential aspects of an assumptions. Related Work, Conclusions and Future Work We have shown in the paper how tasks and problemsolving methods can be specified and verified with KIV. KIV is well-suited for both as it combines algebraic specifications with imperative constructs that enable the specification of the reasoning behaviour. The interactive theorem prover provides excellent support in processing the different automatically generated proof obligations. The modular concept of proofs and proof reuse for partial modified specification make the verification effort feasible. As a consequence of our modularized specification, we distinguish several proof obligations that arise in order to guarantee a consistent specification. Thus a separation of concerns is achieved that contributes to the feasibility of the verification. In addition, the proofs of the internal correctness of the components need not to be repeated when a component is reused. Only the proofs that are concerned with the proper combination of them have to be proceeded when developing a KBS. The examples and proofs in our paper are kept simple. However, we have applied KIV also in more complex Fig. 5 Connecting PSM and Task. assumptions = enrich abduction problem with axioms complete(all-hypotheses), H 1 ⊆ H 2 → explain(H 1 ) ⊆ explain(H 2 ) end enrich mapping = actualize competence with assumptions by morphism correct → complete, input → all-hypotheses, local-minimal-set → local-parsimonious-explanation, objects → hypotheses, all-objects → all-hypotheses, ... end actualize adapter = module export explanation refinement representation of operations explanation implements explanation; import mapping variables res : hypotheses; implementation explanation(var res) begin res := local-parsimonious-explanation end Adapter: Connecting Task and PSM The description of an adapter maps the different terminologies of the task definition, the PSM, and the domain model and introduces further requirements and assumptions that have to be made to relate the competence of a PSM with the functionality as it is introduced by the task definition. Because it relates the three other parts of a specification together and establishes their relationship in a way that meets the specific application problem, they can be described independently and selected from libraries. Their consistent combination and their adaptation to the specific aspects of the given application must be provided by the adapter. Usually an adapter introduces new requirements or assumptions, because, in general most problems tackled with KBSs are inherently complex and intractable (cf. [11], [21]). A PSM can only solve such tasks with reasonable computational effort by introducing assumptions that restrict the complexity of the problem or by strengthening the requirements on domain knowledge. We have to introduce assumptions by the subspecification assumptions (cf. Figure 5) to ensure that the competence of our method implies the goal of the task. First, we have to require that the input of the method is a complete explanation. Based on the mappings it is now easy to prove that the input requirement of the method is fulfilled (i.e., the input is correct). Second, based on the mappings we can prove that our method set-minimizer finds a local-minimal set that is parsimonious in the sense that each subset that contains one element less is not a complete explanation. However, we cannot guarantee that it is parsimonious in general. Smaller subsets of it that are complete explanations may exist. The adapter has to introduce a new requirement on domain knowledge or an assumption (in the case that it does not follow from the domain model) to guarantee, that the competence of the PSM is strong enough to achieve the goal of the task. The monotony assumption (cf. Figure 5) is sufficient (and necessary cf. [12] ) to prove that the (global) parsimoniousness of the result of the PSM follows from its local parsimoniousness. To ensure the automatic generation and management of this proof obligations by KIV we have to specify the adapter as a module that implements the task goal by importing the mappings and exporting the goal of the task (cf. Figure 5) . KIV automatically generates three proof obligations for the adapter module, again formulated in dynamic logic. (i) true (ii) complete(res) (iii) parsimonious(res) The proofs of i and ii are trivial and fully automatic. iii is the real proof obligation of this module. The informal proof sketch is as follows: We unfold the definition of Fig. 4 Verifying the PSM with KIV. can be performed by selecting alternatives provided by a menu. For each of the proof obligations we formulate straightforward auxiliary lemmas i-lemma, ii-lemma, iiilemma, and iv-lemma; one for each of these proof obligations, respectively. These auxiliary lemmas express the corresponding property of the hill-climbing subprocedure (cf. Figure 3) : Using these lemmas, each of the proof obligations can now directly be proven with the interactive proof environment of KIV. Activating the standard set of predefined heuristics (by click) and then selecting the auxiliary lemma proper (by click) is enough. KIV automatically unfolds the control procedure, finds the appropriate instantiation of the lemma, and carries out the first-order reasoning (necessary e.g. for (iii)). Thus the proofs of (i), (ii), (iii), (iv) respectively can be carried out with one user interaction. It remains to prove the four lemmas. All of these proofs work by induction. And to construct them with the help of KIV one has to tell KIV (again by clicking) which kind of induction should be used. KIV is then able to unfold (and symbolically execute) the procedure hill-climbing and find the correct instantiation of the induction hypothesis. While KIV tries to construct the proofs it comes up with subgoals reflecting certain properties of the inference actions. We then interact by formulating these properties as first-order lemmas in the specification of the inferences, and KIV is able to automatically find and use them to close the open subgoals. Thus with a few and almost straightforward user interactions (in addition to the formulation of the lemmas) the original proof obligations are reduced to the task of proving some properties of the inferences stated in first-order logic. These in turn can be derived from the axioms. Here again some user interaction is required, mostly selecting the appropriate axioms (and also one quantifier instantiation). Besides this KIV does all the first-order reasoning. We now give a sketch of the proofs: i-lemma. The termination of the PSM is proven by induction on the first parameter of hill-climbing, where the (well-founded 5 ) order ⊂ is used. In the induction step we use the fact that 5. Remember that we deal with finite sets only. select-one-correct(O,generate-successors(O)) ≠ O → select-one-correct(O,generate-successors(O)) ⊂ O. This is equivalent to select-one-correct(O,generate-successors(O)) ⊆ O which can be proved as an instantiation of a stronger lemma which is used in the proof of ii-lemma. ii-lemma. The proof is carried out by induction on the recursive calls of the hill-climbing procedure in a terminating run. The proof uses the property that This property is proven by using the (three) axioms for the inferences generate-successors and select-onecorrect and a suitable case-distinction (i.e., four interactions). iii-lemma. The proof is carried out, as for lemma iilemma, by induction on the recursive calls of the hillclimbing procedure in a terminating run. For this it is enough to use the property of select-one-correct that it yields a correct object set, whenever it does not yield its first argument as result. This property follows immediately from the axioms. iv-lemma. The proof is carried out, as for ii-lemma and iii-lemma, by induction on the recursive calls of the hillclimbing procedure in a terminating run. The proof uses the property that This property is proven as follows: First we show that from the condition ∃o. (o ∈ O ∧ correct(O \ o)) it follows that O 1 := select-one-correct(O,generate-successors(O)) ∈ generate-successors(O) From the axiom for generate successors follows that an o 1 ∈ O must exist such that O 1 = O \ o 1 , i.e. O 1 ≠ O. To give an impression of how to work with KIV, Figure 4 is a screen dump of the KIV system when proving ivlemma. The current proof window on the right shows the partial proof tree currently under development. Each node represents a sequent (of a sequent calculus for dynamic logic); the root contains the theorem to prove. In the messages window the KIV system reports its ongoing activities. The KIV-Strategy window is the main window which shows the sequent of the current goal i.e. an open premise (leaf) of the (partial) proof tree. The user works either by selecting (clicking) one proof tactic (the list on the left) or by selecting a command from the menu bar above. Proof tactics reduce the current goal to subgoals and thereby make the proof tree grow. Commands include the selection of heuristics, backtracking, pruning the proof tree, saving the proof, etc. Proving Total Correctness of the PSM When introducing a PSM into a library we have to prove two aspects of the operational specification of the PSM. We have to ensure the termination of the procedure and the competence as specified. When reusing the PSM this proof need not to be repeated and can (implicitly) be reused. The proof obligations are automatically generated by KIV as formulas in dynamic logic (cf. [14], [15]). In our example, it derives the following proof obligations (see [24] , section 5.2 for more details on how the correctness of a module is translated into a set of proof obligations formulated in dynamic logic.): (i) true, i.e. termination (ii) output ⊆ input, corresponds to axiom 1 of the competence (iii) correct(output), corresponds to axiom 2 of the competence (iv) o ∈ output → ¬ correct(output \ o), corresponds to axioms 3 of the competence. These proof obligations ensure that the PSM terminates and that it terminates in a state that respects the axioms used to characterize the competence of the PSM ("<>" is the diamond operator of dynamic logic). The next step is to actually prove these obligations using KIV. For constructing proofs KIV provides an integration of automated reasoning and interactive proof engineering. The user constructs proofs interactively, but has only to give the key steps of the proof (e.g. induction, case distinction) and all the numerous tedious steps (e.g. simplification) are performed by the machine. Automation is achieved by rewriting and by heuristics which can be chosen, combined and tailored by the proof engineer. If the chosen set of heuristics get stuck in applying proof tactics the user has to select tactics on his own or activate a different set of heuristics in order to continue the partial proof constructed so far. Most of these user interactions psm-domain requirements = enrich objects with constants input : objects; predicates correct : objects, axioms correct(input) end enrich object-sets = generic specification parameter psm-domain requirements target constants ∅ : object-sets; functions os-insert : objects x object-sets → object-sets; predicates . ∈ . : objects x object-sets; axioms object-sets generated by ∅ os , os-insert , ¬ OS ∈ os ∅ os , Fig. 3 The sub-specifications and modules of set-minimizer. control = module export competence refinement representation of operations control implements local-minimal-set import inferences procedures hill-climbing(objects) : objects variables output, current, new :objects; implementation control(var output) begin hill-climbing(input,output) end hill-climbing(current, var output) begin var new = select-one-correct (current,generate-successors(current)) if new = current then output := current else hill-climbing(new,output) end competence = generic specification parameter psm-domain requirements target constants local-minimal-set : objects; axioms (1) local-minimal-set ⊆ input, end generic specification Formalizing a Problem-Solving Method The concept PSM is present in many current knowledge-engineering frameworks (e.g. Generic Tasks [4] , CommonKADS [26], Method-to-task approach [6]). In general, PSMs are used to describe the reasoning process of a KBS. Aside from some differences between the approaches, there is a consensus that a PSM decomposes the entire reasoning task into more elementary inferences; defines the types of knowledge that are needed by the inference steps to be done; and defines control and knowledge flow between the inferences. Extending, [1] defined the competence of a PSM (i.e., a functional black-box specification) independent from the specification of its operational reasoning behaviour. Proving that a PSM has some competence has the clear advantage that the selection of a method for a given problem and the verification whether a PSM fulfils its task can be done independently from details of the internal reasoning behaviour of the method. Finally, a PSM has requirements on domain knowledge. Each inference step and therefore the competence description of a PSM requires specific types of domain knowledge. These complex requirements on domain knowledge distinguish a PSM from usual software products. Preconditions on valid inputs are extended to complex requirements on available domain knowledge. Libraries of PSMs are described in [2], [4], and [23]. Reusing PSMs enhance the development process of KBSs. They can either be directly reused as reasoning component or in the case of a more complex task they can be used to decompose the task. In the latter case, each inference step of the PSM defines a new task. Each of these subtasks require again the selection of a PSM that may either directly solve it or recursively refine it to new subtasks. In abduction problem = enrich hypotheses, data with functions explain : hypotheses → data; predicates complete : hypotheses, parsimonious : hypotheses; axioms complete(H) ↔ explain(H) = all-data, parsimonious(H) ↔ ¬ ∃ H 1 . H 1 ⊂ H ∧ explain(H) ⊆ explain(H 1 ) end enrich explanation = enrich abduction problem with constants explanation : hypotheses; axioms complete(explanation), parsimonious(explanation) end enrich Fig. 2 The specification of the abductive task. the first case, a PSM is reused as a component (black-box reuse of PSM) whereas in the second case, the PSM defines an architecture (white-box reuse) for selecting further components. 3 We use the very simple PSM set-minimizer of [10] for our example. It receives a set of objects as input and tries to find a minimized version of the set that still fulfils a correctness requirement. The search strategy applied is one-step look ahead. The overall structure of the PSMspecification is provided in Figure 1 and the definition of some of its subspecifications and modules is given in Figure 3 . Domain Requirements The main requirements on available knowledge and input that are introduced by the method are: the existence of a possible set of objects (a sort), the existence of a predicate correct holding true for some sets, and, finally, the method assumes that the input is a correct set. These requirements on knowledge and input data are specified as (formal) parameter of the specification of the method. They are replaced by concrete parameters when the method is applied for a specific task and domain as we will see in Section 6. 4 Operational Specification The method works as follows: First, we take the input. Then we recursively generate the successors of the current set and select one of its correct successors. If there is no new correct successor we return the current set. The functions generate-successors and select-one-correct in the specification inferences correspond to elementary inference actions in CommonKADS [26] . The procedural control (in KADS located at the task body) is defined by the module control.
doi:10.1109/ase.1997.632826 dblp:conf/kbse/FenselS97 fatcat:twx5ibly4jfntny72zab4fzzkm