Goal-Directed Metacontrol for Integrated Procedure Learning [chapter]

Jihie Kim, Karen Myers, Melinda Gervasio, Yolanda Gil
2011 Metareasoning  
Developing systems that learn how to perform complex tasks presents a significant challenge to the artificial intelligence community. As the knowledge to be learned becomes complex, with diverse procedural constructs and uncertainties to be validated, the system needs to integrate a wide range of learning and reasoning methods with different focuses and strengths. For example, one learning method may be used to generalize from user demonstrations, another to learn by practice and exploration,
more » ... d another to test hypotheses with experiments. The POIROT system pursues such a multistrategy learning methodology that employs multiple integrated learners and knowledge validation modules to acquire complex process knowledge for a medical logistics domain (Burstein et al., 2008) . For a learning system of such complexity, activities of participating agents must be coordinated to ensure that their collective activities produce the desired procedural knowledge. This kind of control is inherently metalevel (Anderson & Oates, 2007; Cox & Raja, Chapter 1) in that it requires the system to reflect on what it is doing and why, to monitor its progress, and to make adjustments to its 135 behaviour when performance falls short of expectations. Without such introspection, effective coordination and prioritization of the base-level learning and reasoning components would not be possible. This type of introspection corresponds to a form of metareasoning centered on "stepping back" from the system to analyze its behavior, as discussed by Perlis (Chapter 2). As such, it contrasts with the majority of work to date on metareasoning, which has focused on the problem of bounded rationality, as described by Zilberstein (Chapter 3). Developing a metalevel reasoner for such a complex, integrated learning system poses several challenges, including • Assessing the progress of learning over time; • Systematically addressing conflicts and failures that arise during learning; • Addressing gaps and shortcomings of the individual and aggregate learning results; • Supporting flexible interactions among agents that pursue different learning strategies. We describe a metalevel framework for coordinating the activities of a community of learners to create an integrated learning system. The metalevel framework is organized around learning goals, which are formulated through introspective reasoning to identify problems and requirements for the ongoing 136 learning process. These learning goals are posted to a shared blackboard to direct the other components in the system. Goals can be either process or knowledge oriented. Process goals define specific tasks to be performed as part of the learning process and are used to coordinate the activities of the various learning and reasoning components. Examples of process goals for task learning include hypothesis creation, hypothesis merging, explanation of observations, and hypothesis validation through experimentation. Knowledge goals provide the means for a component to convey the need for additional information to further the learning process. In particular, the quality of learned knowledge could be compromised by missing critical information and the efficiency of learning may be impaired by ambiguity arising from insufficient knowledge. In the succeeding sections of this chapter, we describe the modules within our metalevel framework that are responsible for addressing process (Maven) and knowledge (QUAIL) goals. Maven (Moderating ActiVitiEs of iNtegrated learners) formulates and achieves metalevel process goals to support integrated learning. Maven's design is based on our prior work on metalevel goals and reasoning for interactive knowledge capture (Kim & Gil, 2007; Gil & Kim, 2002) . Maven explicitly 137 represents plans for achieving learning goals along with high-level strategies to prioritize learning goals. By generating assessment annotations on learned knowledge, Maven keeps track of learning progress and makes decisions on learning goals to pursue (Kim & Gil, 2008). QUAIL (Question Asking to Inform Learning) addresses knowledge goals by managing a process of selecting and posing questions to a human expert to fill identified knowledge gaps (Gervasio & Myers, 2008; Gervasio et al., 2009). Question selection trades off the utility of missing knowledge with the cost of obtaining it. Figure 6.1 shows how our goal-oriented metalevel framework for integrated learning maps into a more general model of metareasoning described by Cox and Raja (Chapter 1). The base-level actions correspond to the performance tasks for which procedural knowledge is being learned. Learning occurs at the reasoning level, while the metalevel supports control of learning through two essential mechanisms: metalevel process management (realized by Maven) and metalevel management of learned knowledge (realized by QUAIL). The metalevel influences the components at the reasoning level by posting appropriate learning goals and information to direct their activities. [Figure 6.1 near here] 138 Metalevel process management tracks progress toward current learning goals by monitoring the performance of base-level components and their results; it also initiates additional goals to drive the system toward achieving overall learning objectives. Metalevel management of learned knowledge identifies knowledge gaps by introspection over the current state of the learned knowledge, takes actions to eliminate gaps by posing questions to the human demonstrator, and then provides information back to the base-level learners and reasoners to address unresolved knowledge goals. Although not yet supported in QUAIL, conceptually it could also coordinate with the metalevel process management module to initiate additional activity by the base level as a means of addressing knowledge gaps (as opposed to relying solely on the human demonstrator to provide answers).
doi:10.7551/mitpress/9780262014809.003.0006 fatcat:nmlnhsznlrcjldxd53olcf5epi