Partial observability and learnability

Loizos Michael
2010 Artificial Intelligence  
When sensing its environment, an agent often receives information that only partially describes the current state of affairs. The agent then attempts to predict what it has not sensed, by using other pieces of information available through its sensors. Machine learning techniques can naturally aid this task, by providing the agent with the rules to be used for making these predictions. For this to happen, however, learning algorithms need to be developed that can deal with missing information
more » ... the learning examples in a principled manner, and without the need for external supervision. We investigate this problem herein. We show how the Probably Approximately Correct semantics can be extended to deal with missing information during both the learning and the evaluation phase. Learning examples are drawn from some underlying probability distribution, but parts of them are hidden before being passed to the learner. The goal is to learn rules that can accurately recover information hidden in these learning examples. We show that for this to be done, one should first dispense the requirement that rules should always make definite predictions; "don't know" is sometimes necessitated. On the other hand, such abstentions should not be done freely, but only when sufficient information is not present for definite predictions to be made. Under this premise, we show that to accurately recover missing information, it suffices to learn rules that are highly consistent, i.e., rules that simply do not contradict the agent's sensory inputs. It is established that high consistency implies a somewhat discounted accuracy, and that this discount is, in some defined sense, unavoidable, and depends on how adversarially information is hidden in the learning examples. Within our proposed learning model we prove that any PAC learnable class of monotone or read-once formulas is also learnable from incomplete learning examples. By contrast, we prove that parities and monotone-term 1-decision lists, which are properly PAC learnable, are not properly learnable under the new learning model. In the process of establishing our positive and negative results, we re-derive some basic PAC learnability machinery, such as Occam's Razor, and reductions between learning tasks. We finally consider a special case of learning from partial learning examples, where some prior bias exists on the manner in which information is hidden, and show how this provides a unified view of many previous learning models that deal with missing information. We suggest that the proposed learning model goes beyond a simple extension of supervised learning to the case of incomplete learning examples. The principled and general treatment of missing information during learning, we argue, allows an agent to employ learning entirely autonomously, without relying on the presence of an external teacher, as is the case in supervised learning. We call our learning model autodidactic to emphasize the explicit disassociation of this model from any form of external supervision.
doi:10.1016/j.artint.2010.03.004 fatcat:lkqw47or3jfvzpdswf44tu6grm