Computerizing Mathematical Text with MathLang

Fairouz Kamareddine, J.B. Wells
2008 Electronical Notes in Theoretical Computer Science  
Mathematical texts can be computerized in many ways that capture differing amounts of the mathematical meaning. At one end, there is document imaging, which captures the arrangement of black marks on paper, while at the other end there are proof assistants (e.g., Mizar, Isabelle, Coq, etc.), which capture the full mathematical meaning and have proofs expressed in a formal foundation of mathematics. In between, there are computer typesetting systems (e.g., L A T E X and Presentation MathML) and
more » ... emantically oriented systems (e.g., Content MathML, OpenMath, OMDoc, etc.). The MathLang project was initiated in 2000 by Fairouz Kamareddine and Joe Wells with the aim of developing an approach for computerizing mathematical texts and knowledge which is flexible enough to connect the different approaches to computerization, which allows various degrees of formalization, and which is compatible with different logical frameworks (e.g., set theory, category theory, type theory, etc.) and proof systems. The approach is embodied in a computer representation, which we call MathLang, and associated software tools, which are being developed by ongoing work. Three Ph.D. students (Manuel Maarek (2002 /2007 ), Krzysztof Retel (since 2004, and Robert Lamar (since 2006)) and over a dozen master's degree and undergraduate students have worked on MathLang. The project's progress and design choices are driven by the needs for computerizing real representative mathematical texts chosen from various branches of mathematics. Currently, MathLang supports entry of mathematical text either in an XML format or using the T E X MACS editor. Methods are provided for adding, checking, and displaying various information aspects. One aspect is a kind of weak type system that assigns categories (term, statement, noun (class), adjective (class modifier), etc.) to parts of the text, deals with binding names to meanings, and checks that a kind of grammatical sense is maintained. Another aspect allows weaving together mathematical meaning and visual presentation and can associate natural language text with its mathematical meaning. Another aspect allows identifying chunks of text, marking their roles (theorem, definition, explanation, example, section, etc.), and indicating relationships between the chunks (A uses B, A contradicts B, A follows from B, etc.). Software tool support can use this aspect to check and explain the overall logical structure of a text. Further aspects are being designed to allow adding additional formality to a text such as proof structure and details of how a humanreadable proof is encoded into a fully formalized version (so far this has only been done for Mizar and started for Isabelle). A number of mathematical texts have been computerized, helping with the development of these aspects, and indicating what additional work is needed for the future. This paper surveys the past and future work of the MathLang project.
doi:10.1016/j.entcs.2008.03.063 fatcat:km2j4mpujfarpbhkxrn5evu7eq