The Amsterdam Toolkit for Language Archaeology

Ralf Lämmel
2005 Electronical Notes in Theoretical Computer Science  
GRK -the Grammar Recovery Kit -illustrates options for automation and corresponding tool support in the context of developing quality language references that readily cater for the derivation of parsers. GRK provides the proof-of-concept for two notions: (i) semi-automatic grammar recovery; (ii) language-reference re-engineering. GRK's support for semi-automatic grammar recovery means that GRK can be used to obtain a relatively correct and complete as well as implementable grammar from a
more » ... e reference. GRK's support for language-reference re-engineering means that GRK can be used to update the original language reference such that it reflects the completed and corrected grammar knowledge. As of today, GRK is particularly fit for Cobol archaeology, more specifically for IBM's VS Cobol II. That is, GRK offers a fully mechanised process, where IBM's reference is used as an input, and the output is a transformed language reference whose grammar portions are correct and complete. (The recovery required several hundreds of simple transformation steps in order to deliver a grammar that is fit for parser derivation.) As a byproduct, GRK also generates a slow, Prolog-based parser. Via export to GRK's sibling, GDK (the Grammar Deployment Kit), a reasonably fast, btyaccbased parser can be generated as well. Both parsers accept all of the VS Cobol II code that is at our avail (several millions of lines of code).
doi:10.1016/j.entcs.2005.07.004 fatcat:e2rynnu4u5at5kc7nz25syclyy