Pruning DOM Trees for Structured Document Processing

Yasushi Hayashi, Zhenjiang Hu, Masato Takeichi, Nobuaki Wake, Masafumi Hara, Norio Oshima
2004 Conference Proceedings of Japan Society for Software Science and Technology  
PSD (Programmable Structured Document) is a framework in which structured documents are edited efficiently and safely by evaluating embedded expressions in themselves. The PSD processing system we are currently developing requires an external evaluator to get the DOM data of documents held in the editor. In this work, a method to prune DOM trees is proposed to improve the performance of document manipulations by avoiding unnecessary data communication between the editor and the external
more » ... r. Based on information about references given by the user, it generates a pruned DOM tree, eliminating unnecessary parts for the evaluation from the original DOM tree. The mechanism of tree pruning is explained and its efficiency is evaluated using examples. * The project is supported by the Comprehensive Development of e-Society Foundation Software of the Ministry of Education, Culture, Sports, Science and Technology, Japan. be restricted by this assumption. It could be a language that cannot directly deal with XML as DOM objects. This demand to be language-independent leads to the structure of a PSD system having an editor to keep XML documents and an external evaluator for evaluating embedded code. For such kind of systems, having an effective interface from which the external evaluator can access and manipulate the documents efficiently and flexibly is critical. For example, in TreeCalc, we embed Haskell in XML, and made use of an XML editor/viewer, developed in Java, which employs a DOM for editing and presentation.
doi:10.11309/jssstconference. fatcat:wwtvxut2ibflddkbn24d32rugy