Filters








7 Hits in 3.3 sec

Konvertierung von PDF in XML für die Langzeitarchivierung und Weiterverarbeitung

Thomas Bähr, Merle Friedrichsen
2017 ABI-Technik  
Wie PDF ist auch XML medienneutral und plattformunabhängig.  ...  Die Technische Informationsbibliothek (TIB) führte eine Analyse zur Machbarkeit einer PDF-nach-XML-Konvertierung durch.  ...  In: International Journal on Digital Libraries 14,3-4 (2014): 84; und Constantin, Alexandru, Steve Pettifer, Andrei Voronkov: "PDFX. Fully-automated PDF-to-XML Conversion of Scientific Literature."  ... 
doi:10.1515/abitech-2017-0004 fatcat:3tcsxh23bjgevpvm4tovrsn4qm

The Document Components Ontology (DoCO)

Alexandru Constantin, Silvio Peroni, Steve Pettifer, David Shotton, Fabio Vitali, Oscar Corcho
2016 Semantic Web Journal  
The availability in machine-readable form of descriptions of the structure of documents, as well as of the document discourse (e.g. the scientific discourse within scholarly articles), is crucial for facilitating  ...  that rely on DoCO to annotate and retrieve document components of scholarly articles.  ...  Acknowledgements We would like to thank Angelo Di Iorio and Francesco Poggi for their support and contribution for the development of the structural pattern theory summarised in Section 3.1, and for the  ... 
doi:10.3233/sw-150177 fatcat:3bkjh5e3ebelblzsacyvlfqlk4

Structured Affiliations Extraction from Scientific Literature

Dominika Tkaczyk, Bartosz Tarnawski, Łukasz Bolikowski
2015 D-Lib Magazine  
PDFX: fully-automated pdf-to-xml conversion of scientific literature. In ACM Symposium on Document Engineering , pages 177-180, 2013. http://doi.org/10.1145/2494266.2494271 [3] I. G. Councill, C.  ...  PDFX [2] is a rule-based system for converting scholarly articles in PDF format to XML representation by annotating fragments of the input documents.  ... 
doi:10.1045/november2015-tkaczyk fatcat:jqatgvnvkffovfqqnxjzasypym

A Comparison of Two Unsupervised Table Recognition Methods from Digital Scientific Articles

Stefan Klampfl, Kris Jack, Roman Kern
2014 D-Lib Magazine  
PDFX: Fully-automated PDF-to-XML Conversion of Scientific Literature.  ...  It reconstructs the logical structure of scientific articles in PDF format in a rule-based manner and outputs the result in an XML document, including the tables.  ... 
doi:10.1045/november14-klampfl fatcat:6ja7x3llrfhl5m7zu5wjzlsn24

An Ontological Framework for Information Extraction From Diverse Scientific Sources

Gohar Zaman, Hairulnizam Mahdin, Khalid Hussain, Atta-Ur-Rahman, Jemal Abawajy, Salama A. Mostafa
2021 IEEE Access  
The results of the experiment show that the proposed approach is less sensitive to changes in the document format and has a significantly better average accuracy of 89.14% and F-score as 89%.  ...  Although various information extraction techniques have been proposed in the literature, their efficiency demands domain specific documents with static and well-defined format.  ...  ACKNOWLEDGMENT The help of Maliha Omar, Mohib Ullah Khan and Umar Farooq is greatly appreciated.  ... 
doi:10.1109/access.2021.3063181 fatcat:bsoz4v7ndvb7xeltpmvo2wmkym

Mining Zenodo: Data extraction and indexing of a research repository

Pastor, Horacio Saggion
2019 Zenodo  
Due to this scientific literature overload, an exhaustive research of a certain topic becomes overwhelming and researchers cannot get a solid grasp of all this valuable knowledge.  ...  The work will be based on the employment of the Dr Inventor tool (a text mining framework that enables the automated analysis of scientific publications) to support the extraction of specific information  ...  In this case, we are going to distinguish between general-purpose pdf-to-text conversion and pdf-to-text conversion for the special case of scientific publications.  ... 
doi:10.5281/zenodo.3542093 fatcat:n2n3qpxq7fhnzkmq4z3knn63la

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains

Toshiaki Katayama, Mark D Wilkinson, Kiyoko F Aoki-Kinoshita, Shuichi Kawashima, Yasunori Yamamoto, Atsuko Yamaguchi, Shinobu Okamoto, Shin Kawano, Jin-Dong Kim, Yue Wang, Hongyan Wu, Yoshinobu Kano (+73 others)
2014 Journal of Biomedical Semantics  
specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability  ...  The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual  ...  In confronting this issue, the BioHackers worked on a novel software project called PDFX [31, 32] , which automatically converts the PDF scientific articles to XML form.  ... 
doi:10.1186/2041-1480-5-5 pmid:24495517 pmcid:PMC3978116 fatcat:hyb4sddegjbjncdhugr5lm4wu4