31,613 Hits in 5.3 sec

Representing and Utilising Knowledge for Understanding Structured Documents

Thomas Bayer
1992 IAPR International Workshop on Machine Vision Applications  
This paper presents a document analysis system which is capable of extracting the semantics of specific text portions of structured documents.  ...  The flexibility of the representation formalism Fresco and the properties of the inference algorithm are shown in two different applications, in interpreting amount fields on cheques and in analysing business  ...  The task has been to extract the amount field from the bitmap and to recognise the amount completely. The digits of the amount are handprinted and are composed in very different styles.  ... 
dblp:conf/mva/Bayer92 fatcat:7pnjspgixzdilct67q333z6fau

Automatic metadata extraction and indexing for reusing e-learning multimedia objects

Paolo Bolettieri, Fabrizio Falchi, Claudio Gennaro, Fausto Rabitti
2007 Workshop on multimedia information retrieval on The many faces of multimedia semantics - MS '07  
The objective is to demonstrate the reuse of digital content, as video documents or PowerPoint presentations, by exploiting existing technologies for automatic extraction of metadata (OCR, speech recognition  ...  In this paper we present the architecture of a Digital Library for enabling the reusing of audiovisual documents in an e-Learning context.  ...  CONCLUSIONS Although from the theoretical point of view the idea of using automatic tools for the extraction and the enhancement of metadata in the field of the digital libraries is not at all new, it  ... 
doi:10.1145/1290067.1290072 dblp:conf/mm/BolettieriFGR07 fatcat:a6zpvazmabhf5oebqj3ziv7uzi

Robotic Process Automation and Artificial Intelligence in Industry 4.0 – A Literature review

Jorge Ribeiro, Rui Lima, Tiago Eckhardt, Sara Paiva
2021 Procedia Computer Science  
as digital services.  ...  techniques for the extraction of information and consequent process of optimization and of forecasting scenarios in improving the operational and business processes of organizations.  ...  natural language processing for the extraction of information from documents and consequently improve efficiency in document validation.  ... 
doi:10.1016/j.procs.2021.01.104 fatcat:2znmc6loyzemzjfvepvvgqtba4

A Layout-Free Method for Extracting Elements from Document Images [chapter]

Tsukasa Kochi, Takashi Saitoh
1999 Lecture Notes in Computer Science  
By using 145 pages of documents as a learning set, the system recognized 99.2% of feature sets from 148 various types of unknown documents. S . -W. Le e an d Y .  ...  SGML is a language for defining the layout structure of a document. Various attempts at generating SGML from a document image have not been successful.  ...  Koichi Ejiri and Dr.Hirobumi Nishida for their valuable comments on earlier drafts of this paper.  ... 
doi:10.1007/3-540-48172-9_18 fatcat:su33lqhwlrggbhwf7yb2xjndsi

Information and logical modeling in construction

Myasnikov Alexey Georgievich
2020 International Journal of Advanced Trends in Computer Science and Engineering  
The approach of parameterization and identification of information-logical models, their evolution during the life cycle of construction, from design to operation, is justified.  ...  The article discusses the tasks and peculiarities of the system approach to information and logical modeling in construction.  ...  For example, it's necessary for selection of technologies, tools for identification of infological models of urban planning, extraction of information from models for decision-making.  ... 
doi:10.30534/ijatcse/2020/46912020 fatcat:er2u5a2zazhlroqwryojzek44y

Recognition of Tables and Forms [chapter]

Bertrand Coüasnon, Aurélie Lemaitre
2014 Handbook of Document Image Processing and Recognition  
ABBYY FlexiCapture Abbyy FlexiCapture processes business documents and can extract data from forms. This extraction is done automatically after a form definition and configuration.  ...  Not automatically extracted Logical structure extraction of hierarchical tables.  ...  The survey proposed by Dengel in [56] presents several challenge that are related to document and table analysis. In 2004, Zannibi et al. [5] have provided a complete survey of table recognition.  ... 
doi:10.1007/978-0-85729-859-1_20 fatcat:lxyn3pcn2zehvk4zw5ydoidewa

Improvement of Business Productivity by Applying Robotic Process Automation

Younggeun Hyun, Dongseop Lee, Uri Chae, Jindeuk Ko, Jooyeoun Lee
2021 Applied Sciences  
It is clearly shown that CoPA as a business RPA can improve business productivity in terms of time consumption and document quality.  ...  Digitalization has been bringing about various changes and innovations not only in our daily life but also in our business environment.  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/app112210656 fatcat:iar3xkj2fvcp3ots3272t6rdeq

Genre Classification in Automated Ingest and Appraisal Metadata [chapter]

Yunhyong Kim, Seamus Ross
2006 Lecture Notes in Computer Science  
We have segmented the problem and this paper discusses results in genre classification as a first step toward automating metadata extraction from documents.  ...  Metadata needed to document and manage digital materials are extensive and manual creation of them expensive.  ...  Additional support for this research comes from the DE-LOS: Network of Excellence on Digital Libraries (G038-507618) funded under the European Commission's IST 6 th Framework Programme [12].  ... 
doi:10.1007/11863878_6 fatcat:enze3w32efcjrpchcldz6eezj4

RegMiner: Taming the Complexity of Regulatory Documents for Digitalized Compliance Management

Karolin Winter, Manuel Gall, Stefanie Rinderle-Ma
2020 International Conference on Business Process Management  
By employing NLP and data mining techniques, compliance constraints can be automatically extracted, grouped, and visualized leading to a separation of relevant and nonrelevant document parts and insights  ...  Business process compliance has become a crucial aspect for companies due to severe fines that can be imposed if constraints and rules emerging from regulatory documents are violated.  ...  Logic Tier The logic tier is written in Python 3 and consists of two components.  ... 
dblp:conf/bpm/WinterGR20 fatcat:mvudcdt245eu7damd2ip2ct2ku

Introduction to Document Analysis and Recognition [chapter]

Simone Marinai
2008 Studies in Computational Intelligence  
Recent research directions are widening the use of the DAR techniques, significant examples are the processing of ancient/historical documents in digital libraries, the information extraction from "digital  ...  Layout analysis methods are aimed at extracting the physical and/or logical structure of the document image.  ...  Layout analysis methods are aimed at extracting the physical and/or logical structure of the document image.  ... 
doi:10.1007/978-3-540-76280-5_1 fatcat:hsy7sjlx2bfb5lf5fkp6ix34ji

A survey of document image classification: problem statement, classifier architecture and performance evaluation

Nawei Chen, Dorothea Blostein
2006 International Journal on Document Analysis and Recognition  
Document image classification is an important step in Office  ...  Logical layout analysis (also called logical labeling) extracts logical structure: a hierarchy of logical objects, based on the human-perceptible meaning of the document contents [54] .  ...  [21] use Inductive Logic Programming to induce a set of rules from a set of labeled training samples. There are challenges in automatically learning models from training samples.  ... 
doi:10.1007/s10032-006-0020-2 fatcat:2ssef27glvh7dik37emkr4zpd4


Shubham Nagmoti, Kapil Bhoyar, Shantanu Raut, Saransh Jamgade, Nikhil Mangrulkar, Aniket Pathade
2021 Journal of research in engineering and applied sciences  
Nowadays paperless offices and digitizing document is becoming ordinary for every kind of business or work. It is a good idea to find an easy way to create, store, and protect important documents.  ...  In This paper we have proposed about document scanning in terms of a software interface i.e. web application that does an automated digitization of document with various features such as image enhancement  ...  Using tesseract module text is extracted from the input image.  ... 
doi:10.46565/jreas.2021.v06i02.008 fatcat:5fq46lsglfhvhmj7xq2il7473e

Preservation Metadata: National Library of New Zealand Experience

Steve Knight
2005 Library Trends  
to support their long-term goal of preserving digital assets in perpetuity.  ...  Development of approaches to preservation metadata has been an integral component of international efforts in the fi eld of digital preservation.  ...  Acknowledgments I would like to thank Seamus Ross (Director, Humanities Computing and Information Management, University of Glasgow) and Frank Bischoff  ... 
doi:10.1353/lib.2006.0003 fatcat:m5xb7b5vffen3dsxjabub5vtka

Content features for logical document labeling

Jian Liang, David S. Doermann, Tapas Kanungo, Elisa H. Barney Smith, Jianying Hu, Paul B. Kantor
2003 Document Recognition and Retrieval X  
The use of content features extracted from recognized text is valuable in labeling logical elements in documents without rigid layout structure, like business letters.  ...  Models are automatically initialized and adaptively improved using training samples. Satisfactory experimental results are presented.  ...  In work on extracting logical objects from business letters 1,2 , Bayer and Walischewski used dictionaries and stringmatch functions to identify Opening/Closing regions or a month within a Date.  ... 
doi:10.1117/12.476061 dblp:conf/drr/LiangD03 fatcat:giudpupfdvgxre2wjqjmynaasm

Technical forum: Using logical data models for understanding and transforming legacy business applications

Satish Chandra, Jackie de Vries, John Field, Howard Hess, Manivannan Kalidasan, Komondoor V. Raghavan, Frans Nieuwerth, Ganesan Ramalingam, Justin Xue
2006 IBM Systems Journal  
ACKNOWLEDGMENTS We thank an anonymous referee for bringing some related work, as well as Fred Brooks' quotation at the beginning of the article, to our attention.  ...  In addition, we outline the goals and status of the Mastery project at IBM Research, which aims to build a suite of tools for automatically extracting logical models from legacy applications, focusing  ...  Our long-term goal is also to address process-model and business-rule extraction, in addition to datamodel extraction.  ... 
doi:10.1147/sj.453.0647 fatcat:ovocvez7q5bipcty7fq6zaukkm
« Previous Showing results 1 — 15 out of 31,613 results