Filters








347 Hits in 2.3 sec

Parsing Ink Annotations on Heterogeneous Documents [article]

Xin Wang, Michael Shilman, Sashi Raghupathy
2006 Sketch-Based Interfaces and Modeling  
First, our approach handles annotations on ink notes, which are significantly more ambiguous than annotations on printed documents and hence more difficult to recognize.  ...  Annotation is an integral part of reading, comprehending, commenting, and authoring notes and documents.  ...  Peter Slavik for discussion on Gestures; Benoit Jurion and Marie Millet for many discussions on the definition and user scenarios of annotations and their efforts on data collection; Forrest Oswald, Chengyang  ... 
doi:10.2312/sbm/sbm06/043-050 fatcat:x2sko3xhyva6lgasjdaqrzyznq

Grouping text lines in freeform handwritten notes

M. Ye, H. Sutanto, S. Raghupathy, C. Li, M. Shilman
2005 Eighth International Conference on Document Analysis and Recognition (ICDAR'05)  
On average, the proposed technique processes each note page in less than a second at a 90% accuracy.  ...  Handwritten text lines are prominent structures in freeform digital ink notes and their reliable detection is the foundation to a natural and intelligent interface for note editing and repurposing.  ...  The strong assumptions DIA makes about document regularity do not hold true for digital ink which is far more variable and heterogeneous [7, 1] .  ... 
doi:10.1109/icdar.2005.121 dblp:conf/icdar/YeSRLS05 fatcat:ytaothoxfbd4tdsfgo27i6zlfu

Identifying Useful Passages in Documents Based on Annotation Patterns [chapter]

Frank Shipman, Morgan Price, Catherine C. Marshall, Gene Golovchinsky
2003 Lecture Notes in Computer Science  
Based on this study we have designed a mark parser that analyzes freeform digital ink to identify such high-value annotations.  ...  parts of the documents.  ...  These types of marks are based on opaque digital ink; highlights are similar to underlines using translucent ink and circles are looped text in either opaque or translucent ink -the bottom of a circle  ... 
doi:10.1007/978-3-540-45175-4_11 fatcat:5vsf7quux5bmvk6xrr2h2qpcha

Peer-to-peer ink messaging across heterogeneous devices and platforms

Manoj Prasad A, Muthuselvam Selvaraj, Sriganesh Madhvanath
2008 Proceedings of the 1st Bangalore annual Compute conference on - Compute '08  
We also plan to explore the transmission of digital ink annotations of images and text documents along with the underlying content, and voice as an additional modality apart from ink.  ...  In order to provide digital ink-based instant messaging capability in a heterogeneous environment, one must necessarily address the following issues: (i) Representation of digital ink captured so that  ... 
doi:10.1145/1341771.1341798 dblp:conf/compute/ASM08 fatcat:oawjgmsjr5bolit75qvq2gkqvi

Natural language processing systems for pathology parsing in limited data environments with uncertainty estimation

Anobel Y Odisho, Briton Park, Nicholas Altieri, John DeNero, Matthew R Cooperberg, Peter R Carroll, Bin Yu
2020 JAMIA Open  
Materials and methods Our data comes from the Urologic Outcomes Database at UCSF which includes 3232 annotated prostate cancer pathology reports from 2001 to 2018.  ...  Conclusions We find that when applying machine learning to pathology parsing, large datasets may not always be needed, and that calibration methods can improve the reliability of uncertainty estimates.  ...  Of note, there was heterogeneity in report structure and style over 20 years.  ... 
doi:10.1093/jamiaopen/ooaa029 pmid:33381748 pmcid:PMC7751177 fatcat:l2sxnhvpojg5zb3tsrfnesmf74

Paper-digital meeting support and review

Adriana Ispas, Nan Li, Moira C. Norrie, Beat Signer
2010 Proceedings of the 6th International ICST Conference on Collaborative Computing: Networking, Applications, Worksharing  
work in digitally enhane interaction, private and shared documents as well as pre-and in-shared spaces.  ...  information created in meetings. ital media, such as Anoto's digital pen and paper technolog' However, personal notes are limited in terms of providing facilitate the integration of information captured on  ...  In the case of ink oasciteddgtlp!.  ... 
doi:10.4108/icst.collaboratecom.2010.29 dblp:conf/colcom/IspasLNS10 fatcat:rksfbxihtjbh3iveq2lqxsasbm

A Streaming Digital Ink Framework for Multi-party Collaboration [chapter]

Rui Hu, Vadim Mazalov, Stephen M. Watt
2012 Lecture Notes in Computer Science  
Sessions may be recorded and stored for later playback, analysis or annotation.  ...  The digital ink stream is transmitted as InkML, allowing special recognizers for different content types, such as mathematics and diagrams.  ...  As the stream is received by a participant, InkChat immediately parses the stream and saves the strokes to the current ink session.  ... 
doi:10.1007/978-3-642-31374-5_6 fatcat:qjgrynk4pndbtobs6talvlkdke

Recognition of Tables and Forms [chapter]

Bertrand Coüasnon, Aurélie Lemaitre
2014 Handbook of Document Image Processing and Recognition  
However, it is dedicated for documents with Manhattan layout and may not work for complex documents with heterogeneous arrangements.  ...  datasets with ground truth (see Chap. 29 (Datasets and Annotations for Document Analysis and Recognition)), as different authors pointed it out.  ...  To know more about the work that has been achieved on this topic, one may read some state of the art that are dedicated to table and form analysis.  ... 
doi:10.1007/978-0-85729-859-1_20 fatcat:lxyn3pcn2zehvk4zw5ydoidewa

RiverInk--An Extensible Framework for Multimodal Interoperable Ink

Jonathan Neddenriep, William Griswold
2007 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07)  
However, prevailing ink representations are not compatible across devices or even within vendors, compromising interoperability and hence true ubiquity.  ...  This paper motivates the interoperability problems created by ubiquity, and then describes the design of RiverInk's format, APIs, and ink controls.  ...  Figure 10 . 10 Ink annotations composited on a professor's slide in ActiveClass.  ... 
doi:10.1109/hicss.2007.470 dblp:conf/hicss/NeddenriepG07 fatcat:irqdeq2jmbhdlox5ocggbfixiy

The Labeled Segmentation of Printed Books

Lara McConnaughey, Jennifer Dai, David Bamman
2017 Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing  
We introduce the task of book structure labeling: segmenting and assigning a fixed category (such as TABLE OF CONTENTS, PREFACE, INDEX) to the document structure of printed books.  ...  We manually annotate the page-level structural categories for a large dataset totaling 294,816 pages in 1,055 books evenly sampled from 1750-1922, and present empirical results comparing the performance  ...  Data In order to support the analysis and prediction of labeled document structure, we present a manually annotated dataset of 1,055 books, where each page has been labeled according to one of 10 categories  ... 
doi:10.18653/v1/d17-1077 dblp:conf/emnlp/McConnaugheyDB17 fatcat:jgmjmeol2nfmzoigmy25vbrcha

A synthetic document image dataset for developing and evaluating historical document processing methods

Daniel Walker, William Lund, Eric Ringger, Christian Viard-Gaudin, Richard Zanibbi
2012 Document Recognition and Retrieval XIX  
Keywords: synthetic document images, OCR, datasets, document degradation models, historical document processing * Due to our agreement with the LDC, only the raw corrupted data, and not the topic annotations  ...  Additionally, research into improving the performance of such methods often requires further annotation of training and test data (e.g., topical document labels).  ...  For instructions on how to download the synthetic datasets, the code used to produce it, the Eisenhower Communiqués, and document image samples please visit: https://facwiki.cs.byu.edu/nlp/index.php/Synthetic_OCR_Data  ... 
doi:10.1117/12.912203 dblp:conf/drr/WalkerLR12 fatcat:aliqsvo4jze3fkwxyjssguncuu

CArDIS: A Swedish Historical Handwritten Character and Word Dataset

Amir Yavariabdi, Huseyin Kusetogullari, Turgay Celik, Shivani Thummanapally, Sakib Rijwan, Johan Hall
2022 IEEE Access  
The samples in CArDIS are collected from 64, 084 Swedish historical documents written by several anonymous priests between 1800 and 1900.  ...  The experiments show that the machine learning methods trained on existing handwritten character datasets struggle to recognize characters efficiently on the CArDIS dataset, proving that characters in  ...  RELATED WORK OCR is one of the leading research topics in pattern recognition, and it has been widely used to recognize handwritten or machine-printed characters in document images collected from heterogeneous  ... 
doi:10.1109/access.2022.3175197 fatcat:3vab2zq3srapzjto6n3xa7pwne

Automatic negation detection in narrative pathology reports

Ying Ou, Jon Patrick
2015 Artificial Intelligence in Medicine  
global view of the entire document.  ...  Detailed annotation schemas and guidelines were developed in an iterative process to ensure annotation consistency.  ...  to a document rather than annotating the whole document from scratch.  ... 
doi:10.1016/j.artmed.2015.03.001 pmid:25990897 fatcat:yrijkncnsvht7lonmaqt7uyya4

Modeling the Complexity of Music Metadata in Semantic Graphs for Exploration and Discovery

Pasquale Lisena, Raphaël Troncy, Konstantin Todorov, Manel Achichi
2017 Proceedings of the 4th International Workshop on Digital Libraries for Musicology - DLfM '17  
controlled vocabularies that provide common identi ers that overcome the di erences in language and alternative forms of needed concepts. ese graphs are interlinked to each other and to external resources on  ...  Each triplet contains an information that at the same time can live autonomously and be linked to the other entities. inking about a classic work, we will have a triplet for the composition, one for any  ...  On the data side, we are working on the improvement of the parsing of the data using Named Entity Recognition (NER) techniques, that will link also the DOREMUS data to external LOD datasets, like DBpedia  ... 
doi:10.1145/3144749.3144754 dblp:conf/ismir/LisenaTTA17 fatcat:jusuqa4o4fgovo7aot2znkoxaa

Towards an Interoperable Digital Scholarly Edition

Desmond Schmidt
2014 Journal of the Text Encoding Initiative  
And annotation should point to the document, not the other way around. Otherwise, any alteration to the annotations will break the document.  ...  The latter are now ubiquitous on the Web and have the advantage of offering a forgiving syntax. 14 Rather than parsing the user-input formally, they convert it into approximate HTML, then "tidy" it into  ... 
doi:10.4000/jtei.979 fatcat:sosd25lqufgsbdqlewikdysmhm
« Previous Showing results 1 — 15 out of 347 results