A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Chinese Clinical Named Entity Recognition with ALBERT and MHA Mechanism
2022
Evidence-Based Complementary and Alternative Medicine
Furthermore, we use the RAdam optimizer to boost the convergence speed and improve the generalization ability of our model. ...
We propose a model based on ALBERT and a multihead attention (MHA) mechanism to solve this problem. ...
BERT's size with an acceptable tradeoff on performance to speed up the training progress. ...
doi:10.1155/2022/2056039
pmid:35656458
pmcid:PMC9152388
fatcat:7b7t57cw6rfkporsikwiucy34a
Machine printed text and handwriting identification in noisy document images
2004
IEEE Transactions on Pattern Analysis and Machine Intelligence
We are especially focused on segmenting and identifying between handwriting and machine printed text because: 1) Handwriting in a document often indicates corrections, additions, or other supplemental ...
A Markov Random Field-based (MRF) approach is used to model the geometrical structure of the printed text, handwriting, and noise to rectify misclassifications. ...
This research is supported by the US Department of Defense under contract MDA-9040-2C-0406. ...
doi:10.1109/tpami.2004.1262324
pmid:15376881
fatcat:lgsgdijwtfdhhpx5sugjmpsaiu
Feature extraction using Radon transform and Discrete Wavelet Transform for facial emotion recognition
2016
2016 2nd IEEE International Symposium on Robotics and Manufacturing Automation (ROMA)
The classification stage is designed by using a minimum distance classifier depending on Euclidean Distance which has a high speed performance. ...
The system design also includes a modest postprocessing stage that makes a consistency between the recognized characters within the same word in relation to their upper and lower cases.The overall classification ...
[22] uses a mixture of 128 Gaussians associated to each state position of contextual models (trigraphs) corresponding to the same base character. ...
doi:10.1109/roma.2016.7847840
fatcat:kq2td2qtpvcafcwq7wbaqs6w7q
HIDDEN MARKOV MODELS IN TEXT RECOGNITION
1995
International journal of pattern recognition and artificial intelligence
A multi-level multifont character recognition is presented. The system proceeds by rst delimiting the context of the characters. ...
The system nally uses combinations of stochastic and dictionary veri cation methods for word recognition and error-correction. Abstract ...
This process speeds up the character recognition and word reconstruction processes. The second level of the global recognition process uses several methods for contextual postprocessing. ...
doi:10.1142/s0218001495000389
fatcat:qintuy3ytjantfo36zvdmnx6ti
Online recognition of chinese characters: the state-of-the-art
2004
IEEE Transactions on Pattern Analysis and Machine Intelligence
The recognition of Chinese characters is different from western handwriting recognition and poses a special challenge. ...
The research works are reviewed in terms of pattern representation, character classification, learning/adaptation, and contextual processing. ...
This two-stage recognition strategy has been widely adopted by now, though tree classification and multistage classification can further speed up the recognition. ...
doi:10.1109/tpami.2004.1262182
pmid:15376895
fatcat:j75xqkn3kvdgxo6yyu526xhsvi
Document Retrieval: Expertise in Identifying Relevant Documents
1990
IEEE Data Engineering Bulletin
All letters to the Editor will be considered for pubiication unless accompanied by a request to the contrary. ...
The next paper, by Faloutsos, addresses the other end of the search problemhow to efficiently search a large number of documents. ...
Acknowledgments The description of this research was put together with the help of students in the lit Laboratory: David Lewis, Bob Krovetz, Howard Turtle, and Raj Das. ...
dblp:journals/debu/Smith90
fatcat:dlyur6m4wjdylanby4cy23jouu
Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation
2009
IEEE Transactions on Pattern Analysis and Machine Intelligence
Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. ...
and sparse belief propagation reduces the lexicon words considered by 99.9% with a 12X speedup and no loss in accuracy. ...
This work was supported in part by The Central Intelligence Agency, the National Security Agency, and National Science Foundation under NSF grants IIS-0100851, IIS-0326249 and IIS-0546666. ...
doi:10.1109/tpami.2009.38
pmid:19696446
pmcid:PMC3021989
fatcat:row3pqfhtjbw5fr6ctpvmk3mf4
A modular framework for biomedical concept recognition
2013
BMC Bioinformatics
Neji provides fast and multi-threaded data processing, annotating up to 1200 sentences/second when using dictionary-based concept identification. ...
Concept recognition is provided through dictionary matching and machine learning with normalization methods. ...
First of all, the large dictionaries used in our experiments, in combination with the achieved processing speeds, are good indicators of the scalability of the presented solution. ...
doi:10.1186/1471-2105-14-281
pmid:24063607
pmcid:PMC3849280
fatcat:xmjcjbm74jbudpf3gcqjiib46m
Integrated Sequence Tagging for Medieval Latin Using Deep Representation Learning
[article]
2017
arXiv
pre-print
Apart from the problems with out-of-lexicon items, error percolation is a major downside of such approaches. ...
For example, a lexicon is used to generate all the potential lemma-tag pairs for a token, and next, a context-aware PoS-tagger is used to select the most appropriate tag-lemma pair. ...
Latin Literature: A Stylometric Approach to Gender, Synergy and Authority", funded by the BOF research fund in Ghent. ...
arXiv:1603.01597v2
fatcat:5sz75lcvsfcdfj52gedprjoegq
Integrated Sequence Tagging for Medieval Latin Using Deep Representation Learning
2017
Journal of Data Mining and Digital Humanities
Apart from the problems with out-of-lexicon items, error percolation is a major downside of such approaches. ...
For example, a lexicon is used to generate all the potential lemma-tag pairs for a token, and next, a context-aware PoS-tagger is used to select the most appropriate tag-lemma pair. ...
Latin Literature: A Stylometric Approach to Gender, Synergy and Authority", funded by the BOF research fund in Ghent. ...
doi:10.46298/jdmdh.1398
fatcat:snafmx56lna5tllihxhp6a2kj4
OCR4all – An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings
[article]
2019
arXiv
pre-print
The drawback of these tools often is their limited applicability by non-technical users like humanist scholars and in particular the combined use of several tools in a workflow. ...
Experiments showed that users with minimal or no experience were able to capture the text of even the earliest printed books with manageable effort and great quality, achieving excellent character error ...
Furthermore, we would like to thank the Opera Camerarii team around Thomas Baier, Marion Gindhart, Joachim Hamm, and Ulrich Schlegelmilch for providing a valuable and challenging use case and test object ...
arXiv:1909.04032v1
fatcat:czzg6o6i5baxdcnsc2cacm5xmy
Effective Phrase Prediction
2007
Very Large Data Bases Conference
There are two main challenges: one is that the number of phrases (both the number possible and the number actually observed in a corpus) is combinatorially larger than the number of words; the second is ...
that a "phrase", unlike a "word", does not have a well-defined boundary, so that the autocompletion system has to decide not just what to predict, but also how far. ...
For single word completion, typical techniques involve building a dictionary of all words and possibly coding this as a trie (or suffix-tree), with each node representing one character and each root-leaf ...
dblp:conf/vldb/NandiJ07
fatcat:uucuy2ft5reotemzqdyr4lpivy
Pharmacovigilance Using Clinical Notes
2013
Clinical Pharmacology and Therapeutics
result from concomitant use-with an estimated 29.4% of elderly patients on six or more drugs. 3 ...
Phase IV surveillance is a critical component of drug safety because not all safety issues associated with drugs are detected before market approval. ...
(a) A discharge summary is encoded internally using (b) a highly compressed, numerical representation. ...
doi:10.1038/clpt.2013.47
pmid:23571773
pmcid:PMC3846296
fatcat:4mzb4grjqvdpbdh62zen33kbmi
BioCreative II Workshop Proceedings
2007
Zenodo
with Combinations of Conditional Random Fields 93 Gene Mention Recognition Using Lexicon Match Based Two-Layer Support Vector Machines 97 Using Semi-Supervised Techniques to Detect Gene Mentions 101 BioCreative ...
by Alma Bioinformatics 131 Peregrine: Lightweight gene name normalization by dictionary lookup 135 Gene Mention and Gene Normalization Based on Machine Learning and Online Resources 141 Me and my friends ...
The AVIDD Linux Clusters used in our analysis are funded in part by NSF Grant CDA-9601632.
Acknowledgments We thank the two anonymous BioCreative reviewers for their insightful feedback. ...
doi:10.5281/zenodo.4274543
fatcat:3sa3fvgngffjrblxzgswof42tq
Automatic Methods and Neural Networks in Arabic Texts Diacritization: A Comprehensive Survey
2021
IEEE Access
Arabic diacritics are signs used in Arabic orthography to represent essential morphophonological and syntactic information. It is a common practice to leave out those diacritics in written Arabic. ...
They are often accompanied by either rule-based or neural networks in hybrid systems. ...
Two methods for creating ambiguity dictionaries from undiacritized text were proposed. They are using a morphological analyzer and unsupervised sense induction. ...
doi:10.1109/access.2021.3122977
fatcat:tux7f5c5b5emfozxqvfx75kcxq
« Previous
Showing results 1 — 15 out of 140 results