A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2014; you can also visit the original URL.
The file type is application/pdf
.
Filters
Distance Based Strategy for Supervised Document Image Classification
[chapter]
2004
Lecture Notes in Computer Science
This paper deals with supervised document image classification. An original distance based strategy allows automatic feature selection. ...
Each iteration of the classification algorithm computes the distance d between the image to be classified and the chosen representative. ...
Our strategy simultaneously performs the feature selection for a given problem and the document classification. It is based on the computation of distance between documents. ...
doi:10.1007/978-3-540-27868-9_98
fatcat:4eurp6cftbb7jf34gdgctavyj4
IEEE Access Special Section Editorial: Data Mining and Granular Computing in Big Data and Knowledge Processing
2019
IEEE Access
ontologies are built by reusing and adapting the existing public categories of Chinese judgment documents and the WMD-based similarity computation was made for KNN based document classification. ...
Big data mining relies on distributed computational strategies; it is often impossible to store and process data on one single computing node. ...
doi:10.1109/access.2019.2908776
fatcat:7km2edtcuzeutnwy3pjbvg264e
A review of feature selection methods with applications
2015
2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
The usual applications of FS are in classification, clustering, and regression tasks. This review considers most of the commonly used FS techniques. Particular emphasis is on the application aspects. ...
Since exhaustive search for optimal feature subset is infeasible in most cases, many search strategies have been proposed in literature. ...
ACKNOWLEDGEMENTS This work has been supported in part by the Croatian Science Foundation, within the project "De-identification Methods for Soft and Non-Biometric Identifiers" (DeMSI, UIP- 11-2013-1544 ...
doi:10.1109/mipro.2015.7160458
dblp:conf/mipro/JovicBB15
fatcat:hrqcsfltbzg4vnnwxy3wr3ju4a
Document image retrieval based on texture features and similarity fusion
2016
2016 International Conference on Image and Vision Computing New Zealand (IVCNZ)
The similarity distances between each of the two feature vectors extracted for a given query and the feature vectors extracted from the document images in the training step are computed separately. ...
The document images are finally ranked based on the greatest visual similarity to the query obtained from the fusion similarity measures. ...
Creation of Knowledge-based Feature
are the corresponding weights for the distances computed based on the classifiers. ...
doi:10.1109/ivcnz.2016.7804437
dblp:conf/ivcnz/AlaeiABP16
fatcat:oeqvvjrklbb6vo5eypkmfofcfq
A Graph Lattice Approach to Maintaining Dense Collections of Subgraphs as Image Features
2011
2011 International Conference on Document Analysis and Recognition
Document classification and indexing methods depend on having informative image features. ...
Each feature is itself a subgraph, and a feature vector is a count of occurrences of subgraphs in the image. ...
Thank you to Fang Liu for discussions and for building a preliminary implementation of the graph lattice machinery. ...
doi:10.1109/icdar.2011.216
dblp:conf/icdar/Saund11
fatcat:favkvvory5b5xphnqav5dvcrf4
Unsupervised Classification of Structurally Similar Document Images
2013
2013 12th International Conference on Document Analysis and Recognition
The approach is based on multiple levels of content and structure. At a local level, a bag-of-visual words based on SURF features provides an effective way of computing content similarity. ...
In this paper, we present a learning based approach for computing structural similarities among document images for unsupervised exploration in large document collections. ...
[15] proposed a measure based on minimum edit-distance. ...
doi:10.1109/icdar.2013.248
dblp:conf/icdar/KumarD13
fatcat:tqw2frz4qzcvndlkfewc2oa4ka
Logo Recognition Based on the Dempster-Shafer Fusion of Multiple Classifiers
[chapter]
2013
Lecture Notes in Computer Science
In order to reduce recognition error, a powerful combination strategy based on the Dempster-Shafer theory is utilized to fuse the three classifiers trained on different sources of information. ...
However, the potential improvement in classification through feature fusion by ensemble-based methods has remained unattended. ...
The successful recognition of logos facilitates automatic classification of source documents, which is considered a key strategy for document image analysis and retrieval. ...
doi:10.1007/978-3-642-38457-8_1
fatcat:b3v5mmqnk5g27jog5seylp6tgm
kNN based image classification relying on local feature similarity
2010
Proceedings of the Third International Conference on SImilarity Search and APplications - SISAP '10
In this paper, we propose a novel image classification approach, derived from the kNN classification strategy, that is particularly suited to be used when classifying images described by local features ...
than similarity between images, opening up new opportunities to investigate more efficient and effective strategies. ...
Keypoints are selected by choosing the most stable points from a set of candidate location. Each keypoint in an image is associated with one or more orientations, based on local image gradients. ...
doi:10.1145/1862344.1862360
dblp:conf/sisap/AmatoF10
fatcat:lencmytdungxhjdn65oo4tf5j4
A Graph Lattice Approach to Maintaining and Learning Dense Collections of Subgraphs as Image Features
2013
IEEE Transactions on Pattern Analysis and Machine Intelligence
Effective object and scene classification and indexing depend on extraction of informative image features. ...
Further performance gains are achieved on a more difficult dataset using a feature voting method and feature selection procedure. ...
Appreciation is also due to Fang Liu for discussions and for building a preliminary implementation of the graph lattice machinery. ...
doi:10.1109/tpami.2012.267
pmid:23267200
fatcat:2324ywfd5fgx7lmaylfcsoexji
Machine Learning of Generalized Document Templates for Data Extraction
[chapter]
2002
Lecture Notes in Computer Science
When comparing documents images based on visual similarity it is difficult to determine the correct scale and features for document representation. ...
Feature selection is used to reduce the dimensionality and redundancy of the size distributions, while preserving the essence of the visual appearance of a document. ...
Research in currently focused on feature selection strategies which also (re-)introduce spatial information into the size distribution representation. ...
doi:10.1007/3-540-45869-7_48
fatcat:eta3chv4yvbo3mp4wuwlnljiqq
SwiftLink: Serendipitous Navigation Strategy for Large-Scale Document Collections
2012
2012 23rd International Workshop on Database and Expert Systems Applications
The multiplication of large-scale document collections has created the need for robust and adaptive access strategies in many applicative areas. ...
In this paper, we depart from the traditional document search paradigm to move onto the construction of a collection navigation strategy. ...
EVALUATION STRATEGY AND EXPERIMENTS
A. Dataset We consider images as documents here but our model readily applies on all types of documents, using appropriate features. ...
doi:10.1109/dexa.2012.52
dblp:conf/dexaw/WylM12
fatcat:qt4upuuxjfaytcnjfeyzshcbie
Automatic Document Logo Detection
2007
Proceedings of the International Conference on Document Analysis and Recognition
At a coarse scale, a trained Fisher classifier performs initial classification using features from document context and connected components. ...
In this paper, we propose a new approach to logo detection and extraction in document images that robustly classifies and precisely localizes logos using a boosting strategy across multiple image scales ...
Figs. 2(b) and 2 (c) show computed grayscale blobs at scale level σ = 8 and 16, respectively. We select the initial coarse scale σ n based on the resolution of the input image. ...
doi:10.1109/icdar.2007.4377038
dblp:conf/icdar/ZhuD07
fatcat:zgqnkiyuevgtbgxmrm3bdgzso4
Asymmetric Learning and Dissimilarity Spaces for Content-Based Retrieval
[chapter]
2006
Lecture Notes in Computer Science
The proposed approach is evaluated on both artificial data and real image database, and compared with stateof-the-art algorithms. ⋆ This work is funded by the Swiss NCCR (IM)2 (Interactive Multimodal Information ...
This classification problem is known to be asymmetric, i.e. the negative class does not cluster in the original feature spaces. ...
Image retrieval A last evaluation is conducted on a Corel image subset. The feature space consists in a 64 RGB histogram and embeds 18521 images annotated by several keywords. ...
doi:10.1007/11788034_34
fatcat:gkm2chboqbbuhbqyol4eelivpe
Fast Rule-Line Removal Using Integral Images and Support Vector Machines
2011
2011 International Conference on Document Analysis and Recognition
We use an integral-image representation which allows fast computation of features and apply techniques for large scale Support Vector learning using a data selection strategy to sample a small subset of ...
In this paper, we present a fast and effective method for removing pre-printed rule-lines in handwritten document images. ...
The main bottleneck of taking a pixel-based classification approach for rule-line removal is the feature computation and classification time for each pixel. ...
doi:10.1109/icdar.2011.123
dblp:conf/icdar/KumarD11
fatcat:iygkcvh3kvexfak7yswpski3ca
Comparing representative selection strategies for dissimilarity representations
2006
International Journal of Intelligent Systems
Several alternative representative strategies are proposed and empirically evaluated on a set of term vectors constructed from HTML documents. ...
, and when the representatives are selected randomly, the time required to create the embedded space is significantly reduced, also with a small penalty in accuracy. ...
Note that the outlier-based selection strategies take longer than the random-based strategies, since they need to compute ½ n 2 document vector distances. ...
doi:10.1002/int.20180
fatcat:3yiu2ei3wvefpftevknz6cuaza
« Previous
Showing results 1 — 15 out of 64,251 results