Filters








15,458 Hits in 5.0 sec

Document re-ranking using cluster validation and label propagation

Lingpeng Yang, Donghong Ji, Guodong Zhou, Yu Nie, Guozheng Xiao
2006 Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06  
Then the ranking of the documents can be conducted via label propagation.  ...  For pseudo relevant documents, we determine a cluster of documents from the top ones via cluster validation-based kmeans clustering; for pseudo irrelevant ones, we pick a set of documents from the bottom  ...  DOCUMENT RE-RANKING In this paper, document re-ranking is recast as a two-class label propagation problem.  ... 
doi:10.1145/1183614.1183713 dblp:conf/cikm/YangJZNX06 fatcat:checcs6atrgulfrh23lxt4hz6m

Information Retrieval Using Label Propagation Based Ranking

Lingpeng Yang, Donghong Ji, Yu Nie
2007 NTCIR Conference on Evaluation of Information Access Technologies  
Then the ranking of the documents can be conducted via label propagation.  ...  Our document re-ranking method is done by a label propagation-based semi-supervised learning algorithm to utilize the intrinsic structure underlying in the large document data.  ...  In section 4, we describe our document re-ranking method based on cluster validation and label propagation. In section 5, we describe how to do query expansion in our system.  ... 
dblp:conf/ntcir/YangJN07 fatcat:5dkvfvh2pvdjfd53nxb6pmhxvm

On the Robustness of Document Re-Ranking Techniques: A Comparison of Label Propagation, KNN, and Relevance Feedback

Yuen-Hsien Tseng, Chen-Yang Tsai, Ze-Jing Chuang
2007 NTCIR Conference on Evaluation of Information Access Technologies  
We compared label propagation (LP), K-nearest neighboring (KNN), and relevance feedback (RF) for document re-ranking and found that RF is a more robust technique for performance improvement, while LP and  ...  KNN are sensitive to the choice and the number of relevant documents for successful document re-ranking.  ...  Acknowledge This work is supported in part by NSC under the grant numbers: NSC 95-2221-E-003-016-and NSC 95-2524-S-003-012-.  ... 
dblp:conf/ntcir/TsengTC07 fatcat:4ulvw7nr5rhmvex3iti5rhg27a

Google based name search: Resolving mixed entities on the web

Byung-Won On, Ingyu Lee
2009 2009 Fourth International Conference on Digital Information Management  
For development of such a system, we propose a web service based interface, an unsupervised clustering scheme, and cluster ranking algorithms.  ...  In particular, since the correct number of clusters is often unknown, we study a state-of-the-art unsupervised clustering solution based on propagation of pairwise similarities of entities.  ...  Cluster Ranking Algorithms As the result of our clustering method using similarity propagation, relevant web pages are clustered to the same group.  ... 
doi:10.1109/icdim.2009.5356763 dblp:conf/icdim/OnL09 fatcat:62s2p4e7ebelpbzvjknnl5l3li

Annotating handwritten characters with minimal human involvement in a semi-supervised learning strategy

J. Richarz, S. Vajda, G. A. Fink
2012 2012 International Conference on Frontiers in Handwriting Recognition  
The first is based on cluster-level annotation followed by a majority decision, whereas the second casts the labeling process as a retrieval task and derives labels by voting among ranked lists.  ...  Both methods are thoroughly evaluated in a handwritten character recognition scenario using realistic document data.  ...  Acknowledgments This work is supported by the German Federal Ministry of Economics and Technology on a basis of a decision by the German Bundestag within project KF2442004LF0.  ... 
doi:10.1109/icfhr.2012.181 dblp:conf/icfhr/RicharzVF12 fatcat:x6gmsmg5cvhbbgettj4ba5232q

RESCRIPt: Reproducible sequence taxonomy reference database management

Michael S. Robeson, Devon R. O'Rourke, Benjamin D. Kaehler, Michal Ziemski, Matthew R. Dillon, Jeffrey T. Foster, Nicholas A. Bokulich, Mihaela Pertea
2021 PLoS Computational Biology  
Reproducibly generating, managing, using, and evaluating nucleotide sequence and taxonomy reference databases creates a significant bottleneck for researchers aiming to generate custom sequence databases  ...  We show that bigger is not always better, and reference databases with standardized taxonomies and those that focus on type strains have quantitative advantages, though may not be appropriate for all use  ...  Acknowledgments We would like to thank the QIIME 2 developers and users for providing feedback on RESCRIPt.  ... 
doi:10.1371/journal.pcbi.1009581 pmid:34748542 pmcid:PMC8601625 fatcat:y7r5ny5zqzhddaylx2psitwrpu

Feature propagation on image webs for enhanced image retrieval

Eric Brachmann, Marcel Spehr, Stefan Gumhold
2013 Proceedings of the 3rd ACM conference on International conference on multimedia retrieval - ICMR '13  
We establish image relations by image web construction and adapt a label propagation scheme from the domain of semisupervised learning for feature augmentation.  ...  While the benefit of feature augmentation has been shown before, our approach refrains from the use of semantic labels.  ...  WGC constraints were used by Jegou et al. [5] to re-rank retrieval results in BOF image searches. We deploy WGC to verify the validity of co-segmented feature pairs.  ... 
doi:10.1145/2461466.2461472 dblp:conf/mir/BrachmannSG13 fatcat:34uxmwljjzem3brujkijp7gyvq

Distances Correlation for Re-ranking in Content-Based Image Retrieval

D C G Pedronette, R da S Torres
2010 2010 23rd SIBGRAPI Conference on Graphics, Patterns and Images  
Content-based image retrieval relies on the use of efficient and effective image descriptors.  ...  This paper presents a clustering approach based on distances correlation for computing the similarity among images.  ...  ACKNOWLEDGMENT Authors thank CAPES, FAPESP and CNPq for financial support. Authors also thanks DGA/UNICAMP for its support in this work.  ... 
doi:10.1109/sibgrapi.2010.9 dblp:conf/sibgrapi/PedronetteT10 fatcat:y64xebrnjrecvkjdwgzdvjcgr4

RESCRIPt: Reproducible sequence taxonomy reference database management for the masses [article]

Michael S Robeson, Devon R O'Rourke, Benjamin D Kaehler, Michal Ziemski, Matthew R Dillon, Jeffrey T Foster, Nicholas A Bokulich
2020 bioRxiv   pre-print
Reproducibly generating, managing, using, and evaluating nucleotide sequence and taxonomy reference databases creates a significant bottleneck for researchers aiming to generate custom sequence databases  ...  We show that bigger is not always better, and reference databases with standardized taxonomies and those that focus on type strains have quantitative advantages, though may not be appropriate for all use  ...  However, applying rank propagation will yield the 940 following for these same accessions: and Pezizomycotina) were propagated downward and used to fill in the unannotated ranks. 949 Hence, forward filling  ... 
doi:10.1101/2020.10.05.326504 fatcat:7wg6yno7dbfhtgfosxnitt6jum

ANNOTATE: orgANizing uNstructured cOntenTs viA Topic labEls

Deepak Ajwani, Bilyana Taneva, Sourav Dutta, Patrick K. Nicholson, Ghasem Heyrani-Nobari, Alessandra Sala
2018 2018 IEEE International Conference on Big Data (Big Data)  
To do this, ANNOTATE clusters the disambiguated concepts based on their semantic similarity and only uses the concepts in the largest cluster for topic labeling.  ...  Unsupervised techniques view the taxonomy as an undirected graph and use undirected graph measures (e.g., centrality [10] , clustering [13] , PageRank [14] ) to rank topics.  ... 
doi:10.1109/bigdata.2018.8622647 dblp:conf/bigdataconf/AjwaniTDNHS18 fatcat:nypsfzlrwzeqlo7mowwgw25afq

DWIE: An entity-centric dataset for multi-task document-level information extraction

Klim Zaporojets, Johannes Deleu, Chris Develder, Thomas Demeester
2021 Information Processing & Management  
First, the use of traditional mention-level evaluation metrics for NER and RE tasks on entity-centric DWIE dataset can result in measurements dominated by predictions on more frequently mentioned entities  ...  To realize this, we propose to use graph-based neural message passing techniques between document-level mention spans.  ...  CPN project, 17 and (ii) the Flemish Government under the "Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen" programme.  ... 
doi:10.1016/j.ipm.2021.102563 fatcat:s2imreyj7rep7i56fv7cuy2iia

DWIE: an entity-centric dataset for multi-task document-level information extraction [article]

Klim Zaporojets, Johannes Deleu, Chris Develder, Thomas Demeester
2021 arXiv   pre-print
First, the use of traditional mention-level evaluation metrics for NER and RE tasks on entity-centric DWIE dataset can result in measurements dominated by predictions on more frequently mentioned entities  ...  To realize this, we propose to use graph-based neural message passing techniques between document-level mention spans.  ...  CPN project, 17 and (ii) the Flemish Government under the "Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen" programme.  ... 
arXiv:2009.12626v2 fatcat:2ht56fk3l5bipgev2uttsnagvu

Person re-identification by manifold ranking

Chen Change Loy, Chunxiao Liu, Shaogang Gong
2013 2013 IEEE International Conference on Image Processing  
Existing person re-identification methods conventionally rely on labelled pairwise data to learn a task-specific distance metric for ranking.  ...  In this study, we show that it is possible to propagate the query information along the unlabelled data manifold in an unsupervised way to obtain robust ranking results.  ...  ii) we systematically formulate and validate existing MRank models for the re-identification task.  ... 
doi:10.1109/icip.2013.6738736 dblp:conf/icip/LoyLG13 fatcat:tiwr7txpnbgvxamo4e4ombiv6u

Uncertainty Reduction for Knowledge Discovery and Information Extraction on the World Wide Web

Heng Ji, Hongbo Deng, Jiawei Han
2012 Proceedings of the IEEE  
What are the fundamental techniques that can be used to reduce such uncertainty and achieve reasonable KD and IE performance on the WWW? What is the impact of each novel method?  ...  We hope this can provide a road map to advance KD and IE on the WWW to a higher level of performance, portability and utilization.  ...  and argument labeling to achieve cluster-wide consistency, by propagating the most frequent labels to replace low-confidence annotations. 2) Experiment Results: Now, we will show how to exploit the new  ... 
doi:10.1109/jproc.2012.2190489 fatcat:4rye7lknyvbe5ggxtpv7fqptgm

NLANGP: Supervised Machine Learning System for Aspect Category Classification and Opinion Target Extraction

Zhiqiang Toh, Jian Su
2015 Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)  
We extract a variety of lexicon and syntactic features, as well as cluster features induced from unlabeled data.  ...  Our system achieves state-of-the-art performances, ranking 1st for three of the evaluations (Slot 1 for both restaurant and laptop domains, and Slot 1 & 2) and 2nd for Slot 2 evaluation.  ...  We thank the anonymous reviewers for their helpful comments and suggestions.  ... 
doi:10.18653/v1/s15-2083 dblp:conf/semeval/TohS15 fatcat:4dg7mudgz5a4po4isczuaoemoa
« Previous Showing results 1 — 15 out of 15,458 results