Filters








1,691,996 Hits in 9.0 sec

Efficient construction of large test collections

Gordon V. Cormack, Christopher R. Palmer, Charles L. A. Clarke
1998 Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98  
Test collections with a million or more documents are needed for the evaluation of modern information retrieval systems. Yet their construction requires a great deal of effort.  ...  Exhaustive judging, in which every document is examined and a judgement rendered, is infeasible for collections of this size.  ...  Acknowledgements The authors thank Ellen Voorhees and Donna Harman of the National Institute of Standards and Technology (NIST) for making available the TREC-6 submissions and also thank Ilana Rosenshein  ... 
doi:10.1145/290941.291009 dblp:conf/sigir/CormackPC98 fatcat:pln3hycm2rdddnn7eoebplykqi

Collection-Document Summaries [chapter]

Nils Witt, Michael Granitzer, Christin Seifert
2018 Lecture Notes in Computer Science  
We devise evaluation metrics that do not require human judgement, and three algorithms for extracting CDS that are based on single-document keyword-extraction methods.  ...  Therefore, we propose collection-document (CDS) summaries that highlight commonalities and differences between a collection (or a single document) and a single document.  ...  Future work includes the devision of new algorithms, for instance by combining ∆Rake (best diversity) and ∆T F (best comparability) and an evaluation of the methods on a human-generated groundtruth to  ... 
doi:10.1007/978-3-319-76941-7_56 fatcat:lqyqrjwomnemxnbnuk3g55rm3u

Enterprise Search: Identifying Relevant Sentences and Using Them for Query Expansion

Maheedhar Kolla, Olga Vechtomova
2007 Text Retrieval Conference  
Our method is based on selecting sentences from the given relevant documents and using those selected sentences for query expansion.  ...  We observed that our method of query expansion improves system's performance over baseline run, under various methods of comparison.  ...  Evaluation results are presented in Table 2 Residual Collection In residual collection method of evaluation, all provided relevant documents(i.e documents from pages fields) are removed from evaluation  ... 
dblp:conf/trec/KollaV07 fatcat:o6uskdihv5ew5opdfdiwwoxiba

SEAGLE: A Platform for Comparative Evaluation of Semantic Encoders for Information Retrieval

Fabian David Schmidt, Markus Dietsche, Simone Paolo Ponzetto, Goran Glavaš
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations  
We introduce SEAGLE, 1 a platform for comparative evaluation of semantic text encoding models on information retrieval (IR) tasks.  ...  SEAGLE implements (1) word embedding aggregators, which represent texts as algebraic aggregations of pretrained word embeddings and (2) pretrained semantic encoders, and allows for their comparative evaluation  ...  Additionally, we thank Hans-Peter Zorn from inovex GmbH for his feedback over the course of the same student project.  ... 
doi:10.18653/v1/d19-3034 dblp:conf/emnlp/SchmidtDPG19 fatcat:kadnmiwinjduhmgn2ycmc5svwy

Active Sampling for Large-scale Information Retrieval Evaluation

Dan Li, Evangelos Kanoulas
2017 Proceedings of the 2017 ACM on Conference on Information and Knowledge Management - CIKM '17  
To alleviate this effort different methods for constructing collections have been proposed in the literature, falling under two broad categories: (a) sampling, and (b) active selection of documents.  ...  We validate the proposed method using TREC data and demonstrate the advantages of this new method compared to past approaches.  ...  All content represents the opinion of the authors, which is not necessarily shared or endorsed by their respective employers and/or sponsors.  ... 
doi:10.1145/3132847.3133015 dblp:conf/cikm/LiK17 fatcat:qbk6yzqkwne33o6d4gotbkxr3y

Rank-Ordering Documents According to Their Relevance in Information Retrieval Using Refinements of Ordered-Weighted Aggregations [chapter]

Mohand Boughanem, Yannick Loiseau, Henri Prade
2006 Lecture Notes in Computer Science  
The proposal is evaluated on a standard collection that allows to compare the effectiveness of this approach with a classical one.  ...  Moreover, the proposed approach uses a possibilistic framework for evaluating queries to a document collection, which distinguishes between descriptors that are certainly relevant and those which are possibly  ...  The method used for rank-ordering the documents is therefore crucial for the result of the evaluation of a query.  ... 
doi:10.1007/11670834_4 fatcat:mp5ivid6wjf3fo2nvhf2ic7eyq

Term Extraction for User Profiling: Evaluation by the User

Suzan Verberne, Maya Sappelli, Wessel Kraaij
2013 User Modeling, Adaptation, and Personalization  
We compared the methods in two different evaluation scenarios, both from the perspective of the user: a per-term evaluation, and a holistic (term cloud) evaluation.  ...  We compared three term scoring methods in their ability to extract descriptive terms from a knowledge worker's document collection.  ...  In Section 2 we describe three methods for collecting the descriptive terms from a user's self-authored document collection, and the evaluation setup.  ... 
dblp:conf/um/VerberneSK13 fatcat:grstcnzvtbdbvmt6loqlpdl5ly

Improving Document Ranking in Information Retrieval Using Ordered Weighted Aggregation and Leximin Refinement

Mohand Boughanem, Yannick Loiseau, Henri Prade
2005 European Society for Fuzzy Logic and Technology  
Classical information retrieval (IR) methods often lose valuable information when aggregating weights, which may diminish the discriminating power between documents.  ...  To cope with this problem, the paper presents an approach for ranking documents in IR, based on a vector-based ordering technique already considered in fuzzy logic for multiple criteria analysis purpose  ...  The second method amounts to compare the criteria evaluation vectors directly by using a refinement of Pareto ordering. This later method is discussed in this paper.  ... 
dblp:conf/eusflat/BoughanemLP05 fatcat:uc6izdh54fgkxc2mhvqapu4loi

Improving test collection pools with machine learning

Gaya K. Jayasinghe, William Webber, Mark Sanderson, J. Shane Culpepper
2014 Proceedings of the 2014 Australasian Document Computing Symposium on - ADCS '14  
IR experiments typically use test collections for evaluation. Such test collections are formed by judging a pool of documents retrieved by a combination of automatic and manual runs for each topic.  ...  The proportion of relevant documents found for each topic depends on the diversity across each of the runs submitted and the depth to which runs are assessed (pool depth).  ...  Culpepper is the recipient of an ARC DECRA Research Fellowship (DE140100275).  ... 
doi:10.1145/2682862.2682864 dblp:conf/adcs/JayasingheWSC14 fatcat:u6milu4jkndwle6hdfhcl2hkre

Extending test collection pools without manual runs

Gaya K. Jayasinghe, William Webber, Mark Sanderson, J. Shane Culpepper
2014 Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14  
The quality of the final judgments produced for a collection is a product of the variety across each of the runs submitted and the pool depth.  ...  Information retrieval test collections traditionally use a combination of automatic and manual runs to create a pool of documents to be judged.  ...  INTRODUCTION Successful evaluation and reproducibility of experiments in information retrieval (IR) depends on building reusable test collections composed of documents, topics, and relevance judgments.  ... 
doi:10.1145/2600428.2609473 dblp:conf/sigir/JayasingheWSC14 fatcat:b27qe5hkt5hbfhac22hqf6qgom

Using RankBoost to compare retrieval systems

Huyen-Trang Vu, Patrick Gallinari
2005 Proceedings of the 14th ACM international conference on Information and knowledge management - CIKM '05  
Experimental results obtained on an XML document collection demonstrate the effectiveness of the approach according to different evaluation criteria.  ...  This paper presents a new pooling method for constructing the assessment sets used in the evaluation of retrieval systems. Our proposal is based on RankBoost, a machine learning voting algorithm.  ...  INTRODUCTION For evaluating retrieval systems, relevance assessments of test collections are essential.  ... 
doi:10.1145/1099554.1099641 dblp:conf/cikm/VuG05 fatcat:cii6twbm65egrn3inhcdrvatgu

A logistic regression approach to distributed IR

Ray R. Larson
2002 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '02  
The algorithm is compared to other methods for distributed search using test collections developed for distributed search evaluation.  ...  This poster session examines a probabilistic approach to distributed information retrieval using a Logistic Regression algorithm for estimation of collection relevance.  ...  Since the "collection documents" used for this evaluation represent collections of documents and not individual documents, a number of differences from the usual logistic regression measures were used.  ... 
doi:10.1145/564376.564463 dblp:conf/sigir/Larson02 fatcat:z2kxoexm7beotknlxpjkaxag6e

Plagiarism Detection - State-of-the-art systems (2016) and evaluation methods [article]

Christina Kraus
2016 arXiv   pre-print
While the need for a reliable and performant plagiarism detection system increases with an increasing amount of publications, current systems still have shortcomings.  ...  Particularly intelligent research plagiarism detection still leaves room for improvement. An important factor for progress in research is a suitable evaluation framework.  ...  The authors evaluate their approach on a subset of 100 documents of the PAN-PC 09 data set and receive an improvement of more than 35% for both, precision and recall, compared to the approaches that originally  ... 
arXiv:1603.03014v1 fatcat:mxfrxqyhznghfa7n7cc3xpslwa

A logistic regression approach to distributed IR

Ray R. Larson
2002 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '02  
The algorithm is compared to other methods for distributed search using test collections developed for distributed search evaluation.  ...  This poster session examines a probabilistic approach to distributed information retrieval using a Logistic Regression algorithm for estimation of collection relevance.  ...  Since the "collection documents" used for this evaluation represent collections of documents and not individual documents, a number of differences from the usual logistic regression measures were used.  ... 
doi:10.1145/564437.564463 fatcat:larmz27xoje4na4ipubs6zwi4a

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

Fabien Campagne
2008 BMC Bioinformatics  
Defined sets of relevant and non-relevant documents make it possible to evaluate the performance of a search  ...  For comparison, coefficients in the range 0.86-0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations.  ...  Dorff for technical help and acknowledges support from the resources of the HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine and the David A.  ... 
doi:10.1186/1471-2105-9-132 pmid:18312673 pmcid:PMC2292696 fatcat:exdq6lijnzen7almovdrp2g2ji
« Previous Showing results 1 — 15 out of 1,691,996 results