Filters








514 Hits in 8.8 sec

Applying Support Vector Machines to the TREC-2001 Batch Filtering and Routing Tasks

David D. Lewis
2001 Text Retrieval Conference  
I made the above bets based on a suspicion that high scoring documents are different from collections as a whole.  ...  On a serious note, it's clear that Avi Arampatzis' original worry was on the mark: evaluating routing runs on only the top 1000 documents meant that there was very little distinction among systems for  ... 
dblp:conf/trec/Lewis01 fatcat:4ocfczw3ubhibhejx7dqv5swnu

Boosting for document routing

Raj D. Iyer, David D. Lewis, Robert E. Schapire, Yoram Singer, Amit Singhal
2000 Proceedings of the ninth international conference on Information and knowledge management - CIKM '00  
We describe the algorithm and present experimental results on applying it to the document routing problem.  ...  RankBoost is a recently proposed algorithm for learning ranking functions. It is simple to implement and has strong justifications from computational learning theory.  ...  While the TREC-3 routing dataset and other TREC routing collections are the most widely used benchmark for machine learning of ranking functions, their size was problematic for our prototype code.  ... 
doi:10.1145/354756.354794 dblp:conf/cikm/IyerLSSS00 fatcat:utz4evnb5jcolccjgm4mlxhimy

TAKES: Two-step Approach for Knowledge Extraction in Biomedical Digital Libraries

Min Song
2014 Journal of Information Science Theory and Practice  
First, we introduce a novel query expansion technique based on keyphrases, while others are based on a single term or are combined with a rule-based learning technique like Ripper (Cohen & Singer, 1996  ...  Using the precision at rank n for the IR evaluation is based on the assumption that the most relevant hits must be in the top few documents returned for a query.  ... 
doi:10.1633/jistap.2014.2.1.1 fatcat:zouat6iefzecldddnylkojzmiu

The Text REtrieval Conferences (TRECs)

Donna Harman
1996 Proceedings of a workshop on held at Vienna, Virginia May 6-8, 1996 -  
Thanks also go to the TREC program committee and the staff at NIST. The TREC tracks could not happen without the efforts of the track coordinators; our special thanks to them.  ...  Acknowledgments The authors gratefully acknowledge the continued support of the TREC conferences by the Intelligent Systems Office of the Defense Advanced Research Projects Agency.  ...  Based on the lessons learned from the TREC-4 track on how difficult it was to fairly compare results in interactive experiments, the track concentrated on experimental design in TREC-5.  ... 
doi:10.3115/1119018.1119026 dblp:conf/tipster/Harman96 fatcat:wjklhj3nwnenfkfxben2756yla

The Text REtrieval Conferences (TRECs)

Donna Harman
1996 Proceedings of a workshop on held at Vienna, Virginia May 6-8, 1996 -  
Thanks also go to the TREC program committee and the staff at NIST. The TREC tracks could not happen without the efforts of the track coordinators; our special thanks to them.  ...  Acknowledgments The authors gratefully acknowledge the continued support of the TREC conferences by the Intelligent Systems Office of the Defense Advanced Research Projects Agency.  ...  Based on the lessons learned from the TREC-4 track on how difficult it was to fairly compare results in interactive experiments, the track concentrated on experimental design in TREC-5.  ... 
doi:10.3115/1119018.1119070 dblp:conf/tipster/Harman96a fatcat:bxx2gy23cbeenn5ryxedzg3ixe

The text retrieval conferences (TRECS)

Ellen M. Voorhees, Donna Harman
1996 Proceedings of a workshop on held at Baltimore, Maryland October 13-15, 1998 -  
Thanks also go to the TREC program committee and the staff at NIST. The TREC tracks could not happen without the efforts of the track coordinators; our special thanks to them.  ...  Acknowledgments The authors gratefully acknowledge the continued support of the TREC conferences by the Intelligent Systems Office of the Defense Advanced Research Projects Agency.  ...  Based on the lessons learned from the TREC-4 track on how difficult it was to fairly compare results in interactive experiments, the track concentrated on experimental design in TREC-5.  ... 
doi:10.3115/1119089.1119127 dblp:conf/tipster/VoorheesH98 fatcat:yo3fnbp2tffqzfjhndiwd4zsre

Genetic Programming-Based Discovery of Ranking Functions for Effective Web Search

WEIGUO FAN, MICHAEL D. GORDON, PRAVEEN PATHAK, PRAVEEN PATHAK
2005 Journal of Management Information Systems  
Query improvement in information re- trieval using genetic algorithms: A report on the experiments of the TREC project. In D.K. Harman (ed.), Proceedings of the First Text Retrieval Conference.  ...  A comparative study on feature selection in text categoriza- tion. In D.H. Fisher (ed.), Proceedings of the Fourteenth International Conference on Machine Learning.  ... 
doi:10.1080/07421222.2005.11045828 fatcat:2qn6uz3k6baptbc4qklrz46xle

Impact of Surrogate Assessments on High-Recall Retrieval

Adam Roegiest, Gordon V. Cormack, Charles L.A. Clarke, Maura R. Grossman
2015 Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '15  
We are concerned with the effect of using a surrogate assessor to train a passive (i.e., batch) supervised-learning method to rank documents for subsequent review, where the effectiveness of the ranking  ...  ; and, a more liberal view of relevance can be adopted by having the surrogate label borderline documents as relevant.  ...  Figure 2 : 2 Figure 2: Relative recall depth plots for the TREC-4 experiments, using (a) J1, (b) J2, and (c) J3, as the authority.  ... 
doi:10.1145/2766462.2767754 dblp:conf/sigir/RoegiestCCG15 fatcat:3voipk547jdsjgcnj6yutrhiy4

Training algorithms for linear text classifiers

David D. Lewis, Robert E. Schapire, James P. Callan, Ron Papka
1996 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '96  
We propose that two machine learning algorithms, the Widrow-Hoff and EG algorithms, be used in training linear text classifiers.  ...  Experimental data is presented showing Widrow-Hoff and EG to be more effective than the widely used Rocchio algorithm on several categorization and routing tasks.  ...  Acknowledgments Thanks to William Cohen, Isabelle Moulinier, Amit Singhal, Yoram Singer, Manfred Warmuth, and Yiming Yang for helpful comments on this work.  ... 
doi:10.1145/243199.243277 dblp:conf/sigir/LewisSCP96 fatcat:irhy2gpiwrfhdiin3cc4gfyf6a

Skierarchy: Extending the Power of Crowdsourcing Using a Hierarchy of Domain Experts, Crowd and Machine Learning

Ramesh Nellapati, Sanga Peerreddy, Prateek Singhal
2012 Text Retrieval Conference  
based on their learning needs.  ...  a Machine Learning system serving as a personal assistant to the crowd, at the bottom level.  ...  Step 2: Building a Machine Learning Model, and a Curated Keyword Set: The annotation data was then used to train the Machine Learning algorithm, which in this case is a Logistic Regression based binary  ... 
dblp:conf/trec/NellapatiPS12 fatcat:zpbqryyqifatvlerkaee22mbhe

Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review [article]

Gordon V. Cormack, Maura R. Grossman
2015 arXiv   pre-print
We enhance the autonomy of the continuous active learning method shown by Cormack and Grossman (SIGIR 2014) to be effective for technology-assisted review, in which documents from a collection are retrieved  ...  We show that our enhancements consistently yield superior results to Cormack and Grossman's version of continuous active learning, and other methods, not only on average, but on the vast majority of topics  ...  Based on these results, the "best" 50 topics were selected for use in the TREC 2002 Filtering Track.  ... 
arXiv:1504.06868v1 fatcat:vg344hhgxvcphpfqwpwwcn3iiq

Learning Ranking vs. Modeling Relevance

D. Roussinov, Weiguo Fan
2006 Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS'06)  
To accomplish this, we have designed a representation scheme, which is based on the discretized form of the high level statistics of the query term occurrences (such as tf, df, and document length) rather  ...  Our SVM-based classifier learns from the relevance judgments available with the standard test collections and generalizes to new, previously unseen queries its ability to compare and rank documents with  ...  Instead, he applied a regression trained on Cranfield to CISI collection but with a negative effect. Recently, the approaches based on learning have reported several important breakthroughs. Fan et al  ... 
doi:10.1109/hicss.2006.252 dblp:conf/hicss/RoussinovF06 fatcat:qsbt7hyc4vab5ovg6v4gmepvwi

A New Approach for Indexing & Clustering in Twitter Blogosphere

Avinash Samuel, Dilip Kumar Sharma
2017 International Journal of Web Science and Engineeringfor Smart Devices  
In this paper, a learning based model is proposed for Web data gathering.  ...  The model uses a world learning base and client nearby occurrence stores for client profile obtaining and the catch of client data needs.  ...  Knowledge bases are basically shut or open data stores and can be arranged under three primary headings:  Machine-readable knowledge bases  Human-readable knowledge bases The system of learning based  ... 
doi:10.21742/ijwsesd.2017.4.1.01 fatcat:63bh7rnh5zasheyehljcd36fsa

JHU/APL at TREC 2001: Experiments in Filtering and in Arabic, Video, and Web Retrieval

James Mayfield, Paul McNamee, Cash Costello, Christine D. Piatko, Amit Banerjee
2001 Text Retrieval Conference  
We also found that we were able to choose reasonable score thresholds for the routing task when using a language model for estimating document relevance.  ...  We investigated the use of Support Vector Machines (SVMs) for batch text classification and noticed a large sensitivity to parameter settings for these classifiers.  ...  This was JHU/APL's first experience with Arabic document processing and we learned quite a lot from the experience. We had no personnel who could read Arabic.  ... 
dblp:conf/trec/MayfieldMCPB01 fatcat:mhbbas2scnacxawjtkxdzmijna

DMINR at TREC News Track

Sondess Missaoui, Andrew MacFarlane, Stephann Makri, Marisela Gutierrez-Lopez
2019 Text Retrieval Conference  
Then , it utilises a scoring model which uses the given entities to provide a score for them based on evidence from the news articles.  ...  Our approach to each of these tasks draws on prior work done by City, University of London at the TREC conference.  ...  For the Background Linking task, we experimented with a simple heuristic learning method that has been optimized on an initial set of entities to retrieve background documents.  ... 
dblp:conf/trec/MissaouiMMG19 fatcat:azqxt7mdczcklbxro7kd4mstti
« Previous Showing results 1 — 15 out of 514 results