Filters








21 Hits in 10.4 sec

Overview of the TREC 2019 deep learning track [article]

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, Ellen M. Voorhees
2020 arXiv   pre-print
The Deep Learning Track is a new track for TREC 2019, with the goal of studying ad hoc ranking in a large data regime.  ...  It is the first track with large human-labeled training sets, introducing two sets corresponding to two tasks, each with rigorous TREC-style blind evaluation and reusable test sets.  ...  Through a combination of test collection reuse (from past years) and blind evaluation (submitting runs) the Deep Learning Track is offering a framework for studying ad hoc search in the large data regime  ... 
arXiv:2003.07820v2 fatcat:a4wghnw6fzbmfe4m24lpgpuwhy

Overview of the TREC 2020 deep learning track [article]

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos
2021 arXiv   pre-print
This is the second year of the TREC Deep Learning Track, with the goal of studying ad hoc ranking in the large training data regime.  ...  This year we have further evidence that rankers with BERT-style pretraining outperform other rankers in the large data regime.  ...  Hence, test collections generated as part of this year's track may be more reusable compared to last year since these test collections may be fairer towards evaluating the quality of unseen non-neural  ... 
arXiv:2102.07662v1 fatcat:ydpxzcongzfenn6uiwcpywm7ce

Multiple testing in statistical analysis of systems-based information retrieval experiments

Benjamin A. Carterette
2012 ACM Transactions on Information Systems  
We investigate this phenomenon in the context of simultaneous testing of many hypotheses using a fixed set of data.  ...  High-quality reusable test collections and formal statistical hypothesis testing have together allowed a rigorous experimental environment for information retrieval research.  ...  In short, MCP is a deep problem that colors all of the evaluation we do, particularly with reusable test collections.  ... 
doi:10.1145/2094072.2094076 fatcat:deqmu2e5yrbopflouyhnkesaom

Multi-Stage Document Ranking with BERT [article]

Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, Jimmy Lin
2019 arXiv   pre-print
On two large-scale datasets, MS MARCO and TREC CAR, experiments show that our model produces results that are either at or comparable to the state of the art.  ...  The advent of deep neural networks pre-trained via language modeling tasks has spurred a number of successful applications in natural language processing.  ...  Electronics (Improving Deep Learning using Latent Structure).  ... 
arXiv:1910.14424v1 fatcat:7qpsevc4lvduhkwk7cgyol2lru

Pretrained Transformers for Text Ranking: BERT and Beyond [article]

Jimmy Lin, Rodrigo Nogueira, Andrew Yates
2021 arXiv   pre-print
Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications.  ...  The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query.  ...  In addition, we would like to thank the TPU Research Cloud for resources used to obtain new results in this work.  ... 
arXiv:2010.06467v3 fatcat:obla6reejzemvlqhvgvj77fgoy

Explicit web search result diversification

Rodrygo L.T. Santos
2012 SIGIR Forum  
Test Collection Our experiments use the WT09 test collection, comprising 49 queries from the TREC 2009 Web track (Clarke et al., 2009a) , as described in Table 5 .1.  ...  The TREC Web track provides test collections for the assessment of adhoc and diversity search approaches in a web setting.  ...  In Proceedings of the 24th Inter-  ... 
doi:10.1145/2492189.2492205 fatcat:g3f4j6r6ivhtzbm6mfi2zigsm4

Statistical source expansion for question answering

Nico Schlaefer, Jennifer Chu-Carroll, Eric Nyberg, James Fan, Wlodek Zadrozny, David Ferrucci
2011 Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11  
In addition, we demonstrate that active learning reduces the amount of labeled data required to fit a relevance model by two orders of magnitude with little loss in ranking performance.  ...  In this thesis, we propose a novel algorithm that expands a collection of seed documents by (1) retrieving related content from the Web or other large external sources, (2) extracting self-contained text  ...  to an unseen test collection.  ... 
doi:10.1145/2063576.2063632 dblp:conf/cikm/SchlaeferCNFZF11 fatcat:whoy62klazctbdo4p57wbevkdu

Conversational Search – A Report from Dagstuhl Seminar 19461 [article]

Avishek Anand, Lawrence Cavedon, Matthias Hagen, Hideo Joho, Mark Sanderson, Benno Stein
2020 arXiv   pre-print
Systems were invited to share the latest development in the area of Conversational Search and discuss its research agenda and future directions.  ...  The ideas and findings presented in this report should serve as one of the main sources for diverse research programs on Conversational Search.  ...  of interaction that's appropriate in such dialogues.  ... 
arXiv:2005.08658v1 fatcat:ekkylrysezfkfcwkuk5ynrcwki

Conversational Search (Dagstuhl Seminar 19461)

Avishek Anand, Lawrence Cavedon, Hideo Joho, Mark Sanderson, Benno Stein
2020 Dagstuhl Reports  
Systems were invited to share the latest development in the area of Conversational Search and discuss its research agenda and future directions.  ...  The ideas and findings presented in this report should serve as one of the main sources for diverse research programs on Conversational Search.  ...  We also thank the staff of Schloss Dagstuhl for providing a great venue for a successful seminar. The organisers were in part supported by JSPS KAKENHI Grant Number 19H04418.  ... 
doi:10.4230/dagrep.9.11.34 dblp:journals/dagstuhl-reports/AnandCJSS19 fatcat:ctqyitsfifcbbecio5dnx2i62y

MIRages: an account of music audio extractors, semantic description and context-awareness, in the three ages of MIR

Perfecto Herrera Boyer, Xavier Serra, Emilia Gómez
2018 Zenodo  
collections is reported.  ...  In the age of semantic descriptors work on describing music with high-level concepts, such as mood, instruments, similarities, cover versions or genres, usually inferred with machine learning from annotated  ...  We finally want to thank Michel Plu and Valérie Botherel from Orange Labs for the user evaluation data and Piero Fraternali, Alessandro Bozzon and Marco Brambilla from WebModels for the user interface.  ... 
doi:10.5281/zenodo.2278110 fatcat:uturvyw2gnfzdgtelvtxot3etq

MIRages: an account of music audio extractors, semantic description and context-awareness, in the three ages of MIR

Perfecto Herrera Boyer, Xavier Serra, Emilia Gómez
2018 Zenodo  
collections is reported.  ...  In the age of semantic descriptors work on describing music with high-level concepts, such as mood, instruments, similarities, cover versions or genres, usually inferred with machine learning from annotated  ...  We finally want to thank Michel Plu and Valérie Botherel from Orange Labs for the user evaluation data and Piero Fraternali, Alessandro Bozzon and Marco Brambilla from WebModels for the user interface.  ... 
doi:10.5281/zenodo.1882316 fatcat:6yhrlcyexrgyhhwayeau2gu7f4

Community interpreting, translation, and technology

Christopher D. Mellinger, Nike K. Pokorn
2018 Translation and Interpreting Studies  
In Proceedings of the IEEE Conference on Bibliographical References  ...  The content presented has benefited from discussion with Harald Traue of University of Ulm and further participants of the ELSI networking workshop InterEmotio organised by the German BMBF in Stuttgart  ...  data, crowd-sourced annotation by large groups of individuals with often unknown reliability and high subjectivity, and "deep" and partially less supervised learning with limited transparency of what  ... 
doi:10.1075/tis.00019.int fatcat:3gdc2iojrze4dhyvvs52fo7cuu

Software Architecture for Language Engineering

HAMISH CUNNINGHAM, DONIA SCOTT
2004 Natural Language Engineering  
The thesis represents the first discussion of software infrastructure for language computation that covers a large portion of the field.  ...  In order to demonstrate the theory developed in relation to SALE, we present the design, implementation and evaluation of GATE, a General Architecture for Text Engineering, which illustrates in practice  ...  Many of these approaches involve supervised learning, where the results produced by humans for a particular task are collected in large quantities and used as inputs to machine learning algorithms [Mitchell  ... 
doi:10.1017/s1351324904003481 fatcat:xzkpj2edozgidfrknmergcyyga

2019 CIS Annual Meeting: Immune Deficiency & Dysregulation North American Conference

2019 Journal of Clinical Immunology  
BMBF 01 EO003 (Freiburg) The authors would like to thank the Director General of Health of Malaysia for permission to publish this scientific presentation.  ...  Raif Geha and Janet Chou at the Division of Immunology, Allergy, Rheumatology and Dermatology, Boston Children's Hospital, Harvard Medical School. The following grants are acknowledged: 1.  ...  Methods: Data were analyzed from the IDEaL (Immunoglobulin, Diagnosis, Evaluation, and key Learnings) Patient Registry.  ... 
doi:10.1007/s10875-019-00597-5 pmid:30809743 fatcat:ophhlkvazzhjxl4t4eoct3tfna

Penilaian Kualiti Laporan EIA Dalam Aspek Kajian Hakisan Tanah [article]

Abdul Mahmud
2020 Figshare  
Test and retest method is conducted to ensure the reliability of review data meet the standard. The overall quality is calculated by aggregating the each value quality of review areas.  ...  : Soil erosion and sedimentation has become one of the major issues in the implementation of the EIA project in Malaysia.  ...  The DUC2002 data collection contains 567 documents in 59 sets. DUC2002 contains various English news articles collected from TREC-9 for the document summarization task.  ... 
doi:10.6084/m9.figshare.12095736 fatcat:fzsit3wyqjc3fni4no4zzopz5e
« Previous Showing results 1 — 15 out of 21 results