142 Hits in 4.6 sec

An English-translated parallel corpus for the CJK Wikipedia collections

Ling-Xiang Tang, Shlomo Geva, Andrew Trotman
2012 Proceedings of the Seventeenth Australasian Document Computing Symposium on - ADCS '12  
In this paper, we describe a machine-translated parallel English corpus for the NTCIR Chinese, Japanese and Korean (CJK) Wikipedia collections.  ...  Furthermore, the translated CJK articles could be used to further expand the current coverage of the English Wikipedia.  ...  The number of identified links for each identified anchor was limited to 5, in accordance with the NTCIR-9 CrossLink task specification.  ... 
doi:10.1145/2407085.2407099 dblp:conf/adcs/TangGT12 fatcat:rchf2w7imjem3jyk7mkjufxrwa

To translate or not to translate?

Chia-Jung Lee, Chin-Hui Chen, Shao-Hang Kao, Pu-Jen Cheng
2010 Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10  
An in-depth analysis is also provided for discussing the impact of out-of-vocabulary and wrongly-translated query terms on CLIR performance.  ...  Experiments on the NTCIR-4 and NTCIR-5 English-Chinese CLIR tasks demonstrate that the proposed approach can significantly improve CLIR performance.  ...  We use NTCIR-4 and NTCIR-5 English-Chinese tasks for evaluation and consider both of the <title> and <desc> fields as queries.  ... 
doi:10.1145/1835449.1835558 dblp:conf/sigir/LeeCKC10 fatcat:f4mcswzperaepkl6fir4hwihhy

Probabilistic models for answer-ranking in multilingual question-answering

Jeongwoo Ko, Luo Si, Eric Nyberg, Teruko Mitamura
2010 ACM Transactions on Information Systems  
This article presents two probabilistic models for answering ranking in the multilingual questionanswering (QA) task, which finds exact answers to a natural language question written in different languages  ...  This article first describes a probabilistic model that predicts the probabilities of correctness for individual answers in an independent way.  ...  ACKNOWLEDGMENTS We would like to thank NTCIR for providing the Japanese and Chinese corpora and data set. We would also like to thank Jamie Callan for his valuable discussion and suggestions.  ... 
doi:10.1145/1777432.1777439 fatcat:2aftfesfrfhs5c5synxz5rvkqm

ALBAYZIN Query-by-example Spoken Term Detection 2016 evaluation

Javier Tejedor, Doroteo T. Toledano, Paula Lopez-Otero, Laura Docio-Fernandez, Jorge Proença, Fernando Perdigão, Fernando García-Granada, Emilio Sanchis, Anna Pompili, Alberto Abad
2018 EURASIP Journal on Audio, Speech, and Music Processing  
This is a highly valuable task for blind people or devices that do not have a textbased input, and consequently, the query must be given in other format such as speech.  ...  Special attention was given to the evaluation design so that a thorough post-analysis of the main results could be carried out.  ...  Acknowledgements This work was partially supported by Fundação para a Ciência e Tecnologia (FCT) under the projects UID/EEA/50008/2013 (pluriannual funding in the scope of the LETSREAD project) and UID  ... 
doi:10.1186/s13636-018-0125-9 fatcat:ccrh3ur67nffnf5hpfwg45kmhq

Emulating Human Conversations using Convolutional Neural Network-based IR [article]

Abhay Prakash, Chris Brockett, Puneet Agrawal
2016 arXiv   pre-print
To design a system that is capable of emulating human-like interactions, a conversational layer that can serve as a fabric for chat-like interaction with the agent is needed.  ...  significantly outperforms several conventional baselines in terms of the relevance of responses retrieved.  ...  Other recent neural retrieval models include a multilayer perceptron classifier [14] and three-layered neural networks [2] [6] in the NTCIR-12 short text conversation tasks.  ... 
arXiv:1606.07056v1 fatcat:ul4zaxafmvbzbbsamz6p4rnvfa

Cross-language Information Retrieval [article]

Petra Galuščáková, Douglas W. Oard, Suraj Nair
2021 arXiv   pre-print
This chapter reviews the state of the art for cross-language information retrieval and outlines some open research questions.  ...  Two key assumptions shape the usual view of ranked retrieval: (1) that the searcher can choose words for their query that might appear in the documents that they wish to see, and (2) that ranking retrieved  ...  The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of ODNI, IARPA, or the  ... 
arXiv:2111.05988v1 fatcat:fgnaux4lcbe5jlpczhbxka5cqq

Different Facets of Text Based Automated Question Answering System

vaishali Singh
2018 International Journal for Research in Applied Science and Engineering Technology  
Therefore, this paper attempts to present the state-of-the-art in the field of text based automatic question answering systems and provide a qualitative analysis of different facets.  ...  Finally, a summarized representation of these facets based on certain features of QA system has been done, to bring an insight for the future research.  ...  Such systems are quite similar to the text based dialogue system built upon mechanism of question answering systems [34] .  ... 
doi:10.22214/ijraset.2018.1017 fatcat:z2l3qicgvjawpcahqyr3alk3wy

On the Effectiveness of Contextualisation Techniques in Spoken Query Spoken Content Retrieval

David N. Racca, Gareth J.F. Jones
2016 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16  
Experimental results over the Japanese NTCIR SpokenQuery&Doc collection show that combining global and local context is beneficial for SCR and that models usually benefit from using larger amounts of context  ...  In this paper, we evaluate different contextualisation techniques, including a recently proposed technique based on positional language models (PLM) on the task of retrieving relevant spoken passages in  ...  SPOKEN TEST COLLECTION The SDPWS dataset has been used at recent editions of the NTCIR SpokenQuery&Doc (SQD) tasks [1] .  ... 
doi:10.1145/2911451.2914730 dblp:conf/sigir/RaccaJ16 fatcat:y45iufmgq5a6lex3qfnva2yklu

LeQua@CLEF2022: Learning to Quantify

Esuli, Moreo, Sebastiani
2022 Zenodo  
LeQua 2022 is a new lab for the evaluation of methods for "learning to quantify" in textual datasets, i.e., for training predictors of the relative frequencies of the classes of interest in sets of unlabelled  ...  While these predictions could be easily achieved by first classifying all documents via a text classifier and then counting the numbers of documents assigned to the classes, a growing body of literature  ...  The authors' opinions do not necessarily reflect those of the European Commission.  ... 
doi:10.5281/zenodo.6367102 fatcat:ewjjjtfkb5gzjc5iymruk5ckou

LeQua@CLEF2022: Learning to Quantify [article]

Andrea Esuli, Alejandro Moreo, Fabrizio Sebastiani
2021 arXiv   pre-print
LeQua 2022 is a new lab for the evaluation of methods for "learning to quantify" in textual datasets, i.e., for training predictors of the relative frequencies of the classes of interest in sets of unlabelled  ...  While these predictions could be easily achieved by first classifying all documents via a text classifier and then counting the numbers of documents assigned to the classes, a growing body of literature  ...  The authors' opinions do not necessarily reflect those of the European Commission.  ... 
arXiv:2111.11249v3 fatcat:esvckl7s6jbdhedsl2bznziply

Web Forum Retrieval and Text Analytics: A Survey

Doris Hoogeveen, Li Wang, Timothy Baldwin, Karin M. Verspoor
2018 Foundations and Trends in Information Retrieval  
Answers, between April 2004 and October 2005. Reddit Comment CorpusA periodical dump of all the comments.  ...  Answers are supposed to directly answer the question, while comments can be used to correct someone, ask for clarification on a certain point, make a small addition to a post, or provide similar short  ... 
doi:10.1561/1500000062 fatcat:pmhu4xe56vgg3h2xrrlhswsd5m

A Survey on Retrieval of Mathematical Knowledge [article]

F. Guidi, C. Sacerdoti Coen
2015 arXiv   pre-print
We present a short survey of the literature on indexing and retrieval of mathematical knowledge, with pointers to 72 papers and tentative taxonomies of both retrieval problems and recurring techniques.  ...  For example, the system that scored better at the last NTCIR task reports better scores when applied to content markup generated from L A T E X w.r.t. presentation only markup [RSL14] .  ...  The situation is improving since 2013 with the creation of a math oriented task in the NTCIR initiative [AKO13, AKO14] that is attracting a small, but increasing number of participants [GWHT14, GPBB14,  ... 
arXiv:1505.06646v3 fatcat:5geageq73bavlesdhxeiftfnmy

Automatic text summarization: What has been done and what has to be done [article]

Abdelkrime Aries, Djamel eddine Zegour, Walid Khaled Hidouci
2019 arXiv   pre-print
Most of these challenges are much more related to the nature of processed languages. These challenges are interesting for academics and developers, as a path to follow in this field.  ...  Automatic summarization and, in particular, Automatic text summarization (ATS) is not a new research field; It was known since the 50s.  ...  The aim of this track is to develop ATS systems that afford short, coherent summaries of document. This task is meant to promote a deep linguistic analysis for ATS.  ... 
arXiv:1904.00688v1 fatcat:xvvjdpu3xzdsdn4piksz2s4pve

Translation techniques in cross-language information retrieval

Dong Zhou, Mark Truran, Tim Brailsford, Vincent Wade, Helen Ashman
2012 ACM Computing Surveys  
Over the last 15 years, the CLIR community has developed a wide range of techniques and models supporting free text translation.  ...  Like IR, CLIR is centred upon the search for documents, and for information contained within those documents.  ...  ACKNOWLEDGMENTS This research was partially supported by a PHD scholarship from the University of Nottingham and funding from the Science Foundation Ireland (Grant 07/CE/I1142) as part of the Centre for  ... 
doi:10.1145/2379776.2379777 fatcat:mu5p5djufjghvn3xjppekmwnwu

Comparative Evaluation of Cross-language Information Retrieval Systems [chapter]

Carol Peters
2005 Lecture Notes in Computer Science  
With the increasing importance of the "Global Information Society" and as the world's depositories of online collections proliferate, there is a growing need for systems that enable access to information  ...  the research community with an infrastructure for testing and evaluating systems operating in multilingual contexts and a common platform for the comparison of methodologies and results.  ...  ImageCLEF aims to provide the necessary collection(s) and framework in which to analyze the link between image and text and promote the discovery of alternate methods for cross-language image retrieval  ... 
doi:10.1007/978-3-540-31842-2_16 fatcat:4ad3pwr6pnhr5n6v3gwzsqd4cu
« Previous Showing results 1 — 15 out of 142 results