5,215 Hits in 2.3 sec

Towards a Unified Exploitation of Electronic Dialectal Corpora: Problems and Perspectives [chapter]

Nikitas N. Karanikolas, Eleni Galiotou, Angela Ralli
2014 Lecture Notes in Computer Science  
In this paper, we deal with the problem of storing and retrieving dialectal data in a unified framework.  ...  Then we discuss the possibilities and limitations of a retrieval module aiming at combining different linguistic levels for a unified exploitation of oral and written corpora.  ...  Table 4 depicts a simple example of a query defined by the Search and Retrieve query builder: The G. Oral (GUI for Oral sources) and G.  ... 
doi:10.1007/978-3-319-10816-2_32 fatcat:gyiqtdd6czaexghtylrqc54nmi

Experiments in Query Paraphrasing for Information Retrieval [chapter]

Ingrid Zukerman, Bhavani Raskutti, Yingying Wen
2002 Lecture Notes in Computer Science  
These information sources are: WordNet, a Webster-based thesaurus, and a combination of Webster and WordNet.  ...  Our experiments show that query paraphrasing improves retrieval performance, and that performance is influenced both by the number of paraphrases generated for a query and by their quality.  ...  Our results show that: (1) query paraphrasing improves document retrieval performance, (2) document recall is mainly influenced by the average number of paraphrases generated for a query, and (3) question-answering  ... 
doi:10.1007/3-540-36187-1_3 fatcat:obe5ogos7fbzfdrhrbevhd2b5q

Processing Tools for Greek and Other Languages of the Christian Middle East

Bastien Kindt
2018 Journal of Data Mining and Digital Humanities  
The main goal is to provide scholars with tools (lemmatized indexes and concordances) making corpus-based linguistic information available.  ...  It focuses on the questions of text processing, lemmatization, information retrieval, and bitext alignment.  ...  In the first case, tagged data is used in creating lemmatized concordances and indexes. It is also exploited with corpus analysis software or data retrieval system.  ... 
doi:10.46298/jdmdh.4184 fatcat:iskoaz4tyjbgpluohu4y6yf5yq

Cross-lingual search over 22 european languages

Blaž Fortuna, Jan Rupnik, Boštjan Pajntar, Marko Grobelnik, Dunja Mladenič
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
In this paper we present a system for cross-lingual information retrieval, which can handle tens of languages and millions of documents.  ...  The system uses an interactive webinterface, which can take advantage of a predefined thesaurus allowing the user to dynamically re-rank the retrieval results based on the mapping onto a predefined thesaurus  ...  The documents are indexed by the discovered latent concepts and a standard inverted index is used for the retrieval.  ... 
doi:10.1145/1390334.1390557 dblp:conf/sigir/FortunaRPGM08 fatcat:kwvzandzevgypbde4kt3juo4wq

Benefits of the 'Massively Parallel Rosetta Stone': Cross-Language Information Retrieval with over 30 Languages

Peter A. Chew, Ahmed Abdelali
2007 Annual Meeting of the Association for Computational Linguistics  
In this paper, we describe our experiences in extending a standard cross-language information retrieval (CLIR) approach which uses parallel aligned corpora and Latent Semantic Indexing.  ...  First, we make use of a parallel aligned corpus consisting of almost 50 parallel translations in over 30 distinct languages, each in over 30,000 documents.  ...  Acknowledgement Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under  ... 
dblp:conf/acl/ChewA07 fatcat:df3hs4jdwraorkvvk64xbmc7ai

The Cultural Heritage Language Technologies Consortium

Jeffrey A. Rydberg-Cox
2005 D-Lib Magazine  
The existing texts included a six-million-word corpus of classical Greek and a four-million-word corpus of classical Latin from the Perseus project, selected works of Isaac Newton from the Newton Project  ...  Search | Back Issues | Author Index | Title Index | Contents Jeffrey A.  ... 
doi:10.1045/may2005-rydberg-cox fatcat:jenldpdeqbbbnohaqkcmecblhq

The Quest for 'Falsehood', or a Survey of Tools for the Study of Greek-Syriac-Arabic Translations

Grigory Kessel, Rüdiger Arnzen, Slavomír Čéplö, Yury Arzhanov, Nicolás Bamballi
2020 Zenodo  
The remit of this project is to offer a new approach for research into translation techniques and into the history of the transmission of classical Greek literature in Late Antiquity and the Middle Ages  ...  Thereafter, it sets out the work-in-progress of the ERC project Transmission of Classical Scientific and Philosophical Literature from Greek into Syriac and Arabic (HUNAYNNET).  ...  Nevertheless, Philitas may again choose the opposite approach and look up bāṭilun/bāṭilan in the Arabic-Greek index at the end of WGAÜ and its supplements, where he encounters three further Greek equivalents  ... 
doi:10.5281/zenodo.4382406 fatcat:lgh26i5ufjaozpola7wgj5vw4y

Guest editorial: Artificial intelligence and software multilinguality

Constantine D. Spyropoulos, Vangelis Karkaletsis
1999 Applied Artificial Intelligence  
Our goal is to show the readers that existing AI methods, although not always mature and near-to-market, can offer a lot of help to support multilinguality in the software industry.  ...  Two of them cover research work on "localisation of user interfaces and documentation" (Boutsis et al., Bateman et al.) and are published in the present Issue no 6.  ...  A system performing withinlanguage retrieval, for instance in English and Greek, is actually composed of an English and a Greek retrieval system, allowing the user to use English (Greek) queries to retrieve  ... 
doi:10.1080/088395199117261 fatcat:u6nx67pfevcgbmq5azsx5mdlcm

Music Retrieval By Rhythmic Similarity Applied On Greek And African Traditional Music

Iasonas Antonopoulos, Aggelos Pikrakis, Sergios Theodoridis, Olmo Cornelis, Dirk Moelants, Marc Leman
2007 Zenodo  
EXPERIMENTS AND RESULTS Corpus of Greek Traditional Dance music The first corpus of our study consists of 220 tracks of Greek Traditional Dance Music, which are drawn from various Greek regions.  ...  CONCLUSIONS This paper presented a music retrieval method based on rhythmic similarity measurement. The method yielded satisfactory results on coprora of traditional Greek and African music.  ... 
doi:10.5281/zenodo.1417503 fatcat:tmkizrafwjbkxaafwwg3mgcq74

A word spotting framework for historical machine-printed documents

A. L. Kesidis, E. Galiotou, B. Gatos, I. Pratikakis
2010 International Journal on Document Analysis and Recognition  
A user feedback process is used in order to refine the search procedure. The methodology has been evaluated in early Modern Greek documents printed during the seventeenth and eighteenth century.  ...  In order to improve the efficiency of accessing and search, natural language processing techniques have been addressed that comprise a A. L. Kesidis (B) · B. Gatos · I.  ...  Acknowledgments The research leading to these results has received funding from the Greek Ministry of Research funded R&D (POLY-TIMO project) as well as from the European Community's Seventh Framework  ... 
doi:10.1007/s10032-010-0134-4 fatcat:2vqu3k6qjzbclagqmebyszmt4y

The influence of basic tokenization on biomedical document retrieval

Dolf Trieschnigg, Wessel Kraaij, Franciska de Jong
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
Tokenization is a fundamental preprocessing step in Information Retrieval systems in which text is turned into index terms.  ...  This paper quantifies and compares the influence of various simple tokenization techniques on document retrieval effectiveness in two domains: biomedicine and news.  ...  ACKNOWLEDGEMENTS This work was part of the BioRange programme of the Netherlands Bioinformatics Centre (NBIC), which is supported by a BSIK grant through the Netherlands Genomics Initiative (NGI).  ... 
doi:10.1145/1277741.1277917 dblp:conf/sigir/TrieschniggKJ07 fatcat:avrmupuhnjfz5msgiwgafqtwpi

Approaching the Problem of Multi-lingual Information Retrieval and Visualization in Greek and Latin and Old Norse Texts [chapter]

Jeffrey A. Rydberg-Cox, Lara Vetter, Stefan Rüger, Daniel Heesch
2004 Lecture Notes in Computer Science  
In this paper, we explore approaches to multi-lingual information retrieval for Greek, Latin, and Old Norse texts.  ...  We also describe an information retrieval tool that allows users to formulate Greek, Latin, or Old Norse queries in English and display the results in an innovative clustering and visualization facility  ...  Introduction Cross-lingual information retrieval is a particularly intriguing technology for students and scholars of Ancient and Early-Modern Greek and Latin or Old Norse.  ... 
doi:10.1007/978-3-540-30230-8_16 fatcat:urdxl6a2q5cqho44qece3dsdcm

Asia Minor Greek: Towards a Computational Processing

Eleni Galiotou, Nikitas Karanikolas, Ioanna Manolessou, Nikolaos Pantelidis, Dimitris Papazachariou, Angela Ralli, George Xydopoulos
2014 Procedia - Social and Behavioral Sciences  
storing, processing and retrieving oral and written dialectal data.  ...  , Aivali: In search of Asia Minor Greek"-AmiGre) In fact, the project constitutes the first attempt to describe dialectal phenomena at a phonological, morphological, and structural level.  ...  Acknowledgements This research is co-financed by the European Union (European Social Fund -ESF) and Greek national funds through the Operational Program "Education and Life-long Learning" of the National  ... 
doi:10.1016/j.sbspro.2014.07.138 fatcat:tzb7i2htnzhjzhx7mvq7zc4zai

Annotating Corpora from Various Sources in the Humanities Domain

Voula Giouli
2011 Journal for Language Technology and Computational Linguistics  
Annotating corpora from various sources in the humanities domain: shortcomings and issues  In this paper, we present work aimed at the linguistic annotation of Greek corpora that belong to the humanities  ...  , economics, etc.; we elaborate on the initial steps taken towards customization of the tools.  ...  These annotations served a two-fold purpose, that is, to enhance efficient indexing and retrieval of the textual documents, and to further facilitate the study of textual data and the elicitation of meaningful  ... 
dblp:journals/ldvf/Giouli11 fatcat:wwzsj77gdbfuzlxsnzqjjm7wqe

Exploring New Languages with HAIRCUT at CLEF 2005 [chapter]

Paul McNamee
2006 Lecture Notes in Computer Science  
JHU/APL has long espoused the use of language-neutral methods for cross-language information retrieval.  ...  In our bilingual experiments we used several nontraditional CLEF query languages such as Greek, Hungarian, and Indonesian, in addition to several western European languages.  ...  In addition to the use of character n-gram tokenization we make use of a statistical language model of retrieval and combination of evidence from multiple retrievals.  ... 
doi:10.1007/11878773_17 fatcat:ugsqm4uofzbo3kixvk2pswsszy
« Previous Showing results 1 — 15 out of 5,215 results