333 Hits in 4.8 sec

Query structuring and expansion with two-stage term dependence for Japanese web retrieval

Koji Eguchi, W. Bruce Croft
2009 Information retrieval (Boston)  
that may appear for instance within a compound word in a target document.  ...  We assume two types of dependencies of terms given in a query: (i) long-range dependencies that may appear for instance within a passage or a sentence in a target document, and (ii) short-range dependencies  ...  This work was supported in part by the Grants-in-Aid for Scientific Research (#19024055 and #18650057, and #20300038) from the Ministry of Education, Culture, Sports, Science and Technology, Japan, and  ... 
doi:10.1007/s10791-009-9092-1 fatcat:6cicwma3vrcplbzdaeshtf46zm

Syntax Vector Learning using correspondence for Natural Language Understanding

Hyein Seo, Sangkeun Jung, Taewook Hwang, Hyunji Kim, Yoon-Hyung Roh
2021 IEEE Access  
For example, NOUN (Common Noun), ADJ (Adjective), ADV (Adverb). We represented the input sequence as a set of tags in the POS tagging.  ...  For all fine-tuning experiments, we optimized a variety of hyperparameters: Text reader dimension, syntax reader dimension, batch size, epochs, optimizer, learning rates, and weight decay.  ... 
doi:10.1109/access.2021.3087271 fatcat:pt6lftkj2jbn5o3lddnrzdcqji

Corpus-Based Teaching of German Compound Nouns and Lexical Bundles for Improving Academic Writing Skills

Marina Kogan, Anna Yaroshevich, Olga Ni
2018 Lidil  
The authors also express their gratitude to the guest editors for comments on an earlier version of this paper.  ...  and Anna Tilmans, and Dr Vitor Zakharov from St Petersburg State University for their support during this research and discussion of its results.  ...  We queried the German-Russian parallel subcorpus of the Russian National Corpus (RNC 11 ) for the selected lexical bundles and compound nouns from the Kod.ING corpus (Appendix A), but none of the compound  ... 
doi:10.4000/lidil.5438 fatcat:rckrstbws5cajntv7mgv65xnly

Experiments in Japanese text retrieval and routing using the NEAT system

Gareth J. F. Jones, Tetsuya Sakai, Masahiro Kajiura, Kazuo Sumita
1998 Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '98  
The study includes a comparison of different indexing strategies for documents and queries, investigation of term weighting strategies principally derived for use with English texts, and the application  ...  of relevance feedback for query expansion.  ...  A significant source of this segmentation ambiguity is the free generation of new compound nouns in Japanese.  ... 
doi:10.1145/290941.290992 dblp:conf/sigir/JonesSKS98 fatcat:rgvygdng6bfidlirmagy6sllxe

Sentence Reduction for Syntactic Analysis of Compound Sentences in Punjabi Language

S.K Sharma
2018 EAI Endorsed Transactions on Scalable Information Systems  
Machine learning has application in many areas including medical science [59] , finance [60] , query optimization, pattern recognition, healthcare sectors [61] [62] etc.  ...  in Italian [51] , simplification of Korean sentences for deaf readers [52] , direct manipulation of parse tree [53] , removing unnecessary parts of sentences [54] and Spanish sentence simplification  ... 
doi:10.4108/eai.13-7-2018.156440 fatcat:rlucpapx2vcilk3mderatsfp2u

Learning a merge model for multilingual information retrieval

Ming-Feng Tsai, Hsin-Hsi Chen, Yu-Ting Wang
2011 Information Processing & Management  
To the best of our knowledge, this practice is the first attempt to use a learning-based ranking algorithm to construct a merge model for MLIR merging.  ...  To conduct the learning approach, we present a number of features that may influence the MLIR merging process. These features are mainly extracted from three levels: query, document, and translation.  ...  For a query, the feature set comprises the number of query terms (#QT) and compound words (#CW).  ... 
doi:10.1016/j.ipm.2009.12.002 fatcat:syvnpkv2qjeufhhwoazuatn5oy

A study of learning a merge model for multilingual information retrieval

Ming-Feng Tsai, Yu-Ting Wang, Hsin-Hsi Chen
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
a merge model for MLIR merging.  ...  To conduct the learning approach, we also present a large number of features that may influence the MLIR merging process; these features are mainly extracted from three levels: query, document, and translation  ...  After the above labeling, we then extract query-level features from the labeled dataset. For a query, the feature set comprises the number of query terms (#QT) and compound words (#CW).  ... 
doi:10.1145/1390334.1390370 dblp:conf/sigir/TsaiWC08 fatcat:nphpd2j5sjhnbi6ks442xyptl4

Semantic passage segmentation based on sentence topics for question answering

Hyo-Jung Oh, Sung Hyon Myaeng, Myung-Gil Jang
2007 Information Sciences  
We propose a semantic passage segmentation method for a Question Answering (QA) system.  ...  With the template-filling task used for information extraction in the QA system, the value of the sentence topic assignment method was reinforced.  ...  When it is a compound noun in Korean, however, it can be divided into multiple simple nouns. 2 Most of the terms in the hierarchy at or above the right level of abstraction were found to be within the  ... 
doi:10.1016/j.ins.2007.02.038 fatcat:3j6o2hodovd77j44lquqkdtevi

Structural and lexical factors in adjective placement in complex noun phrases across Romance languages

Kristina Gulordava, Paola Merlo
2015 Proceedings of the Nineteenth Conference on Computational Natural Language Learning  
We show that differences in the prenominal and postnominal placement of adjectives in the noun phrase across five main Romance languages is not only subject to heaviness effects, as previously reported  ...  , but also to subtler structural interactions among dependencies that are better explained as effects of the principle of dependency length minimisation.  ...  Part of this work was conducted during the visit of The first author to the Labex EFL in Paris and Alpage INRIA group. We thank Benoit Crabbé for the fruitful discussions during this period.  ... 
doi:10.18653/v1/k15-1025 dblp:conf/conll/GulordavaM15 fatcat:3denxcyfbzaehnra4rwqkvhizi

Some thoughts on the contrastive analysis of features in second language acquisition

Donna Lardiere
2009 Second Language Research  
I illustrate the nature of the problem by comparing the assembly and expression of features involved in plural-marking in English, Mandarin Chinese and Korean, and situate this comparison with respect  ...  to specific claims of the Nominal Mapping Parameter and within a discussion of parameter (re)setting more generally.  ...  I am grateful to Myong-Hee Choi and Sun Hee Hwang for their help, and most especially to Jong-Un Park, who patiently answered my endless queries about Korean plural-marking, and to Héctor Campos for equally  ... 
doi:10.1177/0267658308100283 fatcat:iqpe6xn3ovbjtc5cckkvfiurhm

Constituent order in compounds and syntax: typology and diachrony

Livio Gaeta
2008 Morphology  
For this specific property of compounds, morphology does not seem to be autonomous from syntax, albeit the relation between morphology and syntax must be thought of as a multi-faceted one.  ...  On the basis of a large language sample, it is shown that constituent order in compounds heavily relies on syntax.  ...  Nonetheless, ''it seems that there is a slight preference for modifier noun þ head structures, independent of the syntactic order of adjective and noun'' (Bauer 2001, p. 697) .  ... 
doi:10.1007/s11525-009-9125-x fatcat:3ytsclga5fb2tjg2xvgvmn3apa

Parsing the Penn Chinese Treebank with Semantic Knowledge [chapter]

Deyi Xiong, Shuanglong Li, Qun Liu, Shouxun Lin, Yueliang Qian
2005 Lecture Notes in Computer Science  
Further analysis of performance improvement indicates that semantic knowledge is helpful for nominal compounds, coordination, and N V tagging disambiguation, as well as alleviating the sparseness of information  ...  After being optimized on the held out data by the EM algorithm, our improved parser achieves 79.4% (F1 measure), as well as a 4.4% relative decrease in error rate on the Penn Chinese Treebank (CTB).  ...  (see [16] ) studied the pattern of N V+noun, which will be analyzed as a predicate-object structure if N V is a verb and a modifiernoun structure if N V is a noun.  ... 
doi:10.1007/11562214_7 fatcat:676kd4stkjecxl262r3sisnc7q

Word Sense Disambiguation Using Prior Probability Estimation Based on the Korean WordNet

Minho Kim, Hyuk-Chul Kwon
2021 Electronics  
A primary reason for the performance degradation of unsupervised disambiguation is that the semantic occurrence probability of ambiguous words is not available.  ...  An experiment was conducted with Korean, English, and Chinese to evaluate the performance of our proposed lexical disambiguation method.  ...  [19] proposed a lexical disambiguation model utilizing mutual information extracted from the Korean Noun Concept Network (ETRINET), a compound noun sense-tagged dictionary and raw corpus.  ... 
doi:10.3390/electronics10232938 fatcat:g5sylv5tprav3ca27if5wwf3pe

Automated FAQ answering with question-specific knowledge representation for web self-service

Eriks Sneiders
2009 2009 2nd Conference on Human System Interactions  
Automated FAQ answering is a valuable complement to web self-service: while the vast majority of site searches fail, our FAQ answering solution for restricted domains answers two thirds of the queries  ...  most appropriate for restricted domains, and what peculiarities of running an FAQ answering service for a customer service may be expected.  ...  In another study 25 most popular queries to a US legislation digital library accounted for 17% of the query flow [7] .  ... 
doi:10.1109/hsi.2009.5090996 fatcat:jsgueau74jdbjfqmyobkpn4j4e

Overlapping statistical word indexing

Yasushi Ogawa, Toru Matsuda
1997 Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '97  
This paper proposes a new statistical indexing method. We fist propose a segmentation method for Japanese text which uses statistical information of characters.  ...  It needs only a small amount of statistic information and computation, and does not need constant maintenance.  ...  It was used only in a post-processing of query word extraction, i.e. to identify component words in a kanji compound word which was extracted from query text using a closed lexicon parser [1~.  ... 
doi:10.1145/258525.258576 dblp:conf/sigir/OgawaM97 fatcat:kszpjgje2bfaxo7ljx2f3qgnqi
« Previous Showing results 1 — 15 out of 333 results