2,082 Hits in 4.0 sec

Lexical cohesion and term proximity in document ranking

Olga Vechtomova, Murat Karamuftuoglu
2008 Information Processing & Management  
Two types of lexical cohesive relationship information between query terms are used in document ranking: short-distance collocation relationship between query terms, and long-distance relationship, determined  ...  We demonstrate effective new methods of document ranking based on lexical cohesive relationships between query terms.  ...  The paper describes in detail how lexical cohesion between query terms in documents could be used in document ranking, and points to the potential of the cohesion theory in improving effectiveness of retrieval  ... 
doi:10.1016/j.ipm.2008.01.003 fatcat:uejo5g7uwbdsvoep7lynyrdi6u

A graph based approach to estimating lexical cohesion

Hayrettin Gürkök, Murat Karamuftuoglu, Markus Schaal
2008 Proceedings of the second international symposium on Information interaction in context - IIiX '08  
In this paper we make use of a graph-based approach to capture term contexts and estimate the level of lexical cohesion in a document.  ...  Lexical cohesion is a characteristic of natural language texts, which can be used to determine whether the query terms are used in the same context in the document.  ...  CONCLUSION AND FUTURE WORK We have investigated different methods for document ranking based on lexical cohesion among query terms in a document.  ... 
doi:10.1145/1414694.1414704 dblp:conf/iiix/GurkokKS08 fatcat:jf7sm6m3mff6xm3vsjqljk4d4e

On document relevance and lexical cohesion between query terms

Olga Vechtomova, Murat Karamuftuoglu, Stephen E. Robertson
2006 Information Processing & Management  
A document ranking method based on lexical cohesion shows some performance improvements.  ...  Lexical cohesion between distinct query terms in a document is estimated on the basis of the lexical-semantic relations (repetition, synonymy, hyponymy and sibling) that exist between there collocates  ...  Ranking of a document set by lexical cohesion scores results in significant performance improvement over term-based document ranking techniques.  ... 
doi:10.1016/j.ipm.2006.01.008 fatcat:guhda6zkyreclfbdpja4vltlhu

Juru at TREC 2003 - Topic Distillation using Query-Sensitive Tuning and Cohesiveness Filtering

Einat Amitay, David Carmel, Adam Darlow, Michael Herscovici, Ronny Lempel, Aya Soffer, Reiner Kraft, Jason Y. Zien
2003 Text Retrieval Conference  
Lexical Affinity weight (LA-Weight). Our ranking algorithm takes into account lexical affinities common to the query and the document, in addition to simple query terms.  ...  Lexical affinities are pairs of closely related terms frequently found in proximity to each other [7] .  ... 
dblp:conf/trec/AmitayCDHLSKZ03 fatcat:c6lzevchbjf4tc6agk7gmgrwmm

Query expansion with terms selected using lexical cohesion analysis of documents

Olga Vechtomova, Murat Karamuftuoglu
2007 Information Processing & Management  
We present new methods of query expansion using terms that form lexical cohesive links between the contexts of distinct query terms in documents (i.e., words surrounding the query terms in text).  ...  The link-forming terms (link-terms) and short snippets of text surrounding them are evaluated in both interactive and automatic query expansion (QE).  ...  Acknowledgements We would like to thank Susan Jones (City University, London) and anonymous referees for their valuable comments and suggestions.  ... 
doi:10.1016/j.ipm.2006.09.004 fatcat:cmhbfip4i5b7vm7ukprxyp5f7y

Lexical cohesion for evaluation of machine translation at document level

Billy T.M. Wong, Cecilia F.K. Pun, Chunyu Kit, Jonathan J. Webster
2011 2011 7th International Conference on Natural Language Processing and Knowledge Engineering  
While most state-of-the-art evaluation metrics focus on the sentence level, we emphasize the importance of document structure, showing that lexical cohesion is a critical feature to highlight the superior  ...  An experiment shows that this feature can bring forth a 3-5% improvement in the correlation of automatic evaluation results with human judgments of machine translation outputs at the document level.  ...  The research described in this paper was partially supported by City University of Hong Kong through the SRG grants 7002267 and 7008003 and by the Research Grants Council (RGC) of HKSAR, China, through  ... 
doi:10.1109/nlpke.2011.6138201 dblp:conf/nlpke/WongPKW11 fatcat:56pauxl5zbgrhfjj3tadcsufd4

An effective coherence measure to determine topical consistency in user-generated content

Jiyin He, Wouter Weerkamp, Martha Larson, Maarten de Rijke
2009 International Journal on Document Analysis and Recognition  
The properties that make the coherence score more appropriate than lexical cohesion, a common measure of topical structure, are discussed.  ...  The coherence score must, however, be used judiciously in order to avoid boosting the ranking of irrelevant but topically focused blogs.  ...  Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided  ... 
doi:10.1007/s10032-009-0089-5 fatcat:ql7sz23vnnc5plpqdbclyq23he

A study of the effect of term proximity on query expansion

Olga Vechtomova, Ying Wang
2006 Journal of information science  
Query expansion terms are often used to enhance original query formulations in document retrieval.  ...  Such terms are usually selected from the entire documents or from windows or passages surrounding query term occurrences.  ...  Term proximity has been explored extensively in document ranking studies [16] [17] [18] [19] [20] , where several distance factors were proposed.  ... 
doi:10.1177/0165551506065787 fatcat:4y6gfyudyjdldl4npri23ukunq

Intra-content term weighting for topic segmentation

Abdessalam Bouchekif, Geraldine Damnati, Delphine Charlet
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
This paper deals with term weighting strategies in the context of lexical cohesion based topic segmentation.  ...  It translates the capacity of a term to discriminate a document within a collection, or a part of a document within a whole document.  ...  Lexical cues borrowed from traditional text segmentation. Main lexical approaches include the notions of lexical cohesion introduced in [1] and lexical chaining [2] , [3] , [4] .  ... 
doi:10.1109/icassp.2014.6854980 dblp:conf/icassp/BouchekifDC14 fatcat:wdtuxem2lfhihl6svu3hexskpe

Extracting Clusters of Specialist Terms from Unstructured Text

Aaron Gerow
2014 Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)  
Term clusters are identified by extracting communities in the cooccurrence graph, after which the largest is discarded and the remaining words are ranked by centrality within a community.  ...  Automatically identifying related specialist terms is a difficult and important task required to understand the lexical structure of language.  ...  Duede and three reviewers for comments on earlier versions of this manuscript.  ... 
doi:10.3115/v1/d14-1149 dblp:conf/emnlp/Gerow14 fatcat:jxwufwgjbbdzplrmvvwmxxy6ai

ReaderBench, an Environment for Analyzing Text Complexity and Reading Strategies [chapter]

Mihai Dascalu, Philippe Dessus, Ştefan Trausan-Matu, Maryse Bianco, Aurélie Nardy
2013 Lecture Notes in Computer Science  
ReaderBench allows the assessment of three main textual features: cohesion-based assessment, reading strategies identification and textual complexity evaluation, which have been subject to empirical validations  ...  ReaderBench covers a complete cycle, from the initial complexity assessment of reading materials, the assignment of texts to learners, the capture of metacognitions reflected in one's textual verbalizations  ...  Secondly, a variety of metrics based on the span and the coverage of lexical chains [20] provide insight in terms of lexicon variety and of cohesion.  ... 
doi:10.1007/978-3-642-39112-5_39 fatcat:6ppzerxwyvh27kiowro26ise7m

Mining Texts, Learner Productions and Strategies with ReaderBench [chapter]

Mihai Dascalu, Philippe Dessus, Maryse Bianco, Stefan Trausan-Matu, Aurélie Nardy
2013 Studies in Computational Intelligence  
Of particular importance are the cohesion and coherence properties of texts that can help or impair [9] and, moreover, interact with reader's personal characteristics [8, 10] .  ...  It has long been recognized that the comprehension performance differs according to lexical and syntactical complexity, as well as to the thematic content and to how information is structured [7, 8] .  ...  the POSDRU/107/1.5/S/76909 Harnessing human capital in research through doctoral scholarships (ValueDoc) projects.  ... 
doi:10.1007/978-3-319-02738-8_13 fatcat:czzpnaqadjbotksfyigo2u64ly

Quantitative Discourse Cohesion Analysis of Scientific Scholarly Texts using Multilayer Networks [article]

Vasudha Bhatnagar, Swagata Duari, S.K. Gupta
2022 arXiv   pre-print
Exploiting the hierarchical structure of scientific scholarly texts, we design section-level and document-level metrics to assess the extent of lexical cohesion in text.  ...  In this study, we aim to computationally analyze the discourse cohesion in scientific scholarly texts using multilayer network representation and quantify the writing quality of the document.  ...  Acknowledgement This work is supported by Department of Science and Technology, Govt. of India, grant MTR/2019/000604.  ... 
arXiv:2205.07532v1 fatcat:2dsv33cv7fgcneesq25dhtdhm4

Extracting Informative Content Units in Text Documents

Reischer Jürgen
2020 Zenodo  
The notion of semantic and thematic informativeness of text is explored in theory and practice.  ...  Possible applications for semantic text processing including conceptual indexing and passage extraction are presented and discussed.  ...  The user wants informative documents, not just relevant ones. This should be considered at least in passage or document ranking.  ... 
doi:10.5281/zenodo.4134777 fatcat:ajo3ka472nhrtalpulfrp4wlai

Web Search Result Clustering using Heuristic Search and Latent Semantic Indexing

Mansaf Alam, Kishwar Sadaf
2012 International Journal of Computer Applications  
Traditional search engines use the hyperlink structure of the web to retrieve documents or pages and give them in a ranked fashion to the user.  ...  General Terms Document Clustering, Heuristic Search, Semantic Similarity, LSI et al.  ...  This query vector is then compared to all documents in the dataset to determine the proximity between query and documents.  ... 
doi:10.5120/6342-8633 fatcat:yioawe6ehvdmphywsdoq3ls7oy
« Previous Showing results 1 — 15 out of 2,082 results