312 Hits in 11.9 sec

Latent semantic indexing (LSI) fails for TREC collections

Avinash Atreya, Charles Elkan
2011 SIGKDD Explorations  
The aim of latent semantic indexing (LSI) is to uncover the relationships between terms, hidden concepts, and documents.  ...  All proposed methods are evaluated experimentally on the four TREC collections mentioned above. The experiments show that the new variants of LSI improve upon previous LSI methods.  ...  Latent semantic indexing (LSI).  ... 
doi:10.1145/1964897.1964900 fatcat:kz4u27rvbbcyvj4cj23oiwvja4

Essential Dimensions of Latent Semantic Indexing (LSI)

April Kontostathis
2007 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07)  
Latent Semantic Indexing (LSI) is commonly used to match queries to documents in information retrieval applications.  ...  In this paper, we first develop a model for understanding which values in the reduced dimensional space contain the term relationship (latent semantic) information.  ...  Introduction Latent Semantic Indexing (LSI) has been applied to a wide variety of learning tasks involving textual data.  ... 
doi:10.1109/hicss.2007.213 dblp:conf/hicss/Kontostathis07 fatcat:ogdvacbaivgyjl2vgxckcr7p6i

A framework for understanding Latent Semantic Indexing (LSI) performance

April Kontostathis, William M. Pottenger
2006 Information Processing & Management  
In this paper we present a theoretical model for understanding the performance of Latent Semantic Indexing (LSI) search and retrieval applications.  ...  Many models for understanding LSI have been proposed. Ours is the first to study the values produced by LSI in the term by dimension vectors.  ...  Wei-Min Huang in developing the proof of the transitivity in LSI as well as in reviewing drafts of this article. The authors also would like to express their gratitude to Dr. Brian D.  ... 
doi:10.1016/j.ipm.2004.11.007 fatcat:ax2h5j7xnbdyneuya3hqfvwbfa

Regularized Latent Semantic Indexing

Quan Wang, Jun Xu, Hang Li, Nick Craswell
2013 ACM Transactions on Information Systems  
Regularized latent semantic indexing: A new approach to large-scale topic modeling.  ...  In this article we introduce Regularized Latent Semantic Indexing (RLSI)-including a batch version and an online version, referred to as batch RLSI and online RLSI, respectively-to scale up topic modeling  ...  A document is viewed as a bag of terms generated from a mixture of latent topics. 1 Various topic modeling methods, such as Latent Semantic Indexing (LSI) [Deerwester et al. 1990 ], Probabilistic Latent  ... 
doi:10.1145/2414782.2414787 fatcat:wqq4oljj5vc3dlqcwz7ydnzs2m

Regularized latent semantic indexing

Quan Wang, Jun Xu, Hang Li, Nick Craswell
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
Regularized latent semantic indexing: A new approach to large-scale topic modeling.  ...  In this article we introduce Regularized Latent Semantic Indexing (RLSI)-including a batch version and an online version, referred to as batch RLSI and online RLSI, respectively-to scale up topic modeling  ...  A document is viewed as a bag of terms generated from a mixture of latent topics. 1 Various topic modeling methods, such as Latent Semantic Indexing (LSI) [Deerwester et al. 1990 ], Probabilistic Latent  ... 
doi:10.1145/2009916.2010008 dblp:conf/sigir/WangXLC11 fatcat:yf7it3hhmrddhjbr5v5xgnqwr4

Straightforward Feature Selection for Scalable Latent Semantic Indexing [chapter]

Jun Yan, Shuicheng Yan, Ning Liu, Zheng Chen
2009 Proceedings of the 2009 SIAM International Conference on Data Mining  
Latent Semantic Indexing (LSI) has been validated to be effective on many small scale text collections.  ...  In this paper, we propose a straightforward feature selection strategy, which is named as Feature Selection for Latent Semantic Indexing (FSLSI), as a preprocessing step such that LSI can be efficiently  ...  Introduction Latent Semantic Indexing (LSI) [5] was originally proposed for dealing with the synonymy and polysemy problems in text analysis.  ... 
doi:10.1137/1.9781611972795.99 dblp:conf/sdm/YanYLC09 fatcat:g2dsvjlorjhrvjfmbvjgoa3mj4

Identification of Critical Values in Latent Semantic Indexing [chapter]

April Kontostathis, William M. Pottenger, Brian D. Davison
2005 Studies in Computational Intelligence  
In this chapter we analyze the values used by Latent Sematic Indexing (LSI) for information retrieval.  ...  Removal of 90% of the values degrades retrieval performance slightly for smaller collections, but improves retrieval performance by 60% on the large TREC collection we tested.  ...  Background and Related Work Latent Semantic Indexing (LSI) [5] is a well-known technique used in information retrieval.  ... 
doi:10.1007/11498186_19 fatcat:cy3wi36p3vcojhj6xqnklqbrcu

On scaling latent semantic indexing for large peer-to-peer systems

Chunqiang Tang, Sandhya Dwarkadas, Zhichen Xu
2004 Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04  
One pioneering work along this direction is pSearch [32, 33] . pSearch places documents onto a peerto-peer overlay network according to semantic vectors produced using Latent Semantic Indexing (LSI).  ...  and documents improves recall by 76%. (3) To further improve retrieval quality, we use low-dimensional subvectors of semantic vectors to cluster documents in the overlay and then use Okapi to guide the  ...  from Latent Semantic Indexing (LSI) [3, 8] .  ... 
doi:10.1145/1008992.1009014 dblp:conf/sigir/TangDX04 fatcat:mv7ixilijfglhckj26zsuk4ry4

Fast updating algorithms for latent semantic indexing [article]

Eugene Vecharynski, Yousef Saad
2014 arXiv   pre-print
This paper discusses a few algorithms for updating the approximate Singular Value Decomposition (SVD) in the context of information retrieval by Latent Semantic Indexing (LSI) methods.  ...  Latent Semantic Indexing (LSI), introduced in [9] , is a wellestablished text mining technique that aims at finding documents in a given collection that are relevant to a user's query.  ...  In the projected space, semantically similar documents tend to be close to each other in a certain measure, which allows to compare them according to their latent semantics rather than a straightforward  ... 
arXiv:1310.2008v2 fatcat:mbstc4ljjrf5xnaykjkku6spma

Latent Semantic Indexing Via a Semi-Discrete Matrix Decomposition [chapter]

Tamara G. Kolda, Dianne P. O'leary
1999 IMA Volumes in Mathematics and its Applications  
Latent Semantic Indexing represents documents by approximations and tends to cluster documents on similar topics even if their term proles are somewhat dierent.  ...  We recommend the original LSI paper [3] , a paper by Dumais reporting the eectiveness of the LSI approach on the TREC-3 dataset [4] , and a more mathematical paper by Berry, Dumais and O'Brien [1]  ...  Latent semantic indexing (LSI) is based on the assumption that exact matching of the query does not necessarily retrieve the most relevant documents.  ... 
doi:10.1007/978-1-4612-1524-0_5 fatcat:cvpc7dzqdna6bpcgxr55ofvmwq

A semidiscrete matrix decomposition for latent semantic indexing information retrieval

Tamara G. Kolda, Dianne P. O'Leary
1998 ACM Transactions on Information Systems  
Latent semantic indexing (LSI) replaces the document matrix with an approximation generated by the truncated singular-value decomposition (SVD).  ...  We will describe the SDD approximation, show how to compute it, and compare the SDD-based LSI method to the SVD-based LSI method.  ...  Latent semantic indexing (LSI) overcomes this problem by automatically discovering latent relationships in the document collection.  ... 
doi:10.1145/291128.291131 fatcat:7wou5qo3zfeqrlicbaajcohhei


Radha Guha
2020 International Research Journal of Computer Science  
This paper explores information retrieval models and experiments Semantic Indexing (LSI) first and then with the more efficient topic modeling algorithm of Latent Dirichlet Allocation (LDA).  ...  Latent Semantic Indexing (LSI), is also known as Latent Semantic Analysis (LSA). LSI/LSA is a topic-modeling machine learning algorithm. A topic is a cluster of words that frequently occur together.  ...  This way of using SVD for de-noising topics is called Latent Semantic Indexing or Latent Semantic Analysis.  ... 
doi:10.26562/irjcs.2020.v0705.001 fatcat:3mmmcy5kuve5hetxfh456bxwoy

Incorporating latent semantic indexing into a neural network model for information retrieval

Inien Syu, S. D. Lang, Narsingh Deo
1996 Proceedings of the fifth international conference on Information and knowledge management - CIKM '96  
We incorporate the Latent Semantic Indexing (LSl) technique into a competition-based neural network model for information retrieval.  ...  Since the pmcIess of creating or updating a thesaurus is rather expensive, we apply the LSI technique to provide an automated procedure that captures the semantic relationship between the doctrments and  ...  approach to modeling the latent semantic relationships between the documents and the index terms.  ... 
doi:10.1145/238355.238475 dblp:conf/cikm/SyuLD96 fatcat:gnb7ex6i2jgavjjyj4potun3ru

Distributed, Large-Scale Latent Semantic Analysis by Index Interpolation

Sebastiano Vigna
2008 Proceedings of the Third International ICST Conference on Scalable Information Systems  
Latent semantic analysis [12] is a well-known technique to extrapolate concepts from a set of documents; it discards noise by reducing the rank of (a variant of) the term/document matrix of a document  ...  Moreover, our approach is advantageous when the document collection is large, because the number of terms over which latent semantic analysis has to be performed is inherently limited by the size of a  ...  semantic indexing.  ... 
doi:10.4108/icst.infoscale2008.3500 dblp:conf/infoscale/Vigna08 fatcat:rfz2h2dyvvc6jfnoy2pekeh3x4

Sentence Retrieval with LSI and Topic Identification [chapter]

David Parapar, Álvaro Barreiro
2006 Lecture Notes in Computer Science  
We have compared the performance of the Latent Semantic Indexing (LSI) retrieval model against the performance of a topic identification method, also based on Singular Value Decomposition (SVD) but with  ...  We used the TREC Novelty Track collections from years 2002 and 2003 for the evaluation.  ...  Our idea was to test Latent Semantic Indexing (LSI) because it had not been used before in this task and because it can lead to more general and less ad-hoc solutions and because only a small set of documents  ... 
doi:10.1007/11735106_12 fatcat:vgj5ywlu2rddbhzt2lgmbe5d7i
« Previous Showing results 1 — 15 out of 312 results