A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2011; you can also visit the original URL.
The file type is application/pdf
.
Filters
Latent semantic indexing (LSI) fails for TREC collections
2011
SIGKDD Explorations
The aim of latent semantic indexing (LSI) is to uncover the relationships between terms, hidden concepts, and documents. ...
All proposed methods are evaluated experimentally on the four TREC collections mentioned above. The experiments show that the new variants of LSI improve upon previous LSI methods. ...
Latent semantic indexing (LSI). ...
doi:10.1145/1964897.1964900
fatcat:kz4u27rvbbcyvj4cj23oiwvja4
Essential Dimensions of Latent Semantic Indexing (LSI)
2007
2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07)
Latent Semantic Indexing (LSI) is commonly used to match queries to documents in information retrieval applications. ...
In this paper, we first develop a model for understanding which values in the reduced dimensional space contain the term relationship (latent semantic) information. ...
Introduction Latent Semantic Indexing (LSI) has been applied to a wide variety of learning tasks involving textual data. ...
doi:10.1109/hicss.2007.213
dblp:conf/hicss/Kontostathis07
fatcat:ogdvacbaivgyjl2vgxckcr7p6i
A framework for understanding Latent Semantic Indexing (LSI) performance
2006
Information Processing & Management
In this paper we present a theoretical model for understanding the performance of Latent Semantic Indexing (LSI) search and retrieval applications. ...
Many models for understanding LSI have been proposed. Ours is the first to study the values produced by LSI in the term by dimension vectors. ...
Wei-Min Huang in developing the proof of the transitivity in LSI as well as in reviewing drafts of this article. The authors also would like to express their gratitude to Dr. Brian D. ...
doi:10.1016/j.ipm.2004.11.007
fatcat:ax2h5j7xnbdyneuya3hqfvwbfa
Regularized Latent Semantic Indexing
2013
ACM Transactions on Information Systems
Regularized latent semantic indexing: A new approach to large-scale topic modeling. ...
In this article we introduce Regularized Latent Semantic Indexing (RLSI)-including a batch version and an online version, referred to as batch RLSI and online RLSI, respectively-to scale up topic modeling ...
A document is viewed as a bag of terms generated from a mixture of latent topics. 1 Various topic modeling methods, such as Latent Semantic Indexing (LSI) [Deerwester et al. 1990 ], Probabilistic Latent ...
doi:10.1145/2414782.2414787
fatcat:wqq4oljj5vc3dlqcwz7ydnzs2m
Regularized latent semantic indexing
2011
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11
Regularized latent semantic indexing: A new approach to large-scale topic modeling. ...
In this article we introduce Regularized Latent Semantic Indexing (RLSI)-including a batch version and an online version, referred to as batch RLSI and online RLSI, respectively-to scale up topic modeling ...
A document is viewed as a bag of terms generated from a mixture of latent topics. 1 Various topic modeling methods, such as Latent Semantic Indexing (LSI) [Deerwester et al. 1990 ], Probabilistic Latent ...
doi:10.1145/2009916.2010008
dblp:conf/sigir/WangXLC11
fatcat:yf7it3hhmrddhjbr5v5xgnqwr4
Straightforward Feature Selection for Scalable Latent Semantic Indexing
[chapter]
2009
Proceedings of the 2009 SIAM International Conference on Data Mining
Latent Semantic Indexing (LSI) has been validated to be effective on many small scale text collections. ...
In this paper, we propose a straightforward feature selection strategy, which is named as Feature Selection for Latent Semantic Indexing (FSLSI), as a preprocessing step such that LSI can be efficiently ...
Introduction Latent Semantic Indexing (LSI) [5] was originally proposed for dealing with the synonymy and polysemy problems in text analysis. ...
doi:10.1137/1.9781611972795.99
dblp:conf/sdm/YanYLC09
fatcat:g2dsvjlorjhrvjfmbvjgoa3mj4
Identification of Critical Values in Latent Semantic Indexing
[chapter]
2005
Studies in Computational Intelligence
In this chapter we analyze the values used by Latent Sematic Indexing (LSI) for information retrieval. ...
Removal of 90% of the values degrades retrieval performance slightly for smaller collections, but improves retrieval performance by 60% on the large TREC collection we tested. ...
Background and Related Work Latent Semantic Indexing (LSI) [5] is a well-known technique used in information retrieval. ...
doi:10.1007/11498186_19
fatcat:cy3wi36p3vcojhj6xqnklqbrcu
On scaling latent semantic indexing for large peer-to-peer systems
2004
Proceedings of the 27th annual international conference on Research and development in information retrieval - SIGIR '04
One pioneering work along this direction is pSearch [32, 33] . pSearch places documents onto a peerto-peer overlay network according to semantic vectors produced using Latent Semantic Indexing (LSI). ...
and documents improves recall by 76%. (3) To further improve retrieval quality, we use low-dimensional subvectors of semantic vectors to cluster documents in the overlay and then use Okapi to guide the ...
from Latent Semantic Indexing (LSI) [3, 8] . ...
doi:10.1145/1008992.1009014
dblp:conf/sigir/TangDX04
fatcat:mv7ixilijfglhckj26zsuk4ry4
Fast updating algorithms for latent semantic indexing
[article]
2014
arXiv
pre-print
This paper discusses a few algorithms for updating the approximate Singular Value Decomposition (SVD) in the context of information retrieval by Latent Semantic Indexing (LSI) methods. ...
Latent Semantic Indexing (LSI), introduced in [9] , is a wellestablished text mining technique that aims at finding documents in a given collection that are relevant to a user's query. ...
In the projected space, semantically similar documents tend to be close to each other in a certain measure, which allows to compare them according to their latent semantics rather than a straightforward ...
arXiv:1310.2008v2
fatcat:mbstc4ljjrf5xnaykjkku6spma
Latent Semantic Indexing Via a Semi-Discrete Matrix Decomposition
[chapter]
1999
IMA Volumes in Mathematics and its Applications
Latent Semantic Indexing represents documents by approximations and tends to cluster documents on similar topics even if their term proles are somewhat dierent. ...
We recommend the original LSI paper [3] , a paper by Dumais reporting the eectiveness of the LSI approach on the TREC-3 dataset [4] , and a more mathematical paper by Berry, Dumais and O'Brien [1] ...
Latent semantic indexing (LSI) is based on the assumption that exact matching of the query does not necessarily retrieve the most relevant documents. ...
doi:10.1007/978-1-4612-1524-0_5
fatcat:cvpc7dzqdna6bpcgxr55ofvmwq
A semidiscrete matrix decomposition for latent semantic indexing information retrieval
1998
ACM Transactions on Information Systems
Latent semantic indexing (LSI) replaces the document matrix with an approximation generated by the truncated singular-value decomposition (SVD). ...
We will describe the SDD approximation, show how to compute it, and compare the SDD-based LSI method to the SVD-based LSI method. ...
Latent semantic indexing (LSI) overcomes this problem by automatically discovering latent relationships in the document collection. ...
doi:10.1145/291128.291131
fatcat:7wou5qo3zfeqrlicbaajcohhei
EXPLORING INFORMATION RETRIEVAL BY LATENT SEMANTIC INDEXING AND LATENT DIRICHLET ALLOCATION TECHNIQUES
2020
International Research Journal of Computer Science
This paper explores information retrieval models and experiments Semantic Indexing (LSI) first and then with the more efficient topic modeling algorithm of Latent Dirichlet Allocation (LDA). ...
Latent Semantic Indexing (LSI), is also known as Latent Semantic Analysis (LSA). LSI/LSA is a topic-modeling machine learning algorithm. A topic is a cluster of words that frequently occur together. ...
This way of using SVD for de-noising topics is called Latent Semantic Indexing or Latent Semantic Analysis. ...
doi:10.26562/irjcs.2020.v0705.001
fatcat:3mmmcy5kuve5hetxfh456bxwoy
Incorporating latent semantic indexing into a neural network model for information retrieval
1996
Proceedings of the fifth international conference on Information and knowledge management - CIKM '96
We incorporate the Latent Semantic Indexing (LSl) technique into a competition-based neural network model for information retrieval. ...
Since the pmcIess of creating or updating a thesaurus is rather expensive, we apply the LSI technique to provide an automated procedure that captures the semantic relationship between the doctrments and ...
approach to modeling the latent semantic relationships between the documents and the index terms. ...
doi:10.1145/238355.238475
dblp:conf/cikm/SyuLD96
fatcat:gnb7ex6i2jgavjjyj4potun3ru
Distributed, Large-Scale Latent Semantic Analysis by Index Interpolation
2008
Proceedings of the Third International ICST Conference on Scalable Information Systems
Latent semantic analysis [12] is a well-known technique to extrapolate concepts from a set of documents; it discards noise by reducing the rank of (a variant of) the term/document matrix of a document ...
Moreover, our approach is advantageous when the document collection is large, because the number of terms over which latent semantic analysis has to be performed is inherently limited by the size of a ...
semantic indexing. ...
doi:10.4108/icst.infoscale2008.3500
dblp:conf/infoscale/Vigna08
fatcat:rfz2h2dyvvc6jfnoy2pekeh3x4
Sentence Retrieval with LSI and Topic Identification
[chapter]
2006
Lecture Notes in Computer Science
We have compared the performance of the Latent Semantic Indexing (LSI) retrieval model against the performance of a topic identification method, also based on Singular Value Decomposition (SVD) but with ...
We used the TREC Novelty Track collections from years 2002 and 2003 for the evaluation. ...
Our idea was to test Latent Semantic Indexing (LSI) because it had not been used before in this task and because it can lead to more general and less ad-hoc solutions and because only a small set of documents ...
doi:10.1007/11735106_12
fatcat:vgj5ywlu2rddbhzt2lgmbe5d7i
« Previous
Showing results 1 — 15 out of 312 results