Iterative Residual Rescaling: An Analysis and Generalization of LSI [article]

Rie Kubota Ando, Lillian Lee
<span title="2001-06-17">2001</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
We consider the problem of creating document representations in which inter-document similarity measurements correspond to semantic similarity. We first present a novel subspace-based framework for formalizing this task. Using this framework, we derive a new analysis of Latent Semantic Indexing (LSI), showing a precise relationship between its performance and the uniformity of the underlying distribution of documents over topics. This analysis helps explain the improvements gained by Ando's
more &raquo; ... 0) Iterative Residual Rescaling (IRR) algorithm: IRR can compensate for distributional non-uniformity. A further benefit of our framework is that it provides a well-motivated, effective method for automatically determining the rescaling factor IRR depends on, leading to further improvements. A series of experiments over various settings and with several evaluation metrics validates our claims.
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/cs/0106039v1">arXiv:cs/0106039v1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ikldqvjgmrbalaw7jfgoowrnzi">fatcat:ikldqvjgmrbalaw7jfgoowrnzi</a> </span>
<a target="_blank" rel="noopener" href="https://archive.org/download/arxiv-cs0106039/cs0106039.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> File Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d9/eb/d9ebed059fad4293a77c4367f0871603b85d4735.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/cs/0106039v1" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>