807 Hits in 5.1 sec

Augmenting conversational dialogue by means of latent semantic googling

Robin Senior, Roel Vertegaal
2005 Proceedings of the 7th international conference on Multimodal interfaces - ICMI '05  
This paper presents Latent Semantic Googling, a variant of Landauer's Latent Semantic Indexing that uses the Google search engine to judge the semantic closeness of sets of words and phrases.  ...  This concept is implemented via Ambient Google, a system for augmenting conversations through the classification of discussed topics.  ...  LSI Word Culling The general concept behind LSI is the search for word cooccurrence in large sets of documents.  ... 
doi:10.1145/1088463.1088490 dblp:conf/icmi/SeniorV05 fatcat:zgvkpebnnngvjlh2e76oh3pfui

Mapping texts through dimensionality reduction and visualization techniques for interactive exploration of document collections

Alneu de Andrade Lopes, Rosane Minghim, Vinícius Melo, Fernando V. Paulovich, Robert F. Erbacher, Jonathan C. Roberts, Matti T. Gröhn, Katy Börner
2006 Visualization and Data Analysis 2006  
The projection is followed by hierarchical clustering to support sub-area identification. The map can be interactively explored, helping to narrow down the search for relevant articles.  ...  The tool can deal with the exploration of inter-topics and intra-topic relationship and is useful in many contexts that need deciding on relevant articles to read, such as scientific research, education  ...  We wish to acknowledge the work of our undergraduate and research students as well as research colleagues in processing some data and discussing various issues of the work.  ... 
doi:10.1117/12.650899 dblp:conf/vda/LopesMMP06 fatcat:5tcnwxtov5fivlzzwqabcnkyuq

Text Mining Through Label Induction Grouping Algorithm Based Method [article]

Gulshan Saleem, Nisar Ahmed, Usman Qamar
2021 arXiv   pre-print
As LINGO uses VSM in cluster content discovery, our task is to replace VSM with LSI for cluster content discovery and to analyze the feasibility of using LSI with Okapi BM25.  ...  The research is applied to five different text-based data sets to get more reliable results for every method.  ...  Our research is based on semantic based clustering of web search results and to use the latent semantic indexing for content discovery to add quality of synonymy and it results better when compares to  ... 
arXiv:2112.08486v1 fatcat:q4jr2zlfhbh4lcqxdosto6jrxu

Evaluation of related news recommendations using document similarity methods

Marko Pranjić, Vid Podpečan, Marko Robnik-Šikonja, Senja Pollak
2020 Zenodo  
A set of related articles is a useful addition to the newly published news. Such news articles contain more context and background information and provide a richer experience to the reader.  ...  Our results show that the tf-idf weighting applied to bag-of-words document representation offers better matching with manually selected links by journalist than more sophisticated approaches, such as  ...  The results of this publication reflects only the author's view and the Commission is not responsible for any use that may be made of the information it contains. References  ... 
doi:10.5281/zenodo.4059710 fatcat:nhauxf25ezfs3fahvhjf2xeuty

A semantic relatedness approach for traceability link recovery

Anas Mahmoud, Nan Niu, Songhua Xu
2012 2012 20th IEEE International Conference on Program Comprehension (ICPC)  
In this paper, we propose an approach, based on semantic relatedness (SR), which brings human judgment to an earlier stage of the tracing process by integrating it into the underlying retrieval mechanism  ...  Index Terms-information search and retrieval, automated tracing, semantic relatedness, experimentation.  ...  measure for SR estimation. 4) Practicality: In general, web search-based SR measures require initiating a web search request for each query [27] .  ... 
doi:10.1109/icpc.2012.6240487 dblp:conf/iwpc/MahmoudNX12 fatcat:rw6hkt3ubzdvvfdt7bysmq6zdq

Combining text and link analysis for focused crawling—An application for vertical search engines

G. Almpanidis, C. Kotropoulos, I. Pitas
2007 Information Systems  
The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler self-evident.  ...  Our implementation presents a different approach to focused crawling and aims to overcome the limitations imposed by the need to provide initial data for training, while maintaining a high recall/precision  ...  We would like to thank Mr. Athanasios Papaioannou for his contributions in the implementation of the PLSI algorithm.  ... 
doi:10.1016/ fatcat:pqcpgeyu4revtavnhmz4ger2ju

A Literature Review on Patent Information Retrieval Techniques

Alok Khode, Sagar Jambhorkar
2017 Indian Journal of Science and Technology  
Patentability search is an important step in the patent process and missing out any relevant patent may cause expensive legal consequences.  ...  Objective: Patents are critical intellectual assets for any competitive business. They can prove to be a gold mine if retrieved, analyzed and utilized appropriately.  ...  Further, the study uses Support Vector Machines (SVM) 42 to merge the ranked results and again re-rank them using additional training sets created from the patent collection.  ... 
doi:10.17485/ijst/2017/v10i37/116435 fatcat:sux6dzrm3re7dig44xrpl7agya

Weak signal identification with semantic web mining

Dirk Thorleuchter, Dirk Van den Poel
2013 Expert systems with applications  
In contrast to related research, a methodology is provided that uses latent semantic indexing (LSI) for the identification of weak signals.  ...  A new weak signal maximization approach is introduced that replaces the commonly used prediction modeling approach in LSI.  ...  The words that are used to formulize the hypothesis are considered for the next step, the creation of search queries.  ... 
doi:10.1016/j.eswa.2013.03.002 fatcat:hq53ovz2lffwlb4gvzpucw5wi4

A Case Study on the Impact of Similarity Measure on Information Retrieval based Software Engineering Tasks [article]

Md Masudur Rahman, Saikat Chakraborty, Gail Kaiser, Baishakhi Ray
2018 arXiv   pre-print
In contrast, simple keyword-based bag-of-words models perform better on code artifacts.  ...  The performance of any IR method critically depends on selecting an appropriate similarity measure for the given application domain.  ...  We use these categories as queries to search GitHub for relevant Java projects using GitHub search API [26] .  ... 
arXiv:1808.02911v1 fatcat:frgps4ys3zcwvnrzkhkd6hfgh4

Improving the Transcription of Academic Lectures for Information Retrieval

Audrey Mbogho, Stephen Marquard
2013 2013 12th International Conference on Machine Learning and Applications  
This paper looks into the use of Wikipedia to dynamically adapt language models for scholarly speech.  ...  Transcribing recordings greatly enhances their usefulness by making them easy to search. However, the number of recordings accumulates rapidly, rendering manual transcription impractical.  ...  The output of this pass is a sparse matrix of words by articles. In Pass 3, the sparse matrix is used to create the LSI model for 400 topics.  ... 
doi:10.1109/icmla.2013.177 dblp:conf/icmla/MboghoM13 fatcat:sfxozqjzrnfe7ocsmba7imyp3q

Learning similarity measures in non-orthogonal space

Ning Liu, Benyu Zhang, Jun Yan, Qiang Yang, Shuicheng Yan, Zheng Chen, Fengshan Bai, Wei-Ying Ma
2004 Proceedings of the Thirteenth ACM conference on Information and knowledge management - CIKM '04  
Experimental results on a synthetic data set, a real MSN search click-thru logs, and 20NG dataset show that our algorithm outperforms the traditional Cosine similarity and is superior to LSI.  ...  Various algorithms such as Latent Semantic Indexing (LSI) were used to solve this problem by projecting the original data into an orthogonal space.  ...  Dataset In order to study the effectiveness of SNOS for measuring the similarity of web objects, experiments are conducted on a real user query click-through log collected by the MSN Web search engine  ... 
doi:10.1145/1031171.1031240 dblp:conf/cikm/LiuZYYYCBM04 fatcat:j3syget3qjhqfnp3mt527gsdke

Unifying Textual and Visual Cues for Content-Based Image Retrieval on the World Wide Web

Stan Sclaroff, Marco La Cascia, Saratendu Sethi, Leonid Taycher
1999 Computer Vision and Image Understanding  
Textual statistics are captured in vector form using latent semantic indexing based on text in the containing HTML document.  ...  A system is proposed that combines textual and visual statistics in a single index vector for content-based search of a WWW image database.  ...  Also, the dictionary used to create the word histograms is ad hoc and was chosen according to the frequencies of various words in a set of representative documents.  ... 
doi:10.1006/cviu.1999.0765 fatcat:2ecfke6agrhtdbncn2trsunbqu

Semantics-Based Automated Service Discovery

A. V. Paliwal, B. Shafiq, J. Vaidya, Hui Xiong, N. Adam
2012 IEEE Transactions on Services Computing  
Additionally, we utilize clustering for accurately classifying the web services based on service functionality.  ...  We propose a solution for achieving functional level service categorization based on an ontology framework.  ...  ACKNOWLEDGMENTS This work was supported in part by the US National Science Foundation under grant IIS-0306838 and SAP Labs, LLC.  ... 
doi:10.1109/tsc.2011.19 fatcat:g3pnujbmubdsrczkjlxgq7dvbe

A Retrieval Sorting Approach for Online Forums Based on Domain Topics

Yu Yan Zhang
2013 Advanced Materials Research  
Focusing on the development of Web 2.0 applications, a result ranking approach is proposed on the basis of LDA model to rank the search results from Web forums.  ...  This work has important significance for the research of improving the performance of retrieval results of web forums.  ...  ACKNOWLEDGMENT The author is most grateful to the anonymous referees for their constructive and helpful comments on the earlier version of the manuscript that helped to improve the presentation of the  ... 
doi:10.4028/ fatcat:uangrd6yafe4tjkeyxjyz4okaa

Distributed Semantic Overlay Networks [chapter]

Christos Doulkeridis, Akrivi Vlachou, Kjetil Nørvåg, Michalis Vazirgiannis
2009 Handbook of Peer-to-Peer Networking  
A classification of existing algorithms according to a set of qualitative criteria is also provided.  ...  efficient search.  ...  One of the important problems in P2P search is the high number of contacted peers that do not contribute to the final result set.  ... 
doi:10.1007/978-0-387-09751-0_17 fatcat:g7tbfz33mncpnjgywn67ayx22q
« Previous Showing results 1 — 15 out of 807 results