3,141 Hits in 6.1 sec

Exploiting parallelism to support scalable hierarchical clustering

Rebecca J. Cathey, Eric C. Jensen, Steven M. Beitzel, Ophir Frieder, David Grossman
2007 Journal of the American Society for Information Science and Technology  
Finally, we show how our parallel hierarchical agglomerative clustering algorithm can be used as the clustering subroutine for a parallel version of the Buckshot algorithm to cluster the complete TREC  ...  A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections.  ...  We implemented our algorithm in Java 1.4, using the MPI for Java library [48] as a wrapper for the MPICH [46] implementation of MPI.  ... 
doi:10.1002/asi.20596 fatcat:niybb3f4zzdvpna2vlu537rvza

Development of an Efficient Hierarchical Clustering Analysis using an Agglomerative Clustering Algorithm

Arshia Naeem, Mariam Rehman, Maria Anjum, Muhammad Asif
2019 Current Science  
Among the different groups of clustering algorithms, agglomerative algorithm is widely used in the document clustering domain.  ...  This study aimed to examine the effectiveness of agglomerative clustering algorithm in document clustering by enhancing its efficiency and evaluating it through implementation.  ...  Hierarchical clustering algorithms create nested clusters by continuously splitting the instances in agglomerative mode or divisive mode.  ... 
doi:10.18520/cs/v117/i6/1045-1053 fatcat:cvqcmjenn5ajvo6ygqfh6o6lsi

A new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques

Yih-Jen Horng, Shyi-Ming Chen, Yu-Chuan Chang, Chia-Hoang Lee
2005 IEEE transactions on fuzzy systems  
First, we present a fuzzy agglomerative hierarchical clustering algorithm for clustering documents and to get the document cluster centers of document clusters.  ...  In this paper, we extend the work of Kraft et al. to present a new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques.  ...  For example, in the clustering process using the agglomerative hierarchical clustering method (e.g., the complete link method and the proposed fuzzy agglomerative hierarchical clustering method), two document  ... 
doi:10.1109/tfuzz.2004.840134 fatcat:w3afn4q2qnhgnlvldhju3nty64

Comparing the Performance of SOM with Traditional Methods for Document Clustering Using Wordnet Ontologies

Abhishek Sawalkar, Mohit Mandlecha, Dnyanesh Kulkarni, Dr. Ratnamala S. Paswan
2022 International Journal for Research in Applied Science and Engineering Technology  
The suggested method employs WordNet to determine the relevance of the concepts in the text, and then clusters the content using several document clustering algorithms (K-means, Agglomerative Clustering  ...  We wish to compare alternative ways for making document clustering algorithms more successful.  ...  Agglomerative Clustering Algorithm Agglomerative clustering is a frequent used clustering algorithm for hierarchical clustering. AGNES is another name for it (Agglomerative Nesting).  ... 
doi:10.22214/ijraset.2022.41554 fatcat:zotnkyzlsveodjujbssucetuxm

Using Data Fusion for a Context Aware Document Clustering

P. Venkateshkumar, A. Subramani
2013 International Journal of Computer Applications  
Agglomerative clustering and Bisecting K-Means are used to cluster the extracted features.  ...  In this paper, a new method for clustering documents is proposed. In the proposed method, the term frequency of the document collection is computed and contexts based terms are fused.  ...  Babu et al [12] proposed a relevant document information clustering algorithm for web search machines. k-means partitioning algorithms and Hierarchical clustering algorithms used in clustering process  ... 
doi:10.5120/12497-7430 fatcat:dfd5edpk7bbnddmnu5cyhqoil4

Query-based Multi-Document Summarization by Clustering of Documents

Gopal K. R. Naveen, Prema Nedungadi
2014 Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing - ICONIAAC '14  
Our studies using the DUC 2002 dataset show an increase in both the efficiency and accuracy of clusters when compared to both the conventional Hierarchical Agglomerative Clustering (HAC) algorithm and  ...  In the Clustering phase, we extend the Potential-based Hierarchical Agglomerative (PHA) clustering method to a hybrid PHA-ClusteringGain-K-Means clustering approach.  ...  In Clustering phase, the retrieved documents are clustered into different topic clusters using generalized spherical k-means algorithm.  ... 
doi:10.1145/2660859.2660972 fatcat:eacyurgeargvldp6au7pm4mbcy

Recent trends in hierarchic document clustering: A critical review

Peter Willett
1988 Information Processing & Management  
This article reviews recent research into the use of hierarchic agglomerative clustering methods for document retrieval.  ...  After an introduction to the calculation of interdocument similarities and to clustering methods that are appropriate for document clustering, the article discusses algorithms that can be used to allow  ...  clustering methods and to algorithms for their efficient implementation.  ... 
doi:10.1016/0306-4573(88)90027-1 fatcat:ahh3fofnu5fobceesj6czkdzbq

An Analytical Assessment on Document Clustering

Pushplata, Ram Chatterjee
2012 International Journal of Computer Network and Information Security  
Clustering is related to data mining for information retrieval. Relevant information is retrieved quickly while doing the clustering of documents.  ...  The algorithms merge the clusters like bottom-up approach for Agglomerative Hierarchical Clustering and top-down approach for Divisive Hierarchical Clustering.  ...  Document clustering is a method, which is used for information retrieval from the text documents (data mining) and web documents (web mining).  ... 
doi:10.5815/ijcnis.2012.05.08 fatcat:4nt2ryrsbjetzfpbzccloqwnmq

Hierarchical Document Clustering Based on Tolerance Rough Set Model [chapter]

Saori Kawasaki, Ngoc Binh, Tu Bao
2000 Lecture Notes in Computer Science  
Clustering is a powerful tool for knowledge discovery in text collections. The quality of document clustering depends not only on clustering algorithms but also on document representation models.  ...  We develop a hierarchical document clustering algorithm based on a tolerance rough set model (TRSM) for representing documents, which offers a way of considering semantics relatedness between documents  ...  Figure 1 describes the general TRSM-based hierarchical clustering algorithm that is an extension of the hierarchical agglomerative clustering algorithm.  ... 
doi:10.1007/3-540-45372-5_51 fatcat:k3kwl3somjcn7fd2lxxqsjlkq4

Performance Analysis of Clustering using Partitioning and Hierarchical Clustering Techniques

S. C. Punitha, P. Ranjith Jeba Thangaiah, M. Punithavalli
2014 International Journal of Database Theory and Application  
A new method called Hierarchical Agglomerative Clustering (HAC) which manages clusters as tree like structure that make possible for browsing.  ...  In several text tasks, this text mining is used such as extraction of information and concept/entity, summarization of documents, modeling of relation with entity, categorization/classification and clustering  ...  Huge number of various areas in text mining and information retrieval, this document clustering is used. Agglomerative can be categorized as greedy, in the algorithmic sense.  ... 
doi:10.14257/ijdta.2014.7.6.21 fatcat:efevx6hxxrfk5jvu3reax3bxom

TALP at WePS-3 2010

Daniel Ferrés, Horacio Rodríguez
2010 Conference and Labs of the Evaluation Forum  
In our experiments we used a simple approach with three algorithms: Lingo, Hierachical Agglomerative Clustering (HAC), and a 2-step HAC algorithm.  ...  In this paper we present our system and experiments at the Third Web People Search Workshop (WePS-3) task for clustering web people search documents in English.  ...  The clustering algorithms implemented for Lemur and used in this paper are described in [3] . These algorithms use cosine similarity in the vector space model as their metric.  ... 
dblp:conf/clef/FerresR10 fatcat:jxy6giommnfxrb3aegin3enumm

Performance Evaluation of Cluster Based Algorithm used for Text Document Classification

2015 International Journal of Science and Research (IJSR)  
In this paper we develop a complete methodology for document classification and clustering.  ...  We use these findings in the construction of a Gaussian Mixture Document Clustering (GMDC) algorithm. This algorithm models the data as a sample from a Gaussian mixture.  ...  Clustering is especially useful for organizing documents to improve retrieval and support browsing The study of the clustering problem precedes its applicability to the text domain.  ... 
doi:10.21275/v5i5.7051602 fatcat:7astteuharayne2uyjnxljto4u

Influence of stemming on Clustering of Arabic texts: Comparative Study in Document Retrieval

Abdessalem Kelaiaia, Hayet Farida Merouani
2013 International Journal of Computer Applications  
improve Document Retrieval (DR).  ...  A classical local document system generally, employs statistical methods for calculating the similarity between the introduced query and each document in the target collection to finally provide an ordered  ...  The authors have not yet been able to explore the effect of reducing the size of the representation vectors on the quality of clusters in the retrieval of Arabic documents; this task remains first on future  ... 
doi:10.5120/10536-5529 fatcat:klcwnlnflrhd3kbetz5pimehqu

An Adaptive Ontology Based Hierarchical Browsing System for CiteSeerx

Nanhong Ye, Susan Gauch, Qiang Wang, Hiep Luong
2010 2010 Second International Conference on Knowledge and Systems Engineering  
As an indispensable technique in addition to the field of Information Retrieval, Ontology based Retrieval System (or Browsing Hierarchy) has been well studied and developed both in academia and industry  ...  Then, we give a empirical analysis of unsupervised learning methods for adding new clusters to the existing browsing hierarchy.  ...  ACKNOWLEDGMENT The authors appreciate the anonymous reviewers for their extensive and informative comments for the improvement of this paper.  ... 
doi:10.1109/kse.2010.32 fatcat:yn5znegtwzckfnpbxp63ytifyq

Efficient Document Clustering for Web Search Result

Sumathi Rani Manukonda, Asst.Prof Kmit, Narayanguda ., Hyderabad ., Nomula Divya, Asst. Prof. Cmrit, Medchal ., Hyderabad .
2018 International Journal of Engineering & Technology  
In this paper the main concentration is on hierarchical clustering and k-means algorithms, hence prove that k-means and its variant are efficient than hierarchical clustering along with this by implementing  ...  greedy fast k-means algorithm (GFA) for cluster document in efficient way is considered.  ...  The algorithms like agglomerative and divisive are the two hierarchical clustering approaches where in most quiet usage is Agglomerative which treats each object as one single cluster and sequentially  ... 
doi:10.14419/ijet.v7i3.3.14494 fatcat:4q7s2dhisvhmbgwufwhk3lmxia
« Previous Showing results 1 — 15 out of 3,141 results