A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2010; you can also visit the original URL.
The file type is application/pdf
.
Filters
Exploiting parallelism to support scalable hierarchical clustering
2007
Journal of the American Society for Information Science and Technology
Finally, we show how our parallel hierarchical agglomerative clustering algorithm can be used as the clustering subroutine for a parallel version of the Buckshot algorithm to cluster the complete TREC ...
A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large collections. ...
We implemented our algorithm in Java 1.4, using the MPI for Java library [48] as a wrapper for the MPICH [46] implementation of MPI. ...
doi:10.1002/asi.20596
fatcat:niybb3f4zzdvpna2vlu537rvza
Development of an Efficient Hierarchical Clustering Analysis using an Agglomerative Clustering Algorithm
2019
Current Science
Among the different groups of clustering algorithms, agglomerative algorithm is widely used in the document clustering domain. ...
This study aimed to examine the effectiveness of agglomerative clustering algorithm in document clustering by enhancing its efficiency and evaluating it through implementation. ...
Hierarchical clustering algorithms create nested clusters by continuously splitting the instances in agglomerative mode or divisive mode. ...
doi:10.18520/cs/v117/i6/1045-1053
fatcat:cvqcmjenn5ajvo6ygqfh6o6lsi
A new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques
2005
IEEE transactions on fuzzy systems
First, we present a fuzzy agglomerative hierarchical clustering algorithm for clustering documents and to get the document cluster centers of document clusters. ...
In this paper, we extend the work of Kraft et al. to present a new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques. ...
For example, in the clustering process using the agglomerative hierarchical clustering method (e.g., the complete link method and the proposed fuzzy agglomerative hierarchical clustering method), two document ...
doi:10.1109/tfuzz.2004.840134
fatcat:w3afn4q2qnhgnlvldhju3nty64
Comparing the Performance of SOM with Traditional Methods for Document Clustering Using Wordnet Ontologies
2022
International Journal for Research in Applied Science and Engineering Technology
The suggested method employs WordNet to determine the relevance of the concepts in the text, and then clusters the content using several document clustering algorithms (K-means, Agglomerative Clustering ...
We wish to compare alternative ways for making document clustering algorithms more successful. ...
Agglomerative Clustering Algorithm Agglomerative clustering is a frequent used clustering algorithm for hierarchical clustering. AGNES is another name for it (Agglomerative Nesting). ...
doi:10.22214/ijraset.2022.41554
fatcat:zotnkyzlsveodjujbssucetuxm
Using Data Fusion for a Context Aware Document Clustering
2013
International Journal of Computer Applications
Agglomerative clustering and Bisecting K-Means are used to cluster the extracted features. ...
In this paper, a new method for clustering documents is proposed. In the proposed method, the term frequency of the document collection is computed and contexts based terms are fused. ...
Babu et al [12] proposed a relevant document information clustering algorithm for web search machines. k-means partitioning algorithms and Hierarchical clustering algorithms used in clustering process ...
doi:10.5120/12497-7430
fatcat:dfd5edpk7bbnddmnu5cyhqoil4
Query-based Multi-Document Summarization by Clustering of Documents
2014
Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing - ICONIAAC '14
Our studies using the DUC 2002 dataset show an increase in both the efficiency and accuracy of clusters when compared to both the conventional Hierarchical Agglomerative Clustering (HAC) algorithm and ...
In the Clustering phase, we extend the Potential-based Hierarchical Agglomerative (PHA) clustering method to a hybrid PHA-ClusteringGain-K-Means clustering approach. ...
In Clustering phase, the retrieved documents are clustered into different topic clusters using generalized spherical k-means algorithm. ...
doi:10.1145/2660859.2660972
fatcat:eacyurgeargvldp6au7pm4mbcy
Recent trends in hierarchic document clustering: A critical review
1988
Information Processing & Management
This article reviews recent research into the use of hierarchic agglomerative clustering methods for document retrieval. ...
After an introduction to the calculation of interdocument similarities and to clustering methods that are appropriate for document clustering, the article discusses algorithms that can be used to allow ...
clustering methods and to algorithms for their efficient implementation. ...
doi:10.1016/0306-4573(88)90027-1
fatcat:ahh3fofnu5fobceesj6czkdzbq
An Analytical Assessment on Document Clustering
2012
International Journal of Computer Network and Information Security
Clustering is related to data mining for information retrieval. Relevant information is retrieved quickly while doing the clustering of documents. ...
The algorithms merge the clusters like bottom-up approach for Agglomerative Hierarchical Clustering and top-down approach for Divisive Hierarchical Clustering. ...
Document clustering is a method, which is used for information retrieval from the text documents (data mining) and web documents (web mining). ...
doi:10.5815/ijcnis.2012.05.08
fatcat:4nt2ryrsbjetzfpbzccloqwnmq
Hierarchical Document Clustering Based on Tolerance Rough Set Model
[chapter]
2000
Lecture Notes in Computer Science
Clustering is a powerful tool for knowledge discovery in text collections. The quality of document clustering depends not only on clustering algorithms but also on document representation models. ...
We develop a hierarchical document clustering algorithm based on a tolerance rough set model (TRSM) for representing documents, which offers a way of considering semantics relatedness between documents ...
Figure 1 describes the general TRSM-based hierarchical clustering algorithm that is an extension of the hierarchical agglomerative clustering algorithm. ...
doi:10.1007/3-540-45372-5_51
fatcat:k3kwl3somjcn7fd2lxxqsjlkq4
Performance Analysis of Clustering using Partitioning and Hierarchical Clustering Techniques
2014
International Journal of Database Theory and Application
A new method called Hierarchical Agglomerative Clustering (HAC) which manages clusters as tree like structure that make possible for browsing. ...
In several text tasks, this text mining is used such as extraction of information and concept/entity, summarization of documents, modeling of relation with entity, categorization/classification and clustering ...
Huge number of various areas in text mining and information retrieval, this document clustering is used. Agglomerative can be categorized as greedy, in the algorithmic sense. ...
doi:10.14257/ijdta.2014.7.6.21
fatcat:efevx6hxxrfk5jvu3reax3bxom
TALP at WePS-3 2010
2010
Conference and Labs of the Evaluation Forum
In our experiments we used a simple approach with three algorithms: Lingo, Hierachical Agglomerative Clustering (HAC), and a 2-step HAC algorithm. ...
In this paper we present our system and experiments at the Third Web People Search Workshop (WePS-3) task for clustering web people search documents in English. ...
The clustering algorithms implemented for Lemur and used in this paper are described in [3] . These algorithms use cosine similarity in the vector space model as their metric. ...
dblp:conf/clef/FerresR10
fatcat:jxy6giommnfxrb3aegin3enumm
Performance Evaluation of Cluster Based Algorithm used for Text Document Classification
2015
International Journal of Science and Research (IJSR)
In this paper we develop a complete methodology for document classification and clustering. ...
We use these findings in the construction of a Gaussian Mixture Document Clustering (GMDC) algorithm. This algorithm models the data as a sample from a Gaussian mixture. ...
Clustering is especially useful for organizing documents to improve retrieval and support browsing The study of the clustering problem precedes its applicability to the text domain. ...
doi:10.21275/v5i5.7051602
fatcat:7astteuharayne2uyjnxljto4u
Influence of stemming on Clustering of Arabic texts: Comparative Study in Document Retrieval
2013
International Journal of Computer Applications
improve Document Retrieval (DR). ...
A classical local document system generally, employs statistical methods for calculating the similarity between the introduced query and each document in the target collection to finally provide an ordered ...
The authors have not yet been able to explore the effect of reducing the size of the representation vectors on the quality of clusters in the retrieval of Arabic documents; this task remains first on future ...
doi:10.5120/10536-5529
fatcat:klcwnlnflrhd3kbetz5pimehqu
An Adaptive Ontology Based Hierarchical Browsing System for CiteSeerx
2010
2010 Second International Conference on Knowledge and Systems Engineering
As an indispensable technique in addition to the field of Information Retrieval, Ontology based Retrieval System (or Browsing Hierarchy) has been well studied and developed both in academia and industry ...
Then, we give a empirical analysis of unsupervised learning methods for adding new clusters to the existing browsing hierarchy. ...
ACKNOWLEDGMENT The authors appreciate the anonymous reviewers for their extensive and informative comments for the improvement of this paper. ...
doi:10.1109/kse.2010.32
fatcat:yn5znegtwzckfnpbxp63ytifyq
Efficient Document Clustering for Web Search Result
2018
International Journal of Engineering & Technology
In this paper the main concentration is on hierarchical clustering and k-means algorithms, hence prove that k-means and its variant are efficient than hierarchical clustering along with this by implementing ...
greedy fast k-means algorithm (GFA) for cluster document in efficient way is considered. ...
The algorithms like agglomerative and divisive are the two hierarchical clustering approaches where in most quiet usage is Agglomerative which treats each object as one single cluster and sequentially ...
doi:10.14419/ijet.v7i3.3.14494
fatcat:4q7s2dhisvhmbgwufwhk3lmxia
« Previous
Showing results 1 — 15 out of 3,141 results