A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Toward the highest effectiveness in text description-based service retrieval
2015
Document Numérique
From the results of the experiments, we conclude that the IR model of this family, which is based on query expansion via a co-occurrence thesaurus outperforms the effectiveness of all the models studied ...
Therefore, we have implemented this model in a text description-based service search engine, which is part of a system designed to provide nomad users with services that fulfil users' needs expressed in ...
Another hybrid approach where K-means algorithm is used to divide the corpus in several clusters of documents is proposed (see (Pan, Zhang, 2009) ). ...
doi:10.3166/dn.18.2-3.155-177
fatcat:bmlz2vfo7ncntmuvc3nuonaqlm
Semi-Supervised Linear Discriminant Clustering
2014
IEEE Transactions on Cybernetics
The goal is to find a feature space where the K-means can perform well in the new space. ...
The proposed algorithm considers clustering and dimensionality reduction simultaneously by connecting K-means and linear discriminant analysis (LDA). ...
[12] , [13] , and clustering-based approaches [14] - [16] . ...
doi:10.1109/tcyb.2013.2278466
pmid:23996591
fatcat:dpxxp6lcyraxhb2rzrbryy2pqa
Clustering with Balancing Constraints
[chapter]
2008
Constrained Clustering
data such as text documents. ...
This chapter describes several approaches to obtaining balanced clustering results that also scale well to large data sets. ...
This research was supported in part by the Digital Technology Center Data Mining Consortium (DDMC) at the University of Minnesota, Twin Cities, and NSF grants IIS 0307792 and III-0713142. 30Constrained Clustering ...
doi:10.1201/9781584889977.ch8
fatcat:kj5gtm37ebbmtcvk3zw2dw2bde
Seed-Guided Deep Document Clustering
[chapter]
2020
Lecture Notes in Computer Science
This seed-guided constrained document clustering problem was recently addressed through topic modeling approaches. ...
In this paper, we jointly learn deep representations and bias the clustering results through the seed words, leading to a Seed-guided Deep Document Clustering approach. ...
Conclusion We have introduced in this paper the SD2C framework, the first attempt, to the best of our knowledge, to constrain document clustering with seed words using a deep clustering approach. ...
doi:10.1007/978-3-030-45439-5_1
fatcat:cug7brgy6bdxzcrwynaiarcz6y
Clustering Genes Using Heterogeneous Data Sources
2010
International Journal of Knowledge Discovery in Bioinformatics
For the constrained clustering algorithm, we have studied the effectiveness of various constraints sets. ...
To deal with incomplete data sources, we have adopted the MPCK-means clustering algorithm, which is a constrained clustering algorithm, to perform exploratory analysis on one complete source (such as gene ...
In the second approach, the spherical K-means algorithm, which is a K-means algorithm using cosine-based distance, was applied to the gene-term matrix T times (we chose T = 100 here). ...
doi:10.4018/jkdb.2010040102
fatcat:i65e5huzurcord6yaojw44jknu
The optimum clustering framework: implementing the cluster hypothesis
2011
Information retrieval (Boston)
Key idea is to base cluster analysis and evalutation on a set of queries, by defining documents as being similar if they are relevant to the same queries. ...
In this paper, we present a theoretic foundation for optimum document clustering. ...
Other clustering approaches that are based on more advanced document representations use the document features in the collection (or a subset thereof) as queries. ...
doi:10.1007/s10791-011-9173-9
fatcat:6vg4ismou5hunihe7mz5kvspe4
Large-scale multi-dimensional document clustering on GPU clusters
2010
2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)
of flocking-based document clustering. ...
This method is superior to other clustering algorithms, including k-means, in the sense that the outcome is not sensitive to the initial state. ...
Acknowledgement This work was supported in part by NSF grant CCF-0429653, CCR-0237570 and a subcontract from ORNL. The ...
doi:10.1109/ipdps.2010.5470429
dblp:conf/ipps/ZhangMCP10
fatcat:i43znbjewfaxzbb73lxwjh6kpi
Clustering tagged documents with labeled and unlabeled documents
2013
Information Processing & Management
This study employs our proposed semi-supervised clustering method called Constrained-PLSA to cluster tagged documents with a small amount of labeled documents and uses two data sets for system performance ...
The first data set is a document set whose boundaries among the clusters are not clear; while the second one has clear boundaries among clusters. ...
In addition to the above K-means variant approaches, there are many semi-supervised clustering approaches that are extended from the other algorithms. ...
doi:10.1016/j.ipm.2012.12.004
fatcat:kdmjhrvg6fhnhpidrgrfq4dhpq
Neural Gas Clustering Adapted for Given Size of Clusters
2016
Mathematical Problems in Engineering
Common clustering approaches cannot impose constraints on sizes of clusters. However, in many applications, sizes of clusters are bounded or known in advance. ...
The convergence of algorithm towards an optimum is tested on simple illustrative examples. ...
Since we do not allow the cluster size constraints to be relaxed, we did not compare our adapted neural gas algorithm with a constrained k-means algorithm but we compared our algorithm with balanced k-means ...
doi:10.1155/2016/9324793
fatcat:mbtsbmv6onb53atxdhry467mi4
Semi-supervised model-based document clustering: A comparative study
2006
Machine Learning
The first two are extensions of the seeded k-means and constrained k-means algorithms studied by Basu et al. (2002) ; the last one is motivated by Cohn et al. (2000) . ...
We compare three (slightly) different semi-supervised approaches for clustering documents: Seeded damnl, Constrained damnl, and Feedback-based damnl, where damnl stands for multinomial model-based deterministic ...
Basu et al. (2002) compared seeded spherical k-means and constrained spherical k-means for clustering documents and showed that the constrained version performs better. ...
doi:10.1007/s10994-006-6540-7
fatcat:n52fgsxlgfhgxkmzgtpuejpy2a
Characterizing pattern preserving clustering
2008
Knowledge and Information Systems
Experimental results on document data show that HICAP can produce overlapping clusters that preserve useful patterns, but has relatively worse clustering performance than bisecting K-means with respect ...
By contrast, in terms of entropy, K-CAP can perform substantially better than the bisecting K-means algorithm when data sets contain clusters of widely different sizes-a common situation in the real-world ...
Constrained clustering (Tung, Ng, Lakshmanan and Han, 2001 ) is based on the idea of using standard clustering approaches, but restricting the clustering process. ...
doi:10.1007/s10115-008-0148-0
fatcat:d23657nmerdpfe43ivfexq7efq
XML data clustering
2011
ACM Computing Surveys
In the last few years we have observed a proliferation of approaches for clustering XML documents and schemas based on their structure and content. ...
The presence of such a huge amount of approaches is due to the different applications requiring the XML data to be clustered. ...
In this phase, a partitional clustering algorithm based on a modified version of k-means is used. -Evaluation Criteria. ...
doi:10.1145/1978802.1978804
fatcat:zgparleb6nbkdnoxlcxn3vyrhm
Scalable, Balanced Model-based Clustering
[chapter]
2003
Proceedings of the 2003 SIAM International Conference on Data Mining
This paper presents a general framework for adapting any generative (model-based) clustering algorithm to provide balanced solutions, i.e., clusters of comparable sizes. ...
Instead of a maximum-likelihood (ML) assignment, a balanceconstrained approach is used for the sample assignment step. ...
In this paper, we take a balance-constrained approach built upon the framework of probabilistic, model-based clustering [40] . ...
doi:10.1137/1.9781611972733.7
dblp:conf/sdm/ZhongG03
fatcat:5bjfvo2u2baz7cthogdgcqievi
Co-Bidding Graphs for Constrained Paper Clustering
2016
Symposium on Languages, Applications and Technologies
We present a two-tier constrained clustering method for automatic conference scheduling that can automatically assign paper presentations into predefined schedule slots instead of requiring the program ...
We demonstrate a methodology which is capable to enrich textual information with graph based data and utilize both in an innovative machine learning application of clustering. ...
A methodology for mining document-enriched heterogeneous information networks. The Computer Journal, 2012. 7 John A Hartigan and Manchek A Wong. Algorithm AS 136: A k-means clustering algorithm. ...
doi:10.4230/oasics.slate.2016.1
dblp:conf/slate/SkvorcLR16
fatcat:5mnubxxg4vbl7pnednbezylnka
Improving document clustering using automated machine translation
2012
Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12
In this work, we propose an alternative approach to address this problem using the constrained clustering framework. ...
This gives rise to an intriguing question: can we use the extra information to achieve a better clustering of the documents? ...
research via ONR grants N00014-09-1-0712 Automated Discovery and Explanation of Event Behavior, N00014-11-1-0108 Guided Learning in Dynamic Environments and NSF Grant NSF IIS-0801528 Knowledge Enhanced Clustering ...
doi:10.1145/2396761.2396844
dblp:conf/cikm/WangQD12
fatcat:k3idu2evvvexxhpb23gvezvgam
« Previous
Showing results 1 — 15 out of 64,663 results