A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
An Efficient Agglomerative Clustering Algorithm for Web Navigation Pattern Identification
2016
Circuits and Systems
Rough set based clustering with validity measure 0.54 Generation of dense clusters is essential for finding interesting patterns needed for further mining and analysis. ...
In this paper, a new agglomerative clustering technique is proposed to identify users with similar interest, and to determine the motivation for visiting a website. ...
The greatest deficiency of PNN is its slow speed. A straight forward implementation of this requires an O(N 3 ) time which is rather slow for large training set. ...
doi:10.4236/cs.2016.79205
fatcat:ifjmc7rc65hbrkuga7lzkykxgq
Biketastic
2010
Proceedings of the 28th international conference on Human factors in computing systems - CHI '10
Using a mobile phone application and online map visualization, bikers are able to document and share routes, ride statistics, sensed information to infer route roughness and noisiness, and media that documents ...
In this paper, we present architecture and algorithms for route data inferences and visualization. We evaluate the system based on feedback from bicyclists provided during a two-week pilot. ...
The steps include: de-clustering, simplification, and smoothing. In the de-clustering step, the route is processed so that clusters of points are removed for visualization appeal. ...
doi:10.1145/1753326.1753598
dblp:conf/chi/ReddySDCES10
fatcat:cr7ca42s2fbn3ceejzcwgb4cuq
A process for mining science & technology documents databases, illustrated for the case of "knowledge discovery and data mining"
1999
Ciência da Informação
The paper offers a set of specific indicators suitable for mining such databases to understand innovation prospects. ...
This paper presents a process of mining research & development abstract databases to profile current status and to project potential developments for target technologies, The process is called "technology ...
The relative emphasis indicator shows that association rules, very large databases, deductive databases, knowledge acquisition, rule induction, spatial database, background knowledge, rough sets, etc., ...
doi:10.1590/s0100-19651999000100002
fatcat:yltrenvudzgkrhrvba5ffptuma
Improving Quality of Search Results Clustering with Approximate Matrix Factorisations
[chapter]
2006
Lecture Notes in Computer Science
For our experiments we use the standard merge-thencluster approach based on the Open Directory Project web catalogue as a source of human-clustered document summaries. ...
We also compare our approach with two other clustering algorithms: Suffix Tree Clustering (STC) and Tolerance Rough Set Clustering (TRC). ...
Acknowledgment The author would like to thank anonymous reviewers for helpful suggestions. The experiments were performed within the Carrot 2 Search Results Clustering Framework. ...
doi:10.1007/11735106_16
fatcat:hwwmahkjbzey5h45de2naynksu
Construction and Application of College English Blended Teaching System Based on Multidata Fusion
2022
Discrete Dynamics in Nature and Society
data is also very fast, which is suitable for processing large amounts of data. ...
Users can build Hadoop cluster infrastructure without understanding its underlying principles, make full use of the advantages of distributed high-speed computing, and combine the advantages of Hadoop's ...
doi:10.1155/2022/4990844
fatcat:hfbxnwb7zvb3lp23gfljvduxxe
Clustered Distributed Index for Efficient Text Retrieval Using Threads
2010
International Journal of Grid Computing & Applications
In this research paper, a novel method of improving the clustered distributed indices for efficient text retrieval using threads is presented. ...
The indexing stage scans for text of all the documents and builds a list of search terms, often called an index. ...
When used to classify large data sets, clustering algorithms are very computing demanding and require high performance machines to get results in reasonable time. ...
doi:10.5121/ijgca.2010.1201
fatcat:tx5tounsuzdh7eax6fmu42bnui
Automatic Algorithms For Medieval Manuscript Analysis
2018
Zenodo
and efficient system for document analysis. ...
The study and browsing of such digital libraries is invaluable for scholars in the Cultural Heritage field, but requires automatic tools for analyzing and indexing these datasets. ...
This work was partially supported by a Yale University and CRS4 research agreement, as well as by the Scan4Reco project funded by European Union's Horizon 2020 Framework Programme for Research and Innovation ...
doi:10.5281/zenodo.1468270
fatcat:crgnfwkhpndhpb2vj35sgm3j7i
Survey of Clustering Algorithms for Categorization of Patient Records in Healthcare
2016
Indian Journal of Science and Technology
To improve the accuracy of clustering the large dataset HFKHM is used. ...
This paper presents a related work on the existing clustering algorithms for categorizing the tumors as benign or malignant. ...
Hence data is increased rapidly on for coming year data driven approach has been followed. Data partitioning is very important on high dimensional large dataset. ...
doi:10.17485/ijst/2016/v9i8/87971
fatcat:ijpow2entvcxjemvdnu77vpkcu
Terrain understanding for robot navigation
2007
2007 IEEE/RSJ International Conference on Intelligent Robots and Systems
Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions ...
Currently, we are using fuzzy c-means clustering and exploring a number of different features for characterizing the visual appearance of the terrain. ...
We are reasonably satisfied with the VTI parameters for roughness and ground resistance. A simple and reliable technique for measuring wheel slip would also be of interest. ...
doi:10.1109/iros.2007.4399223
dblp:conf/iros/KarlsenW07
fatcat:3ujuanqacnaajez4hcadhy5zla
Bibliometric Analysis of Specific Energy Consumption (SEC) in Machining Operations: A Sustainable Response
2021
Sustainability
The selection criteria of documents are set for citation analysis. ...
A systematic approach collects information on SEC documents' primary data; their types, publications, citations, and predictions are presented. ...
Gujral Punjab Technical University, Kapurthala-Jalandhar (Punjab), India, for allowing them to carry out this research work.
Conflicts of Interest: The authors declare no conflict of interest. ...
doi:10.3390/su13105617
fatcat:cdvsx7chhngzrpuczi7fn5sj5q
Evolutionary Feature Selection For Big Data Processing Using Mapreduce And Apso
2017
Zenodo
Our experimental results show that Big Data -A significantly speeds up data movement in Map Reduce and doubles the throughput of Big Data. ...
Big Data -A, an acceleration framework that optimizes Big Data with plug-in components for fast data movement, overcoming the existing limitations. ...
Relevance Feature Discovery for Text Mining: It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of large scale terms ...
doi:10.5281/zenodo.376807
fatcat:wsqltk3vfzdzzhuzjjghto5yjq
A tree algorithm for nearest neighbor searching in document retrieval systems
1978
SIGIR Forum
For large collections, the average search time for this algorithm is less than that for a sequential search and greater than that for a clustered search. ...
A nearest neighbors associative retrieval algorithm, suitable for document retrieval using similarity matching, is described. ...
The sequential search is straightforward, but in large collections it is very time-consuming to examine the entire collection. ...
doi:10.1145/1013234.803139
fatcat:jgtydnj435ci5fcsvmsn5ynfka
A tree algorithm for nearest neighbor searching in document retrieval systems
1978
Proceedings of the 1st annual international ACM SIGIR conference on Information storage and retrieval - SIGIR '78
For large collections, the average search time for this algorithm is less than that for a sequential search and greater than that for a clustered search. ...
A nearest neighbors associative retrieval algorithm, suitable for document retrieval using similarity matching, is described. ...
The sequential search is straightforward, but in large collections it is very time-consuming to examine the entire collection. ...
doi:10.1145/800096.803139
dblp:conf/sigir/EastmanW78
fatcat:s6z5axdzezb6xbty3tadynzc4u
Applications of Clustering Techniques in Data Mining: A Comparative Study
2020
International Journal of Advanced Computer Science and Applications
Also, presents one of the most common clustering technique for identification of data patterns by performing an analysis of sample data. ...
Clustering, recognized as an essential issue of unsupervised learning, deals with the segmentation of the data structure in an unknown region and is the basis for further understanding. ...
To achieve high accuracy in terms of time and space, K-means would be the best choice for large and categorical data. ...
doi:10.14569/ijacsa.2020.0111218
fatcat:s7trnkmupfdynovm447xoisgyi
A Clustering Framework Based on Adaptive Space Mapping and Rescaling
[chapter]
2009
Lecture Notes in Computer Science
Specifically, documents are first mapped into a low dimensional space with respect to the cluster centers so that the distribution statistics of each cluster could be analyzed on the corresponding dimension ...
These two steps are conducted iteratively along with the clustering algorithm to constantly improve the clustering performance. ...
Introduction With the explosion of documents on the Web, there has been increasing need for efficient and effective analysis methods to manage massive text collections. ...
doi:10.1007/978-3-642-04769-5_32
fatcat:q6xhsog5cvfwndvzkgfxy2hpqi
« Previous
Showing results 1 — 15 out of 26,714 results