Filters








26,714 Hits in 5.3 sec

An Efficient Agglomerative Clustering Algorithm for Web Navigation Pattern Identification

A. Anitha
2016 Circuits and Systems  
Rough set based clustering with validity measure 0.54 Generation of dense clusters is essential for finding interesting patterns needed for further mining and analysis.  ...  In this paper, a new agglomerative clustering technique is proposed to identify users with similar interest, and to determine the motivation for visiting a website.  ...  The greatest deficiency of PNN is its slow speed. A straight forward implementation of this requires an O(N 3 ) time which is rather slow for large training set.  ... 
doi:10.4236/cs.2016.79205 fatcat:ifjmc7rc65hbrkuga7lzkykxgq

Biketastic

Sasank Reddy, Katie Shilton, Gleb Denisov, Christian Cenizal, Deborah Estrin, Mani Srivastava
2010 Proceedings of the 28th international conference on Human factors in computing systems - CHI '10  
Using a mobile phone application and online map visualization, bikers are able to document and share routes, ride statistics, sensed information to infer route roughness and noisiness, and media that documents  ...  In this paper, we present architecture and algorithms for route data inferences and visualization. We evaluate the system based on feedback from bicyclists provided during a two-week pilot.  ...  The steps include: de-clustering, simplification, and smoothing. In the de-clustering step, the route is processed so that clusters of points are removed for visualization appeal.  ... 
doi:10.1145/1753326.1753598 dblp:conf/chi/ReddySDCES10 fatcat:cr7ca42s2fbn3ceejzcwgb4cuq

A process for mining science & technology documents databases, illustrated for the case of "knowledge discovery and data mining"

Donghua Zhu, Alan Porter, Scott Cunningham, Judith Carlisie, Anustup Nayak
1999 Ciência da Informação  
The paper offers a set of specific indicators suitable for mining such databases to understand innovation prospects.  ...  This paper presents a process of mining research & development abstract databases to profile current status and to project potential developments for target technologies, The process is called "technology  ...  The relative emphasis indicator shows that association rules, very large databases, deductive databases, knowledge acquisition, rule induction, spatial database, background knowledge, rough sets, etc.,  ... 
doi:10.1590/s0100-19651999000100002 fatcat:yltrenvudzgkrhrvba5ffptuma

Improving Quality of Search Results Clustering with Approximate Matrix Factorisations [chapter]

Stanislaw Osinski
2006 Lecture Notes in Computer Science  
For our experiments we use the standard merge-thencluster approach based on the Open Directory Project web catalogue as a source of human-clustered document summaries.  ...  We also compare our approach with two other clustering algorithms: Suffix Tree Clustering (STC) and Tolerance Rough Set Clustering (TRC).  ...  Acknowledgment The author would like to thank anonymous reviewers for helpful suggestions. The experiments were performed within the Carrot 2 Search Results Clustering Framework.  ... 
doi:10.1007/11735106_16 fatcat:hwwmahkjbzey5h45de2naynksu

Construction and Application of College English Blended Teaching System Based on Multidata Fusion

Aiqin Pan, Zaoli Yang
2022 Discrete Dynamics in Nature and Society  
data is also very fast, which is suitable for processing large amounts of data.  ...  Users can build Hadoop cluster infrastructure without understanding its underlying principles, make full use of the advantages of distributed high-speed computing, and combine the advantages of Hadoop's  ... 
doi:10.1155/2022/4990844 fatcat:hfbxnwb7zvb3lp23gfljvduxxe

Clustered Distributed Index for Efficient Text Retrieval Using Threads

M Basavaraju, R Prabhakar
2010 International Journal of Grid Computing & Applications  
In this research paper, a novel method of improving the clustered distributed indices for efficient text retrieval using threads is presented.  ...  The indexing stage scans for text of all the documents and builds a list of search terms, often called an index.  ...  When used to classify large data sets, clustering algorithms are very computing demanding and require high performance machines to get results in reasonable time.  ... 
doi:10.5121/ijgca.2010.1201 fatcat:tx5tounsuzdh7eax6fmu42bnui

Automatic Algorithms For Medieval Manuscript Analysis

Ruggero Pintus, Ying Yang, Holly Rushmeier, Enrico Gobbetti
2018 Zenodo  
and efficient system for document analysis.  ...  The study and browsing of such digital libraries is invaluable for scholars in the Cultural Heritage field, but requires automatic tools for analyzing and indexing these datasets.  ...  This work was partially supported by a Yale University and CRS4 research agreement, as well as by the Scan4Reco project funded by European Union's Horizon 2020 Framework Programme for Research and Innovation  ... 
doi:10.5281/zenodo.1468270 fatcat:crgnfwkhpndhpb2vj35sgm3j7i

Survey of Clustering Algorithms for Categorization of Patient Records in Healthcare

D. Narmadha, Appavu Alias Balamurugan, G. Naveen Sundar, S. Jeba Priya
2016 Indian Journal of Science and Technology  
To improve the accuracy of clustering the large dataset HFKHM is used.  ...  This paper presents a related work on the existing clustering algorithms for categorizing the tumors as benign or malignant.  ...  Hence data is increased rapidly on for coming year data driven approach has been followed. Data partitioning is very important on high dimensional large dataset.  ... 
doi:10.17485/ijst/2016/v9i8/87971 fatcat:ijpow2entvcxjemvdnu77vpkcu

Terrain understanding for robot navigation

Robert E. Karlsen, Gary Witus
2007 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems  
Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions  ...  Currently, we are using fuzzy c-means clustering and exploring a number of different features for characterizing the visual appearance of the terrain.  ...  We are reasonably satisfied with the VTI parameters for roughness and ground resistance. A simple and reliable technique for measuring wheel slip would also be of interest.  ... 
doi:10.1109/iros.2007.4399223 dblp:conf/iros/KarlsenW07 fatcat:3ujuanqacnaajez4hcadhy5zla

Bibliometric Analysis of Specific Energy Consumption (SEC) in Machining Operations: A Sustainable Response

Raman Kumar, Sehijpal Singh, Ardamanbir Singh Sidhu, Catalin I. Pruncu
2021 Sustainability  
The selection criteria of documents are set for citation analysis.  ...  A systematic approach collects information on SEC documents' primary data; their types, publications, citations, and predictions are presented.  ...  Gujral Punjab Technical University, Kapurthala-Jalandhar (Punjab), India, for allowing them to carry out this research work. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/su13105617 fatcat:cdvsx7chhngzrpuczi7fn5sj5q

Evolutionary Feature Selection For Big Data Processing Using Mapreduce And Apso

D. Anusuya, R. Senthilkumar, Dr. T. Senthil Prakash
2017 Zenodo  
Our experimental results show that Big Data -A significantly speeds up data movement in Map Reduce and doubles the throughput of Big Data.  ...  Big Data -A, an acceleration framework that optimizes Big Data with plug-in components for fast data movement, overcoming the existing limitations.  ...  Relevance Feature Discovery for Text Mining: It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of large scale terms  ... 
doi:10.5281/zenodo.376807 fatcat:wsqltk3vfzdzzhuzjjghto5yjq

A tree algorithm for nearest neighbor searching in document retrieval systems

Caroline M. Eastman, Stephen F. Weiss
1978 SIGIR Forum  
For large collections, the average search time for this algorithm is less than that for a sequential search and greater than that for a clustered search.  ...  A nearest neighbors associative retrieval algorithm, suitable for document retrieval using similarity matching, is described.  ...  The sequential search is straightforward, but in large collections it is very time-consuming to examine the entire collection.  ... 
doi:10.1145/1013234.803139 fatcat:jgtydnj435ci5fcsvmsn5ynfka

A tree algorithm for nearest neighbor searching in document retrieval systems

Caroline M. Eastman, Stephen F. Weiss
1978 Proceedings of the 1st annual international ACM SIGIR conference on Information storage and retrieval - SIGIR '78  
For large collections, the average search time for this algorithm is less than that for a sequential search and greater than that for a clustered search.  ...  A nearest neighbors associative retrieval algorithm, suitable for document retrieval using similarity matching, is described.  ...  The sequential search is straightforward, but in large collections it is very time-consuming to examine the entire collection.  ... 
doi:10.1145/800096.803139 dblp:conf/sigir/EastmanW78 fatcat:s6z5axdzezb6xbty3tadynzc4u

Applications of Clustering Techniques in Data Mining: A Comparative Study

Muhammad Faizan, Megat F., Shahrinaz Ismail, Sara Sultan
2020 International Journal of Advanced Computer Science and Applications  
Also, presents one of the most common clustering technique for identification of data patterns by performing an analysis of sample data.  ...  Clustering, recognized as an essential issue of unsupervised learning, deals with the segmentation of the data structure in an unknown region and is the basis for further understanding.  ...  To achieve high accuracy in terms of time and space, K-means would be the best choice for large and categorical data.  ... 
doi:10.14569/ijacsa.2020.0111218 fatcat:s7trnkmupfdynovm447xoisgyi

A Clustering Framework Based on Adaptive Space Mapping and Rescaling [chapter]

Yiling Zeng, Hongbo Xu, Jiafeng Guo, Yu Wang, Shuo Bai
2009 Lecture Notes in Computer Science  
Specifically, documents are first mapped into a low dimensional space with respect to the cluster centers so that the distribution statistics of each cluster could be analyzed on the corresponding dimension  ...  These two steps are conducted iteratively along with the clustering algorithm to constantly improve the clustering performance.  ...  Introduction With the explosion of documents on the Web, there has been increasing need for efficient and effective analysis methods to manage massive text collections.  ... 
doi:10.1007/978-3-642-04769-5_32 fatcat:q6xhsog5cvfwndvzkgfxy2hpqi
« Previous Showing results 1 — 15 out of 26,714 results