24,501 Hits in 13.3 sec

Performance modeling in CUDA streams — A means for high-throughput data processing

Hao Li, Di Yu, Anand Kumar, Yi-Cheng Tu
2014 2014 IEEE International Conference on Big Data (Big Data)  
The high data rate of such systems requires large computing power provided by the query engine.  ...  Push-based database management system (DBMS) is a new type of data processing software that streams large volume of data to concurrent query operators.  ...  The world has entered an age of big data where many applications have to deal with massive amount of data that can easily overwhelm a traditional DBMS.  ... 
doi:10.1109/bigdata.2014.7004245 pmid:26566545 pmcid:PMC4640924 dblp:conf/bigdataconf/LiY0T14 fatcat:2e2p4kjmgfecta3h4yfozebxgm

Utilizing digital health applications as a means to diffuse knowledge to improve family planning outcomes in Bangladesh

Rupali J. Limaye, Nandita Kapadia-Kundu, Rebecca Arnold, Jessica Gergen, Tara M. Sullivan
2017 Clinical Obstetrics Gynecology and Reproductive Medicine  
of 361 clusters, r (i) is the estimate computed from the reduced sample of 360 clusters (i th cluster excluded), and k is the total number of clusters.  ...  Each asset was assigned a weight (factor score) generated through principal components analysis, and the resulting asset scores were standardized in relation to a normal distribution with a mean of zero  ...  IN COLUMN 3, ENTER CODES FOR DISCONTINUATION NEXT TO LAST MONTH OF USE. NUMBER OF CODES IN COLUMN 3 MUST BE SAME AS NUMBER OF INTERRUPTIONS OF METHOD USE IN COLUMN 1.  ... 
doi:10.15761/cogrm.1000176 fatcat:adwa7yqdg5ffvjyyoketwqjuzi

Adaptive Neural Network Classifier-Based Analysis of Big Data in Health Care [chapter]

Manaswini Pradhan
2018 Data Mining  
The FCM based Map-Reduce, clusters the large medical datasets into smaller groups of certain similarity and assigns each data cluster to one Mapper, where the training of neural networks are done by the  ...  Because of the massive volume, variety, and continuous updating of medical data, the efficient processing of medical data and the real-time response of the treatment recommendation has become an important  ...  Phase 2: Mapper phase For large scale mobile data process, the mapper is a programming model and a Phase 2(a): assigning each data groups to separate Mappers In the Mapper phase, the clustered data  ... 
doi:10.5772/intechopen.77225 fatcat:enag735qdzbill5gwsp3rlww3a

Using data to build a better EM: EM* for big data

Hasan Kurban, Mark Jenne, Mehmet M. Dalkilic
2017 International Journal of Data Science and Analytics  
the data that is more difficult to cluster.  ...  While parallelism is an obvious and, usually, necessary strategy, we observe that both (1) continually revisiting data and (2) visiting all data are two of the most prominent problems especially for iterative  ...  Acknowledgements The authors thank the editor and anonymous reviewers for their informative comments and suggestions. This work was partially supported by NCI Grant 1R01CA213466-01.  ... 
doi:10.1007/s41060-017-0062-1 dblp:journals/ijdsa/KurbanJD17 fatcat:flvfllrcqjhmpmkjc5eo3h663e

Suppressed epidemics in multirelational networks

Elvis H. W. Xu, Wei Wang, C. Xu, Ming Tang, Younghae Do, P. M. Hui
2015 Physical Review E  
A mean field theory that ignores spatial correlation is shown to give qualitative agreement and capture all the key features.  ...  For large w1/w0 ratios, the suppression leads to an absorbing phase consisting only of healthy nodes within a range p_L =< p =< p_R, and an active phase with mixed infected and healthy nodes for p < p_L  ...  For sufficiently large w 1 /w 0 , 100% healthy phase is achieved for a range of p when the cluster sizes are big enough to disrupt the spread via w 0 links and yet not so big for the disease to sustain  ... 
doi:10.1103/physreve.92.022812 pmid:26382459 fatcat:dazekjwaknamtmjb6e4bcjocse

Learning Analytics Through Serious Games: Data Mining Algorithms for Performance Measurement and Improvement Purposes

Abdelali Slimani, Fatiha Elouaai, Lotfi Elaachak, Othman Bakkali Yedri, Mohammed Bouhorma, Mateu Sbert
2018 International Journal of Emerging Technologies in Learning (iJET)  
This paper presents methods and approaches of educational data mining such as EM and K-Means to discuss the learning analytics through serious games, and then we provide an analysis of the player experience  ...  data collected from the educational game "ELISA" used to teach students of biology the immunological technique for determination of ANTI-HIV antibodies.  ...  The K-Means algorithm The k-means algorithm [10] selects randomly k number of objects, each of them initially represents a cluster mean or center, an object is assigned to the cluster to which it is  ... 
doi:10.3991/ijet.v13i01.7518 fatcat:wtbnsy7tkrb7vkbdmhhafriu7i

A hybrid recommender system based-on link prediction for movie baskets analysis

Mohammadsadegh Vahidi Farashah, Akbar Etebarian, Reza Azmi, Reza Ebrahimzadeh Dastjerdi
2021 Journal of Big Data  
The proposed method in this paper consists of four phases: (1) Running the CBRS that in this phase, all users are clustered using Density-based spatial clustering of applications with noise algorithm (  ...  are connected through the link. (4) This phase is related to the combination of collaborative recommender system's output and improved Friendlink algorithm.  ...  online site and system. • The methods used need improvements on big data.  ... 
doi:10.1186/s40537-021-00422-0 fatcat:tbusqxuewfh7fhwsq6kdr2gvhq

Summarizing Relational Data Using Semi-Supervised Genetic Algorithm-Based Clustering Techniques

2010 Journal of Computer Science  
k-means clustering.  ...  Approach: We proposed a genetic semi-supervised clustering technique as a means of aggregating data stored in multiple tables to facilitate the task of solving a classification problem in relational database  ...  With this representation, the data can be conveniently clustered by using the hierarchical or partitioning clustering technique, as a means of summarizing them.  ... 
doi:10.3844/jcssp.2010.775.784 fatcat:izqiaqebcja2nbkmxuxxx3qf7u

Harnessing Interpretable and Unsupervised Machine Learning to Address Big Data from Modern X-ray Diffraction [article]

Jordan Venderley, Michael Matty, Krishnanand Mallayya, Matthew Krogstad, Jacob Ruff, Geoff Pleiss, Varsha Kishore, David Mandrus, Daniel Phelan, Lekhanath Poudel, Andrew Gordon Wilson, Kilian Weinberger (+5 others)
2021 arXiv   pre-print
Now, the primary challenge is to understand and discover scientific principles from big data sets when a comprehensive analysis is beyond human reach.  ...  Our approach can radically transform XRD experiments by allowing in-operando data analysis and enabling researchers to refine experiments by discovering interesting regions of phase space on-the-fly.  ...  We thank Jeffrey Lynn and Johnpierre Paglione for assistance in preparing the (Ca x Sr 1−x ) 3 Rh 4 Sn 13 samples. Initial development of X-TEC (EAK, AW, KW,  ... 
arXiv:2008.03275v4 fatcat:s3hs6f5jtncadlhb3a2tyr4gfm

Differential Approximation and Sprinting for Multi-Priority Big Data Engines [article]

Robert Birke, Isabelly Rocha, Juan Perez, Valerio Schiavoni, Pascal Felber, Lydia Y. Chen
2019 arXiv   pre-print
Today's big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees.  ...  The unique combination of approximation and sprinting avoids the eviction of low-priority jobs and its consequent latency degradation and resource waste.  ...  Acknowledgment The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under the LEGaTO Project (legato-project. eu), grant agreement  ... 
arXiv:1909.05531v2 fatcat:ayn2xzf3rfaappni4o2ct35hby

Predictive Health Analytic Model in Federated Cloud

2019 International journal of recent technology and engineering  
Simulation result reveals that the proposed architecture is essential for the present needs of human life.  ...  Each provider has provided the awareness about the distinct diseases, predict the possible level of diseases affected and the mode of treatment.  ...  A clustering method based on the combination of the particle swarm optimization (PSO) and K-mean is suggested to identify the type and priority of the disease.  ... 
doi:10.35940/ijrte.b2309.078219 fatcat:o2ke7hrbz5aorj5qcgb2tpy6xq

Parallel Optimal Grid-Clustering algorithm exploration on MapReduce Framework

B. Hanmanthu, R. Rajesh, P. Niranjan
2018 International Journal of Computer Applications  
Taking this as a disadvantage we are exploring the optimal grid clustering techniques for big data analysis using MapReduce architecture.  ...  Where as in the conventional data mining techniques the clustering technique is proven as that the most useful technique for effective data analysis.  ...  Distributed implementation of MapReduce needs a means of connecting the processes performing the Map and Reduce phases. This may be a distributed file system.  ... 
doi:10.5120/ijca2018917041 fatcat:oztecc7lr5gj7ges5xiwrkjs34

Parallel Hierarchical Affinity Propagation with MapReduce [article]

Dillon Mark Rose, Jean Michel Rouly, Rana Haber, Nenad Mijatovic, Adrian M. Peter
2014 arXiv   pre-print
Paramount amongst the desires to manipulate and extract actionable intelligence from vast big data volumes is the need for scalable, performance-conscious analytics algorithms.  ...  Experimental validation of our clustering methodology on a variety of synthetic and real data sets (e.g. images and point data) demonstrates our competitiveness against other state-of-the-art MapReduce  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.  ... 
arXiv:1403.7394v1 fatcat:j262egarp5ayvpfy6qxntydrxq

Parallel Hierarchical Affinity Propagation with MapReduce

Dillon Mark Rose, Jean Michel Rouly, Rana Haber, Nenad Mijatovic, Adrian M. Peter
2013 2013 IEEE 5th International Conference on Cloud Computing Technology and Science  
Paramount amongst the desires to manipulate and extract actionable intelligence from vast big data volumes is the need for scalable, performance-conscious analytics algorithms.  ...  Experimental validation of our clustering methodology on a variety of synthetic and real data sets (e.g. images and point data) demonstrates our competitiveness against other state-of-the-art MapReduce  ...  As an exemplar-based clustering approach, the technique does not seek to find a mean for each cluster center, instead certain representative data points are selected as the exemplars of the clustered subgroups  ... 
doi:10.1109/cloudcom.2013.97 dblp:conf/cloudcom/RoseRHMP13 fatcat:o7u6opuhkbflfcl37ukwzz64iq

Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms [article]

Erich Schubert, Peter J. Rousseeuw
2019 arXiv   pre-print
In Euclidean geometry the mean-as used in k-means-is a good estimator for the cluster center, but this does not hold for arbitrary dissimilarities.  ...  It can easily be combined with earlier approaches to use PAM and CLARA on big data (some of which use PAM as a subroutine, hence can immediately benefit from these improvements), where the performance  ...  -for example with a mean, the precision only improves with √ n, so adding more data eventually does barely improve the results).  ... 
arXiv:1810.05691v3 fatcat:wg2indlvzjhctikzii5jv4eefa
« Previous Showing results 1 — 15 out of 24,501 results