A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
Exploiting Parallel R in the Cloud with SPRINT
2012
Methods of Information in Medicine
Methods-The SPRINT parallel implementations of correlation, permutation testing, partitioning around medoids and the multi-purpose papply have been benchmarked on data sets of various size on Amazon EC2 ...
High Performance Computing (HPC) in the Cloud offers an affordable way of meeting this need. ...
Sornthep Vannarat, Head of the Large Scale Simulation Laboratory from the National Electronics and Computer Technology Center when performing the experiments in Thailand. ...
doi:10.3414/me11-02-0039
pmid:23223611
pmcid:PMC3547073
fatcat:hb4fbgcfvne23gydfhyyn6zqvq
Clustering Acoustic Segments Using Multi-Stage Agglomerative Hierarchical Clustering
2015
PLoS ONE
The algorithm does not require the number of clusters to be specified in advance, and is shown to be comparable in performance to parallel spectral clustering (PSC) [18] which has also been developed specifically ...
These segments are time series of varying length and are not easily represented as vectors in a vector space. However, they can be compared and similarities can be computed. ...
Acknowledgments Computations were performed using the University of Stellenbosch's Rhasatsha HPC: http:// www.sun.ac.za/hpc.
Author Contributions ...
doi:10.1371/journal.pone.0141756
pmid:26517376
pmcid:PMC4627777
fatcat:jdhmz6krkrdzfhizbcv3i5cwpy
A framework for machine-learning-augmented multiscale atomistic simulations on parallel supercomputers
2015
International Journal of Quantum Chemistry
We argue that the ML approach allows computational effort to be concentrated on the most chemically active subregions of the QM zone, significantly improving the overall efficiency of the simulation. ...
We thus propose a novel method to partition large QM regions into multiple subregions which can be computed in parallel to achieve optimal scaling. ...
the sizes and optimise the shapes of C i . ...
doi:10.1002/qua.24952
fatcat:hkr5aoh47beynbqo65isf7ydwq
SingleCAnalyzer: Interactive Analysis of Single Cell RNA-Seq Data on the Cloud
2022
Frontiers in Bioinformatics
prediction, unsupervised clustering of cells, pseudotime/trajectory analysis, expression comparisons between groups, functional enrichment of differentially expressed genes and gene set expression analysis ...
However, the proper application of these computational methods requires extensive bioinformatics expertise. Otherwise, it is often difficult to obtain reliable and reproducible results. ...
At present, SingleCAnalyzer applies the following unsupervised clustering methods: 1) k-means, which is computed using the kmeans function of the stats R package (with iter_max = 15); 2) partition around ...
doi:10.3389/fbinf.2022.793309
fatcat:d2imgql2hbcaldivak5kk23khy
Extending the SACOC algorithm through the Nyström method for dense manifold data analysis
2017
International Journal of Bio-Inspired Computation (IJBIC)
Spectral-based methods, which are one of the main used methodologies in this area, are sensitive to metric parameters and noise. ...
We evaluated the performance of the proposed approach, called SACON, comparing it against online clustering algorithms and the Nyström extension of the Spectral Clustering algorithm using several benchmark ...
Acknowledgement This work has been supported by the research projects: TIN2014-56494-C4-4-P, CIBERDINE S2013/ICE-3095, SeMaMatch EP/K032623/1 and Airbus Defence & Space (FUAM-076914 and FUAM-076915). ...
doi:10.1504/ijbic.2017.085894
fatcat:6n7tg5lqb5e2bfeeyopa52bl44
Supporting Autonomic Management of Clouds: Service Clustering With Random Forest
2016
IEEE Transactions on Network and Service Management
of the cloud domain including the sensitivity of the services to hardware configurations and priorities. ...
As future works, we will investigate the characteristics of RF, such as variable importance and feature selection, to improve our methodology. ...
To cluster the observations, the dissimilarity matrix is passed as input to a compatible clustering algorithm, for example, the Partitioning Around Medoids (PAM) [6] . ...
doi:10.1109/tnsm.2016.2569000
fatcat:mbcon775wvcqdkjqgn4giz2d5y
Modelling and Recognition of Protein Contact Networks by Multiple Kernel Learning and Dissimilarity Representations
2020
Entropy
The core of the training procedure is the joint optimisation of kernel weights and representatives selection in the dissimilarity spaces. ...
Computational results show remarkable classification capabilities and the knowledge discovery analysis is in line with current biological knowledge, suggesting the reliability of the proposed system. ...
The choice behind a genetic algorithm stems from them being widely famous in the context of derivative-free optimisation, embarrassingly easy to parallelise and for the sake of consistency with competing ...
doi:10.3390/e22070794
pmid:33286565
pmcid:PMC7517365
fatcat:w66fbfzgjncydlg5fmjcsxffdm
Unsupervised extraction of stable expression signatures from public compendia with eADAGE
[article]
2016
bioRxiv
pre-print
While we expected PhoB activity in limiting phosphate conditions, our analyses found PhoB activity in other media with moderate phosphate and predicted that a second stimulus provided by the sensor kinase ...
Cross experiment comparisons in public data compendia are challenged by unmatched conditions and technical noise. ...
., and Hill, J. (2011). Optimisation and parallelisation of the partitioning 524 around medoids function in R, in: 2011 International Conference on High Performance 525 Computing & Simulation. ...
doi:10.1101/078659
fatcat:mze4x6sikfbxndhrsduv7eh3ma
Large-scale data mining analytics based on MapReduce
[article]
2015
This version is called YARN and is very flexible in supporting various kinds of distributed data processing other than batchmode processing of MapReduce. ...
Apache Spark claimed that it is much faster than Hadoop MapReduce as it exploits the advantages of in-memory computations which is particularly more beneficial for iterative workloads in case of data mining ...
(medoids) and assigns them to the closest medoids. • Reduce function uses medoids centers as the key and calculates the new medoids using the optimal search of medoids. • If the medoids have changes, ...
doi:10.18419/opus-3493
fatcat:aa7hzahdtjbprf4ebxzxj6bba4
Space-Time Window Reconstruction In Parallel High Performance Numeric Simulations. Application For Cfd (Phd Thesis)
2011
Zenodo
The size of the output originating from large scale, numerical simulations poses major bottlenecks in high performance, parallel computing. ...
Recently it became more and more evident that a radical change has to take place in the way scientists and engineers handle numerical simulations. ...
Both partitioning around medoids [29] Most of the papers have been selected from conference proceedings and journals published by the Association for Computing Machinery (ACM) and the Institute of Electrical ...
doi:10.5281/zenodo.15938
fatcat:xh4i6ig7qvfkjapq76phiadpje
A review of urban energy system models: Approaches, challenges and opportunities
2012
Renewable & Sustainable Energy Reviews
However such a broad topic inevitably results in number of alternative interpretations of the problem domain and the modelling tools used in its study. ...
We also highlight a sixth field, land use and transportation modelling, which has direct relevance to the use of energy in cities but has been somewhat overlooked by the literature to date. ...
Acknowledgments The financial support of BP via the Urban Energy Systems project at Imperial College London (www.imperial.ac.uk/urbanenergysystems) is gratefully acknowledged. ...
doi:10.1016/j.rser.2012.02.047
fatcat:qzlxzdptcrcjbahjfg74wqquxe
Discovering latent topical structure by second-order similarity analysis
2011
Journal of the American Society for Information Science and Technology
and sparseness of the vector space, in combination with the effects of feature independence and vocabulary mismatch conspire to limit the structural validity of computed similarities. ...
In the next section, the motivation for this work is explained in greater detail along with formal definitions of the second-order similarity and other methods applied in this study. ...
Continuity, in contrast, is derived from the recall function. ...
doi:10.1002/asi.21519
fatcat:xx2dyggapnb5zbbtp7slfrwqeu
Efficient clustering techniques for big data
2018
Many studies introduced a parallel implementation of Lloyd's K-Means on Hadoop in order to improve the algorithm's scalability. ...
The majority of the running time in the original K-Means algorithm (known as Lloyd's algorithm) is spent on computing distances from each data point to all cluster centres to find the closest centre to ...
Clustering Large Applications based upon RANdomized Search (CLARANS) [35] extends the clustering algorithm called K-Medoid, which cluster data points around medoids instead of cluster centroids as in ...
doi:10.48683/1926.00084926
fatcat:svndi2iqbjgk3nf7vj5lti7suy
Sparse multivariate models for pattern detection in high-dimensional biological data
2015
In NsRRR, pairwise relatedness between predictors and between responses are represented by two networks, and the model identifies associations between a subnetwork of predictors and a subnetwork of responses ...
Parsimonious models, which may refer to parsimony in model structure and/or model parameters, have been shown to improve both biological interpretability of the model and the generalisability to new data ...
When Q > 1, the constrained region is a strictly convex set and the optimisation problem in (1.2) and (2.1) is convex optimisation for both squared and logistic loss functions. ...
doi:10.25560/25762
fatcat:6c2xsoazlnbpjoamcsdw4cb6pu
Understanding High Dimensional Visual Data
2019
This thesis has focused on the advancement of these computer vision systems. ...
The algorithms produced by this research have established new levels of efficiency for retrieving and comparing visual data, as well as defining new machine learning techniques that extend our current ...
Here, our optimisation function encourages the formation of class specific Gaussian clusters around each anchor. ...
doi:10.26180/5c53bf3eadb54
fatcat:im4u5eytfnhyrkjyk5hacgsb5u
« Previous
Showing results 1 — 15 out of 21 results