A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Transcriptome prediction performance across machine learning models and diverse ancestries
2021
Human Genetics and Genomics Advances
While EN generally outperformed random forest (RF), support vector regression (SVR), and K nearest neighbor (KNN), we found that RF outperformed EN for some genes, particularly between disparate ancestries ...
We show that the prediction performance is highest when the training and the testing population share similar ancestries regardless of the prediction algorithm used. ...
The average R 2 for each of the prediction algorithms is EN = 0.0733, SVR = 0.0476, RF = 0.0409, and KNN = 0.0103. ...
doi:10.1016/j.xhgg.2020.100019
pmid:33937878
pmcid:PMC8087249
fatcat:mzgznh6gfvhqfcrtp65b446fla
Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data
2005
Bioinformatics
In this paper, an innovative missing value imputation algorithm called collateral missing value estimation (CMVE) is presented which uses multiple covariance-based imputation matrices for the final prediction ...
(KNN). ...
THE CMVE ALGORITHM The complete CMVE algorithm, which is detailed in Figure 1 , introduces the concept of multiple parallel estimations of missing values. ...
doi:10.1093/bioinformatics/bti345
pmid:15731210
fatcat:2wyk46ap2bdfrknteypfxgzi5y
An Examination of Machine Learning Algorithms for Missing Values Imputation
2019
VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE
It represents the research and imputation of missing values in gene expression data. By using the local or global correlation of the data we focus mostly on the contrast of the algorithms. ...
The purpose of our review article is to focus on the developments of current techniques. For scientists rather applying different or newly develop algorithms with the identical functional goal. ...
ACKNOWLEDGEMENTS We would like to thank Universiti Malaysia Pahang for supporting this work under the RDU Grant, Grant number: RDU180344 and RDU190113.. ...
doi:10.35940/ijitee.l1081.10812s219
fatcat:ixexhti6jvcjbnkqj2jmrncqge
Impact of imputation methods on the amount of genetic variation captured by a single-nucleotide polymorphism panel in soybeans
2016
BMC Bioinformatics
The genotypic matrix captured the highest amount of genetic variance when missing loci were imputed by the method proposed in this paper. ...
The procedures these technologies use to impute genetic data, therefore, greatly affect downstream analyses. ...
Acknowledgement We thank the SoyNAM collaborators for their contributions to the experiment: Dr. ...
doi:10.1186/s12859-016-0899-7
pmid:26830693
pmcid:PMC4736474
fatcat:6bgi5ki7evgxljyn66doho6db4
A Hybrid Modified Deep Learning Data Imputation Method for Numeric Datasets
2021
International Journal of Intelligent Systems and Applications in Engineering
The imputation performance of RF-DLI is compared to K-Nearest Neighbors (KNN), Multiple Imputation by Chained Equations (MICE), MEAN imputation, and Principle Component Analysis (PCA) imputation approaches ...
Datawig is a deep learning-based library that supports missing value imputation for all types of data. RF-DLI approach includes the following steps to impute missing data. ...
The method benefits from deep learning for predicting the missing data and genetic algorithm for optimizing the weights of the neural network. ...
doi:10.18201/ijisae.2021167931
fatcat:qbgfg3st4vci5nzqz3e5v3m2qy
Data Imputation in Wireless Sensor Networks Using a Machine Learning-Based Virtual Sensor
2020
Journal of Sensor and Actuator Networks
The MLP was trained using a genetic algorithm which efficiently reached an optimal solution for each sensor node. ...
Data imputation allows for a system to counteract the effect of data loss by substituting faulty or missing sensor values with system-defined virtual values. ...
The genetic algorithm was able to converge on an optimal solution in a relatively small amount of time despite no parallelism being implemented into the training algorithm. ...
doi:10.3390/jsan9020025
fatcat:4xifoaih25ewnhubgu5x47se2y
A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset
2005
Genomics
genetic algorithm and K nearest neighbors. ...
We have formalized a robust gene selection approach based on a hybrid between genetic algorithm and support vector machine. ...
Acknowledgments This work was supported in part by the National Natural Science Foundation of China (Grants 30170515 and 30370798), the National High Tech Development Project, the Chinese 863 Program ( ...
doi:10.1016/j.ygeno.2004.09.007
pmid:15607418
fatcat:wlguq3kumjcvhgizssxw5eaxyu
A survey on missing data in machine learning
2021
Journal of Big Data
We propose and evaluate two methods, the k nearest neighbor and an iterative imputation method (missForest) based on the random forest algorithm. ...
are most suitable for. ...
Also, an imputation experiment was done on the KNN and RF algorithms for imputation on the Iris and novel ID fan datasets to demonstrate how popular imputation algorithms perform. ...
doi:10.1186/s40537-021-00516-9
pmid:34722113
pmcid:PMC8549433
fatcat:2swvf2dp5rfgjddoswh6bmpfjq
DreamAI: algorithm for the imputation of proteomics data
[article]
2020
bioRxiv
pre-print
The final resulting algorithm, DreamAI, is based on an ensemble of six different imputation methods. ...
To address this problem, the NCI-CPTAC Proteogenomics DREAM Challenge was carried out to develop effective imputation algorithms for labelled LC-MS/MS proteomics data through crowd learning. ...
"McImpute: Matrix completion based imputation for single cell RNA-seq data." Frontiers in genetics 10 (2019): 9. 25. ...
doi:10.1101/2020.07.21.214205
fatcat:r7g5oy6ptnb2hmoiewhsthh2ne
CF-GeNe: Fuzzy Framework for Robust Gene Regulatory Network Inference
2006
Journal of Computers
including: Least Square Impute (LSImpute), K-Nearest Neighbour Impute (KNN), Bayesian Principal Component Analysis Impute (BPCA) and ZeroImpute. ...
The approach uses the Collateral Missing Value Estimation (CMVE) algorithm as its core to estimate missing values in microarray gene expression data. ...
including: Least Square Impute (LSImpute), K-Nearest Neighbour Impute (KNN) and Bayesian Principal Component Analysis (BPCA). ...
doi:10.4304/jcp.1.7.1-8
fatcat:2spuzrkn6ng3zmbdwzznxnfhwq
Dynamic Feature Scaling for K-Nearest Neighbor Algorithm
[article]
2018
arXiv
pre-print
Nearest Neighbors Algorithm is a Lazy Learning Algorithm, in which the algorithm tries to approximate the predictions with the help of similar existing vectors in the training dataset. ...
A majority of the metrics such as Euclidean distance are scale variant, meaning that the results could vary for different range of values used for the features. ...
The KNN algorithm is a very rigid algorithm i.e all the data points are considered for every iteration. ...
arXiv:1811.05062v1
fatcat:xidkfbyjjvfc5bpj5jxnzpvwwm
Empirical Analysis of Software Effort Preprocessing Techniques Based on Machine Learning
2021
International Journal of Intelligent Engineering and Systems
The purpose of this paper is to evaluate and compare the performance of the proposed technique with the knearest neighbor imputation (kNNI) technique, random forest imputation, and multiple imputation ...
The results show that the three imputation methods have almost the same performance. ...
Feature selection The use of genetic algorithms (GA) as feature selection (FS) uses a parallel search random strategy, directed to the search for high fitness points, i.e. the point at which the function ...
doi:10.22266/ijies2021.1231.49
fatcat:w4wasj2mzvgyva3y2elyyo3g2a
An Efficient Missing Data Imputation Based On Co-Cluster Sparse Matrix Learning
2019
International Journal of Scientific Research in Computer Science Engineering and Information Technology
This algorithm learns without reference class, and even with data continuous missing rate as high as the existing techniques. ...
This makes the task of data processing challenging. This paper aims to design a solution for this problem which is ways different from traditional approaches. ...
To extend MIAEC for large-scale data processing, they apply the map reduce programming model to realize the distribution and parallelization of MIAEC. ...
doi:10.32628/cseit195220
fatcat:rsi4z5kh4ncbvmeyofl7j37cre
Missing Value Aware Optimal Feature Selection Method for Efficient Big Data Mining Process
2019
International journal of recent technology and engineering
In this research method Improved KNN imputation algorithm is introduced to handle the missing values. ...
This is achieved in our previous research work by introducing the Enhanced Particle Swarm Optimization with Genetic Algorithm – Modified Artificial Neural Network (EPSOGA -MANN) which can select the optimal ...
In this research method Improved KNN imputation algorithm is introduced to handle the missing values. ...
doi:10.35940/ijrte.b1055.0982s1119
fatcat:wb7fohnfofc6zmogvhb3r5cxgu
A COMPARATIVE ANALYSIS OF CLASSIFICATION TECHNIQUES ON MEDICAL DATA SETS
2014
International Journal of Research in Engineering and Technology
The work has been implemented in WEKA environment and obtained results show that SVM is the most robust classification method and KNN is the least effective classifier for medical data sets. ...
In this paper, the analysis has been performed for five different classification algorithms in terms of accuracy, kappa statistics, execution time, mean absolute error under three datasets, collected from ...
Binarycoded genetic algorithms and Real-coded genetic algorithms are used for assigning weights to the features, so that set of optimal features can be deduced from high dimensional data. ...
doi:10.15623/ijret.2014.0306085
fatcat:2vbplzqd4fb3tkkvlxx5ytisba
« Previous
Showing results 1 — 15 out of 389 results