4,597 Hits in 4.0 sec

Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices [article]

Ananya Bhattacharjee, Md. Shamsuzzoha Bayzid
2019 bioRxiv   pre-print
Moreover, our proposed techniques can handle substantial amount of missing data, to the extent where the best alternate method fails.  ...  However, substantial challenges remain in leveraging this huge amount of molecular data. One of the foremost among these challenges is the need for efficient tools that can handle missing data.  ...  Matrix Factorization has previously been used in imputing missing data in various domains of bioinformatics, including analyzing scRNA-seq with missing data [52] , handling missing data in genome-wide  ... 
doi:10.1101/744789 fatcat:icrb4xdmqjcuzgtl4x642bzxp4

C3IMD : An Efficient Class-Based Clustering Classifier for Im-putation Intelligent Medical Data

P. Premalatha, S Subasree, N K Sakthivel
2018 International Journal of Engineering & Technology  
The fast evolution in medical application yields to abundance of huge amount of data in volume and velocity.  ...  From the results obtained, it was revealed that the proposed Class-Based Clustering Classifier for Imputation Intelligent Medical Data (C3IMD) is outperforming both the existing models in terms of Classification  ...  of incomplete data using the concept of mean and median.  ... 
doi:10.14419/ijet.v7i2.27.12717 fatcat:tztydsexxvad3cxubcy7myjjkm

Deep Distribution-preserving Incomplete Clustering with Optimal Transport [article]

Mingjie Luo, Siwei Wang, Xinwang Liu, Wenxuan Tu, Yi Zhang, Xifeng Guo, Sihang Zhou, En Zhu
2021 arXiv   pre-print
Although various methods have been proposed, the performance of existing approaches drops dramatically when handling incomplete high-dimensional data (which is common in real world applications).  ...  Extensive experiments demonstrate that the proposed network achieves superior and stable clustering performance improvement against existing state-of-the-art incomplete clustering methods over different  ...  of the log-likelihood of the observed data. (9)MDIOT 7 [19]: Missing Data Imputing using Optimal Transport.  ... 
arXiv:2103.11424v1 fatcat:zecohbhhrrdxvhcnhfc5vdjqo4

Clustering-Based Multiple Imputation via Gray Relational Analysis for Missing Data and Its Application to Aerospace Field

Jing Tian, Bing Yu, Dan Yu, Shilong Ma
2013 The Scientific World Journal  
Then, it utilizes the entropy of the proximal category for each incomplete instance in terms of the similarity metric based on gray relational analysis.  ...  In this paper, we propose a missing data completion method named CBGMI. Firstly, it separates the nonmissing data instances into several clusters by excluding the missing-valued entries.  ...  Acknowledgments This work is supported by Project of the State Key Laboratory of Software Development Environment, Beihang University (SKLSDE-2011ZX-09) and National Natural Science Foundation of China  ... 
doi:10.1155/2013/720392 pmid:23737724 pmcid:PMC3659482 fatcat:xwhg5d4kf5cz5hpgfxjxn2hbwy

Updating mortality risk estimation in intensive care units from high-dimensional electronic health records with incomplete data [article]

Bertrand Bouvarel, Fabrice Carrat, Nathanael Lapidus
2022 medRxiv   pre-print
Missing data were handled using either complete case analysis or multiple imputation.  ...  Multiple imputation allowed to include 70 predictors and keep 95% of patients, with similar performances, hence allowing predictions in patients with incomplete data.  ...  Complete case Imputed-19 Missing data Selected predictors were subject to missing values, to a large extent for some of them. Three approaches were compared to handle incomplete data.  ... 
doi:10.1101/2022.04.28.22274405 fatcat:vgflzacfazgrrn7zmmkkh5alhy

Pattern recognition based speed forecasting methodology for urban traffic network

Tamás Tettamanti, Alfréd Csikós, Krisztián Balázs Kis, Zsolt János Viharos, István Varga
2017 Transport  
As another contribution of the paper, a built-in incomplete data handling is provided as input data (originating from traffic sensors or Floating Car Data (FCD)) might be absent or biased in practice.  ...  Therefore, input data handling can assure a robust operation of speed forecasting also in case of missing data.  ...  Acknowledgements This paper was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences.  ... 
doi:10.3846/16484142.2017.1352027 fatcat:cefovcnqdbdxjndf5zlx5qph7u

Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices

Ananya Bhattacharjee, Md. Shamsuzzoha Bayzid
2020 BMC Genomics  
Moreover, these methods are scalable to large datasets with hundreds of taxa, and can handle a substantial amount of missing data.  ...  However, substantial challenges remain in leveraging these large scale molecular data. One of the foremost challenges is to develop efficient methods that can handle missing data.  ...  Matrix Factorization has previously been used in imputing missing data in various domains of bioinformatics, including analyzing scRNA-seq with missing data [78] , handling missing data in genome-wide  ... 
doi:10.1186/s12864-020-06892-5 pmid:32689946 fatcat:xw5uelfpgzft3jroeu4zbrhq7a

Treatment of non-response in longitudinal network studies

Mark Huisman, Christian Steglich
2008 Social Networks  
In the framework of stochastic actor-driven models for network change ("SIENA models"), different methods to cope with such incomplete data are investigated.  ...  The collection of longitudinal data on complete social networks often faces the problem of actor nonresponse.  ...  As the methods repeatedly sample from the conditional distribution of the missing data, they can also be used to impute the data sets.  ... 
doi:10.1016/j.socnet.2008.04.004 fatcat:idtv7nrgxje3dc327mx6otsdkq

Topic modeling for systematic review of visual analytics in incomplete longitudinal behavioral trial data

Joshua Rumbut, Hua Fang, Honggong Wang
2020 Smart Health  
, an integrated and comprehensive soft computing tool for behavioral trajectory pattern recognition, validation, and visualization of incomplete longitudinal data.  ...  analytic methods and actual working algorithms for longitudinal behavioral trial data.  ...  Methods for the analysis of incomplete data was rarely integrated into the development of visual analytics systems.  ... 
doi:10.1016/j.smhl.2020.100142 pmid:33344744 pmcid:PMC7745978 fatcat:oq2z3apep5dyna57dcnodiuzvy

Enhanced SVM based Ensemble Algorithm to Improve the Classification for High Dimensional Data

Kavitha S., M. Hemalatha
2015 International Journal of Computer Applications  
The preprocessing step consists of cleaning algorithms like normalization, missing value handling routines which enhance the quality of the gene microarray data and help to improve the subsequent steps  ...  This research work focuses on using machine learning classification algorithms for predicting the presence or absence of cancer.  ...  Missing Value Handling : Two major issues with the frequently used KNNImpute (K-Nearest Neighbour Imputation) algorithm for handling missing values in microarray data is it's high time complexity and performance  ... 
doi:10.5120/ijca2015907340 fatcat:r4oua2rthrcehcc2xukzi4ukxa

Missing Data Imputation using Genetic Algorithm for Supervised Learning

Waseem Shahzad, Qamar Rehman, Ejaz Ahmed
2017 International Journal of Advanced Computer Science and Applications  
We show that our proposed methods outperform when compare with another state of the art missing data imputation techniques.  ...  Genetic algorithm (GA) is used for the estimation of missing values in datasets.  ...  data. • The method used in the imputation step must foresee the intended complete-data analysis.  ... 
doi:10.14569/ijacsa.2017.080360 fatcat:opaf7nbbgjax5f6sriceb7pmc4

Multiple Imputation: an attempt to retell the evolutionary process

Florian Meinfelder
2014 AStA Wirtschafts- und Sozialstatistisches Archiv  
The general concept of Multiple Imputation is explained using a simulated trivariate data set, and the imputation model is based on the standard Bayesian linear model, in order to explain the method as  ...  In this article, we are going to give a rough overview of the shortcomings of methods for handling missing data prior to Rubin's work in the late 1970s, and we explore the conceptual innovations that might  ...  Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided  ... 
doi:10.1007/s11943-014-0151-8 fatcat:fvnurt4r4zd7benra3izx5lfx4

Efficient and Effective Incomplete Multi-View Clustering

Xinwang Liu, Xinzhong Zhu, Miaomiao Li, Chang Tang, En Zhu, Jianping Yin, Wen Gao
Instead of completing the incomplete kernel matrices, EE-IMVC proposes to impute each incomplete base matrix generated by incomplete views with a learned consensus clustering matrix.  ...  Further, we conduct comprehensive experiments to study the proposed EE-IMVC in terms of clustering accuracy, running time, evolution of the learned consensus clustering matrix and the convergence.  ...  Xinzhong Zhu is the corresponding author of this paper.  ... 
doi:10.1609/aaai.v33i01.33014392 fatcat:m7biowcgqbe63ovkmknyh44hha

Systematic Review on Missing Data Imputation Techniques with Machine Learning Algorithms for Healthcare

Amelia Ritahani Ismail, Nadzurah Zainal Abidin, Mhd Khaled Maen
2022 Journal of Robotics and Control (JRC)  
This paper provides a comprehensive review of different imputation techniques used to replace the missing data.  ...  Therefore, to accurately deal with incomplete data, a sophisticated algorithm is proposed to impute those missing values.  ...  ACKNOWLEDGMENT This research was supported by the Ministry of Education for Fundamental Research Grant Scheme (FRGS): FRGS/1/2018/ICT02/UIAM/02/1.  ... 
doi:10.18196/jrc.v3i2.13133 fatcat:7e6olnt75zd3fbctrgmvsp6x2e

Optimization of Feature Selection Using Genetic Algorithm in Naïve Bayes Classification for Incomplete Data

Bain Khotimah, University of Airlangga Surabaya, Miswanto Miswanto, Herry Suprajitno, University of Airlangga Surabaya, University of Airlangga Surabaya
2020 International Journal of Intelligent Engineering and Systems  
In the experiment, preprocessing the data using SOMI yielded error results that were up to 10% for various data sets with missing data compared to other methods.  ...  SOMI can be used for homogeneous, heterogeneous and mixed data sets.  ...  Hot Deck is an improvement of missing data imputation using mean/average, modus or median mode to provide better results compared to deleting incomplete data or when compared to imputation methods with  ... 
doi:10.22266/ijies2020.0229.31 fatcat:jg4rodstmveajflqx73vpc5nyy
« Previous Showing results 1 — 15 out of 4,597 results