Comparison of location-scale and matrix factorization batch effect removal methods on gene expression datasets

Emilie Renard, P.-A. Absil
2017 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)  
Merging gene expression datasets is a simple way to increase the number of samples in an analysis. However experimental and data processing conditions, which are proper to each dataset or batch, generally influence the expression values and can hide the biological effect of interest. It is then important to normalize the bigger merged dataset, as failing to adjust for those batch effects may adversely impact statistical inference. Batch effect removal methods are generally based on a
more » ... ale approach, however less widespread methods based on matrix factorization have also been proposed. We investigate on breast cancer data how those batch effect removal methods improve (or possibly degrade) the performance of simple classifiers. Our results indicate that the matrix factorization approach would deserve greater attention, as it gives results at least as good as common location-scale methods, and even significantly better results in specific cases.
doi:10.1109/bibm.2017.8217888 dblp:conf/bibm/RenardA17 fatcat:d2ijzlnkk5erbdqcvtlwswt67q