Robust rank aggregation for gene list integration and meta-analysis

Raivo Kolde, Sven Laur, Priit Adler, Jaak Vilo
2012 Computer applications in the biosciences : CABIOS  
Motivation: The continued progress in developing technological platforms, availability of many published experimental data sets, as well as different statistical methods to analyze those data have allowed approaching the same research question using various methods simultaneously. To get the best out of all these alternatives we need to integrate their results in an unbiased manner. Prioritized gene lists are a common result presentation method in genomic data analysis applications. Thus the
more » ... k aggregation methods can become a useful and general solution for the integration task. Results: Standard rank aggregation methods are often ill-suited for biological settings where the gene lists are inherently noisy. As a remedy we propose a novel robust rank aggregation (RRA) method. Our method detects genes that are ranked consistently better than expected under null hypothesis of uncorrelated inputs and assigns a significance score for each gene. The underlying probabilistic model makes the algorithm parameter free and robust to outliers, noise and errors. Significance scores also provide a rigorous way to keep only the statistically relevant genes in the final list. These properties make our approach robust and compelling for many settings. Availability: All the methods are implemented as a GNU R package ROBUSTRANKAGGREG, freely available at the Comprehensive R Archive Network
doi:10.1093/bioinformatics/btr709 pmid:22247279 pmcid:PMC3278763 fatcat:7sn45nzggnd5lofj5jfkzmcgyi