BrumiR: A toolkit for de novo discovery of microRNAs from sRNA-seq data [article]

Carol Moraga, Evelyn Sanchez, Mariana Galvao Ferrarini, Rodrigo A Gutierrez, Elena A Vidal, Marie-France Sagot
2020 bioRxiv   pre-print
MicroRNAs (miRNAs) are small non-coding RNAs that are key players in the regulation of gene expression. In the last decade, with the increasing accessibility of high-throughput sequencing technologies, different methods have been developed to identify miRNAs, most of which rely on pre-existing reference genomes. However, when a reference genome is absent or is not of high quality, such identification becomes more difficult. In this context, we developed BrumiR, an algorithm that is able to
more » ... ver miRNAs directly and exclusively from sRNA-seq data. We benchmarked BrumiR with datasets encompassing animal and plant species using real and simulated sRNA-seq experiments. The results demonstrate that BrumiR reaches the highest recall for miRNA discovery, while at the same time being much faster and more efficient than the state-of-the-art tools evaluated. The latter allows BrumiR to analyze a large number of sRNA-seq experiments, from plants or animals species. Moreover, BrumiR detects additional information regarding other expressed sequences (sRNAs, isomiRs, etc.), thus maximizing the biological insight gained from sRNA-seq experiments. Finally, when a reference genome is available, BrumiR provides a new mapping tool (BrumiR2ref) that performs an a posteriori exhaustive search to identify the precursor sequences. The code of BrumiR is freely available at https://github.com/camoragaq/BrumiR.
doi:10.1101/2020.08.07.240689 fatcat:mgmoxdbhfrhxxmokcjptdjscgi