Classification of Precursor MicroRNAs from Different Species Based on K-mer Distance Features

Malik Yousef, Jens Allmer
<span title="">2021</span> <i title="MDPI AG"> <a target="_blank" rel="noopener" href="" style="color: black;">Algorithms</a> </i> &nbsp;
MicroRNAs (miRNAs) are short RNA sequences that are actively involved in gene regulation. These regulators on the post-transcriptional level have been discovered in virtually all eukaryotic organisms. Additionally, miRNAs seem to exist in viruses and might also be produced in microbial pathogens. Initially, transcribed RNA is cleaved by Drosha, producing precursor miRNAs. We have previously shown that it is possible to distinguish between microRNA precursors of different clades by representing
more &raquo; ... he sequences in a k-mer feature space. The k-mer representation considers the frequency of a k-mer in the given sequence. We further hypothesized that the relationship between k-mers (e.g., distance between k-mers) could be useful for classification. Three different distance-based features were created, tested, and compared. The three feature sets were entitled inter k-mer distance, k-mer location distance, and k-mer first–last distance. Here, we show that classification performance above 80% (depending on the evolutionary distance) is possible with a combination of distance-based and regular k-mer features. With these novel features, classification at closer evolutionary distances is better than using k-mers alone. Combining the features leads to accurate classification for larger evolutionary distances. For example, categorizing Homo sapiens versus Brassicaceae leads to an accuracy of 93%. When considering average accuracy, the novel distance-based features lead to an overall increase in effectiveness. On the contrary, secondary-structure-based features did not lead to any effective separation among clades in this study. With this line of research, we support the differentiation between true and false miRNAs detected from next-generation sequencing data, provide an additional viewpoint for confirming miRNAs when the species of origin is known, and open up a new strategy for analyzing miRNA evolution.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.3390/a14050132</a> <a target="_blank" rel="external noopener" href="">doaj:ef01af41922545cc878e3a182b36b050</a> <a target="_blank" rel="external noopener" href="">fatcat:74ejw4ia7faxfgcduorginrige</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href=""> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> </button> </a>