Reproductive phasiRNAs in grasses are compositionally distinct from other classes of small RNAs [article]

Parth Patel, Sandra Mathioni, Atul Kakrana, Hagit Shatkay, Blake C. Meyers
2018 bioRxiv   pre-print
Little is known about the characteristics and function of reproductive phased, secondary, small interfering RNAs (phasiRNAs) in the Poaceae, despite the availability of significant genomic resources, experimental data, and a growing number of computational tools. We utilized machine-learning methods to identify sequence-based and structural features that distinguish phasiRNAs in rice and maize from other small RNAs (sRNAs). We developed Random Forest classifiers that can distinguish
more » ... phasiRNAs from other sRNAs in complex sets of sequencing data, utilizing sequence-based (k-mers) and features describing position-specific sequence biases. The classification performance attained is >80% in accuracy, sensitivity, specificity, and positive predicted value. Feature selection identified important features in both ends of phasiRNAs. We demonstrated that phasiRNAs have strand specificity and position-specific nucleotide biases potentially influencing AGO sorting; we also predicted targets to infer functions of phasiRNAs, and computationally-assessed their sequence characteristics relative to other sRNAs. Our results demonstrate that machine-learning methods effectively identify phasiRNAs despite the lack of characteristic features typically present in precursor loci of other small RNAs, such as sequence conservation or structural motifs. The 5'-end features we identified provide insights into AGO-phasiRNA interactions; we describe a hypothetical model of competition for AGO loading between phasiRNAs of different nucleotide compositions.
doi:10.1101/242727 fatcat:b5emy7o5wfggzpo7inage33bfi